Search Results - "Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367)"

Refine Results
  1. 1

    PipeRench: a coprocessor for streaming multimedia acceleration by Goldstein, S.C., Schmit, H., Moe, M., Budiu, M., Cadambi, S., Taylor, R.R., Laufer, R.

    “…Future computing workloads will emphasize an architecture's ability to perform relatively simple calculations on massive quantities of mixed-width data. This…”
    Get full text
    Conference Proceeding
  2. 2

    A performance comparison of contemporary DRAM architectures by Cuppu, V., Jacob, B., Davis, B., Mudge, T.

    “…In response to the growing gap between memory access time and processor speed, DRAM manufacturers have created several new DRAM architectures. This paper…”
    Get full text
    Conference Proceeding
  3. 3

    Simultaneous subordinate microthreading (SSMT) by Chappell, R.S., Stark, J., Kim, S.P., Reinhardt, S.K., Patt, Y.N.

    “…Current work in Simultaneous Multithreading provides little benefit to programs that aren't partitioned into threads. We propose Simultaneous Subordinate…”
    Get full text
    Conference Proceeding
  4. 4

    Effective jump-pointer prefetching for linked data structures by Roth, A., Sohi, G.S.

    “…Current techniques for prefetching linked data structures (LDS) exploit the work available in one loop iteration or recursive call to overlap pointer chasing…”
    Get full text
    Conference Proceeding
  5. 5

    Speculation techniques for improving load related instruction scheduling by Yoaz, A., Erez, M., Ronen, R., Jourdan, S.

    “…State of the art microprocessors achieve high performance by executing multiple instructions per cycle. In an out-of-order engine, the instruction scheduler is…”
    Get full text
    Conference Proceeding
  6. 6

    Selective value prediction by Calder, B., Reinman, G., Tullsen, D.M.

    “…Value prediction is a relatively new technique to increase instruction-level parallelism by breaking true data dependence chains. A value prediction…”
    Get full text
    Conference Proceeding
  7. 7

    A scalable front-end architecture for fast instruction delivery by Reinman, G., Anstin, T., Calder, B.

    “…In the pursuit of instruction-level parallelism, significant demands are placed on a processor's instruction delivery mechanism. Delivering the performance…”
    Get full text
    Conference Proceeding
  8. 8

    A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization by Merten, M.C., Trick, A.R., George, C.N., Gyllenhaal, J.C., Hwu, W.W.

    “…This paper presents a novel hardware-based approach for identifying, profiling, and monitoring hot spots in order to support runtime optimization of…”
    Get full text
    Conference Proceeding
  9. 9

    Multicast snooping: a new coherence method using a multicast address network by Bilir, E.E., Dickson, R.M., Ying Hu, Plakal, M., Sorin, D.J., Hill, M.D., Wood, D.A.

    “…This paper proposes a new coherence method called "multicast snooping" that dynamically adapts between broadcast snooping and a directory protocol. Multicast…”
    Get full text
    Conference Proceeding
  10. 10

    Correlated load-address predictors by Bekerman, M., Jourdan, S., Ronen, R., Kirshenboim, G., Rappoport, L., Yoaz, A., Weiser, U.

    “…As microprocessors become faster, the relative performance cost of memory accesses increases. Bigger and faster caches significantly reduce the absolute…”
    Get full text
    Conference Proceeding
  11. 11

    The block-based trace cache by Black, B., Rychlik, B., Shen, J.P.

    “…The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buffering and reusing dynamic instruction traces. This work…”
    Get full text
    Conference Proceeding
  12. 12

    Maps: a compiler-managed memory system for Raw machines by Barua, R., Lee, W., Amarasinghe, S., Agarwal, A.

    “…This paper describes Maps, a compiler managed memory system for Raw architectures. Traditional processors for sequential programs maintain the abstraction of a…”
    Get full text
    Conference Proceeding
  13. 13

    Storageless value prediction using prior register values by Tullsen, D.M., Seng, J.S.

    “…This paper presents a technique called register value prediction (RVP) which uses a type of locality called register-value reuse. By predicting that an…”
    Get full text
    Conference Proceeding
  14. 14

    Scaling application performance on a cache-coherent multiprocessors by Dongming Jiang, Singh, J.P.

    “…Hardware-coherent, distributed shared address space systems are increasingly successful at moderate scale. However, it is unclear whether, or with how much…”
    Get full text
    Conference Proceeding
  15. 15

    Commit-Reconcile and Fences (CRF): a new memory model for architects and compiler writers by Xiaowei Shen, Arvind, L.R.

    “…We present a new mechanism-oriented memory model called Commit-Reconcile & Fences (CRF) and define it using algebraic rules. Many existing memory models can be…”
    Get full text
    Conference Proceeding
  16. 16

    Decoupling local variable accesses in a wide-issue superscalar processor by Sangyeun Cho, Pen-Chung Yew, Gyungho Lee

    “…Providing adequate data bandwidth is extremely important for a wide-issue superscalar processor to achieve its full performance potential. Adding a large…”
    Get full text
    Conference Proceeding
  17. 17

    Memory forwarding: enabling aggressive layout optimizations by guaranteeing the safety of data relocation by Chi-Keung Luk, Mowry, T.C.

    “…By optimizing data layout at run-time, we can potentially enhance the performance of caches by actively creating spatial locality, facilitating prefetching,…”
    Get full text
    Conference Proceeding
  18. 18

    Performance of image and video processing with general-purpose processors and media ISA extensions by Ranganathan, P., Adve, S., Jouppi, N.P.

    “…This paper aims to provide a quantitative understanding of the performance of image and video processing applications on general-purpose processors, without…”
    Get full text
    Conference Proceeding
  19. 19

    Memory sharing predictor: the key to a speculative coherent DSM by An-Chow Lai, Falsafi, B.

    “…Recent research advocates using general message predictors to learn and predict the coherence activity in distributed shared memory (DSM). By accurately…”
    Get full text
    Conference Proceeding
  20. 20

    Area efficient architectures for information integrity in cache memories by Seongwoo Kim, Somani, A.K.

    “…Information integrity in cache memories is a fundamental requirement for dependable computing. Conventional architectures for enhancing cache reliability using…”
    Get full text
    Conference Proceeding