Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors
With increasing numbers of cores, future CMPs (chip multi-processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. Although such an organization is effective for avoiding access hot-spots, it can caus...
Saved in:
Published in: | 2009 18th International Conference on Parallel Architectures and Compilation Techniques pp. 348 - 357 |
---|---|
Main Authors: | , , , , , , , , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-09-2009
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | With increasing numbers of cores, future CMPs (chip multi-processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. Although such an organization is effective for avoiding access hot-spots, it can cause a significant number of non-local L2 accesses for many commonly occurring regular data access patterns. In this paper we develop a compile-time framework for data locality optimization via data layout transformation. Using a polyhedral model, the program's localizability is determined by analysis of its index set and array reference functions, followed by non-canonical data layout transformation to reduce non-local accesses for localizable computations. Simulation-based results on a 16-core 2D tiled CMP demonstrate the effectiveness of the approach. The developed program transformation technique is also useful in several other data layout transformation contexts. |
---|---|
ISBN: | 9780769537719 0769537715 |
ISSN: | 1089-795X 2641-7944 |
DOI: | 10.1109/PACT.2009.36 |