Overcoming the memory wall in packet processing hammers or ladders?
Overhead of memory accesses limits the performance of packet processing applications. To overcome this bottleneck, today's network processors can utilize a wide-range of mechanisms-such as multi-level memory hierarchy, wide-word accesses, special-purpose result-caches, asynchronous memory, and...
Saved in:
Published in: | Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems pp. 1 - 10 |
---|---|
Main Authors: | , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
New York, NY, USA
ACM
26-10-2005
|
Series: | ACM Conferences |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Overhead of memory accesses limits the performance of packet processing applications. To overcome this bottleneck, today's network processors can utilize a wide-range of mechanisms-such as multi-level memory hierarchy, wide-word accesses, special-purpose result-caches, asynchronous memory, and hardware multi-threading. However, supporting all of these mechanisms complicates programmability and hardware design, and wastes systemresources. In this paper, we address the following fundamental question: what minimal set of hardware mechanisms must a network processor support to achieve the twin goals of simplified programmability and high packet throughput? We show that no single mechanism sufficies; the minimal set must include data-caches and multi-threading. Data-caches and multi-threading are complementary; whereas data-caches exploit locality to reduce the number of context-switches and the off-chip memory bandwidth requirement, multi-threading exploits parallelism to hide long cache-miss latencies. |
---|---|
Bibliography: | SourceType-Conference Papers & Proceedings-1 ObjectType-Conference Paper-1 content type line 25 |
ISBN: | 9781595930828 1595930825 |
DOI: | 10.1145/1095890.1095892 |