Overcoming the memory wall in packet processing hammers or ladders?

Overhead of memory accesses limits the performance of packet processing applications. To overcome this bottleneck, today's network processors can utilize a wide-range of mechanisms-such as multi-level memory hierarchy, wide-word accesses, special-purpose result-caches, asynchronous memory, and...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems pp. 1 - 10
Main Authors: Mudigonda, Jayaram, Vin, Harrick M., Yavatkar, Raj
Format: Conference Proceeding
Language:English
Published: New York, NY, USA ACM 26-10-2005
Series:ACM Conferences
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Overhead of memory accesses limits the performance of packet processing applications. To overcome this bottleneck, today's network processors can utilize a wide-range of mechanisms-such as multi-level memory hierarchy, wide-word accesses, special-purpose result-caches, asynchronous memory, and hardware multi-threading. However, supporting all of these mechanisms complicates programmability and hardware design, and wastes systemresources. In this paper, we address the following fundamental question: what minimal set of hardware mechanisms must a network processor support to achieve the twin goals of simplified programmability and high packet throughput? We show that no single mechanism sufficies; the minimal set must include data-caches and multi-threading. Data-caches and multi-threading are complementary; whereas data-caches exploit locality to reduce the number of context-switches and the off-chip memory bandwidth requirement, multi-threading exploits parallelism to hide long cache-miss latencies.
Bibliography:SourceType-Conference Papers & Proceedings-1
ObjectType-Conference Paper-1
content type line 25
ISBN:9781595930828
1595930825
DOI:10.1145/1095890.1095892