(Invited) Cross-Layer Resilience: Challenges, Insights, and the Road Ahead

Resilience to errors in the underlying hardware is a key design objective for a large class of computing systems, from embedded systems all the way to the cloud. Sources of hardware errors include radiation, circuit aging, variability induced by manufacturing and operating conditions, manufacturing...

Full description

Saved in:
Bibliographic Details
Published in:2019 56th ACM/IEEE Design Automation Conference (DAC) pp. 1 - 4
Main Authors: Cheng, Eric, Mueller-Gritschneder, Daniel, Abraham, Jacob, Bose, Pradip, Buyuktosunoglu, Alper, Chen, Deming, Cho, Hyungmin, Li, Yanjing, Sharif, Uzair, Skadron, Kevin, Stan, Mircea, Schlichtmann, Ulf, Mitra, Subhasish
Format: Conference Proceeding
Language:English
Published: ACM 01-06-2019
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Resilience to errors in the underlying hardware is a key design objective for a large class of computing systems, from embedded systems all the way to the cloud. Sources of hardware errors include radiation, circuit aging, variability induced by manufacturing and operating conditions, manufacturing test escapes, and early-life failures. Many publications have suggested that cross-layer resilience, where multiple error resilience techniques from different layers of the system stack cooperate to achieve cost-effective resilience, is essential for designing cost-effective resilient digital systems. This paper presents a comprehensive overview of cross-layer resilience by addressing fundamental cross-layer resilience questions, by summarizing insights derived from recent advances in cross-layer resilience research, and by discussing future cross-layer resilience challenges. CCS CONCEPTS * General and reference \rightarrow Reliability; * Hardware \rightarrow Fault tolerance; * Computer systems organization \rightarrow Reliability
AbstractList Resilience to errors in the underlying hardware is a key design objective for a large class of computing systems, from embedded systems all the way to the cloud. Sources of hardware errors include radiation, circuit aging, variability induced by manufacturing and operating conditions, manufacturing test escapes, and early-life failures. Many publications have suggested that cross-layer resilience, where multiple error resilience techniques from different layers of the system stack cooperate to achieve cost-effective resilience, is essential for designing cost-effective resilient digital systems. This paper presents a comprehensive overview of cross-layer resilience by addressing fundamental cross-layer resilience questions, by summarizing insights derived from recent advances in cross-layer resilience research, and by discussing future cross-layer resilience challenges. CCS CONCEPTS * General and reference \rightarrow Reliability; * Hardware \rightarrow Fault tolerance; * Computer systems organization \rightarrow Reliability
Author Skadron, Kevin
Buyuktosunoglu, Alper
Schlichtmann, Ulf
Abraham, Jacob
Cho, Hyungmin
Mitra, Subhasish
Li, Yanjing
Sharif, Uzair
Mueller-Gritschneder, Daniel
Chen, Deming
Cheng, Eric
Bose, Pradip
Stan, Mircea
Author_xml – sequence: 1
  givenname: Eric
  surname: Cheng
  fullname: Cheng, Eric
  organization: Stanford University
– sequence: 2
  givenname: Daniel
  surname: Mueller-Gritschneder
  fullname: Mueller-Gritschneder, Daniel
  organization: Technical University of Munich
– sequence: 3
  givenname: Jacob
  surname: Abraham
  fullname: Abraham, Jacob
  organization: University of Texas at Austin
– sequence: 4
  givenname: Pradip
  surname: Bose
  fullname: Bose, Pradip
  organization: IBM Research
– sequence: 5
  givenname: Alper
  surname: Buyuktosunoglu
  fullname: Buyuktosunoglu, Alper
  organization: IBM Research
– sequence: 6
  givenname: Deming
  surname: Chen
  fullname: Chen, Deming
  organization: University of Illinois at Urbana-Champaign
– sequence: 7
  givenname: Hyungmin
  surname: Cho
  fullname: Cho, Hyungmin
  organization: Hongik University
– sequence: 8
  givenname: Yanjing
  surname: Li
  fullname: Li, Yanjing
  organization: University of Chicago
– sequence: 9
  givenname: Uzair
  surname: Sharif
  fullname: Sharif, Uzair
  organization: Technical University of Munich
– sequence: 10
  givenname: Kevin
  surname: Skadron
  fullname: Skadron, Kevin
  organization: University of Virginia
– sequence: 11
  givenname: Mircea
  surname: Stan
  fullname: Stan, Mircea
  organization: University of Virginia
– sequence: 12
  givenname: Ulf
  surname: Schlichtmann
  fullname: Schlichtmann, Ulf
  organization: Technical University of Munich
– sequence: 13
  givenname: Subhasish
  surname: Mitra
  fullname: Mitra, Subhasish
  organization: Stanford University
BookMark eNotjM9LwzAYQCMo6ObOHrzkqGDn9yVpfngbxR-VgTD0PLLm6xqpqTRF2H_vQE_v8Q5vxk7TkIixK4QloirvpURtLC6lFFIZdcJmxwpSG1G6c7bI-RMAhDXo0F6w15s6_cSJwi2vxiHnYu0PNPIN5dhHSg098KrzfU9pT_mO1ynHfTcdzafAp474ZvCBrzry4ZKdtb7PtPjnnH08Pb5XL8X67bmuVuvCS4CpaEut0OoSBQkIApVQ2DrldNANSBOCcI3QQuuw21lsyNgA2jki7VsjfSnn7PrvG4lo-z3GLz8ettaCgVLKX_YpSdw
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1145/3316781.3323474
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1450367259
9781450367257
EndPage 4
ExternalDocumentID 8807053
Genre orig-research
GroupedDBID 6IE
6IH
ACM
ADPZR
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
GUFHI
LHSKQ
RIE
RIO
ID FETCH-LOGICAL-a300t-f564186512e20d214241f9496d6c037dd29c26266dbb81ce78d0699ee6af73a53
IEDL.DBID RIE
IngestDate Wed Jun 26 19:29:04 EDT 2024
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a300t-f564186512e20d214241f9496d6c037dd29c26266dbb81ce78d0699ee6af73a53
PageCount 4
ParticipantIDs ieee_primary_8807053
PublicationCentury 2000
PublicationDate 2019-June
PublicationDateYYYYMMDD 2019-06-01
PublicationDate_xml – month: 06
  year: 2019
  text: 2019-June
PublicationDecade 2010
PublicationTitle 2019 56th ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev DAC
PublicationYear 2019
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0002871918
Score 2.1518643
Snippet Resilience to errors in the underlying hardware is a key design objective for a large class of computing systems, from embedded systems all the way to the...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Aging
Cross-layer resilience
Fault tolerance
Hardware
Integrated circuit modeling
Manufacturing
Program processors
reliability
Resilience
Title (Invited) Cross-Layer Resilience: Challenges, Insights, and the Road Ahead
URI https://ieeexplore.ieee.org/document/8807053
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA62J08qrfgmBw8K3XbTyebhTWpLKyJSFbyV7M4EBNmK2_r7Tba1evDibcglMDOZ-WYyD8bOiRRKNBCsn_IhQFGUGG8pAQByAtH5ep3P-FHfv5ibYRyT09n0whBRXXxG3UjWf_k4L5YxVdYLuqaD0jRYQ1uz6tXa5FMi8rfCrKf3CJn1IDZ5G9EF6IOMBX2_1qfU3mO08797d1n7pw2PP2wczB7borLFbi8m5WfEiZd8ED1ccucCauZTql7f6md6xQffC1KqDp-UVQy_A-VK5AHt8encIb8ONhjb7Hk0fBqMk_VChMRBmi4SnykpjAo-mvopxmFpUngrrUJVpKAR-7bohwhFYZ4bUZA2mCprgzyc1-Ay2GfNcl7SAeNWOwInSOvMSSO98RlJkzmw5PNcmEPWinyYva9mXszWLDj6-_iYbQcgYVclVCesufhY0ilrVLg8q6X0BeHvka4
link.rule.ids 310,311,782,786,791,792,798,27934,54767
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4ED3pSA8a3PXjQhIXtttuHN4MQUCQGMfFGusw0MTGLccHfb7sgevDibdJLk5npzDfTeRBygShBgObe-knnAxSJkXYGI845WgZgXbnOp_ekhi_6thPG5DTWvTCIWBafYTOQ5V8-zKaLkCpreV1TXmkqZDMVSqplt9Y6oxKwv2F6Nb-HibTFQ5u3Zk3OEy5CSd-vBSql_-ju_O_mXVL_acSjj2sXs0c2MK-Ru8t-_hmQ4hVtBx8XDazHzXSExetb-VCvaft7RUrRoP28CAG4p2wO1OM9OppZoDfeCkOdPHc743YvWq1EiCyP43nkUimYlt5LYxJDGJcmmDPCSJDTmCuAxEwTH6NIyDLNpqg0xNIYLxHrFLcp3yfVfJbjAaFGWeSWoVKpFVo47VIUOrXcoMsypg9JLfBh8r6cejFZseDo7-NzstUbPwwmg_7w_phse1hhlgVVJ6Q6_1jgKakUsDgrJfYFwpmU_w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+56th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=%28Invited%29+Cross-Layer+Resilience%3A+Challenges%2C+Insights%2C+and+the+Road+Ahead&rft.au=Cheng%2C+Eric&rft.au=Mueller-Gritschneder%2C+Daniel&rft.au=Abraham%2C+Jacob&rft.au=Bose%2C+Pradip&rft.date=2019-06-01&rft.pub=ACM&rft.spage=1&rft.epage=4&rft_id=info:doi/10.1145%2F3316781.3323474&rft.externalDocID=8807053