Cyclic Storage for Fault-Tolerant Distributed Executions

Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calenda...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems Vol. 17; no. 9; pp. 1028 - 1036
Main Authors: Marcelm-Jimenez, R., Rajsbaum, S., Stevens, B.
Format: Journal Article
Language:English
Published: New York IEEE 01-09-2006
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the ith step, block B i is selected. Each component takes a copy of its local state and sends it to one of the components in B i , in such a way that each component stores (approximately) the same number of local states. Afterward, if a component of B i crashes, all of its stored data is lost and the computation cannot continue. If there exists a block with no failed components in it, then a recent global state can be retrieved and the computation does not need to start over from the very beginning. The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). In this paper, several such schemes are described and compared in terms of these measures
AbstractList Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the ith step, block B i is selected. Each component takes a copy of its local state and sends it to one of the components in B i , in such a way that each component stores (approximately) the same number of local states. Afterward, if a component of B i crashes, all of its stored data is lost and the computation cannot continue. If there exists a block with no failed components in it, then a recent global state can be retrieved and the computation does not need to start over from the very beginning. The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). In this paper, several such schemes are described and compared in terms of these measures
Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B sub(0), B sub(1),..., B sub(b-1) of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the ith step, block B sub(i) is selected. Each component takes a copy of its local state and sends it to one of the components in B sub(i), in such a way that each component stores (approximately) the same number of local states. Afterward, if a component of B sub(i) crashes, all of its stored data is lost and the computation cannot continue. If there exists a block with no failed components in it, then a recent global state can be retrieved and the computation does not need to start over from the very beginning. The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). In this paper, several such schemes are described and compared in terms of these measures
The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states).
Author Marcelm-Jimenez, R.
Rajsbaum, S.
Stevens, B.
Author_xml – sequence: 1
  givenname: R.
  surname: Marcelm-Jimenez
  fullname: Marcelm-Jimenez, R.
  organization: Departamento de Ingenieria Electrica, UAM-lztapalapa
– sequence: 2
  givenname: S.
  surname: Rajsbaum
  fullname: Rajsbaum, S.
– sequence: 3
  givenname: B.
  surname: Stevens
  fullname: Stevens, B.
BookMark eNpd0E1LAzEQBuAgCrbVoycvixdPW_O12eQo_VChoNDeQzadyJbtpiZZsP_elAqCp3kPD8PMO0aXve8BoTuCp4Rg9bT5mK-nFGMxJRRfoBGpKllSItllzphXpaJEXaNxjDuMCa8wHyE5O9qutcU6-WA-oXA-FEszdKnc-A6C6VMxb2MKbTMk2BaLb7BDan0fb9CVM12E2985QZvlYjN7LVfvL2-z51VpGRGpZBRcU7OtFbXhztSysdw5pYRgpFYMW2q4AMkayXKExhAOFWfMbJWjmLEJejyvPQT_NUBMet9GC11nevBD1FIJIhVWPMuHf3Lnh9Dn27QilGJRK5FReUY2-BgDOH0I7d6EoyZYn0rUpxL1qUSdS8z-_uxbAPizQkicP_gBe5JuFg
CODEN ITDSEO
CitedBy_id crossref_primary_10_1016_j_jss_2020_110665
crossref_primary_10_1002_spe_3328
crossref_primary_10_1007_s10845_010_0496_y
Cites_doi 10.1016/S0195-6698(87)80042-2
10.1007/978-1-4613-1401-1_4
10.1145/214451.214456
10.1109/71.298209
10.1145/99163.99173
10.1145/378580.378650
10.1017/CBO9780511665608
10.1109/PCCC.2000.830330
10.1109/RELDIS.1992.235144
10.1006/jpdc.1996.0019
10.1109/4236.957894
10.1007/978-1-4757-3831-5
10.1145/62044.62050
10.1007/978-3-642-62012-6
10.1006/inco.1995.1169
10.1016/0097-3165(78)90013-4
10.1109/DCS.1988.12507
10.1016/S0304-3975(02)00634-5
10.1109/TSE.1987.232562
10.1109/FTCS.1996.534622
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
DOI 10.1109/TPDS.2006.120
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library Online
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList
Technology Research Database
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 1036
ExternalDocumentID 2544609241
10_1109_TPDS_2006_120
1668066
Genre orig-research
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AASAJ
AAYOK
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AI.
AIBXA
AKJIK
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIC
RIE
RIG
RNI
RNS
RZB
TN5
TWZ
UHB
VH1
XFK
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
ID FETCH-LOGICAL-c316t-32efb73dc67a4fa78bc4ff9966317930c2a46e83b83c2aeba14e5433ad9f2033
IEDL.DBID RIE
ISSN 1045-9219
IngestDate Fri Aug 16 12:00:05 EDT 2024
Thu Oct 10 17:17:35 EDT 2024
Fri Aug 23 00:58:36 EDT 2024
Wed Jun 26 19:28:26 EDT 2024
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c316t-32efb73dc67a4fa78bc4ff9966317930c2a46e83b83c2aeba14e5433ad9f2033
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
PQID 912206796
PQPubID 85437
PageCount 9
ParticipantIDs ieee_primary_1668066
crossref_primary_10_1109_TPDS_2006_120
proquest_miscellaneous_896189094
proquest_journals_912206796
PublicationCentury 2000
PublicationDate 2006-09-01
PublicationDateYYYYMMDD 2006-09-01
PublicationDate_xml – month: 09
  year: 2006
  text: 2006-09-01
  day: 01
PublicationDecade 2000
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2006
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References macwilliams (bibl102827) 1993
bibl102818
yang (bibl102836) 1992
bibl10287
bibl10286
lynch (bibl102826) 1996
bibl102814
bibl10289
bibl10288
bibl102816
bibl102831
bibl102832
bibl102810
babaoglu (bibl10282) 1993
bibl102811
bibl102812
russell (bibl102833) 2002
bibl102830
kreher (bibl102823) 1998
berenbrink (bibl10284) 1999
bibl102829
bibl102824
bibl102825
colbourn (bibl102815) 2000
bibl102820
bibl102821
bibl102822
deswarte (bibl102817) 1991
stinson (bibl102834) 1999
azagury (bibl10281) 2002
marcel n-jim nez (bibl102828) 2005
bibl10283
tonchev (bibl102835) 1988
fraleigh (bibl102819) 1987
bhagwan (bibl10285) 2002
cohen (bibl102813) 2002
References_xml – ident: bibl10287
  doi: 10.1016/S0195-6698(87)80042-2
– year: 2002
  ident: bibl10285
  article-title: Replication Strategies for Highly Available Peer-to-Peer Storage
  publication-title: Proc Int'l Workshop Future Directions in Distributed Computing
  contributor:
    fullname: bhagwan
– ident: bibl102822
  doi: 10.1007/978-1-4613-1401-1_4
– ident: bibl10289
  doi: 10.1145/214451.214456
– year: 1996
  ident: bibl102826
  publication-title: Distributed Algorithms
  contributor:
    fullname: lynch
– ident: bibl102820
  doi: 10.1109/71.298209
– ident: bibl102824
  doi: 10.1145/99163.99173
– start-page: 259
  year: 2002
  ident: bibl10281
  article-title: Point-in-Time Copy: Yesterday, Today and Tomorrow
  publication-title: Proc 14th IEEE Symp Mass Storage Systems
  contributor:
    fullname: azagury
– ident: bibl102812
  doi: 10.1145/378580.378650
– year: 2002
  ident: bibl102813
  article-title: Optimal and Pessimal Orderings
  publication-title: Discrete Applied Math
  contributor:
    fullname: cohen
– ident: bibl10283
  doi: 10.1017/CBO9780511665608
– year: 1991
  ident: bibl102817
  article-title: Tol rance aux Fautes, S curit et Protection
  publication-title: Construction des Syst mes d'Exploitation R partis
  contributor:
    fullname: deswarte
– ident: bibl102811
  doi: 10.1109/PCCC.2000.830330
– year: 1993
  ident: bibl102827
  publication-title: The Theory of Error-Correcting Codes
  contributor:
    fullname: macwilliams
– start-page: 120
  year: 2002
  ident: bibl102833
  article-title: Distributed Computation Meets Design Theory: Local Schedulling for Disconnected Operations
  publication-title: Bull EATCS
  contributor:
    fullname: russell
– ident: bibl102818
  doi: 10.1109/RELDIS.1992.235144
– year: 1988
  ident: bibl102835
  publication-title: Combinatorial Configurations Designs Codes Graphs
  contributor:
    fullname: tonchev
– ident: bibl102829
  doi: 10.1006/jpdc.1996.0019
– year: 2000
  ident: bibl102815
  article-title: Applications of Combinatorial Designs to Communications, Cryptography, and Networking
  contributor:
    fullname: colbourn
– ident: bibl102832
  doi: 10.1109/4236.957894
– start-page: 53
  year: 2005
  ident: bibl102828
  article-title: Performance Measures for Distributed Storage
  publication-title: Proc Design Analysis and Simulation of Distributed Systems (DASD)
  contributor:
    fullname: marcel n-jim nez
– ident: bibl10288
  doi: 10.1007/978-1-4757-3831-5
– year: 1987
  ident: bibl102819
  publication-title: lgebra Abstracta
  contributor:
    fullname: fraleigh
– year: 1992
  ident: bibl102836
  article-title: Global Snapshots for Distributed Debugging: An Overview
  contributor:
    fullname: yang
– start-page: 2
  year: 1999
  ident: bibl10284
  article-title: Design of the PRESTO Multimedia Storage Network
  publication-title: Proc Int'l Workshop Comm and Data Management in Large Networks
  contributor:
    fullname: berenbrink
– ident: bibl102831
  doi: 10.1145/62044.62050
– year: 1998
  ident: bibl102823
  publication-title: Combinatorial Algorithms
  contributor:
    fullname: kreher
– ident: bibl102816
  doi: 10.1007/978-3-642-62012-6
– start-page: 55
  year: 1993
  ident: bibl10282
  article-title: Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms
  publication-title: Distributed Systems
  contributor:
    fullname: babaoglu
– ident: bibl102830
  doi: 10.1006/inco.1995.1169
– ident: bibl10286
  doi: 10.1016/0097-3165(78)90013-4
– ident: bibl102825
  doi: 10.1109/DCS.1988.12507
– ident: bibl102814
  doi: 10.1016/S0304-3975(02)00634-5
– year: 1999
  ident: bibl102834
  article-title: An Introduction to Combinatorial Designs
  contributor:
    fullname: stinson
– ident: bibl102821
  doi: 10.1109/TSE.1987.232562
– ident: bibl102810
  doi: 10.1109/FTCS.1996.534622
SSID ssj0014504
Score 1.8712288
Snippet Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive...
The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible...
Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B sub(0), B sub(1),..., B sub(b-1) of subsets of V,...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Publisher
StartPage 1028
SubjectTerms Bismuth
Calendars
Centralized control
Charge
checkpoint/restart
Computation
Computer crashes
Computer networks
Crashes
Data mining
Design engineering
distributed applications
Distributed control
distributed systems
Fault tolerance
Fault tolerant systems
Load balancing and task assignment
network repositories/data mining/backup
Reproduction
Resumes
storage/repositories
Stores
Title Cyclic Storage for Fault-Tolerant Distributed Executions
URI https://ieeexplore.ieee.org/document/1668066
https://www.proquest.com/docview/912206796
https://search.proquest.com/docview/896189094
Volume 17
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH-4nfTgdFOcU8lBPFnXNFmaHGUf7CTCdvBW0uQFhNGJW0H_e5N2X6AXb4GWUF7f1y8v7_0A7pWhqU21iqxQIuLc0UjpgFqFsjHNTc5U6B2eztKXNzkahzE5j7teGESsLp_hU1hWtXy7NGU4KutTIaQPkQ1opErWvVq7igEfVFSBHl0MIuXNcD9Psz9_Hc3qsgMNtN4H8aciVPnlhavQMmn976PO4HSTQpLn-p-fwxEWbWht6RnIxlrbcHIwa7ADcvhtFu-GzDzI9j6E-GSVTHS5WEfz5QJ9xFqTURiiG_iv0JLxF5paJy9gPhnPh9NoQ5sQGUbFOmIJujxl1ohUc6dTmRvuXMA1LFhjbBLNBUqWS-aXmGvKccAZ01a5JGbsEprFssArIMYp7dMN9M7U-MQNpUaZMmYURaetU1142Moy-6iHY2QVqIhVFoQeOC5F5oXehU4Q3P6lWmZd6G0ln21MZ5UpmiTV6VYXyO6p1_lQyNAFLstVJgNNjfLA9PrvfXtwXJ-UhKtgN9Bcf5Z4C42VLe8qpfkBKVDAAw
link.rule.ids 315,782,786,798,27933,27934,54767
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB60HtSDb7E-cxBPrjabNI-j2JaKD4TuwduSTSYglFZsF_Tfm-y2VdCLt8AuYZmd15fJzAdwri2VThqdOKFFwrmniTYRtQrtWrSwBdOxd7g_kE8vqtONY3IuF70wiFhdPsOruKxq-W5sy3hUdk2FUCFELsNKm0sh626tRc2AtyuywIAv2okOhvg9UfM6e-4M6sIDjcTePyJQRanyyw9XwaW3-b_P2oKNWRJJbuq_vg1LONqBzTlBA5nZ6w6s_5g2uAvq9tMOXy0ZBJgdvAgJ6SrpmXI4TbLxEEPMmpJOHKMbGbDQke4H2lor9yDrdbPbfjIjTkgso2KasBR9IZmzQhrujVSF5d5HZMOiPbZsarhAxQrFwhILQzm2OWPGaZ-2GNuHxmg8wgMg1msTEg4M7tSG1A2VQSUZs5qiN87rJlzMZZm_1eMx8gpWtHQehR5ZLkUehN6E3Si475dqmTXhaC75fGY8k1zTNK3Ot5pAFk-D1sdShhnhuJzkKhLV6ABND__e9wxW-9njQ_5w93R_BGv1uUm8GHYMjel7iSewPHHlaaVAX26Bw1Q
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cyclic+Storage+for+Fault-Tolerant+Distributed+Executions&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Marcelm-Jimenez%2C+R.&rft.au=Rajsbaum%2C+S.&rft.au=Stevens%2C+B.&rft.date=2006-09-01&rft.pub=IEEE&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=17&rft.issue=9&rft.spage=1028&rft.epage=1036&rft_id=info:doi/10.1109%2FTPDS.2006.120&rft.externalDocID=1668066
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon