Cyclic Storage for Fault-Tolerant Distributed Executions
Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calenda...
Saved in:
Published in: | IEEE transactions on parallel and distributed systems Vol. 17; no. 9; pp. 1028 - 1036 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
New York
IEEE
01-09-2006
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the ith step, block B i is selected. Each component takes a copy of its local state and sends it to one of the components in B i , in such a way that each component stores (approximately) the same number of local states. Afterward, if a component of B i crashes, all of its stored data is lost and the computation cannot continue. If there exists a block with no failed components in it, then a recent global state can be retrieved and the computation does not need to start over from the very beginning. The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). In this paper, several such schemes are described and compared in terms of these measures |
---|---|
AbstractList | Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the ith step, block B i is selected. Each component takes a copy of its local state and sends it to one of the components in B i , in such a way that each component stores (approximately) the same number of local states. Afterward, if a component of B i crashes, all of its stored data is lost and the computation cannot continue. If there exists a block with no failed components in it, then a recent global state can be retrieved and the computation does not need to start over from the very beginning. The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). In this paper, several such schemes are described and compared in terms of these measures Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B sub(0), B sub(1),..., B sub(b-1) of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the ith step, block B sub(i) is selected. Each component takes a copy of its local state and sends it to one of the components in B sub(i), in such a way that each component stores (approximately) the same number of local states. Afterward, if a component of B sub(i) crashes, all of its stored data is lost and the computation cannot continue. If there exists a block with no failed components in it, then a recent global state can be retrieved and the computation does not need to start over from the very beginning. The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). In this paper, several such schemes are described and compared in terms of these measures The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). |
Author | Marcelm-Jimenez, R. Rajsbaum, S. Stevens, B. |
Author_xml | – sequence: 1 givenname: R. surname: Marcelm-Jimenez fullname: Marcelm-Jimenez, R. organization: Departamento de Ingenieria Electrica, UAM-lztapalapa – sequence: 2 givenname: S. surname: Rajsbaum fullname: Rajsbaum, S. – sequence: 3 givenname: B. surname: Stevens fullname: Stevens, B. |
BookMark | eNpd0E1LAzEQBuAgCrbVoycvixdPW_O12eQo_VChoNDeQzadyJbtpiZZsP_elAqCp3kPD8PMO0aXve8BoTuCp4Rg9bT5mK-nFGMxJRRfoBGpKllSItllzphXpaJEXaNxjDuMCa8wHyE5O9qutcU6-WA-oXA-FEszdKnc-A6C6VMxb2MKbTMk2BaLb7BDan0fb9CVM12E2985QZvlYjN7LVfvL2-z51VpGRGpZBRcU7OtFbXhztSysdw5pYRgpFYMW2q4AMkayXKExhAOFWfMbJWjmLEJejyvPQT_NUBMet9GC11nevBD1FIJIhVWPMuHf3Lnh9Dn27QilGJRK5FReUY2-BgDOH0I7d6EoyZYn0rUpxL1qUSdS8z-_uxbAPizQkicP_gBe5JuFg |
CODEN | ITDSEO |
CitedBy_id | crossref_primary_10_1016_j_jss_2020_110665 crossref_primary_10_1002_spe_3328 crossref_primary_10_1007_s10845_010_0496_y |
Cites_doi | 10.1016/S0195-6698(87)80042-2 10.1007/978-1-4613-1401-1_4 10.1145/214451.214456 10.1109/71.298209 10.1145/99163.99173 10.1145/378580.378650 10.1017/CBO9780511665608 10.1109/PCCC.2000.830330 10.1109/RELDIS.1992.235144 10.1006/jpdc.1996.0019 10.1109/4236.957894 10.1007/978-1-4757-3831-5 10.1145/62044.62050 10.1007/978-3-642-62012-6 10.1006/inco.1995.1169 10.1016/0097-3165(78)90013-4 10.1109/DCS.1988.12507 10.1016/S0304-3975(02)00634-5 10.1109/TSE.1987.232562 10.1109/FTCS.1996.534622 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
DOI | 10.1109/TPDS.2006.120 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library Online CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
DatabaseTitleList | Technology Research Database Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1558-2183 |
EndPage | 1036 |
ExternalDocumentID | 2544609241 10_1109_TPDS_2006_120 1668066 |
Genre | orig-research |
GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AASAJ AAYOK ABFSI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AI. AIBXA AKJIK ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIC RIE RIG RNI RNS RZB TN5 TWZ UHB VH1 XFK AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
ID | FETCH-LOGICAL-c316t-32efb73dc67a4fa78bc4ff9966317930c2a46e83b83c2aeba14e5433ad9f2033 |
IEDL.DBID | RIE |
ISSN | 1045-9219 |
IngestDate | Fri Aug 16 12:00:05 EDT 2024 Thu Oct 10 17:17:35 EDT 2024 Fri Aug 23 00:58:36 EDT 2024 Wed Jun 26 19:28:26 EDT 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 9 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c316t-32efb73dc67a4fa78bc4ff9966317930c2a46e83b83c2aeba14e5433ad9f2033 |
Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
PQID | 912206796 |
PQPubID | 85437 |
PageCount | 9 |
ParticipantIDs | ieee_primary_1668066 crossref_primary_10_1109_TPDS_2006_120 proquest_miscellaneous_896189094 proquest_journals_912206796 |
PublicationCentury | 2000 |
PublicationDate | 2006-09-01 |
PublicationDateYYYYMMDD | 2006-09-01 |
PublicationDate_xml | – month: 09 year: 2006 text: 2006-09-01 day: 01 |
PublicationDecade | 2000 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on parallel and distributed systems |
PublicationTitleAbbrev | TPDS |
PublicationYear | 2006 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | macwilliams (bibl102827) 1993 bibl102818 yang (bibl102836) 1992 bibl10287 bibl10286 lynch (bibl102826) 1996 bibl102814 bibl10289 bibl10288 bibl102816 bibl102831 bibl102832 bibl102810 babaoglu (bibl10282) 1993 bibl102811 bibl102812 russell (bibl102833) 2002 bibl102830 kreher (bibl102823) 1998 berenbrink (bibl10284) 1999 bibl102829 bibl102824 bibl102825 colbourn (bibl102815) 2000 bibl102820 bibl102821 bibl102822 deswarte (bibl102817) 1991 stinson (bibl102834) 1999 azagury (bibl10281) 2002 marcel n-jim nez (bibl102828) 2005 bibl10283 tonchev (bibl102835) 1988 fraleigh (bibl102819) 1987 bhagwan (bibl10285) 2002 cohen (bibl102813) 2002 |
References_xml | – ident: bibl10287 doi: 10.1016/S0195-6698(87)80042-2 – year: 2002 ident: bibl10285 article-title: Replication Strategies for Highly Available Peer-to-Peer Storage publication-title: Proc Int'l Workshop Future Directions in Distributed Computing contributor: fullname: bhagwan – ident: bibl102822 doi: 10.1007/978-1-4613-1401-1_4 – ident: bibl10289 doi: 10.1145/214451.214456 – year: 1996 ident: bibl102826 publication-title: Distributed Algorithms contributor: fullname: lynch – ident: bibl102820 doi: 10.1109/71.298209 – ident: bibl102824 doi: 10.1145/99163.99173 – start-page: 259 year: 2002 ident: bibl10281 article-title: Point-in-Time Copy: Yesterday, Today and Tomorrow publication-title: Proc 14th IEEE Symp Mass Storage Systems contributor: fullname: azagury – ident: bibl102812 doi: 10.1145/378580.378650 – year: 2002 ident: bibl102813 article-title: Optimal and Pessimal Orderings publication-title: Discrete Applied Math contributor: fullname: cohen – ident: bibl10283 doi: 10.1017/CBO9780511665608 – year: 1991 ident: bibl102817 article-title: Tol rance aux Fautes, S curit et Protection publication-title: Construction des Syst mes d'Exploitation R partis contributor: fullname: deswarte – ident: bibl102811 doi: 10.1109/PCCC.2000.830330 – year: 1993 ident: bibl102827 publication-title: The Theory of Error-Correcting Codes contributor: fullname: macwilliams – start-page: 120 year: 2002 ident: bibl102833 article-title: Distributed Computation Meets Design Theory: Local Schedulling for Disconnected Operations publication-title: Bull EATCS contributor: fullname: russell – ident: bibl102818 doi: 10.1109/RELDIS.1992.235144 – year: 1988 ident: bibl102835 publication-title: Combinatorial Configurations Designs Codes Graphs contributor: fullname: tonchev – ident: bibl102829 doi: 10.1006/jpdc.1996.0019 – year: 2000 ident: bibl102815 article-title: Applications of Combinatorial Designs to Communications, Cryptography, and Networking contributor: fullname: colbourn – ident: bibl102832 doi: 10.1109/4236.957894 – start-page: 53 year: 2005 ident: bibl102828 article-title: Performance Measures for Distributed Storage publication-title: Proc Design Analysis and Simulation of Distributed Systems (DASD) contributor: fullname: marcel n-jim nez – ident: bibl10288 doi: 10.1007/978-1-4757-3831-5 – year: 1987 ident: bibl102819 publication-title: lgebra Abstracta contributor: fullname: fraleigh – year: 1992 ident: bibl102836 article-title: Global Snapshots for Distributed Debugging: An Overview contributor: fullname: yang – start-page: 2 year: 1999 ident: bibl10284 article-title: Design of the PRESTO Multimedia Storage Network publication-title: Proc Int'l Workshop Comm and Data Management in Large Networks contributor: fullname: berenbrink – ident: bibl102831 doi: 10.1145/62044.62050 – year: 1998 ident: bibl102823 publication-title: Combinatorial Algorithms contributor: fullname: kreher – ident: bibl102816 doi: 10.1007/978-3-642-62012-6 – start-page: 55 year: 1993 ident: bibl10282 article-title: Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms publication-title: Distributed Systems contributor: fullname: babaoglu – ident: bibl102830 doi: 10.1006/inco.1995.1169 – ident: bibl10286 doi: 10.1016/0097-3165(78)90013-4 – ident: bibl102825 doi: 10.1109/DCS.1988.12507 – ident: bibl102814 doi: 10.1016/S0304-3975(02)00634-5 – year: 1999 ident: bibl102834 article-title: An Introduction to Combinatorial Designs contributor: fullname: stinson – ident: bibl102821 doi: 10.1109/TSE.1987.232562 – ident: bibl102810 doi: 10.1109/FTCS.1996.534622 |
SSID | ssj0014504 |
Score | 1.8712288 |
Snippet | Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B 0 , B 1 ,..., B b-1 of subsets of V, where successive... The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible... Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B sub(0), B sub(1),..., B sub(b-1) of subsets of V,... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Publisher |
StartPage | 1028 |
SubjectTerms | Bismuth Calendars Centralized control Charge checkpoint/restart Computation Computer crashes Computer networks Crashes Data mining Design engineering distributed applications Distributed control distributed systems Fault tolerance Fault tolerant systems Load balancing and task assignment network repositories/data mining/backup Reproduction Resumes storage/repositories Stores |
Title | Cyclic Storage for Fault-Tolerant Distributed Executions |
URI | https://ieeexplore.ieee.org/document/1668066 https://www.proquest.com/docview/912206796 https://search.proquest.com/docview/896189094 |
Volume | 17 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH-4nfTgdFOcU8lBPFnXNFmaHGUf7CTCdvBW0uQFhNGJW0H_e5N2X6AXb4GWUF7f1y8v7_0A7pWhqU21iqxQIuLc0UjpgFqFsjHNTc5U6B2eztKXNzkahzE5j7teGESsLp_hU1hWtXy7NGU4KutTIaQPkQ1opErWvVq7igEfVFSBHl0MIuXNcD9Psz9_Hc3qsgMNtN4H8aciVPnlhavQMmn976PO4HSTQpLn-p-fwxEWbWht6RnIxlrbcHIwa7ADcvhtFu-GzDzI9j6E-GSVTHS5WEfz5QJ9xFqTURiiG_iv0JLxF5paJy9gPhnPh9NoQ5sQGUbFOmIJujxl1ohUc6dTmRvuXMA1LFhjbBLNBUqWS-aXmGvKccAZ01a5JGbsEprFssArIMYp7dMN9M7U-MQNpUaZMmYURaetU1142Moy-6iHY2QVqIhVFoQeOC5F5oXehU4Q3P6lWmZd6G0ln21MZ5UpmiTV6VYXyO6p1_lQyNAFLstVJgNNjfLA9PrvfXtwXJ-UhKtgN9Bcf5Z4C42VLe8qpfkBKVDAAw |
link.rule.ids | 315,782,786,798,27933,27934,54767 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB60HtSDb7E-cxBPrjabNI-j2JaKD4TuwduSTSYglFZsF_Tfm-y2VdCLt8AuYZmd15fJzAdwri2VThqdOKFFwrmniTYRtQrtWrSwBdOxd7g_kE8vqtONY3IuF70wiFhdPsOruKxq-W5sy3hUdk2FUCFELsNKm0sh626tRc2AtyuywIAv2okOhvg9UfM6e-4M6sIDjcTePyJQRanyyw9XwaW3-b_P2oKNWRJJbuq_vg1LONqBzTlBA5nZ6w6s_5g2uAvq9tMOXy0ZBJgdvAgJ6SrpmXI4TbLxEEPMmpJOHKMbGbDQke4H2lor9yDrdbPbfjIjTkgso2KasBR9IZmzQhrujVSF5d5HZMOiPbZsarhAxQrFwhILQzm2OWPGaZ-2GNuHxmg8wgMg1msTEg4M7tSG1A2VQSUZs5qiN87rJlzMZZm_1eMx8gpWtHQehR5ZLkUehN6E3Si475dqmTXhaC75fGY8k1zTNK3Ot5pAFk-D1sdShhnhuJzkKhLV6ABND__e9wxW-9njQ_5w93R_BGv1uUm8GHYMjel7iSewPHHlaaVAX26Bw1Q |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cyclic+Storage+for+Fault-Tolerant+Distributed+Executions&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Marcelm-Jimenez%2C+R.&rft.au=Rajsbaum%2C+S.&rft.au=Stevens%2C+B.&rft.date=2006-09-01&rft.pub=IEEE&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=17&rft.issue=9&rft.spage=1028&rft.epage=1036&rft_id=info:doi/10.1109%2FTPDS.2006.120&rft.externalDocID=1668066 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |