An Effective Way To Reduce Network Transmission In Backup System
Content-defined chunking (CDC) algorithms play an important role in data deduplication, data synchronization and cloud storage. The existing CDC algorithms have the problems of unstable chunk size variance and low chunking throughput in processing low entropy strings. To solve these problems, this p...
Saved in:
Published in: | 2022 23rd IEEE International Conference on Mobile Data Management (MDM) pp. 125 - 127 |
---|---|
Main Authors: | , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-06-2022
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Content-defined chunking (CDC) algorithms play an important role in data deduplication, data synchronization and cloud storage. The existing CDC algorithms have the problems of unstable chunk size variance and low chunking throughput in processing low entropy strings. To solve these problems, this paper proposes Double Extreme (DE) and Rapid Double Extreme (RDE) CDC algorithm. Both DE and RDE are hash-free chunking algorithms. DE uses the byte values in the sliding window to determine the cut point. The strategy of using both maximum and minimum allows DE to better handle low entropy strings and achieve a small chunk size variance. RDE, based on DE, uses a multi-step strategy to achieve higher chunking throughput. We compared DE and RDE with the existing CDC algorithms. The experimental results show that DE and RDE significantly reduce the chunk size variance of the CDC algorithms and improves the chunking throughput performance compare to other CDC algorithms. |
---|---|
ISSN: | 2375-0324 |
DOI: | 10.1109/MDM55031.2022.00038 |