Implementing the context tree weighting method for text compression
The context tree weighting method is a universal compression algorithm for FSMX sources. Though we expect that it will have a good compression ratio in practice, it is difficult to implement it and in many cases the implementation is only for estimating compression ratio. Though Willems and Tjalkens...
Saved in:
Published in: | DCC (Los Alamitos, Calif.) pp. 123 - 132 |
---|---|
Main Authors: | , , |
Format: | Conference Proceeding Journal Article |
Language: | English |
Published: |
IEEE
2000
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The context tree weighting method is a universal compression algorithm for FSMX sources. Though we expect that it will have a good compression ratio in practice, it is difficult to implement it and in many cases the implementation is only for estimating compression ratio. Though Willems and Tjalkens (1997) showed a practical implementation using not block probabilities but conditional probabilities, it is used for only binary alphabet sequences. We extend the method for multi-alphabet sequences and show a simple implementation using PPM techniques. We also propose a method to optimize a parameter of the context tree weighting for binary alphabet case. Experimental results on texts and DNA sequences show that the performance of PPM can be improved by combining the context tree weighting and that DNA sequences can be compressed in less than 2.0 bpc. |
---|---|
Bibliography: | SourceType-Scholarly Journals-2 ObjectType-Feature-2 ObjectType-Conference Paper-1 content type line 23 SourceType-Conference Papers & Proceedings-1 ObjectType-Article-3 |
ISBN: | 9780769505923 0769505929 |
ISSN: | 1068-0314 2375-0359 |
DOI: | 10.1109/DCC.2000.838152 |