Implementing the context tree weighting method for text compression

The context tree weighting method is a universal compression algorithm for FSMX sources. Though we expect that it will have a good compression ratio in practice, it is difficult to implement it and in many cases the implementation is only for estimating compression ratio. Though Willems and Tjalkens...

Full description

Saved in:

Bibliographic Details
Published in:	DCC (Los Alamitos, Calif.) pp. 123 - 132
Main Authors:	Sadakane, K., Okazaki, T., Imai, H.
Format:	Conference Proceeding Journal Article
Language:	English
Published:	IEEE 2000
Subjects:	Compression algorithms DNA Optimization methods Sequences
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The context tree weighting method is a universal compression algorithm for FSMX sources. Though we expect that it will have a good compression ratio in practice, it is difficult to implement it and in many cases the implementation is only for estimating compression ratio. Though Willems and Tjalkens (1997) showed a practical implementation using not block probabilities but conditional probabilities, it is used for only binary alphabet sequences. We extend the method for multi-alphabet sequences and show a simple implementation using PPM techniques. We also propose a method to optimize a parameter of the context tree weighting for binary alphabet case. Experimental results on texts and DNA sequences show that the performance of PPM can be improved by combining the context tree weighting and that DNA sequences can be compressed in less than 2.0 bpc.
Bibliography:	SourceType-Scholarly Journals-2 ObjectType-Feature-2 ObjectType-Conference Paper-1 content type line 23 SourceType-Conference Papers & Proceedings-1 ObjectType-Article-3
ISBN:	9780769505923 0769505929
ISSN:	1068-0314 2375-0359
DOI:	10.1109/DCC.2000.838152