CFP last date
22 April 2024
Reseach Article

Achieving Better Compression Applying Index-based Byte-Pair Transformation before Arithmetic Coding

by Jyotika Doshi, Savita Gandhi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 90 - Number 13
Year of Publication: 2014
Authors: Jyotika Doshi, Savita Gandhi
10.5120/15783-4555

Jyotika Doshi, Savita Gandhi . Achieving Better Compression Applying Index-based Byte-Pair Transformation before Arithmetic Coding. International Journal of Computer Applications. 90, 13 ( March 2014), 42-47. DOI=10.5120/15783-4555

@article{ 10.5120/15783-4555,
author = { Jyotika Doshi, Savita Gandhi },
title = { Achieving Better Compression Applying Index-based Byte-Pair Transformation before Arithmetic Coding },
journal = { International Journal of Computer Applications },
issue_date = { March 2014 },
volume = { 90 },
number = { 13 },
month = { March },
year = { 2014 },
issn = { 0975-8887 },
pages = { 42-47 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume90/number13/15783-4555/ },
doi = { 10.5120/15783-4555 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:10:59.192728+05:30
%A Jyotika Doshi
%A Savita Gandhi
%T Achieving Better Compression Applying Index-based Byte-Pair Transformation before Arithmetic Coding
%J International Journal of Computer Applications
%@ 0975-8887
%V 90
%N 13
%P 42-47
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Arithmetic coding is used in many compression techniques during the entropy encoding stage. Further compression is not possible without changing the data model and increasing redundancy in the data set. To increase the redundancy, we have applied index based byte-pair transformation (BPT-I) as a pre-processing to arithmetic coding. BPT-I transforms most frequent byte-pairs (2-byte integers). Here, most frequent byte-pairs are sorted in the order of their frequency and groups consisting of 256 byte-pairs are formed. Each byte-pair in a group is then encoded using two tokens: group number and the location in a group. Group number is denoted using variable length prefix codeword; whereas location within a group is denoted using 8-bit index. BPT-I is designed to be applied on any type of source; not necessarily text. More the number of groups considered during transformation, better is the compression. Experimental results have shown around 4. 30% additional reduction in compressed file size when arithmetic coding is applied after byte-pair data transformation BPT-I.

References
  1. F. D. Awan, N. Zhang, N. Motgi, R. T. Iqbal, A. Mukherjee. "LIPT: A reversible lossless text transform to improve compression performance", Proceedings of the IEEE Data Compression Conference (DCC'2001), pp. 481, March 27–29, 2001
  2. T. C. Bell, A. Moffat, "A Note on the DMC Data Compression Scheme", Computer Journal, vol. 32(1), pp. 16-20, 1989
  3. M. Burrows,D. J. Wheeler. "A block-sorting lossless data compression algorithm", Digital Systems Research Center, Research Report 124, Digital Equipment Corporation, Palo Alto, California, May 10, 1994
  4. G. V. Cormack, R. N. Horspool, "Data Compressing Using Dynamic Markov Modeling", Computer Journal, vol. 30(6), pp. 541-550, 1987
  5. Jyotika Doshi and Savita Gandhi, "Computing Number of Bits to be processed using Shift and Log in Arithmetic Coding", International Journal of Computer Applications 62(15):14-20, January 2013, Published by Foundation of Computer Science, New York, USA. BibTeX
  6. Jyotika Doshi, Savita Gandhi, "Quad-Byte Transformation as a Pre-processing to Arithmetic Coding", International Journal of Engineering Research & Technology (IJERT), Vol. 2 Issue 12, December 2013, e-ISSN: 2278-0181
  7. M. Dyer,D. Taubman, S. Nooshabadi, "Improved throughput arithmetic coder for JPEG2000", Proc. Int. Conf. Image Process. , Singapore, pp. 2817–2820, Oct. 2004
  8. Philip Gage, "A New Algorithm For Data Compression", The C Users Journal, vol. 12(2)2, pp. 23–38, February 1994
  9. P. G. Howard, J. S. Vitter, "Arithmetic coding for data compression", Proc. IEEE. , vol. 82: pp. 857-865, 1994
  10. J. C. Kieffer, E. H. Yang, "Grammar-based codes: A new class of universal lossless source codes", IEEE Trans. Inform. Theory, vol. 46, pp. 737–754, 2000
  11. H. Kruse, A. Mukherjee. "Preprocessing Text to Improve Compression Ratios",Proc. Data Compression Conference, pp. 556, 1998
  12. G. Langdon, "An introduction to arithmetic coding", IBM Journal Research and Development, vol. 28, pp. 135-149, 1984
  13. Detlev Marpe, Heiko Schwarz, Thomas Wiegand, "Context-Based Adaptive Binary Arithmetic Coding in the H. 264/AVC Video Compression Standard", IEEE Trans. On Circuits and Systems for Video Technology, vol. 13(7), pp. 620-636, July 2003
  14. Altan Mesut, Aydin Carus, "ISSDC: Digram Coding Based Lossless Dtaa Compression Algorithm", Computing and Informatics, Vol. 29, pp. 741–754, 2010
  15. Moffat, "Implementing the PPM Data Compression Scheme", IEEE Transactions on Communications, vol. 38, pp. 1917-1921, 1990
  16. M. Nelson, "Data Compressin with the Burrows-Wheeler Transform", Dr. Dobb's Journal, pp. 46-50, Sept 1996 available at http://marknelson. us/1996/09/01/bwt/
  17. Radescu R. , "Lossless Text Compression Using the LIPT Transform", Proceedings of the 7th International Conference Communications 2008 (COMM2008), ISBN 978-606-521-008-0. , pp. 59-62, Bucharest, Romania, 5-7 June 2008
  18. Senthil S, Robert L, "Text Preprocessing using Enhanced Intelligent Dictionary Based Encoding (EIDBE)", Proceedings of Third International Conference on Electronics Computer Technology, pp. 451-455, Apr 2011
  19. Senthil S, Robert L, "IIDBE: A Lossless Text Transform for Better Compression", International Journal of Wisdom Based Computing, vol. 1(2), August 2011
  20. Shajeemohan B. S, Govindan V. K, "Compression scheme for faster and secure data transmission over networks", IEEE Proceedings of the International conference on Mobile business, 2005
  21. Storer J. A. , Szymanski T. G. , "Data Compression via Textual Substitution", Journal of ACM Vol. 29(4), pp. 928-951, Oct 1982
  22. W. Sun, A. Mukherjee, N. Zhang, "A Dictionary-based Multi-Corpora Text compression System", Proceedings of the 2003 IEEE Data Compression Conference, March 2003
  23. S. Taubman and M. W. Marcellin, "JPEG2000: Image Compression Fundamentals", Standards and Practice. Norwell, MA: Kluwer Academic, 2002
  24. T. Wiegand, G. Sullivan, G. Bjontegaard, A. Luthra, "Overview of the H. 264/AVC video coding standard", IEEE Trans. Circuits Syst. Video Technol. , vol. 13(7), pp. 560–576, Jul 2003
  25. T. Welch, "A Technique for High-Performance Data Compression", IEEE Computer, vol. 17(6), pp. 8-19, June 1984
  26. M. J. Willems, Y. M. Shtarkov, T. J. Tjalkens, "The context-tree weighting method: Basic properties", IEEE Trans. Inform. Theory, vol. 41, pp. 653–664, May 1995
  27. H. Witten, R. M. Neal, J. G. Cleary, "Arithmetic coding for data compression", Commun. ACM, vol. 30(6), pp. 520–540, 1987
  28. H. Witten, Alistair Moffat, Timothy C. Bell, "Managing Gigabytes-Compressing and Indexing Documents and Images", 2nd edition, Morgan Kaufmann Publishers, 1999
Index Terms

Computer Science
Information Sciences

Keywords

Data Compression better compression rate index based byte-pair data transformation data transformation as a pre-processing to arithmetic coding arithmetic coding