CFP last date
20 May 2024
Reseach Article

Data Compression Considering Text Files

by Kashfia Sailunaz, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 90 - Number 11
Year of Publication: 2014
Authors: Kashfia Sailunaz, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda
10.5120/15765-4456

Kashfia Sailunaz, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda . Data Compression Considering Text Files. International Journal of Computer Applications. 90, 11 ( March 2014), 27-32. DOI=10.5120/15765-4456

@article{ 10.5120/15765-4456,
author = { Kashfia Sailunaz, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda },
title = { Data Compression Considering Text Files },
journal = { International Journal of Computer Applications },
issue_date = { March 2014 },
volume = { 90 },
number = { 11 },
month = { March },
year = { 2014 },
issn = { 0975-8887 },
pages = { 27-32 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume90/number11/15765-4456/ },
doi = { 10.5120/15765-4456 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:10:46.862353+05:30
%A Kashfia Sailunaz
%A Mohammed Rokibul Alam Kotwal
%A Mohammad Nurul Huda
%T Data Compression Considering Text Files
%J International Journal of Computer Applications
%@ 0975-8887
%V 90
%N 11
%P 27-32
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Lossless text data compression is an important field as it significantly reduces storage requirement and communication cost. In this work, the focus is directed mainly to different file compression coding techniques and comparisons between them. Some memory efficient encoding schemes are analyzed and implemented in this work. They are: Shannon Fano Coding, Huffman Coding, Repeated Huffman Coding and Run-Length coding. A new algorithm "Modified Run-Length Coding" is also proposed and compared with the other algorithms. These analyses show how these coding techniques work, how much compression is possible for these coding techniques, the amount of memory needed for each technique, comparison between these techniques to find out which technique is better in what conditions. It is observed from the experiments that the repeated Huffman Coding shows higher compression ratio. Besides, the proposed Modified run length coding shows a higher performance than the conventional one.

References
  1. M. N. Huda, "Study on Huffman Coding," Graduate Thesis, 2004.
  2. S. R. Kodituwakku and U. S. Amarasinghe, "Comparison of Lossless Data Compression Algorithms for Text Data," Indian Journal of Computer Science and Engineering, vol. I(4), 2007, pp. 416-426.
  3. M. Al-laham and I. M. M. E. Emary, "Comparative Study between Various Algorithms of Data Compression Techniques," International Journal of Computer Science and Network Security, vol. 7(4), April 2007, pp. 281-291.
  4. S. Shanmugasundaram and R. Lourdusamy, "A Comparative Study of Text Compression Algorithms," International Journal of Wisdom Based Computing, vol. I (3), December 2011, pp. 68–76.
  5. K. Sayood, "Introduction to Data Compression," 4th ed. , Elsevier, 2012.
  6. M. R. Hasan, M. I. Ibrahimy, S. M. A. Motakabber, M. M. Ferdaus and M. N. H. Khan, "Comparative data compression techniques and multicompression results," IOP Conference, 2013.
  7. C. Lamorahan, B. Pinontoan and N. Nainggolan, " Data Compression Using Shannon-Fano Algorithm," JdC, Vol. 2, No. 2, September, 2013, pp. 10-17.
  8. P. Yellamma and N. Challa, "Performance Analysis of Different Data Compression Techniques On Text File," International Journal of Engineering Research & Technology (IJERT), Vol. 1 Issue 8, October – 2012.
  9. M. A. Khan, "Evaluation of Basic Data Compression Algorithms in a Distributed Environment," Journal of Basic & Applied Sciences, Vol. 8, 2012, pp. 362-365.
  10. D. A. Lelewer and D. S. Hirschberg, "Data Compression," Journal - ACM Computing Surveys (CSUR), Vol. 19 Issue 3, September 1987, pp. 261-296.
  11. M. Sharma, "Compression Using Huffman Coding," International Journal of Computer Science and Network Security, Vol. 10 No. 5, May 2010, pp. 133-141.
  12. H. Altarawneh and M. Altarawneh, "Data Compression Techniques on Text Files: A Comparison Study," International Journal of Computer Applications (0975 – 8887) , Vol. 26 No. 5, July 2011.
  13. R. S. Aarthi, D. Muralidharan and P. Swaminathan, "Double Compression of Test Data Using Huffman Code," Journal of Theoretical and Applied Information Technology, Vol. 39 No. 2, 15 May 2012, pp. 104-113.
Index Terms

Computer Science
Information Sciences

Keywords

Data compression Lossless compression Encoding Compression Ratio Code length Standard deviation.