Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Higher Compression from Burrows-Wheeler Transform for DNA Sequence

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Authors:
Rexline S. J., Aju Richard Gerard, Trujilla Lobo F.
10.5120/ijca2017915261

Rexline S J., Aju Richard Gerard and Trujilla Lobo F.. Higher Compression from Burrows-Wheeler Transform for DNA Sequence. International Journal of Computer Applications 173(3):11-15, September 2017. BibTeX

@article{10.5120/ijca2017915261,
	author = {Rexline S. J. and Aju Richard Gerard and Trujilla Lobo F.},
	title = {Higher Compression from Burrows-Wheeler Transform for DNA Sequence},
	journal = {International Journal of Computer Applications},
	issue_date = {September 2017},
	volume = {173},
	number = {3},
	month = {Sep},
	year = {2017},
	issn = {0975-8887},
	pages = {11-15},
	numpages = {5},
	url = {http://www.ijcaonline.org/archives/volume173/number3/28313-2017915261},
	doi = {10.5120/ijca2017915261},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

Large amount of space is required to store biological sequences in DNA database like GenBank sequence database. The data storage for biological sequences has become very essential in today’s current situation. Standard compression algorithms are not competent enough to compress biological sequences. In recent times, special algorithms have been introduced specifically for the purpose of compressing the biological sequences like DNA and protein sequences. In this paper, the Burrows-Wheeler Transform (BWT) based approaches are explored to compress the biological sequences. In comparison with the existing general purpose compression algorithms, the proposed BWT based method compresses these types of sequences better and at the same time the cost of Burrows-Wheeler Transform is almost insignificant.

References

  1. Arnavut. Z, “Move-to-Front and Inversion Coding”, Proceedings of Data Compression Conference, IEEE Computer Society, Snowbird, Utah, pp. 193- 202, March 2000.
  2. M.P.Bhuyan, V.Deka, S.Bordoloi, “Burrows Wheeler based data compression and secure transmission”, IJRET: International Journal of Research in Engineering and Technology, Volume: 02 Special Issue: 02 | Dec-2013.
  3. M. Burrows, and D.J. Wheeler, “A Block-sorting Lossless Data Compression Algorithm,” Digital Systems Research Center Research Report 124, 1994.
  4. Chun Li , Huan Liu, Junhong Liu, Yuping Qin, Zhifu Wangb, “A Burrows-Wheeler Transform based method for DNA sequence comparison, Computational Biology and Bioinformatics, 2(3): 33-37, 2014.
  5. Jolanta Kawulok, “Approximate String Matching for Searching DNA Sequences”, International Journal of Bioscience, Biochemistry and Bioinformatics, Vol. 3, No. 2, March 2013.
  6. Jouni Sir´en, Niko V¨alim¨aki, Veli M¨akinen, and Gonzalo Navarro, “Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections”, String Processing and Information Retrieval , pp 164-175, 2008.
  7. Juha K¨arkk¨ainen, “Fast BWT in Small Space by Blockwise Suffix Sorting”, Preprint submitted to Elsevier Science, 16 March 2007.
  8. Nelson M, “Data Compression with the Burrows- Wheeler Transform”, Dr. Dobb’s Journal, Sept. 1996.
  9. Rahul Vishwakarma1 and Newsha Amiri, “High Density Data Storage in DNA Using an Efficient Message Encoding Scheme”, International Journal of Information Technology Convergence and Services (IJITCS) Vol.2, No.2, April 2012.
  10. RAFAŁ POKRZYWA, “Searching for Unique DNA Sequences with the Burrows-Wheeler Transform”, Biocybernetics and Biomedical Engineering, Volume 28, Number 1, pp. 95–104, 2008.
  11. Sebastian Wandelt, Marc Bux, and Ulf Leser,” Trends in Genome Compression”, Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universitat zu Berlin, Germany ,June 4, 2013.
  12. Witten, I.H., R.M. Neal and J.G. Cleary, “Arithmetic coding for data compression”, Commun.ACM, 30: pp : 520-540,1987.
  13. Yong Zhang, Amar Mukherjee, Matt Powell and Tim Bell, “DNA Sequence Compression Using the Burrows-Wheeler Transform”, Proceedings of the IEEE Computer Society Bioinformatics Conference, 2002.

Keywords

DNA sequence compression, Burrows-Wheeler Transform, BWT and genome.