CFP last date
22 July 2024
Reseach Article

New Distance Measure for Sequence Comparison using Cumulative Frequency Distribution

by Meera.A, Lalitha Rangarajan, Shilpa .N
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 19 - Number 2
Year of Publication: 2011
Authors: Meera.A, Lalitha Rangarajan, Shilpa .N

Meera.A, Lalitha Rangarajan, Shilpa .N . New Distance Measure for Sequence Comparison using Cumulative Frequency Distribution. International Journal of Computer Applications. 19, 2 ( April 2011), 13-18. DOI=10.5120/2335-3043

@article{ 10.5120/2335-3043,
author = { Meera.A, Lalitha Rangarajan, Shilpa .N },
title = { New Distance Measure for Sequence Comparison using Cumulative Frequency Distribution },
journal = { International Journal of Computer Applications },
issue_date = { April 2011 },
volume = { 19 },
number = { 2 },
month = { April },
year = { 2011 },
issn = { 0975-8887 },
pages = { 13-18 },
numpages = {9},
url = { },
doi = { 10.5120/2335-3043 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-06T20:05:56.503619+05:30
%A Meera.A
%A Lalitha Rangarajan
%A Shilpa .N
%T New Distance Measure for Sequence Comparison using Cumulative Frequency Distribution
%J International Journal of Computer Applications
%@ 0975-8887
%V 19
%N 2
%P 13-18
%D 2011
%I Foundation of Computer Science (FCS), NY, USA

Comparison of two promoter sequences is proposed in this paper. Motifs are extracted from promoter sequences using available software tool ‘TF SEARCH’. The promoter sequences are compared using cumulative frequency distribution of motifs. For experimental study, promoter sequences of different mammals of the enzyme Citrate synthase of TCA (kreb) cycle in CMP (Central Metabolic Pathway) are considered. Results reveal high similarity in motif sequences of different organisms in the same chromosome. Also some amount of similarity is present among motif sequences of different chromosomes of the same organism.

  1. Alan M Moses,Derek Y Chiang,Daniel A Pollard,Venky N Iyer & Michael BEisen, 2004 .MONKEY:Identifying conserved transcription factor binding sites in multiple alignments using a binding site-specific evolutionary model ;Genome biology vol.5, issue 2,article 98,
  2. Altschul S.F., Gish,W., Miller,W., Myers,E.E. and Lipman,D.J. 1990. Basic local alignment search tool, J. Mol. Biol., 215, 403–410. [PubMed]
  3. Nick Bray, Inna Dubchak and Lior Pachter, 2003. AVID: A Global Alignment Program. Genome Res. 13: 97-102.
  4. Blanco E, Messeguer X, Smith TF, Guigo´ R 2006. Transcription factor map alignment of promoter regions , PLoS Comput Biol 2(5): e49. DOI: 10.1371/journal.pcbi 0020049
  5. Brutlag.D. 2002. Multiple sequence alignment and Motifs, Bioinformatics methods and Techniques. Stanford University, Stanford center for Professional development,
  6. Davidov, E., Holland, J., Marple, E., Naylor, S., 2003. Advancing drug discovery through systems biology. Drug Discov Today. 8: 175-83.
  7. Down, T.A, Hubbard, T.J.P. 2004. What can we learn from non-coding regions of similarity between genomes. BMC Bioinformatics 5, 131-137.
  8. Eugene Berezikov, Victor Guryev and Edwin Cuppen, 2005. CONREAL web server: identification and visualization of conserved transcription factor binding sites. Nucleic Acids Research, Vol. 33, Web Server issue W447–W450 doi:10.1093/nar/gki378. Gen. Biol., 5, R98
  9. Ficket, J.W., A.G. Hatzigeorgiou, 1997. Eukaryotic promoter recognition. Genome Res 7, 861–878.
  10. Jacques van Helden, 2003. Regulatory Sequence Analysis Tools. Nucleic Acids Research, Vol. 31, No. 13 3593–3596 DOI : .1093/nar/gkg567.
  11. Kent, W.J. 2002. BLAT—The BLAST-like alignment tool. Genome Res. 12: 656–664,
  12. Meera A, Lalitha Rangarajan, Savithri Bhat, 2009.Computational Approach Towards Finding Evolutionary Distance And Gene order Using Promoter Sequences Of Central Metabolic Pathway. Interdisciplinary sciences-computational life sciences DOI: 0.1007/s12539-009-0017-3 [ Spriger link],
  13. Mount.D. 2001. Bioinformatics - sequence and Genome analysis. Cold Spring Harbor, NY: Cold spring Harbor Laboratory Press,
  14. Ning, Z., Cox, A. J., and Mullikin, J.C. 2001. SSAHA: A fast search method for large DNA databases. Genome Res. 11: 1725–1729.
  15. Schwartz, ,S., Kent, W. J., Smith, A., Zhang, Z., Baertsch, R., Hardison, R. C., Haussler, D. and Miller, W., 2003. Human–mouse alignments with BLASTZ. Genome Res., 13, 103–107
  16. Smith, T. F. and Waterman, M. S., 1981. Identification of common molecular subsequences. J. Mol. Biol. 147: 195–197
  17. Ureta-Vidal A., Ettwiller, L., Birney, E, 2003. Comparative genomics: genome-wide analysis in metazoan eukaryotes. Nat Rev Genet. Apr; 4(4):251-62.
Index Terms

Computer Science
Information Sciences


Cumulative frequency distribution Distance measure Pattern matching Promoter sequence Regression line Transcription Factors (TFs) Transcription factor binding sites (TFBS)