CFP last date
20 May 2024
Reseach Article

Mining Biological Network and genomes-A Systematic Review

by Rohini M. D. Surendran
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 178 - Number 33
Year of Publication: 2019
Authors: Rohini M. D. Surendran
10.5120/ijca2019919209

Rohini M. D. Surendran . Mining Biological Network and genomes-A Systematic Review. International Journal of Computer Applications. 178, 33 ( Jul 2019), 21-25. DOI=10.5120/ijca2019919209

@article{ 10.5120/ijca2019919209,
author = { Rohini M. D. Surendran },
title = { Mining Biological Network and genomes-A Systematic Review },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2019 },
volume = { 178 },
number = { 33 },
month = { Jul },
year = { 2019 },
issn = { 0975-8887 },
pages = { 21-25 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume178/number33/30752-2019919209/ },
doi = { 10.5120/ijca2019919209 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:52:07.908138+05:30
%A Rohini M. D. Surendran
%T Mining Biological Network and genomes-A Systematic Review
%J International Journal of Computer Applications
%@ 0975-8887
%V 178
%N 33
%P 21-25
%D 2019
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The research community is inundated with data such as the genome sequences of various organisms, microarray data and so on, of biological origin. This data-volume is rapidly increasing and the process of understanding the data is lagging behind the process of acquiring it. The sheer enormity calls for a systematic approach to understanding this using computational method. The rapid progress of biotechnology and bio-data analysis methods has led to the emergence and fast growth of a promising new field: bioinformatics. It is a field having a tremendous amount of bio-data which needs in-depth analysis. Bio-data is available as, Nucleotide sequences (DNA and RNA sequences), Protein sequences, Genomes and structures in the form of Biological networks (metabolic pathways, gene regulatory network, and protein interaction network). A framework to discover frequent patterns and modules from biological networks is presented. From the study of different Biological networks, it can be concluded that the best way to analyze and extract the information (frequent functional module) from the biological network is through graph mining since these networks can be modeled into different types of graphs according to the information needs to be extracted. But this graph-based mining approach often leads to the computationally hard problem due to their relation with subgraph isomorphism. Graph simplification technique is used that is suitable to biological networks, which makes the graph mining problem computationally tractable and scalable to large numbers of networks. So the detection of frequently occurring patterns and modules will be a computationally simpler task since the reduction in the effective graph size significantly.

References
  1. R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules”, Proc. 20th Int’l Conf. Very Large Data Bases (VLDB), pp. 487-499, Sept. 1994.
  2. R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules between sets of items in large databases”, Proceedings of the ACM SIGMOD (Washington D.C., USA), 1993.
  3. R. Agrawal and R. Srikant. “Mining sequential patterns”, In ICDE, 1995.
  4. D.J. Cook and L.B. Holder, “Graph-Based Data Mining”, IEEE Intelligent Systems, Volume. 15, no. 2, pp. 32-41, 2000.
  5. A. Inokuchi, T.Washio, and H. Motoda. “An apriori-based algorithm for mining frequent substructures from graph data”, In PKDD’00, 2000.
  6. G. Cong, L. Yi, B. Liu, and K. Wang, “Discovering frequent substructures from hierarchical semi-structured data”, Proc. Second SIAM Int’l Conf. Data Mining (SDM’02), 2002.
  7. Olken F, “Biopathways and protein interaction databases”, A lecture in Bioinformatics Tools for Comparative Genomics, Berkeley, CA, feb 2003.
  8. Gouda,K. and Zaki,M.J. “Efficiently mining maximal frequent itemsets”, IEEE International Conference on Data Mining (ICDM’01), San Jose, CA, November, pp. 163-170, 2001.
  9. W. A. Rives and T. Galitski, “Modular organization of cellular networks”, Proc Natl Acad.Sci.. Usa, 100, 1128-1133, 2003.
  10. Y. Tohsato, H. Matsuda and A. Hashimoto, “A multiple alignment algorithm for metabolic pathways analysis using enzyme hierarchy”, Eighth International Conference Intelligent Systems for Molecular Biology (ISMB’00), pp. 376-383, August-2000.
  11. P. D.Karp and M. L. Mavrovouniotis, “Representing, Analyzing and Synthesizing Biochemical Pathways”, IEEE expert, 11-21, April 1994.
  12. N. Vanetik, E. Gudes, and E. Shimony, “Computing Frequent Graph Patterns From Semi-Structured Data.” ICDM’02, 2002.
  13. Michihiro Kuramochi and George Karypis, “An Efiicient Algorithm for Discovering Frequent Subgraphs”, IEEE Transaction on Knowledge and Data Engineering, Vol. 16, No.9, Sept 2004.
  14. Jiawei Han, “How can data mining help bio-data Analysis?” Workshop on data mining in Bioinformatics with SIGKDD02 Conferences” 2002.
  15. Mehmet Koyuturk, Yohan Kim, Shankar Subramaniam, “Detecting Conserved Interaction Patterns in Biological Networks”.
  16. YanX, Han J: sSpan: Graph-based substructure pattern mining. In IEEE Intl. Conf. Data Mining, 721-724, 2002.
  17. Bioinformatics center and Institute of chemical research http://www.genome.ad.jp
  18. Protein Data Bank (PDB) http://www.rcsb.org/pdb/
  19. Biobase Biological database http://www.biobase.de/
  20. Metabolic pathways dataset: ftp://ftp.genome.jp/pub/ kegg/xml/
Index Terms

Computer Science
Information Sciences

Keywords

Data mining Biological networks graph mining metabolic pathways.