Call for Paper - June 2019 Edition
IJCA solicits original research papers for the June 2019 Edition. Last date of manuscript submission is May 20, 2019. Read More

A Method of Subgraphs Extraction in a Large Graph Database in a Distributed System

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Ritu Yadav, Samarth Varshney

Ritu Yadav and Samarth Varshney. A Method of Subgraphs Extraction in a Large Graph Database in a Distributed System. International Journal of Computer Applications 175(5):1-5, October 2017. BibTeX

	author = {Ritu Yadav and Samarth Varshney},
	title = {A Method of Subgraphs Extraction in a Large Graph Database in a Distributed System},
	journal = {International Journal of Computer Applications},
	issue_date = {October 2017},
	volume = {175},
	number = {5},
	month = {Oct},
	year = {2017},
	issn = {0975-8887},
	pages = {1-5},
	numpages = {5},
	url = {},
	doi = {10.5120/ijca2017914960},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


Since many real applications such as web connectivity, social networks, and so on, are emerging now-a-days, thus graph databases have been commonly used as significant tools to exemplify and query complex graph data wherein each vertex in a graph usually contains information, which can be modeled by a set of tokens or elements. The method for subgraphs extraction by considering set similarity query over a large graph database has already been proposed, which retrieves subgraphs that are structurally isomorphic to the query graph, and meanwhile satisfy the condition of vertex pair matching with the (dynamic/fixed) weighted set similarity in a centralized system. This paper explains the efficient implementation of subgraphs extraction in a large graph database in a distributed environment by considering both vertex set similarity and graph topology which offers a better price/performance ratio and increases availability using redundancy when parts of a system fail than centralized systems in case of a large dataset (i.e., a graph with millions/billions of nodes wherein each node contains some information) by performing parallel processing.


  1. Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li, “Efficient subgraph matching on billion node graphs,” Proc. VLDB Endowment, vol. 5, no. 9, pp. 788–799, 2012.
  2. P. Zhao and J. Han, “On graph query optimization in large networks,” Proc. VLDB Endowment, vol. 3, nos. 1/2, pp. 340–351, 2010.
  3. L. Zou, L. Chen, and M. T. Ozsu, “Distance-join: Pattern match query in a large graph database,” Proc. VLDB Endowment, vol. 2, no. 1, pp. 886–897, 2009.
  4. Y. Tian and J. M. Patel, “Tale: A tool for approximate large graph matching,” in Proc. 24th Int. Conf. Data Eng., 2008, pp. 963–972.
  5. H. He and A. K. Singh, “Closure-tree: An index structure for graph queries,” in Proc. 222nd Int. Conf. Data Eng., 2006, p. 38.
  6. W.-S. Han, J. Lee, and J.-H. Lee, “Turboiso: Towards ultrafast and robust subgraph isomorphism search in large graph databases,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2013, pp. 337–348.
  7. L. P. Cordella, P. Foggia, C. Sansone, and M. Vento, “A (sub) graph isomorphism algorithm for matching large graphs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 10, pp. 1367–1372, Oct. 2004.
  8. J. R. Ullmann, “An algorithm for subgraph isomorphism,” J. ACM, vol. 23, no. 1, pp. 31–42, 1976.
  9. S. Ma, Y. Cao, W. Fan, J. Huai, and T. Wo, “Capturing topology in graph pattern matching,” Proc. VLDB Endowments, vol. 5, no. 4, pp. 310–321, 2012.
  10. A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan, “Nema: Fast graph search with label similarity,” in Proc. VLDB Endowment, vol. 6, no. 3, pp. 181–192, 2013.
  11. Liang Hong, Member, IEEE, Lei Zou, Member, IEEE, Xiang Lian, Member, IEEE, and Philip S. Yu, Fellow, IEEE, "Subgraph Matching with Set Similarity in a Large Graph Database", IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 9, SEPTEMBER 2015


apache spark, distributed systems, graph topology, set similarity