Call for Paper - March 2023 Edition
IJCA solicits original research papers for the March 2023 Edition. Last date of manuscript submission is February 20, 2023. Read More

Structural Clustering Multimedia Documents: An Approach based on Semantic Sub-graph Isomorphism

Print
PDF
International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 51 - Number 1
Year of Publication: 2012
Authors:
Ali Idarrou
Driss Mammass
10.5120/8005-1343

Ali Idarrou and Driss Mammass. Article: Structural Clustering Multimedia Documents: An Approach based on Semantic Sub-graph Isomorphism. International Journal of Computer Applications 51(1):14-21, August 2012. Full text available. BibTeX

@article{key:article,
	author = {Ali Idarrou and Driss Mammass},
	title = {Article: Structural Clustering Multimedia Documents: An Approach based on Semantic Sub-graph Isomorphism},
	journal = {International Journal of Computer Applications},
	year = {2012},
	volume = {51},
	number = {1},
	pages = {14-21},
	month = {August},
	note = {Full text available}
}

Abstract

The works that used graphs to represent documents has referred to the richness of these expressive tools. However, the exploited graph theory could be of great interest concerning the evaluation of similarity between these documents, both in documentary classification and the information retrieval. In structural classification of the documents, object of this work, the similarity measure is a crucial step. In many applications, this step results in a sub-graph isomorphism problem. This problem is known in graph theory by a combinatorial explosion. To get around this problem, we propose to consider a graph as a set of paths that compose it. The matching, paths allows reducing the combinatorial cost. We propose a structural measure based on the sub-graph isomorphism and we discuss the quality of our classifier, especially the separation of classes. We'd like to show that our measure is structural, not a "surface measure" and evaluate our approach on a corpus of multimedia documents extracted, randomly, from the INEX 2007 corpus.

References

  • Bisson, La similarité: une notion symbolique/ numérique. Apprentissage symbolique-numérique. Eds Moulet, Brito, Cepadues Edition, 2000.
  • Boeres, M. , C. Ribeiro, et I. Bloch (2004). "A randomized heuristic for scene recognition by graph matching" In WEA 2004, pp. 100–113.
  • Bruno E. , Calabretto S. , Murisasco E. , "Documents textuels multi structurés un état de l'art", Revue i3, vol. 7, n° 1, 2007.
  • Champin P-A. , Solnon C. , "Measuring the similarity of labeled graphs", In 5th International Conference on Case-Based Reasoning (ICCBR 2003), volume Lecture Notes in Artificial Intelligence 2689-Springer-Verlag, p. 80–95, 2003.
  • Dalamagas T. , Cheng T. , Winkel K-J and Sellis T. , "Clustering XML Documents Using Structural Summaries" , In EDBT Workshops, 2004, pp 547–556.
  • Djemal K. , "De la modélisation à l'exploitation des documents à structures multiples", Thèse de Doctorat de l'Université de Paul Sabatier, Toulouse France 2010.
  • Genane, "Contributions à une méthodologie de comparaison de partitions", Thèse de Doctorat de l'Université de Paris 6, 2004.
  • Idarrou A. , Mammass D. , Soulé-Dupuy A. and Vallès-Parlangeau N. : "A generic Approach to the Classification of Multimedia Documents: a Structures Comparison ", In ICGST-ICISP Special Issue on GVIP, December 2010.
  • Idarrou A. : "Classification de documents multimédias: comparaison de structures", Ateliers Jeunes Chercheurs CIFED 2010, Sousse Tunisie, p 501-506.
  • Mbarki M. ,: "Gestion de l'hétérogénéité documentaire: le cas d'un entrepôt de documents multimédias", Thèse de Doctorat de Paul Sabatier Toulouse France, 2008.
  • Portier P-E, "Construction des documents multistructurés dans le contexte des Humanités numériques ", Thèse de Doctorat de l'Institut National des Sciences Appliquées Lyon France, 2010.
  • Schlieder T. , Meuss M. , Querying and Ranking XML Documents. , Special Topic Issue of the Journal of the American Society of Information Science on XML and Information Retrieval, 2002.
  • Sorlin S. , Solnon C. : "Reactive Tabu Search for Measuring Graph Similarity". GbRPR 2005: 172-182
  • Sorlin S. , Sammoud O. , Solnon C. , Jolin JM, Mesurer la similarité de graphes, Dans Extraction de Connaissance à partir d'Images (ECOI 2006), Atelier de Extraction et Gestion de Connaissances (EGC 2006), Nicole.
  • Sammoud O. , Sorlin S. , Solnon C. , Ghédira K. , "A comparative study of ant colony optimization and reactive search for graph matching problems" In 6th European Conference on Evolutionary Computation in Combinatorial Optimization (EvoCOP 2006), Günther Raidl and Jens Gottlieb ed. Budapest. pp. 287-301. LNCS 3906. Springer.
  • Termier A. , Rousset M. C. et Sebag. , "Treefinder: a First Step towards XML Data Mining" In Proceading of ICMD 2002 p 450-457.
  • Vercoustre A. , Fegas M. , Lechevallier Y. et Desperyoux T. ,: "Classification de document XML à partir d'une représentation linéaire des arbres de ces documents", EGC pp. 433-444, 2006.