CFP last date
20 May 2024
Reseach Article

A Comprehensive Survey of Pattern Mining: Challenges and Opportunities

by Pragati Upadhyay, M. K. Pandey, Narendra Kohli
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 180 - Number 24
Year of Publication: 2018
Authors: Pragati Upadhyay, M. K. Pandey, Narendra Kohli
10.5120/ijca2018916573

Pragati Upadhyay, M. K. Pandey, Narendra Kohli . A Comprehensive Survey of Pattern Mining: Challenges and Opportunities. International Journal of Computer Applications. 180, 24 ( Mar 2018), 32-39. DOI=10.5120/ijca2018916573

@article{ 10.5120/ijca2018916573,
author = { Pragati Upadhyay, M. K. Pandey, Narendra Kohli },
title = { A Comprehensive Survey of Pattern Mining: Challenges and Opportunities },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2018 },
volume = { 180 },
number = { 24 },
month = { Mar },
year = { 2018 },
issn = { 0975-8887 },
pages = { 32-39 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume180/number24/29106-2018916573/ },
doi = { 10.5120/ijca2018916573 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:01:40.102041+05:30
%A Pragati Upadhyay
%A M. K. Pandey
%A Narendra Kohli
%T A Comprehensive Survey of Pattern Mining: Challenges and Opportunities
%J International Journal of Computer Applications
%@ 0975-8887
%V 180
%N 24
%P 32-39
%D 2018
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Pattern mining is an important field of data mining. The fundamental task of data mining is to explore the database to find out sequential, frequent patterns. In recent years, data mining has shifted its focus to design methods for discovering patterns with user expectations. In this regard various types of pattern mining methods have been proposed. Frequent pattern mining, sequential pattern mining, temporal pattern mining, and constraint based pattern mining. Pattern mining has various useful real-life applications such as market basket analysis, e-learning, social network analysis, web page, click sequences, Bioinformatics, etc., this paper presents a survey of various types of pattern mining. The main goal of this paper is to present both an introduction to all pattern mining and a survey of various algorithms, challenges and research opportunities. This paper not only discusses the problems of pattern mining and its related applications, but also the extensions and possible future improvements in this field.

References
  1. Aggarwal CC,”Data mining: the textbook,” Heidelberg: Springer, 2015.
  2. Han J, Pei J, Kamber M, “Data mining: concepts and techniques,” Amsterdam: Elsevier, 2011.
  3. Agrawal R, Srikant, “R. Fast algorithms for mining association rules,” In: Proc. 20th int. conf. very large data bases, VLDB 1994, Santiago de Chile, Chile, pp.487-499, 12-15 September 1994.
  4. Antunes C, and Oliveira A, “Sequential Pattern Mining with Approximated Constraints,” in Proceedings of the International Conference on Applied Computing, pp. 131-138, 2004.
  5. Hu Y, “The Research of Customer Purchase Behavior using Constraint-Based Sequential Pattern Mining Approach,” Thesis Report, National Central University Library Electronic Theses & Dissertations System, 2007.
  6. Agrawal R, Srikant, “R. Fast algorithms for mining association rules,” In: Proc. 20th int. conf. very large data bases, VLDB 1994, Santiago de Chile, Chile, pp.487-499, 12-15 September 1994.
  7. Fernando B, Elisa F, Tinne T, “Effective use of frequent itemset mining for image classification,” In: European Conference on Computer Vision, Florence, Italy, pp. 214-227, 7-13 October,2012.
  8. Glatz E, Mavromatidis S, Ager B, Dimitropoulos X, “Visualizing big network track data using frequent pattern mining and hypergraphs,”, Computing, 96(1), pp. 27-38, 2014.
  9. Duan Y, Fu X, Luo B, Wang Z, Shi J, Du X, “Detective Automatically identify and analyze malware processes in forensic scenarios via DLLs,” IEEE International Conference on Communications, London, United Kingdom, pp. 5691-5696, 8-12 June, 2015.
  10. Mukherjee Liu, Glance, “Spotting fake reviewer groups in consumer reviews,” In: Proc. 21st international conference on World Wide Web, Lyon, France, pp. 191-200, 16-20 April, 2012.
  11. Liu Y, Zhao Y, Chen L, Pei J, Han J, “Mining frequent trajectory patterns for activity monitoring using radio frequency tag arrays,” IEEE Transactions on Parallel and Distributed Systems, 23(11), pp. 2138-2149, 2012.
  12. Mwamikazi E, Fournier-Viger P, Moghrabi C, Baudouin R, “A Dynamic Questionnaire to Further Reduce Questions in Learning Style Assessment,” In: Proc. 10th Int. Conf.Artificial Intelligence Applications and Innovations, Rhodes, Greece, pp. 224-235, 19-21 September, 2014.
  13. Fournier-Viger P, Lin J C W, Dinh T, Le HB, “Mining Correlated High-Utility Itemsets using the Bond Measure.”, In: Proc. Intern.Conf. Hybrid Artificial Intelligence Systems Seville, Spain, pp.53-65, 18-20 April, 2016.
  14. Soulet A, Raissi C, Plantevit M, Cremilleux B, “Mining dominant patterns in the sky,” In: Proc. 11th IEEE Int. Conf. on Data Mining, Vancouver, Canada, pp. 655-664, 11-14 December,2011.
  15. Mabroukeh NR, Ezeife CI, “A taxonomy of sequential pattern mining algorithms,” ACM Computing Surveys, 43(1): 3, 2010.
  16. Fournier-Viger P, Gomariz A, Campos M, Thomas R, “Fast Vertical Mining of Sequential Patterns Using Co-occurrence Information,” In: Proc. 18th Pacific-Asia Conf. Knowledge Discovery and Data Mining. Tainan, Taiwan, pp. 40-52, 13-16 May, 2014.
  17. Yan X, Han J, “gspan: Graph-based substructure pattern mining,” In: Proc. 2002 Intern. Conf. Data Mining, Maebashi City, Japan, pp. 721-724, 9-12 December, 2002.
  18. Koh Y S, Ravana S R, “Unsupervised Rare Pattern Mining: A Survey,” ACM Transactions on Knowledge Discovery from Data, 10(4): article no. 45, 2016.
  19. Han J, Pei J, Ying Y, Mao R, “Mining frequent patterns without candidate generation: a frequent-pattern tree approach,” Data Min. Knowl. Discov. 8(1), pp. 53-87, 2004.
  20. Zaki M J, “Scalable Algorithms for Association Mining,” IEEE Trans. Knowl. Data Eng., 12(3), pp. 372-390, 2000.
  21. Pei J, Han J, Lu H, Nishio S, Tang S, Yang D, ”H-mine: Hyper-structure mining of frequent patterns in large databases,” In: Proc. IEEE Intern. Conf. Data Mining, San Jose, USA, pp. 441-448, 29 November - 2 December, 2001.
  22. Uno T, Kiyomi M, Arimura H, “LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets”, Proc. ICDM'04 Workshop on Frequent Itemset Mining Implementations, CEUR, 2004.
  23. Zaki M J, Gouda K, “Scalable Algorithms for Association Mining,” IEEE Trans. Knowl. Data Eng., 2000.
  24. Zaki M J, Gouda K, “Fast vertical mining using diffsets”, In: Proc. 9th ACM SIGKDD Intern. Conf. Knowledge Discovery and Data Mining, Washington DC, USA, pp. 326-335, 24 – 27 August, 2003.
  25. Lucchese C, Orlando S, Perego R, “Fast and Memory Efficient Mining of Frequent Closed Itemsets,” IEEE Trans. Knowl. Data Eng., 18(1), pp. 21-36, 2006.
  26. Han J, Dong G, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M C, “Freespan: Frequent pattern-projected sequential pattern mining,” Proceedings 2000 Int. Conf. Knowledge Discovery and Data Mining (KDD’00), pp. 355-359, 2000.
  27. Myra S, “Web usage mining for Web site evaluation,” Communications of the ACM, vol. 43, No. 8, pp. 127–134, 2000.
  28. Uno T, Kiyomi M, and Arimura H, “LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets,” Proc. ICDM'04, Workshop on Frequent Itemset Mining Implementations, CEUR, 2004.
  29. Pei J, Han J, Lu H, Nishio S, Tang S, Yang D, ”H-mine: Hyper-structure mining of frequent patterns in large databases,” In: Proc. IEEE Intern. Conf. Data Mining, San Jose, USA, pp. 441-448, 29 November - 2 December, 2001.
  30. Srikant R, and Agrawal R, “Mining sequential patterns: Generalizations and performance improvements," The International Conference on Extending Database Technology, pp. 1-17, 1996.
  31. Aliberti G, Colantonio A, Di Pietro R, Mariani R., “EXPEDITE: EXPress closed ITemset Enumeration,” Expert Systems with Applications, 42(8), pp. 3933-3944, 2015.
  32. Suguna K, “Frequent Pattern Mining of Web Log Files Working Principles,” vol. 157, no. 3, pp. 1–5, 2017.
  33. Vo B, Hong TP, Le B, “DBV-Miner: A Dynamic Bit-Vector approach for fast mining frequent closed itemsets,” Expert Systems with Applications, 39(8), pp. 7196-206, 2012.
  34. Szathmary L, Valtchev P, Napoli A, Godin R, Boc A, Makarenkov V, “A fast compound algorithm for mining generators, closed itemsets, and computing links between equivalence classes,” Annals of Mathematics and Artificial Intelligence, pp. 81-105, 2014.
  35. Fournier-Viger P, Wu CW, Tseng VS, “Novel concise representations of high utility item-sets using generator patterns,” In: Proc. Intern. Conf. International Conference on Advanced Data Mining and Applications, Guilin, China, pp. 30-43, 19-21 December, 2014.
  36. Srikant R, and Agrawal R, “Mining sequential patterns: Generalizations and performance improvements," The International Conference on Extending Database Technology, pp. 1-17, 1996.
  37. Zaki M. J., “SPADE: An efficient algorithm for mining frequent sequences," Machine learning, vol.42 (1-2), pp. 31-60, 2001.
  38. Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, and Hsu M. C., “Mining sequential patterns by pattern-growth: The prefixspan approach," IEEE Transactions on knowledge and data engineering, vol. 16(11), pp. 1424-1440, 2004.
  39. Ayres J, Flannick J, Gehrke J, and Yiu T, “Sequential pattern mining using a bitmap representation," ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.429-435, 2002.
  40. Fournier-Viger P, Gomariz A, Campos M, and Thomas R, “Fast Vertical Mining of Sequential Patterns Using Co-occurrence Information," The Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2014.
  41. Yang Z, and Kitsuregawa M, “LAPIN-SPAM: An improved algorithm for mining sequential pattern," The International Conference on Data Engineering Workshops, pp. 1222-1222, 2005.
  42. Aseervatham S, Osmani A, and Viennet E, “bitSPADE: A lattice-based sequential pattern mining algorithm using bitmap representation," The International Conference on Data Mining, pp. 792-797, 2006.
  43. Yang Z, and Kitsuregawa M, “LAPIN-SPAM: An improved algorithm for mining sequential pattern," The International Conference on Data Engineering Workshops, pp. 1222-1222, 2005.
  44. Fournier-viger P, Lin J C, “A Survey of Itemset Mining,” pp. 1–41, 2017.
  45. Han J, Pei J, Ying Y, and Mao R, “Mining frequent patterns without candidate generation: a frequent-pattern tree approach," Data Mining and Knowledge Discovery, vol. 8(1), 2004.
  46. Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, and Hsu M C, “FreeSpan: frequent pattern projected sequential pattern mining," ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 355-359, 2000.
  47. Huang K Y, Chang C H, Tung J H, and Ho C T, “COBRA: closed sequential pattern mining using bi-phase reduction approach," The International Conference on Data Warehousing and Knowledge Discovery, pp. 280-291, 2006.
  48. Ge J, Xia Y, and Wang J, “Towards efficient sequential pattern mining in temporal uncertain databases”, The Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 268-279, 2015.
  49. Wang J, Han J, and Li C, “Frequent closed sequence mining without candidate maintenance, "IEEE Transactions on Knowledge Data Engineering, vol. 19(8), pp. 1042-1056, 2007.
  50. Gomariz A, Campos M, Marin R, and Goethals B, “ClaSP: “An efficient algorithm for mining frequent closed sequences," The Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 50-61, 2013.
  51. Pham T T, Luo J, Hong T P, and Vo B, “MSGPs: a novel algorithm for mining sequential generator patterns," The International Conference on Computational Collective Intelligence, pp.393-401, 2012.
  52. Zhang J, Wang Y, and Yang D, “CCSpan: Mining closed contiguous sequential patterns,", Knowledge-Based Systems, vol. 89, pp.1-13, 2015.
  53. Yu H, Yamana H, “Generalized sequential pattern mining with item intervals," Journal of Computers, vol. 1(3), pp. 51-60, 2006.
  54. Pei J, Han J, and Wang W, “Constraint-based sequential pattern mining: the pattern-growth methods," Journal of Intelligent Information Systems, vol. 28(2), pp. 133-160, 2007.
  55. Pei J, Han J, and Lakshmanan L V, “Mining frequent itemsets with convertible constraints," The International Conference on Data Engineering, pp.433-442, 2001.
  56. Lee C, Chen M, Lin C, “Progressive partition miner: an efficient algorithm for mining general temporal association rules,” IEEE Transaction on Knowledge and Data Engineering 15(4), PP. 1004–1017 (2003).
  57. Huang J, Dai B, Chen M, “Twain: Two-End Association Miner with Precise Frequent Exhibition Periods,” ACM Transactions on Knowledge Discovery from Data mining, 1(2), 2007.
  58. Ramaswamy S, Mahajan S, Silberschatz A, “On the Discovery of Interesting Patterns in Association Rules,” In: International Conference on Very Large Databases, New York, USA, pp. 368–379, 1998.
  59. Antunes C, “Pattern Mining over Nominal Event Sequences using Constraint Relaxations,” Ph.D. Thesis, Instituto Superior Técnico, Lisboa, Portugal, January 2005.
  60. Antunes C M, “D2pm: a framework for mining generic patterns. Technical, Instituto Superior Technical, Lisbon, 2011.
  61. Pina S M, and Antunes C, “( TD ) 2 PaM : A Constraint-Based Algorithm for Mining Temporal Patterns in Transactional Databases,” no. i, pp. 390–407, 2013.
  62. Fournier-viger P, and J. C. Lin, “A Survey of Sequential Pattern Mining,” vol. 1, no. 1, pp. 54–77, 2017.
Index Terms

Computer Science
Information Sciences

Keywords

Constraints Sequential Pattern Mining Frequent Pattern Domain Driven Pattern Mining.