CFP last date
20 May 2024
Reseach Article

FP-Split SPADE-An Algorithm for Finding Sequential Patterns

by Pragya Goel, Rajender Nath, Kartik
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 145 - Number 5
Year of Publication: 2016
Authors: Pragya Goel, Rajender Nath, Kartik
10.5120/ijca2016910627

Pragya Goel, Rajender Nath, Kartik . FP-Split SPADE-An Algorithm for Finding Sequential Patterns. International Journal of Computer Applications. 145, 5 ( Jul 2016), 23-28. DOI=10.5120/ijca2016910627

@article{ 10.5120/ijca2016910627,
author = { Pragya Goel, Rajender Nath, Kartik },
title = { FP-Split SPADE-An Algorithm for Finding Sequential Patterns },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2016 },
volume = { 145 },
number = { 5 },
month = { Jul },
year = { 2016 },
issn = { 0975-8887 },
pages = { 23-28 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume145/number5/25275-2016910627/ },
doi = { 10.5120/ijca2016910627 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:47:59.373950+05:30
%A Pragya Goel
%A Rajender Nath
%A Kartik
%T FP-Split SPADE-An Algorithm for Finding Sequential Patterns
%J International Journal of Computer Applications
%@ 0975-8887
%V 145
%N 5
%P 23-28
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Sequential Pattern Mining (SPM) is one of the key areas in Web Usage Mining (WUM) with broad applications such as analyzing customer behavior from weblog files. The current algorithms in this area can be classified into two broad areas, namely, apriori-based and pattern-growth based. Apriori based algorithms for mining sequential patterns need to scan the database many times as they focus on candidate generation and test approach. A lot of research has been done so far, but even the best apriori based algorithm for SPM in terms of number of database scans is SPADE that scans the database three times for discovering sequential patterns. Pattern growth based algorithms avoid the candidate generation step and the best pattern growth algorithm known so far is Prefix Span that needs to scan the database at least twice. In this paper, a novel algorithm for SPM is proposed called FP-Split SPADE that reduced the database scan to only one by creating an FP-Split tree and applying SPADE algorithm on the tree instead on sequence database that greatly improved the efficiency of mining sequential patterns.

References
  1. Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules, In Proceedings of the 20thInternational Conference on Very Large Databases, VLDB, ACM, 487-499.
  2. Agrawal, R., and Srikant, R. 1995. Mining sequentialpatterns, In Proceedings of the 11th IEEE InternationalConference on Data Engineering, 3-14.
  3. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., and Hsu, M.-C. 2000. FreeSpan: frequent pattern-projected sequential pattern mining, In Proceedings of the 6th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining, 355-359.
  4. Zaki, M. J. SPADE. 2001. An Efficient algorithm for mining frequent sequences, Machine Learning, 31-60.
  5. Pei, J., Han, J., Moratzavi-Asl, B., Pinto, H., Chen, Q., Dayal and U. PrefixSpan. 2001. Mining Sequential Patterns Efficiently by Prefix- Projected Pattern growth, In Proceedings of the 17th IEEE International on Data Engineering, 215-224.
  6. Leleu, M., Rigotti, C., Boulicaut, J.-F., and Euvrard, G. 2003. GO-SPADE: Mining Sequential Patterns overDatasets with Consecutive Repetitions, In Proceedingsof the 3rd International Conference on MachineLearning and Data Mining in Pattern Recognition, Springer, 293-306.
  7. Han, J., Pei, J., Yin, Y. and Mao R. 2004. MiningFrequent Patterns without candidate generation: AFrequent pattern tree approach, Data Mining and Knowledge Discovery, Springer, 53-87.
  8. Lee, C.F., Shen and T-H. 2005. An FP-Split Method for Fast Association Rules Mining, In Proceedings of the3rd IEEE International Conference on InformationTechnology: Research and Education, 459-463.
  9. Aseervatham, S., Osmani, A., and Viennet, E. 2006. bitSPADE:A Lattice-based Sequential Pattern Mining Algorithm Using Bitmap Representation, InProceedings of the 6th IEEE International Conferenceon Data Mining, 792-797.
  10. Alias, S., and Norwawi, N. M. 2008. pSPADE:Miningsequential pattern using personalized support thresholdvalue. International Symposium on Information Technology, IEEE, 1-8.
Index Terms

Computer Science
Information Sciences

Keywords

Sequential Pattern Mining Web Mining SPADE Apriori FP-Split tree