CFP last date
22 April 2024
Call for Paper
May Edition
IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 22 April 2024

Submit your paper
Know more
Reseach Article

Efficient Tabular Dataset Preparations by the Aggregations in SQL: A Survey

by Jincy Annie V.v, J. A. M. Rexie
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 58 - Number 15
Year of Publication: 2012
Authors: Jincy Annie V.v, J. A. M. Rexie
10.5120/9358-3693

Jincy Annie V.v, J. A. M. Rexie . Efficient Tabular Dataset Preparations by the Aggregations in SQL: A Survey. International Journal of Computer Applications. 58, 15 ( November 2012), 17-20. DOI=10.5120/9358-3693

@article{ 10.5120/9358-3693,
author = { Jincy Annie V.v, J. A. M. Rexie },
title = { Efficient Tabular Dataset Preparations by the Aggregations in SQL: A Survey },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 58 },
number = { 15 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 17-20 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume58/number15/9358-3693/ },
doi = { 10.5120/9358-3693 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:02:36.167777+05:30
%A Jincy Annie V.v
%A J. A. M. Rexie
%T Efficient Tabular Dataset Preparations by the Aggregations in SQL: A Survey
%J International Journal of Computer Applications
%@ 0975-8887
%V 58
%N 15
%P 17-20
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Many data mining algorithms require the result to be transformed into tabular format. Tabular datasets are the suitable input for many data mining approaches. But the existing SQL aggregations cannot produce results in tabular form with more summarized details especially in horizontal tabular form. Here discuss several approaches to produce data sets in tabular format and also present an efficient method to produce results in horizontal tabular format. Alternative methods for the evaluation of new format are also shown here.

References
  1. C. Ordonez, "Statistical Model Computation with UDFs," IEEE Trans. Knowledge and Data Eng. , vol. 22, no. 12, pp. 1752-1765, Dec. 2010.
  2. C. Ordonez and S. Pitchaimalai, "Bayesian Classifiers Programmed in SQL," IEEE Trans. Knowledge and Data Eng. , vol. 22, no. 1, pp. 139-144, Jan. 2010.
  3. Haixun Wang, Carlo Zaniolo, "User Defined Aggregates in Object-Relational Systems," IEEE Trans. Knowledge and Data Eng. , 2001.
  4. C. Ordonez, "Data Set Preprocessing and Transformation in a Database System," Intelligent Data Analysis, vol. 15, no. 4, pp. 613-631, 2011.
  5. Kai-Uwe Sattler, Eike Schallehn, "A Data Preparation Framework based on a Multidatabase Language," IEEE Trans. Knowledge and Data Eng, 2001.
  6. S. Sarawagi, S. Thomas, and R. Agrawal, "Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '98), pp. 343-354, 1998.
  7. C. Ordonez, "Integrating K-Means Clustering with a Relational DBMS Using SQL," IEEE Trans. Knowledge and Data Eng. , vol. 18,no. 2, pp. 188-201, Feb. 2006.
  8. Alexander Hinneburg, Dirk Wolfgang Lehner, "Combi-Operator-Database Support for Data Mining Applications," Proc. 29th VLDB Conference, 2003.
  9. H. Wang, C. Zaniolo, and C. R. Luo, "ATLAS: A Small But Complete SQL Extension for Data Mining and Data Streams,"Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), pp. 1113-1116, 2003.
  10. C. Ordonez, "Vertical and Horizontal Percentage Aggregations," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), pp. 866-871, 2004.
  11. C. Cunningham, G. Graefe, and C. A. Galindo-Legaria, "PIVOT and UNPIVOT: Optimization and Execution Strategies in an RDBMS," Proc. 13th Int'l Conf. Very Large Data Bases (VLDB '04), pp. 998-1009, 2004.
  12. Jennifer L. Beckmann, Alan Halverson, Rajasekar Krishnamurthy, Jeffrey F. Naughton, "Extending RDBMSs to Support Sparse Datasets Using An Interpreted Attribute Storage Format," An enterprise directory solution with DB2. IBM Systems Journal, 39(2), 2005.
  13. C. Ordonez, "Horizontal Aggregations for Building Tabular Data Sets," IEEE Trans. Knowledge and Data Eng, VOL. 24, NO. 4, April 2012.
Index Terms

Computer Science
Information Sciences

Keywords

Aggregations Vertical Aggregations Horizontal Aggregations Structured Query Language Dataset Preparation