We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

The Impact and Importance of Statistics in Data Science

by Pallavi Gupta, Nitin V. Tawar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 24
Year of Publication: 2020
Authors: Pallavi Gupta, Nitin V. Tawar
10.5120/ijca2020920215

Pallavi Gupta, Nitin V. Tawar . The Impact and Importance of Statistics in Data Science. International Journal of Computer Applications. 176, 24 ( May 2020), 10-14. DOI=10.5120/ijca2020920215

@article{ 10.5120/ijca2020920215,
author = { Pallavi Gupta, Nitin V. Tawar },
title = { The Impact and Importance of Statistics in Data Science },
journal = { International Journal of Computer Applications },
issue_date = { May 2020 },
volume = { 176 },
number = { 24 },
month = { May },
year = { 2020 },
issn = { 0975-8887 },
pages = { 10-14 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number24/31345-2020920215/ },
doi = { 10.5120/ijca2020920215 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:43:22.504781+05:30
%A Pallavi Gupta
%A Nitin V. Tawar
%T The Impact and Importance of Statistics in Data Science
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 24
%P 10-14
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

With the massive amount of data pouring in, the data science has become one of the most challenging yet promising field to deal with such tremendous quantity of data and bring out the quality information out for strategic business decisions. The way to data science begins with collection of huge amount of data which should be managed enough to start processing on it to analyze it. The statistics plays a vital role from molding data into the required format to final presentation of results to make it easy for the operations to be carried out on data almost in every step of data science. In this paper, we give a manifestation of how important the statistics is to provide the necessary tools and methods to handle data to provide deep insights into the data and how useful statistics is for quantification and analysis of data. We will discuss various tools and techniques of statistics used in data science beginning from measures of dispersion to advanced tools for visualization of results to be able to understand the role and importance of statistical approaches in data processing and analysis.

References
  1. Aggarwal, C.C. (ed.): Data Classification: Algorithms and Applications. CRC Press, Boca Raton (2014)
  2. Claus Weihs, Katja Ickstadt, Data Science: the impact of statistics, International Journal of Data Science and Analytics, Springer, https://doi.org/10.1007/s41060-018-0102-5
  3. Aue, A., Horváth, L.: Structural breaks in time series. J. Time Ser. Anal. 34(1), 1–16 (2013)
  4. Brown, M.S.: Data Mining for Dummies. Wiley, London (2014)
  5. Cao, L.: Data science: a comprehensive overview. ACM Comput. Surv. (2017). https://doi.org/10.1145/3076253
  6. Lütkepohl, H.: New Introduction to Multiple Time Series Analysis. Springer, Berlin (2010)
  7. Wu, J.: Statistics = data science? http://www2.isye.gatech.edu/ ~jeffwu/presentations/datascience.pdf (1997)
  8. https://www.edureka.co/blog/math-and-statistics-for-data-science/
  9. https://link.springer.com/journal/11634
  10. https://onlinecourses.nptel.ac.in/noc20_cs46 /unit?unit=16&lesson=17
  11. https://statistics.laerd.com/statistical-guides /descriptive-inferential- statistics.php
Index Terms

Computer Science
Information Sciences

Keywords

Inferential Analysis Mean Median Mode Null hypotheses p-value