CFP last date
22 April 2024
Reseach Article

Mining Developer Questions about Major NoSQL Databases

by Saiful Islam, Khalid Hasan, Rifat Shahriyar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 174 - Number 13
Year of Publication: 2021
Authors: Saiful Islam, Khalid Hasan, Rifat Shahriyar
10.5120/ijca2021921021

Saiful Islam, Khalid Hasan, Rifat Shahriyar . Mining Developer Questions about Major NoSQL Databases. International Journal of Computer Applications. 174, 13 ( Jan 2021), 1-8. DOI=10.5120/ijca2021921021

@article{ 10.5120/ijca2021921021,
author = { Saiful Islam, Khalid Hasan, Rifat Shahriyar },
title = { Mining Developer Questions about Major NoSQL Databases },
journal = { International Journal of Computer Applications },
issue_date = { Jan 2021 },
volume = { 174 },
number = { 13 },
month = { Jan },
year = { 2021 },
issn = { 0975-8887 },
pages = { 1-8 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume174/number13/31735-2021921021/ },
doi = { 10.5120/ijca2021921021 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:21:57.700892+05:30
%A Saiful Islam
%A Khalid Hasan
%A Rifat Shahriyar
%T Mining Developer Questions about Major NoSQL Databases
%J International Journal of Computer Applications
%@ 0975-8887
%V 174
%N 13
%P 1-8
%D 2021
%I Foundation of Computer Science (FCS), NY, USA
Abstract

NoSQL databases are quickly becoming more and more popular among developers. While RDBMS is still the most popular, the NoSQL camp is closing in the gap. To bridge this gap, we aim to carry out the first empirical software engineering research on NoSQL databases using Stack Overflow posts. Being one of the leading question-answering sites available, Stack Overflow has become a helpful resource in numerous software engineering research. In this paper, we chose five NoSQL databases, MongoDB, Cassandra, Redis, Neo4j, and HBase, based on their popularity and the increasing number of posts on Stack Overflow. We extracted the relevant questions and investigated different challenges and issues faced by the developers of NoSQL databases and the various domains the NoSQL databases are used by mining questions asked on Stack Overflow.We sorted the issues by popularity and difficulty metrics and observed the different nature of difficulty and popularity. We found that connection issues and integration are the most common difficult issues the developer of NoSQL databases faced. We also found that Cassandra, HBase, and Neo4j are very popular among Java developers, MongoDB and Redis are very popular among node.js developers, and Cassandra and HBase are very popular for big data systems. Our findings will help better understand the challenges, requirements, and specific applications of the NoSQL databases.

References
  1. Academic papers using Stack Exchange data. https: //meta.stackexchange.com/questions/134495/ academic-papers-using-stack-exchange-data. Accessed: 2020-12-19.
  2. Apache Cassandra. https://cassandra.apache.org/. Accessed: 2020-12-19.
  3. Apache HBase. https://hbase.apache.org/. Accessed: 2020-12-19.
  4. The most popular database for modern apps, MongoDB. https://www.mongodb.com/. Accessed: 2020-12-19.
  5. Neo4j graph platform, the leader in graph databases. https: //neo4j.com/. Accessed: 2020-12-19.
  6. Redis. https://redis.io/. Accessed: 2020-12-19.
  7. Kartik Bajaj, Karthik Pattabiraman, and Ali Mesbah. Mining questions asked by web developers. In Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, pages 112–121, New York, NY, USA, 2014. ACM.
  8. Anton Barua, Stephen W. Thomas, and Ahmed E. Hassan. What are developers talking about? an analysis of topics and trends in stack overflow. Empirical Software Engineering, 19(3):619–654, Jun 2014.
  9. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, Jan 2003.
  10. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert Gruber. Bigtable: A distributed storage system for structured data (awarded best paper!). In Brian N. Bershad and Jeffrey C. Mogul, editors, 7th Symposium on Operating Systems Design and Implementation (OSDI ’06), November 6-8, Seattle,WA, USA, pages 205– 218. USENIX Association, 2006.
  11. Stack Exchange. Data dump. https://archive.org/ details/documentation-dump.7z, march 2019.
  12. Guy Harrison. Next Generation Databases: NoSQL and Big Data. Apress, Berkely, CA, USA, 1st edition, 2015.
  13. Lena Mamykina, Bella Manoim, Manas Mittal, George Hripcsak, and Bj¨orn Hartmann. Design lessons from the fastest Q&A site in the west. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’11, pages 2857–2866, New York, NY, USA, 2011. ACM.
  14. Andrew Kachites McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.
  15. Kuldeep Singh Renu Kanwar, Prakriti Trivedi. NoSQL, a solution for distributed database management system. International Journal of Computer Applications, 67(2), 4 2013.
  16. Christoffer Rosen and Emad Shihab. What are mobile developers asking about? a large scale study using stack overflow. Empirical Software Engineering, 21(3):1192–1223, Jun 2016.
  17. Hailing Zhang, Yang Wang, and Junhui Han. Middleware design for integrating relational database and NoSQL based on data dictionary. International Journal of Computer Applications, 12 2011.
Index Terms

Computer Science
Information Sciences

Keywords

NoSQL Database Stack Overflow LDA