Map vs. Unordered Map: An Analysis on Large Datasets

Akanksha Bindal; Prateek Narang; S. Indu

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

On Rayleigh-Ritz Method in Three-Parameter Eigenvalue Problems

January

2014

To Secure and Compress the Message on Local Area Network

May

2013

RIO: An AI based Virtual Assistant

May

2018

Improved the Prediction of Clinical Data Accuracy using RBF Neural Network Model

Mar

2017

Reseach Article

Map vs. Unordered Map: An Analysis on Large Datasets

by Akanksha Bindal, Prateek Narang, S. Indu

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 127 - Number 2

Year of Publication: 2015

Authors: Akanksha Bindal, Prateek Narang, S. Indu

10.5120/ijca2015906322

Akanksha Bindal, Prateek Narang, S. Indu . Map vs. Unordered Map: An Analysis on Large Datasets. International Journal of Computer Applications. 127, 2 ( October 2015), 5-9. DOI=10.5120/ijca2015906322

@article{ 10.5120/ijca2015906322,

author = { Akanksha Bindal, Prateek Narang, S. Indu },

title = { Map vs. Unordered Map: An Analysis on Large Datasets },

journal = { International Journal of Computer Applications },

issue_date = { October 2015 },

volume = { 127 },

number = { 2 },

month = { October },

year = { 2015 },

issn = { 0975-8887 },

pages = { 5-9 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume127/number2/22699-2015906322/ },

doi = { 10.5120/ijca2015906322 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:18:48.259801+05:30

%A Akanksha Bindal

%A Prateek Narang

%A S. Indu

%T Map vs. Unordered Map: An Analysis on Large Datasets

%J International Journal of Computer Applications

%@ 0975-8887

%V 127

%N 2

%P 5-9

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In the current scenario, with the transpiring big data explosion, data sets are often too large to fit completely inside the computers´ internal memory. In efficient processes, speed is not an option, it is a must. Hence every alternative is explored to further enhance performance, by expanding in-place memory storage that enables more data to be resident in the memory, eliminating operation latency, and even deploying an in-memory database (IMDB) system where all the data can be kept in memory. However, the technique of in-memory data handling is still at an infant stage and not viable in the current scenario. To tackle this problem a hierarchical hashing scheme is discussed where only one component of a big data structure resides in the memory. In this paper two data structures are explored: 1) Map which is implemented as self-balancing binary search trees or more commonly Red Black Trees and 2) Unordered Map which is based on hashing with chaining technique. Serialization and deserialization operations are also performed to free the internal memory and preserve the data structure object for later use. Operations such as read, write are performed, along with documentation of the results and illustrations of visual representations of the two algorithmic data structures.

References

Jeffrey Scott Vitter, Purdue University ”Dealing with massive data” http://www.cs.purdue.edu/homes/jsv/Papers/Vit.IO survey.pdf 2006.
G. DeCandia, D. Hastorun, M. Jampani et al., Dynamo: Amazons highly available key-value store, OSR, 2007.
Hao Zhang, Gang Chen,Kian-Lee Tan et al., ”In-Memory Big Data Management and Processing: A Survey”
S. Robbins, Ram is the new disk, InfoQ News, Jun. 2008.
J. Ousterhout, P. Agrawal, D. Erickson et al., The case for ramclouds: Scalable high-performance storage entirely in dram, OSR, 2010.
F. Li, B. C. Ooi, M. T. O zsu, and S. Wu, Distributed data manage-ment using mapreduce,ACM Comput. Surv., 2014.
Wikipedia-Wikipedia.org.
Standard C++ Library reference: http://www.cplusplus.com/reference/unorderedmap/unordered map/ Cpp Reference Documentation
The Boost C++ Libraries: http://theboostcpplibraries.com/ Boost C++ Documentation
Standard C++ Library reference: http://www.cplusplus.com/reference/map/map/ Cpp Reference Documentation
HP, Vertica systems, 2011. [Online]. Available: http://www.vertica. com
D. J. DeWitt, R. H. Katz, F. Olken et al., Implementation techniques for main memory database systems, in SIGMOD 84, 1984.
R. B. Hagmann, A crash recovery scheme for a memory-resident database system,TC, 1986.
T. J. Lehman and M. J. Carey, A recovery algorithm for a high performance memory-resident database system, in SIGMOD 87, 1987.
M. H. Eich, Mars: The design of a main memory database machine, in Database Machines and Knowledge Base Machines.Springer US, 1988.
D. J. DeWitt and J. Gray, Parallel database systems: The future of high performance database systems, CACM, 1992.
S. Wu, B. C. Ooi, and K.-L. Tan, Online aggregation, in Advanced Query Processing. Springer Berlin Heidelberg, 2013
S. Wu, S. Jiang, B. C. Ooi, and K.-L. Tan, Distributed online aggregations, in PVLDB 09, 2009.
T. J. Lehman and M. J. Carey, A study of index structures for main memory database management systems, in PVLDB 86, 1986.
J. Rao and K. A. Ross, Cache conscious indexing for decision-support in main memory, in PVLDB 99, 1999.
D. B. Lomet, S. Sengupta, and J. J. Levandoski, The bw-tree: A b-tree for new hardware platforms, in ICDE 13, 2013.
T. Brown, F. Ellen, and E. Ruppert, A general technique for non-blocking trees, in PPoPP, pp. 329342, ACM, 2014.
T. Brown, Personal homepage.
Oracle, Concurrentskiplistmap, 2014.
D. Dice, Y. Lev, M. Moir, D. Nussbaum, and M. Olszewski, Early experience with a commercial hardware transactional memory im-plementation, tech. rep., Sun Microsystems, Inc., 2009..
A. C. Yao. On random 2-3 trees. Acta Informatica, 9, 159170, 1978.
K.-Y. Whang and R. Krishnamurthy. Multilevel grid filesA dy-namic hierarchical multidimensional file structure. In Proceedings of the International Symposium on Database Systems for Advanced Applications, 449459. World Scientific Press, 1992.
J. S. Vitter and P. Flajolet. Average-case analysis of algorithms and data structures. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, chapter 9, 431524

Index Terms

Computer Science

Information Sciences

Keywords

Map Unordered Map Boost C++ Serialization Hierarchical hashing Memory management