CFP last date
22 April 2024
Reseach Article

Performance Analysis and Comparison of Sampling Algorithms in Online Social Network

by Stuti K., Atul Srivastava
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 133 - Number 5
Year of Publication: 2016
Authors: Stuti K., Atul Srivastava
10.5120/ijca2016907868

Stuti K., Atul Srivastava . Performance Analysis and Comparison of Sampling Algorithms in Online Social Network. International Journal of Computer Applications. 133, 5 ( January 2016), 30-35. DOI=10.5120/ijca2016907868

@article{ 10.5120/ijca2016907868,
author = { Stuti K., Atul Srivastava },
title = { Performance Analysis and Comparison of Sampling Algorithms in Online Social Network },
journal = { International Journal of Computer Applications },
issue_date = { January 2016 },
volume = { 133 },
number = { 5 },
month = { January },
year = { 2016 },
issn = { 0975-8887 },
pages = { 30-35 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume133/number5/23785-2016907868/ },
doi = { 10.5120/ijca2016907868 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:30:22.085477+05:30
%A Stuti K.
%A Atul Srivastava
%T Performance Analysis and Comparison of Sampling Algorithms in Online Social Network
%J International Journal of Computer Applications
%@ 0975-8887
%V 133
%N 5
%P 30-35
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Graph sampling provides an efficient way by selecting a representative subset of the original graph thus making the graph scale small for improved computations. Random walk graph sampling has been considered as a fundamental tool to collect uniform node samples from a large graph. In this paper, a comprehensive analysis and comparison of four existing sampling algorithms- BFS, NBRW-rw, MHRW and MHDA is presented. The comparison is shown on the basis of the performance of each algorithm on different kinds of datasets. Here, the considered parameters are node-degree distribution and clustering coefficient which effect the performance of an algorithm in generating unbiased samples. The sampling methods as in this study are analysed on the real-network datasets and finally the conclusion says that MHDA performs excellently whereas BFS gives a poor performance.

References
  1. D. Corlette and F. Shipman, “Capturing On-line Social Network Link Dynamics using Event-Driven Sampling, ” International Conference on Computational Science and Engineering, IEEE, 2009.
  2. T. Wang, Y. Chen, Z. Zhang, T. Xu, L. Jin, P. Hui, B. Deng, and X. Li, “Understanding Graph Sampling Algorithms for Social Network Analysis,” 31st International Conference on Distributed Computing Systems Workshops (ICDCSW), IEEE, 2011.
  3. C.-I. Wong, K.-Y. Wong, K.-W. Ng, W. Fan, and K.-H. Yeung, “Design of a Crawler for Online Social Network Analysis,” WSEAS TRANSACTIONS on COMMUNICATIONS, vol.13, 2014.
  4. M. Gjoka, M. Kurant, C. T. Butts, and A. Markopoulou, “Walking in Facebook: A Case Study of Unbiased Sampling of OSNs,” in proc. of the IEEE INFOCOM, 2010.
  5. M. Gjoka, M. Kurant, C. T. Butts, and A. Markopoulou, “Practical Recommendations on Crawling Online Social Networks,” IEEE Journal on Selected Areas in Communications, vol.29, Issue 9, October 2011.
  6. W. M. K. Trochim, “Sampling,” 2002, [Online],Available: http://trochim.human.cornell.edu/tutorial/mugo/tutorial.htm.
  7. M. Doherty, “Probability versus Non-Probability Sampling in Sample Surveys,” The New Zealand Statistics Review, pp. 21-28, March 1994.
  8. M. V. Gwetu, “The Application of Sampling to the Design of Structural Analysis Web Crawlers,” IJCSI, vol. 7, Issue 4, no. 8, July 2010.
  9. M. Kurant, A. Markopoulou, and P.Thiran, “On the Bias of Breadth First Search (BFS) and of Other Graph Sampling Techniques,” International Teletraffic Congress (ITC 22), 2010.
  10. L. Goodman, “Snowball Sampling,” Annals of Mathematical Statistics vol.32, pp.148-170, 1961.
  11. S. Dhawan, K. Singh, and K. Saini, “Comparing Crawling Techniques In Social Network,” IJIRS, vol.3, Issue 7, July 2014.
  12. C.-H. Lee, X. Xu, and D. Y. Eun, “Beyond Random Walk and Metropolis –Hastings Samplers: Why you should not Backtrack for Unbiased Graph Sampling,” SIGMETRICS, 2012.
  13. M. Strand, “Metropolis-Hastings Markov Chain Monte Carlo,” May 2009.
  14. R.-H. Li, J. X. Yu, L. Qin, R. Mao, and T. Jin, “On Random Walk Based Graph Sampling,” 31st International Conference on Data Engineering (ICDE), IEEE, 2015.
  15. D. Achlioptas, D. Kempe, A. Clauset, and C. Moore, “On the Bias of Traceroute Sampling or, Power-law Degree Distributions in Regular Graphs,” in proc. of the 37th ACM Symposium on Theory of Computing (STOC), May 2005.
  16. Stanford Large Network Dataset collection. http://snap.stanford.edu/data/index.html.
Index Terms

Computer Science
Information Sciences

Keywords

Comparison of Sampling Algorithms Node Degree Distribution Clustering Coefficient