Call for Paper - March 2023 Edition
IJCA solicits original research papers for the March 2023 Edition. Last date of manuscript submission is February 20, 2023. Read More

What the Masses Want: A Case Study in Knowledge Discovery from Politically Oriented Data

Print
PDF
International Journal of Computer Applications
© 2013 by IJCA Journal
Volume 67 - Number 6
Year of Publication: 2013
Authors:
Samhaa R. El-beltagy
Moustafa Ghanem
Heba Ezzat
Sourya Ezzat
Mohmmed Aboelhouda
Ahmed Gamal
Mohamed Elkalioby
Shady Alaa Issa
10.5120/11399-6712

Samhaa R El-beltagy, Moustafa Ghanem, Heba Ezzat, Sourya Ezzat, Mohmmed Aboelhouda, Ahmed Gamal, Mohamed Elkalioby and Shady Alaa Issa. Article: What the Masses Want: A Case Study in Knowledge Discovery from Politically Oriented Data. International Journal of Computer Applications 67(6):21-28, April 2013. Full text available. BibTeX

@article{key:article,
	author = {Samhaa R. El-beltagy and Moustafa Ghanem and Heba Ezzat and Sourya Ezzat and Mohmmed Aboelhouda and Ahmed Gamal and Mohamed Elkalioby and Shady Alaa Issa},
	title = {Article: What the Masses Want: A Case Study in Knowledge Discovery from Politically Oriented Data},
	journal = {International Journal of Computer Applications},
	year = {2013},
	volume = {67},
	number = {6},
	pages = {21-28},
	month = {April},
	note = {Full text available}
}

Abstract

This paper describes an approach taken to analyze and categorize a sizable dataset of politically oriented posts that were submitted to a popular idea bank, Egypt 2. 0, created following the Egyptian revolution. The aim of the analysis was to organize and present the data in a simple way that allows the voice of the people to be heard by decision makers and activists in a critical 6 week period in February and March 2011. The constraints faced when developing the approach included the absence of a classification scheme, the unavailability of training data, the need to assign more than one category, or label, to individual posts and the need to complete the task in a short period of time. The goal of this paper is twofold. Firstly, to present and evaluate the rapid development framework and algorithms used to organize the data. Secondly, to document the challenges encountered when both developing the system itself and analyzing the data, and to present our experience to the research community with the aim of identifying potentially new interesting research topics.

References

  • Blei, D. M. Ng, A. Y. and Jordan, M. I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research, pp. 993-1022.
  • Boyapati, V. 2000. Towards a comprehensive topic hierarchy for news. Master Thesis. The Australian National University.
  • Chang, Y. and Huang, H. 2008. An automatic document classifier system based on nave Bayes classifier and ontology. In Proceedings of 7th International Conference on Machine Learning and Cybernetics, Kunming, China.
  • El-Beltagy, S. R. and Rafea, A. 2011. An accuracy enhanced light stemmer for Arabic text. ACM Transactions on Speech and Language Processing (TSLP), 7(1).
  • El-Beltagy, S. R. and Rafea, A. 2009. KP-Miner: A keyphrase extraction system for English and Arabic documents. Information Systems, 34(1), 132–144.
  • Esuli, A. and Sebastiani, F. 2009. Active learning strategies for multi-label text classification," In Proceedings of the 31st European Conference on Information Retrieval (ECIR'09). Toulouse, France, pp. 102–113
  • Fellbaum, C. (Ed) (1998). WordNet: An Electronic Lexical Database. MIT Press.
  • Janik, M. and Kochut, K. 2008. Training-less ontology-based text categorization. In Proceedings of Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR 2008) at the 30th European Conference on Information Retrieval (ECIR'08), Glasgow, Scotland,.
  • Janik, M. and Kochut, K. . 2007. Wikipedia in action: ontological knowledge in text categorization. Technical. Report No. UGA-CS-TR-07-001. University of Georgia.
  • Joher, A. , Al-hajar, Z. and Kassem, F. 2008. Automatic Arabic text categorization with Bayesian learning. Damascus University - Department of Artificial Intelligence, 2008.
  • Mendenhall, W. Beaver, R. J. , and. Beaver, B. M. (2003). Introduction to Probability and Statistics. Brooks/Cole, a division of Thomson Learning.
  • Said, D. , Wanas, N. , Darwish, N. , and Hegazy, N. 2009. A study of text preprocessing tools for Arabic text categorization. In Proceedings of the 2nd International conference on Arabic Language Resources and Tools. Cairo, Egypt, 2009
  • Salton, G. and M. J. McGill (1983). Introduction to modern information retrieval. McGraw-Hill. ISBN 0070544840.
  • Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Computing Surveys (CSUR), pp. 1–47.
  • Ueda, N. and Saito, K. 2003. Parametric mixture models for multi-labeled text. Advances in neural information processing systems, 15, 721–728.
  • Wang, B. B. , McKay, R. I. , Abbass, H. A. , and Barlow, M. 2002. Learning text classifier using the domain concept hierarchy. In Proceedings of the IEEE International Conference on Communications, New York, USA.
  • Wikipedia. (2012). http://www. wikipedia. org/