Call for Paper - January 2024 Edition
IJCA solicits original research papers for the January 2024 Edition. Last date of manuscript submission is December 20, 2023. Read More

Unsupervised Text Classification and Search using Word Embeddings on a Self-Organizing Map

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Suraj Subramanian, Deepali Vora

Suraj Subramanian and Deepali Vora. Unsupervised Text Classification and Search using Word Embeddings on a Self-Organizing Map. International Journal of Computer Applications 156(11):35-37, December 2016. BibTeX

	author = {Suraj Subramanian and Deepali Vora},
	title = {Unsupervised Text Classification and Search using Word Embeddings on a Self-Organizing Map},
	journal = {International Journal of Computer Applications},
	issue_date = {December 2016},
	volume = {156},
	number = {11},
	month = {Dec},
	year = {2016},
	issn = {0975-8887},
	pages = {35-37},
	numpages = {3},
	url = {},
	doi = {10.5120/ijca2016912570},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


This paper presents the results of an experimental implementation of a document classifier leveraging contextual word embeddings clustered on a self-organizing map. The problem of document categorization is further compounded when there are no predefined categories, or conversely there are too many categories, that documents may be bucketed into. This paper proposes to address these problems by modelling the major themes contained in the document corpus into a cluster-map using a self-organizing neural network. The cluster-map provides a visual representation to explore the corpus, and a near-semantic search interface of the many concepts outlined across the corpus.


  1. Honkela, T., Kaski, S., Lagus, K. and Kohonen, T., “Newsgroup exploration with WEBSOM method and browsing interface, ” Technical report, vol. 32, 1996.
  2. Kohonen, T., “Self-organization of very large document collections: State of the art,” Springer London ICANN 98, pp. 65-74, 1998.
  3. Kaski, S., Honkela, T., Lagus, K. and Kohonen, T., “WEBSOM–self-organizing maps of document collections,” Neurocomputing 21(1), pp.101-117, 1998
  4. Ritter, H. and Kohonen, T., “Self-organizing semantic maps,” Biological Cybernetics, 61(4), pp.241-254, 1989.
  5. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. and Dean, J., “Distributed representations of words and phrases and their compositionality,” Advances in Neural Information Processing Systems (pp. 3111-3119), 2013.
  6. Pennington, J., Socher, R. and Manning, C.D., “Glove: Global Vectors for Word Representation,” EMNLP, vol. 14, pp. 1532-43, October 2014.
  7. Lin, X., Soergel, D. and Marchionini, G., “A self-organizing semantic map for information retrieval,” Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 262-269, September 1991.
  8. Martin, F. and Johnson, M., “More Efficient Topic Modelling Through a Noun Only Approach,” Australasian Language Technology Association Workshop, pp. 111, 2015.


Clustering, knowledge retrieval, natural language processing, neural nets, self organizing map, topic modelling, semantic search, unsupervised.