Toward Mitigating Adversarial Texts

Basemah Alshemali; Jugal Kalita

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

The Incorporation Of Register Capping To The Model Of The Rename Register File Using Markov Chain

An Do Wei-Ming Lin

Random Articles

Reseach Article

Toward Mitigating Adversarial Texts

by Basemah Alshemali, Jugal Kalita

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 178 - Number 50

Year of Publication: 2019

Authors: Basemah Alshemali, Jugal Kalita

10.5120/ijca2019919384

Basemah Alshemali, Jugal Kalita . Toward Mitigating Adversarial Texts. International Journal of Computer Applications. 178, 50 ( Sep 2019), 1-7. DOI=10.5120/ijca2019919384

@article{ 10.5120/ijca2019919384,

author = { Basemah Alshemali, Jugal Kalita },

title = { Toward Mitigating Adversarial Texts },

journal = { International Journal of Computer Applications },

issue_date = { Sep 2019 },

volume = { 178 },

number = { 50 },

month = { Sep },

year = { 2019 },

issn = { 0975-8887 },

pages = { 1-7 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume178/number50/30888-2019919384/ },

doi = { 10.5120/ijca2019919384 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:53:36.458594+05:30

%A Basemah Alshemali

%A Jugal Kalita

%T Toward Mitigating Adversarial Texts

%J International Journal of Computer Applications

%@ 0975-8887

%V 178

%N 50

%P 1-7

%D 2019

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Neural networks are frequently used for text classification, but can be vulnerable to misclassification caused by adversarial examples: input produced by introducing small perturbations that cause the neural network to output an incorrect classification. Previous attempts to generate black-box adversarial texts have included variations of generating nonword misspellings, natural noise, synthetic noise, along with lexical substitutions. This paper proposes a defense against black-box adversarial attacks using a spell-checking system that utilizes frequency and contextual information for correction of nonword misspellings. The proposed defense is evaluated on the Yelp Reviews Polarity and the Yelp Reviews Full datasets using adversarial texts generated by a variety of recent attacks. After detecting and recovering the adversarial texts, the proposed defense increases the classification accuracy by an average of 26.56% on the Yelp Reviews Polarity dataset and 16.27% on the Yelp Reviews Full dataset. This approach further outperforms six of the publicly available, state-of-the-art spelling correction tools by at least 25.56% in terms of average correction accuracy.

References

Moustafa Alzantot, Bharathan Balaji, and Mani Srivastava. Did you hear that? adversarial examples against automatic speech recognition. In Proceedings of the 31st Conference on Neural Information Processing Systems, 2017.
Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo- Jhang Ho, Mani Srivastava, and Kai-Wei Chang. Generating natural language adversarial examples. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2890–2896, 2018.
Yonatan Belinkov and Yonatan Bisk. Synthetic and natural noise both break neural machine translation. In International Conference on Learning Representations, 2018.
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.
MoustaphaMCisse, Yossi Adi, Natalia Neverova, and Joseph Keshet. Houdini: Fooling deep structured visual and speech recognition models with adversarial examples. In Proceedings of the 31st Conference on Neural Information Processing Systems.
Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. HotFlip: White-box adversarial examples for NLP. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pages 31–36, 2017.
Pieter Fivez, Simon ? Suster, and Walter Daelemans. Unsupervised context-sensitive spelling correction of English and Dutch clinical free-text with word and character n-gram embeddings. Biomedical Natural Language Processing Workshop, pages 143–148, 2017.
Ji Gao, Jack Lanchantin, Mary Lou Soffa, and Yanjun Qi. Black-box generation of adversarial text sequences to evade deep learning classifiers. In IEEE Security and Privacy Workshops, pages 50–56, 2018.
Jeroen Geertzen, Theodora Alexopoulou, and Anna Korhonen. Automatic linguistic annotation of large scale l2 databases: The EF-Cambridge open language database (efcamdat). In Proceedings of the Second Language Research Forum, pages 240–254, 2013.
Zhitao Gong, Wenlu Wang, Bo Li, Dawn Song, and Wei- Shinn Ku. Adversarial texts with gradient methods. arXiv preprint arXiv:1801.07175, 2018.
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
Georg Heigold, G¨unter Neumann, and Josef van Genabith. How robust are character-based word embeddings in tagging and MT against wrod scramlbing or randdm nouse? In Proceedings of the Conference of the Association for Machine Translation in the Americas, pages 68–80, 2018.
Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82–97, 2012.
Hossein Hosseini, Sreeram Kannan, Baosen Zhang, and Radha Poovendran. Deceiving Google’s perspective API built for detecting toxic comments. In Proceedings of the Conference on Computer Vision and Pattern Recognition, 2017.
Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. Mimic-iii, a freely accessible critical care database. Scientific data, 3:160035, 2016.
Dan Jurafsky and James H Martin. Speech and Language Processing. Pearson London, 3rd edition, 2018.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1097–1105, 2012.
Jason Lee, Kyunghyun Cho, and Thomas Hofmann. Fully character-level neural machine translation without explicit segmentation. Transactions of the Association for Computational Linguistics, 5:365–378, 2017.
Vladimir I Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, volume 10, pages 707–710, 1966.
Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and black-box attacks. In International Conference on Learning Representations, 2017.
Chris J Lu, Alan R Aronson, Sonya E Shooshan, and Dina Demner-Fushman. Spell checker for consumer language (CSpell). Journal of the American Medical Informatics Association, 26(3):211–218, 2019.
Cettolo Mauro, Girardi Christian, and Federico Marcello. Wit3: Web inventory of transcribed and translated talks. In Conference of European Association for Machine Translation, pages 261–268, 2012.
Wes McKinney et al. Data structures for statistical computing in Python. In Proceedings of the Python in Science Conference, volume 445, pages 51–56, 2010.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In IEEE European Symposium, Security and Privacy.
Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael Wellman. Towards the science of security and privacy in machine learning. IEEE European Symposium on Security and Privacy, 2018.
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. In the Conference on Neural Information Processing Systems, 2017.
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, pages 91–99, 2015.
Rico Sennrich, Orhan Firat, Kyunghyun Cho, Alexandra Birch, Barry Haddow, Julian Hitschler, Marcin Junczys- Dowmunt, Samuel L¨aubli, Antonio Valerio Miceli Barone, Jozef Mokry, et al. Nematus: a toolkit for neural machine translation. In Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pages 65–68, 2017.
Anders Søgaard, Miryam de Lhoneux, and Isabelle Augenstein. Nightmare at test time: How punctuation prevents parsers from generalizing. In Proceedings of the 2018 Empirical Methods in Natural Language Processing.
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, pages 3104–3112, 2014.
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
A¨aron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. In the ISCA Speech Synthesis Workshop, page 125, 2016.
G van Rossum. Python tutorial, technical report CS-R9526, Centrum voor Wiskunde en Informatica (CWI), Amsterdam. 1995.
St´efan van der Walt, S Chris Colbert, and Gael Varoquaux. The numpy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2):22–30, 2011.
Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level convolutional networks for text classification. In Advances in neural information processing systems, pages 649–657, 2015.
Zhengli Zhao, Dheeru Dua, and Sameer Singh. Generating natural adversarial examples. International Conference on Learning Representations, 2018.

Index Terms

Computer Science

Information Sciences

Keywords

Adversarial text Spelling correction