CFP last date
20 April 2026
Call for Paper
May Edition
IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper
Know more
Random Articles
Reseach Article

Vendor-Agnostic Invoice Processing Framework: Integrating OCR, Canonical Modeling, and Human-in-the-Loop Validation

by Homi Dhumal, Harsh Dixit, Manav Shah
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 90
Year of Publication: 2026
Authors: Homi Dhumal, Harsh Dixit, Manav Shah
10.5120/ijca2026926589

Homi Dhumal, Harsh Dixit, Manav Shah . Vendor-Agnostic Invoice Processing Framework: Integrating OCR, Canonical Modeling, and Human-in-the-Loop Validation. International Journal of Computer Applications. 187, 90 ( Mar 2026), 30-35. DOI=10.5120/ijca2026926589

@article{ 10.5120/ijca2026926589,
author = { Homi Dhumal, Harsh Dixit, Manav Shah },
title = { Vendor-Agnostic Invoice Processing Framework: Integrating OCR, Canonical Modeling, and Human-in-the-Loop Validation },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2026 },
volume = { 187 },
number = { 90 },
month = { Mar },
year = { 2026 },
issn = { 0975-8887 },
pages = { 30-35 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number90/vendor-agnostic-invoice-processing-framework-integrating-ocr-canonical-modeling-and-human-in-the-loop-validation/ },
doi = { 10.5120/ijca2026926589 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2026-03-20T22:55:35.604969+05:30
%A Homi Dhumal
%A Harsh Dixit
%A Manav Shah
%T Vendor-Agnostic Invoice Processing Framework: Integrating OCR, Canonical Modeling, and Human-in-the-Loop Validation
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 90
%P 30-35
%D 2026
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Automated invoice processing still faces ongoing unresolved difficulties that could be related to non-standardized format, inaccuracy of optical character recognition, and the need to maintain the financial integrity as well as the audit compliance. Current academic and commercial solutions do not fully address the issues and an integrated approach to ensure numerical inaccuracy, regulatory compliance, and auditing is not developed. To overcome this weakness, this paper proposes a validation-based pipeline of invoice processing, combining OCR extraction, canonical data modelling, and carefully organization human-in the-loop validation controls. It is a pipeline that normalizes the extracted fields to a vendor-neutral schema to ensure a seamless interoperability of enterprise resource planning and imposes arithmetic and accounting validation constraints. Experimental evaluation has shown improved retrieval of financial information and reducing numerical inconsistencies caused by OCR errors.

References
  1. Christine H. Doxey, 2021. “The New Accounts Payable Toolkit”, John Wiley & Sons, New York.
  2. Sagar Sahu, Sania Salwekar, Atharva Pandit and Manoj Patil, 2020. “Invoice Processing Using Robotic Process Automation”, International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 6/2, pp. 223–229.
  3. Thomas Saout, Frédéric Lardeux and Frédéric Saubion, 2024. “An Overview of Data Extraction from Invoices”, IEEE Access, 12, pp. 19872–19886.
  4. Merxhan Bajrami, Nevena Ackovska, Biljana Stojkoska, et al, 2024.“Deep Dive into Invoice Intelligence: A Benchmark Study of Leading Models for Automated Invoice Data Extraction”, Proceedings of the Ninth International Congress on Information and Communication Technology, Springer, Singapore, pp. 177–191.
  5. Dipali Baviskar, Swati Ahirrao and Ketan Kotecha, 2021. “Multi-Layout Unstructured Invoice Documents Dataset: A Dataset for Template-Free Invoice Processing and Its Evaluation Using AI Approaches”, IEEE Access, 9, pp. 101494– 101512.
  6. Alireza Alaei, Vinh Bui, David Doermann and Umapada Pal, 2023. “Document Image Quality Assessment: A Survey”, ACM Computing Surveys, 56/2, pp. 1–36.
  7. El Harraj and Nabil Raissouni, 2015. “OCR Accuracy Improvement on Document Images Through a Novel Pre-Processing Approach”, Procedia Computer Science, 73, pp. 78–85.
  8. H. T. Ha and Pavel Horák, 2022. “Information Extraction from Scanned Invoice Images Using Text Analysis and Layout Features”, Expert Systems with Applications, 195, pp. 116611.
  9. Yiheng Xu, Minghao Li, Lei Cui, et al, 2020. “LayoutLM: Pre-Training of Text and Layout for Document Image Understanding”, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1192–1200.
  10. Yang Xu, Yiheng Xu, Tengchao Lv, et al, 2021. “LayoutLMv2: Multi-Modal Pre-Training for Visually-Rich Document Understanding”, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, pp.2579–2591.
  11. Anoop R. Katti, Christian Reisswig, Cordula Guder, et al, 2018. “Chargrid: Towards Understanding 2D Documents”, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4459–4469.
  12. Rasmus Berg Palm, Ole Winther and Florian Laws, 2017. “CloudScan: A Configuration-Free Invoice Analysis System Using Recurrent Neural Networks”, Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 406–413.
  13. Ufuk Ilke Avei, Dionysis Goularas, Emin Erkan Korkmaz and Baris Deveci, 2024. “Information Extraction from Scanned Invoice Documents Using Deep Learning Methods”, Proceedings of the IEEE Thirteenth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6.
  14. Albana Rexhepi, Erijon Hasi, Art Haxholli and Eliot Bytyçi, 2025. “Invoice and Receipt Optical Character Recognition: Review on Current Methods and Future Trends”, Journal of Imaging, 11/2, pp. 1–25.
  15. Graham A. Cutting and Anne-Françoise Cutting-Decelle, 2021. “Intelligent Document Processing: Methods and Tools in the Real World”, Springer, Cham.
  16. Abhay Kumar Dalsaniya and Kishan Patel, 2022. “Enhancing Process Automation with AI: The Role of Intelligent Automation in Business Efficiency”, International Journal of Science and Research Archive, 5/2, pp. 322–337.
  17. Aziz Amari, Mariem Makni, Wissal Fnaich, et al, 2024. “An Efficient Deep Learning-Based Approach to Automating Invoice Document Validation”, Proceedings of the IEEE/ACS 21st International Conference on Computer Systems and Applications (AICCSA), pp. 1–8.
  18. Hyung Chul Lee, 2016. “Can Electronic Tax Invoicing Improve Tax Compliance? A Case Study of the Republic of Korea”, Journal of Public Economics, 134, pp. 1–12.
  19. Juan Antonio Ruíz-Ceniceros, José Alfonso Aguilar-Calderón, Carolina Tripp-Barba, et al, 2023. “Dynamic Canonical Data Model: An Architecture Proposal for the Integration of Software Units”, Applied Sciences, 13/19, pp. 11040.
  20. Jon Bosak, Tim McGrath and G. Ken Holman, 2006. “Universal Business Language v2.0: Committee Specification”, OASIS Universal Business Language Technical Committee.
  21. Philipp Liegl, 2009. “Conceptual Business Document Modeling Using UN/CEFACT’s Core Components”, Electronic Commerce Research, 9/3, pp. 181–204.
  22. Felix Krieger, Paul Drews, Burkhardt Funk and Till Wobbe, 2021. “Information Extraction from Invoices: A Graph Neural Network Approach for Datasets with High Layout Variety”, Innovation Through Information Systems, Springer, pp. 5–20.
  23. Felix Krieger, Paul Drews and Burkhardt Funk, 2023. “Automated Invoice Processing: Machine Learning-Based Information Extraction for Long-Tail Suppliers”, Intelligent Systems with Applications, 20, pp. 200285.
  24. Sushant Kumar, Sumit Datta, Vishakha Singh, et al, 2024. “Applications, Challenges, and Future Directions of Human-in-the-Loop Learning”, IEEE Access, 12, pp. 75735–75760.
  25. Adriana Tiron-Tudor and Delia Deliu, 2022. “Reflections on the Human–Algorithm Complex: Duality Perspectives in the Auditing Process”, Accounting, Auditing and Accountability Journal, 35/7, pp. 1581–1605.
  26. Guido L. Geerts, 2011. “A Design Science Research Methodology and Its Application to Accounting Information Systems Research”, International Journal of Accounting Information Systems, 12/2, pp. 142–151.
  27. Alan R. Hevner, 2010. “Design Science Research in Information Systems”, MIS Quarterly, 34/1, pp. 1–11.
  28. Cassio Pennachin and Ben Goertzel, 2007. "Contemporary Approaches to Artificial General Intelligence", Artificial general intelligence - Springer Berlin Heidelberg, pp. 1-30.
  29. Tarun Tater, Neelamadhav Gantayat, Sampath Dechu, et al, 2022. “AI-Driven Accounts Payable Transformation”, Proceedings of the AAAI Conference on Artificial Intelligence, 36/11, pp. 12405–12413.
Index Terms

Computer Science
Information Sciences

Keywords

Document Processing (IDP) Invoice Automation OCR Canonical Data Format Accounting Rules Human-in-the-loop Enterprise Resource Planning (ERP)