| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 115 |
| Year of Publication: 2026 |
| Authors: Ghaida Alotaibi, Rawan Alghamdi, Shahd Aldokhi, Majd Aldossary, Majd Aldossary, Hadeel Alotaibi, Areej Buker |
10.5120/ijca4f5f7241e5ed
|
Ghaida Alotaibi, Rawan Alghamdi, Shahd Aldokhi, Majd Aldossary, Majd Aldossary, Hadeel Alotaibi, Areej Buker . Tharaa: An Integrated Mobile System for Personal Finance Management with OCR, Predictive Analytics, and Intelligent Guidance. International Journal of Computer Applications. 187, 115 ( Jun 2026), 14-22. DOI=10.5120/ijca4f5f7241e5ed
Personal financial management is increasingly challenging for individual users because daily spending is fragmented across physical payments, online transactions, recurring subscriptions, and installment plans. This paper presents Tharaa, a unified personal-finance mobile system that consolidates expense and income tracking, budget and goal management, subscription and installment monitoring, on-device Optical Character Recognition (OCR) for receipt capture, machine-learning-driven predictive analytics, and an AI-powered finance coach into a single Flutter mobile application backed by a Firestore cloud database. The system embeds three predictive models — Logistic Regression for end-of-month overspending classification, Lasso regression for per-category spending forecasting, and Isolation Forest for anomaly detection — that all run on-device using exported coefficients, so no transactional data leaves the user's phone for prediction. We evaluate the three models on two datasets. The first is a controlled synthetic dataset spanning four behavioral personas (lifestyle overspender, installment-heavy, income-stable saver, subscription-driven). The second is the publicly available Daily Household Transactions dataset from Kaggle (2,461 rows, 45 calendar months, January 2015 – September 2018). The overspending classifier achieves 96.0% accuracy with precision 1.00 on the synthetic test set, and a macro-averaged recall of 0.82 on the Kaggle test split despite extreme class imbalance (91.7%/8.3%). Category forecasts produced by Lasso track actual spending closely (Mean Absolute Error of SAR 222.76 on the synthetic dataset for Food). The Isolation Forest detects an overall anomaly rate of 8.02% on the Kaggle dataset, in close agreement with the 8.02% observed across the synthetic evaluation pool. The results confirm that the integrated approach is feasible across both controlled and real-world settings, and that on-device inference preserves user privacy without sacrificing predictive performance.