Application of Machine Learning for Fraud Detection in Corporate Annual Financial Reports
DOI:
https://doi.org/10.70062/jiafc.v1i2.201Keywords:
Auditing, Financial Fraud Detection, Machine Learning, Neural Network, Supervised LearningAbstract
This study investigates the application of supervised learning algorithms for detecting financial statement fraud in annual corporate reports. Financial reporting fraud remains a critical challenge for auditors and regulators, as traditional detection methods, such as ratio analysis and manual auditing, often fail to identify complex anomalies in large datasets. The research aims to evaluate the effectiveness of several machine learning algorithms in improving fraud detection accuracy and reliability. A dataset consisting of 500 annual financial statements from 2020 to 2024, including 50 identified cases of potential fraud, was preprocessed through data cleaning, normalization, and labeling. Algorithms tested include Decision Tree, Support Vector Machine (SVM), Naïve Bayes, K-Nearest Neighbors (K-NN), and Neural Network. The results indicate that Neural Network achieves the highest accuracy (94.5%), followed by SVM (91.6%), while simpler algorithms such as Naïve Bayes and K-NN demonstrate moderate performance. Comparative analysis highlights that ensemble and deep learning models are more capable of capturing complex patterns in financial data, providing a significant advantage over traditional methods. The findings suggest that integrating machine learning into auditing practices can enhance the detection of fraudulent activities, improve decision-making processes, and increase the reliability of audit outcomes. This research underscores the importance of combining advanced computational techniques with professional auditor oversight to ensure accuracy, transparency, and accountability in financial reporting.
References
Agrawal, R. (2018). Integrated effect of nearest neighbors and distance measures in K-NN algorithm. In Advances in Intelligent Systems and Computing (Vol. 654, pp. 759–766). https://doi.org/10.1007/978-981-10-6620-7_74
Ahn, S.-S., Kim, D.-G., Cho, S.-N., Chung, T.-Y., Joo, W.-K., Kim, S.-K., & Kim, J.-S. (2015). Design and implementation of executive information system: Focused on KISTI strategic management system. ICIC Express Letters, 9(5), 1355–1360.
Angra, S., & Ahuja, S. (2017). Machine learning and its applications: A review. In Proceedings of the 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDACI) (pp. 57–60). https://doi.org/10.1109/ICBDACI.2017.8070809
Arul, M. R., & Sathiyamoorthi, V. (2022). Introduction to machine learning and its implementation techniques. In Research Anthology on Machine Learning Techniques, Methods, and Applications (pp. 1–25). https://doi.org/10.4018/978-1-6684-6291-1.ch001
Ashtiani, M. N., & Raahemi, B. (2022). Intelligent fraud detection in financial statements using machine learning and data mining: A systematic literature review. IEEE Access, 10, 72504–72525. https://doi.org/10.1109/ACCESS.2021.3096799
Ashtiani, M. N., & Raahemi, B. (2023). An efficient resampling technique for financial statements fraud detection: A comparative study. In Proceedings of the International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME). https://doi.org/10.1109/ICECCME57830.2023.10253185
Bhattacharya, R., Kumar, R., Rajeswari, U., Prakash, J. A., Barodia, N., & Sawadatkar, S. (2024). An analysis on financial statement fraud detection for listed companies using DCNN-LSTM-AE-AM model. In Proceedings of the Asian Conference on Intelligent Technology (ACOIT). https://doi.org/10.1109/ACOIT62457.2024.10941438
Bustamante Molano, L. X., Hernández Aros, L., & Gutiérrez Portela, F. (2025). Financial fraud detection through the application of machine learning techniques with an anomaly-based approach. In Communications in Computer and Information Science (Vol. 2332, pp. 159–172). https://doi.org/10.1007/978-3-031-91328-0_13
Dewangan, S., & Kumar, S. (2025). Enhancing fraud detection in finance through AI and machine learning. In Utilizing AI and Machine Learning in Financial Analysis (pp. 267–281). https://doi.org/10.4018/979-8-3693-8507-4.ch014
Elbrashy, A. M., Abdulaziz, A. M. N., & Ibraheem, M. R. (2023). Using machine learning techniques in predicting auditor opinion: Empirical study. In Lecture Notes in Networks and Systems (Vol. 753, pp. 233–247). https://doi.org/10.1007/978-981-99-4764-5_15
Gupta, R., Goyal, R., Malik, K., & Sahu, I. (2024). AI-enhanced data mining for fraud detection in financial transactions. In Proceedings of the 3rd International Conference on Sentiment Analysis and Deep Learning (ICSADL 2024) (pp. 244–249). https://doi.org/10.1109/ICSADL61749.2024.00045
Gupta, S., & Mehta, S. K. (2024). Feature selection for dimension reduction of financial data for detection of financial statement frauds in context to Indian companies. Global Business Review, 25(2), 323–348. https://doi.org/10.1177/0972150920928663
Hajamydeen, A. I., & Helmi, R. A. A. (2020). Performance of supervised learning algorithms on multi-variate datasets. In Machine Learning and Big Data: Concepts, Algorithms, Tools and Applications (pp. 209–232). https://doi.org/10.1002/9781119654834.ch8
Hajek, P., & Henriques, R. (2017). Mining corporate annual reports for intelligent detection of financial statement fraud: A comparative study of machine learning methods. Knowledge-Based Systems, 128, 139–152. https://doi.org/10.1016/j.knosys.2017.05.001
Jain, A., & Shinde, S. (2019). A comprehensive study of data mining-based financial fraud detection research. In Proceedings of the 2019 IEEE 5th International Conference on Convergence Technology (I2CT). https://doi.org/10.1109/I2CT45611.2019.9033767
Kanksha, Singh, H., & Laxmi, V. (2021). Supervised learning algorithm: A survey. In Communications in Computer and Information Science (Vol. 1393, pp. 71–78). https://doi.org/10.1007/978-981-16-3660-8_7
Karthikeyan, P., Velswamy, K., Harshavardhanan, P., Rajagopal, R., JeyaKrishnan, V., & Velliangiri, S. (2021). Machine learning techniques application: Social media, agriculture, and scheduling in distributed systems. In Research Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing (pp. 1396–1417). https://doi.org/10.4018/978-1-7998-5339-8.ch068
Li, B., Yu, J., Zhang, J., & Ke, B. (2015). Detecting accounting frauds in publicly traded U.S. firms: A machine learning approach. In Proceedings of the 7th Asian Conference on Machine Learning (ACML) (pp. 173–188).
Li, S., Fisher, R., & Falta, M. (2021). The effectiveness of artificial neural networks applied to analytical procedures using high level data: A simulation analysis. Meditari Accountancy Research, 29(6), 1425–1450. https://doi.org/10.1108/MEDAR-06-2020-0920
Madhuri, K. (2023). Security threats and detection mechanisms in machine learning. In Handbook of Artificial Intelligence (pp. 255–274). https://doi.org/10.2174/9789815124514123010016
Mongwe, W. T., Mbuvha, R., & Marwala, T. (2021). Bayesian inference of local government audit outcomes. PLOS ONE, 16(12), Article e0261245. https://doi.org/10.1371/journal.pone.0261245
Namaplli, R. C. R., Kleckova, E., Singh, K., Vyas, N., Karnawat, A. T., & Pran, S. G. (2024). Predicting financial statement fraud with Deep Q-Network (DQN) model: A machine learning approach. In Proceedings of the 2nd International Conference on Emerging Research in Computational Science (ICERCS). https://doi.org/10.1109/ICERCS63125.2024.10895234
Nguyen Thanh, C., & Phan Huy, T. (2025). Predicting financial reports fraud by machine learning: The proxy of auditor opinions. Cogent Business and Management, 12(1), Article 2510556. https://doi.org/10.1080/23311975.2025.2510556
Noyunsan, C., Katanyukul, T., & Saikaew, K. (2018). Performance evaluation of supervised learning algorithms with various training data sizes and missing attributes. Engineering and Applied Science Research, 45(3), 221–229.
Nuritdinovich, M. A., Bokhodirovna, K. M., Kavitha, V. O., & Ugli, S. A. O. (2025). Advanced AI algorithms in accounting: Redefining accuracy and speed in financial auditing. AIP Conference Proceedings, 3306(1), Article 050008. https://doi.org/10.1063/5.0275750
Ogidan, E. T., Dimililer, K., & Kirsal-Ever, Y. (2020). Machine learning for cyber security frameworks: A review. In Drones in Smart-Cities: Security and Performance (pp. 27–36). https://doi.org/10.1016/B978-0-12-819972-5.00002-1
Pal, T. (2023). The exploratory study of machine learning on applications, challenges, and uses in the financial sector. In Advanced Machine Learning Algorithms for Complex Financial Applications (pp. 156–165). https://doi.org/10.4018/978-1-6684-4483-2.ch010
Qureshi, N. I., & Meça, A. (2024). The way of machine learning based solicit for detecting deceit in online based transaction system with security. In Proceedings of the 4th International Conference on Advances in Computing, Innovation and Technology in Engineering (ICACITE) (pp. 1316–1321). https://doi.org/10.1109/ICACITE60783.2024.10616595
Ramona, L., Luchian, A.-M., Boscoianu, E.-C., Boscoianu, M., & Vladareanu, V. (2019). Towards a new critical role of information systems in the modern decision making process. International Journal of Advanced Trends in Computer Science and Engineering, 8(1), 48–53. https://doi.org/10.30534/ijatcse/2019/1081.12019
Rao, R. K., & Mandhala, V. N. (2024). Unveiling financial fraud: A comprehensive review of machine learning and data mining techniques. Ingénierie des Systèmes d'Information, 29(6), 2309–2334. https://doi.org/10.18280/isi.290620
Rezaei, Z., Samghabadi, S. S., Amini, M. A., & Banad, Y. M. (2024). The power of ensemble methods: A comparative study of machine learning, deep learning, and LLMs for financial fraud detection. In Proceedings of the 2024 International Conference on AI x Data and Knowledge Engineering (AIxDKE) (pp. 125–126). https://doi.org/10.1109/AIxDKE63520.2024.00031
Tasnim, S. S., Jamal, M. K., Akter, S., Akter, S., & Hossain, S. (2023). An empirical comparison among supervised learning algorithms with model explainability. In Proceedings of the 26th International Conference on Computer and Information Technology (ICCIT). https://doi.org/10.1109/ICCIT60459.2023.10441508
Thu, O. P. T., Ngoc, H. D., & Thuy, V. V. T. (2024). Forecasting audit opinions on financial statements: Statistical algorithm or machine learning? Electronic Journal of Applied Statistical Analysis, 17(1), 133–152.
Wang, C., Wang, M., Wang, X., Zhang, L., & Long, Y. (2024). Multi-relational graph representation learning for financial statement fraud detection. Big Data Mining and Analytics, 7(3), 920–941. https://doi.org/10.26599/BDMA.2024.9020013
West, J., & Bhattacharya, M. (2015). Mining financial statement fraud: An analysis of some experimental issues. In Proceedings of the 2015 10th IEEE Conference on Industrial Electronics and Applications (ICIEA) (pp. 461–466). https://doi.org/10.1109/ICIEA.2015.7334157
Wu, Y. (2022). Linear regression in machine learning. In Proceedings of SPIE (Vol. 12163, Article 121634T). https://doi.org/10.1117/12.2628053
Yang, J.-C., Chuang, H.-C., & Kuan, C.-M. (2020). Double machine learning with gradient boosting and its application to the Big N audit quality effect. Journal of Econometrics, 216(1), 268–283. https://doi.org/10.1016/j.jeconom.2020.01.018
Zhou, J. (2021). Application of machine learning algorithms in audit data analysis. In Proceedings of the ACM International Conference Proceeding Series (pp. 54–58). https://doi.org/10.1145/3510858.3510881
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Journal of Investigative Auditing & Financial Crime

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


