Construction and Empirical Research of Accounting Information Quality Assessment Model Based on Big Data
Main Article Content
Abstract
This paper introduces an innovative accounting information quality evaluation model leveraging big data, addressing the critical need for accurate financial data assessment in today's complex economic landscape. We collect and preprocess extensive data from diverse financial databases, encompassing both structured and unstructured information from listed companies' annual reports, financial statements, and audit reports. The challenge lies in identifying the most pertinent variables amidst vast data, which we tackle using correlation analysis and Random Forest algorithms for feature selection. Our model employs a hybrid approach, integrating Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machine (GBM) algorithms to enhance predictive accuracy. To validate our model's efficacy, we employ performance metrics such as accuracy, precision, recall, and F1-score. Empirical research on a comprehensive dataset of 500 listed companies over five years reveals that the Random Forest model excels, achieving a remarkable accuracy (88.7%) and F1-score (88.7%). Statistical analysis substantiates significant performance variations among models, while sensitivity analysis underscores the influence of key parameters on model outcomes. This research significantly contributes to the advancement of robust accounting information quality evaluation models, harnessing the power of big data and sophisticated machine learning techniques.