Sentiment Analysis and Model Construction of Literary Works Based on Big Data

Main Article Content

JINLING YAN

Abstract

This paper undertakes a thorough investigation into the emotional analysis and model construction of literary works, leveraging big data and advanced machine learning techniques. Motivated by the growing relevance of understanding emotional nuances in literature, we address the challenge of accurately capturing and analyzing these subtleties within large textual datasets. Our approach involves collecting and preprocessing a diverse corpus of literary texts, followed by meticulous text segmentation, tokenization, and feature extraction. We utilize key features such as Bag-of-Words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF), and sentiment scores to inform our model. A Support Vector Machine (SVM) classifier is employed to construct the emotional analysis model, which we rigorously evaluate using metrics including accuracy, precision, recall, and F1-score. Our results reveal a high accuracy rate of 0.85, along with balanced performance across all metrics, attesting to the model's reliability. Further analysis of emotional distribution indicates that happiness and sadness are predominant emotions in the corpus, while feature importance analysis highlights BoW as the most significant contributor to model performance. We conclude by discussing the model's effectiveness and its broader implications for digital humanities, proposing potential directions for future research.

Article Details

Section

Articles