Analisis Emosi dalam Lirik Lagu menggunakan Natural Language Processing
Main Article Content
Abstract
Article Summary
Music is a universal medium for expressing emotions, with song lyrics serving as a narrative component rich in affective content. This study aims to analyze the emotional landscape within popular English song lyrics collected from the Spotify platform and to examine the effectiveness of Natural Language Processing (NLP) approaches in classifying these emotions. The research corpus consists of 57,494 randomly collected song lyrics without genre restrictions. Through a comprehensive analytical pipeline---ranging from text preprocessing (case folding, normalization, cleaning, tokenization, filtering, stemming), custom lexicon-based emotion labeling, TF-IDF feature extraction, to classification using a Random Forest model---the study reveals two key findings. Empirically, song lyrics are dominated by positive emotions, with romantic (36.2%) and happy (26.2%) emerging as the main themes, followed by sad (16.3%), while angry expressions (4.4%) appear least frequently, indicating significant class imbalance. Methodologically, the proposed model demonstrates solid performance with an overall accuracy of 83.03% and a weighted avg F1 score of 0.82. However, analysis of the confusion matrix and classification report uncovers performance disparities across emotion classes: angry and energetic emotions exhibit low recall (42% and 62%, respectively), likely due to imbalanced data distribution and lexicon limitations in capturing context. In conclusion, this study not only succeeds in mapping the dominance of love- and happiness-related themes in popular song lyrics but also demonstrates that classical NLP models can achieve competitive performance. The findings additionally highlight the importance of addressing class imbalance and developing more context-rich emotion lexicons or employing deep learning models in future research, in order to capture the emotional spectrum of lyrics more evenly and comprehensively.
Keywords
Article Keywords
Downloads
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-BY 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Akaishi, J., Sakata, M., Yoshinaga, J., Nakano, M., Koshi, K., & Kiyota, K. (2022). Estimating the emotional information in Japanese songs using search engines. Sensors, 22(5), 1β11. https://doi.org/10.3390/s22051800.
Ara, A., & Rekha, V. (2024). Enhancing music emotion classification using multi-feature approach. International Journal of Advanced Computer Science and Applications (IJACSA), 15(9), 794β803. https://doi.org/10.14569/IJACSA.2024.0150981.
Ardianti, I. A. P. G., Andriyani, A. A. A. D., & Swabawa, I. B. M. S. (2023). Discourse analysis on Jiwa yang Bersedih song lyrics. Lire Journal (Journal of Linguistics and Literature), 7(2), 290β300. https://doi.org/10.33019/lire.v7i2.227.
BuΕΎiΔ, D., & DobΕ‘a, J. (2025). Emotions in Eurovision Song Contest lyrics. 2025 MIPRO 48th ICT and Electronics Convention, 137β142. https://doi.org/10.1109/MIPRO65660.2025.11131930.
Chowdary, P., Singh, B., Agarwal, R., & Alluri, V. (2024). Lyrically speaking: Exploring the link between lyrical emotions, themes and depression risk. 25th International Society for Music Information Retrieval Conference. https://doi.org/10.48550/arXiv.2408.15575.
Czarnek, G., & Stillwell, D. (2022). Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions. PLOS One, 17(10), 1β22. https://doi.org/10.1371/journal.pone.0275910.
Gayatri, N. S. A., & Zulfiningrum, R. (2025). Kajian semiotika: Urgensi dukungan sosial dalam lirik lagu "Stay Alive" karya Bangtan Sonyeondan. Jurnal Komunikasi Universitas Garut: Hasil Pemikiran dan Penelitian, 11(1), 250β268. https://doi.org/10.52434/jk.v11i1.41490.
Hutabarat, S. W., Helvira, J., Harahap, N. I., Manalu, N., Pasya, M. R., Hutauruk, R. A. R., & Amalia, N. (2025). Afiksasi dalam lirik lagu: Studi kasus album Short Nβ Sweet karya Sabrina Carpenter. Jurnal Sistem Informasi, Teknik Komputer dan Teknologi Pendidikan, 5(1), 27β34. https://doi.org/10.55338/justikpen.v5i1.154.
Jim, J. R., Talukder, M. A. R., Malakar, P., Kabir, M. M., Nur, K., & Mridha, M. F. (2024). Recent advancements and challenges of NLP-based sentiment analysis: A state-of-the-art review. Natural Language Processing Journal, 6, 100059. https://doi.org/10.1016/j.nlp.2024.100059.
Jo, W., & Kim, M. J. (2022). Tracking emotions from song lyrics: Analyzing 30 years of K-Pop hits. Emotion, 23(6), 1658β1669. https://doi.org/10.1037/emo0001185.
Mishra, N., & Chingre, S. S. (2024). Enhanced mood and theme recognition in music using lyrical sentiment analysis. International Journal for Research, 12(11), 2437β2442. https://doi.org/10.22214/ijraset.2024.65624.
Mohammad, S. M. (2023). Best practices in the creation and use of emotion lexicons. Findings of the Association for Computational Linguistics: EACL, 1825β1836. https://doi.org/10.18653/v1/2023.findings-eacl.136.
Ngo, G., Beard, R., & Chandra, R. (2022). Evolutionary bagging for ensemble learning. Neurocomputing, 510, 1β14. https://doi.org/10.1016/j.neucom.2022.08.055.
Pratama, I. R., Cahyana, Y., & Wahiddin, D. (2025). Model machine learning untuk analisis sentimen masyarakat terhadap kenaikan PPN di media sosial X. Bulletin of Computer Science Research, 5(4), 277β286. https://doi.org/10.47065/bulletincsr.v5i4.523.
Saptono, H., Achmar, Y. F., Hadi, H. S., Shiroth, S. F., Aria, L. R. P., & Alfarizqi, M. M. (2025). Implementasi deteksi intrusi aplikasi web berbasis supervised machine learning: Studi kasus LMS STT Terpadu Nurul Fikri. Decode: Jurnal Pendidikan Teknologi Informasi, 5(3), 888β902. https://doi.org/10.51454/decode.v5i3.1313.
Simbolon, L., Suhadi, J., & Pratiwy, D. (2025). Exploring the experiential meaning of song lyrics: A systematic literature review. SIBATIK Journal: Jurnal Ilmiah Bidang Sosial, Ekonomi, Budaya, Teknologi, dan Pendidikan, 4(11), 3661β3672. https://doi.org/10.54443/sibatik.v4i11.3624.
Sutriawan, Rustad, S., Shidik, G. F., & Pujiono. (2025). Performance evaluation of text embedding models for ambiguity classification in Indonesian news corpus: A comparative study of TF-IDF, Word2Vec, FastText, BERT, and GPT. International Information and Engineering Technology Association, 30(6), 1469β1482. https://doi.org/10.18280/isi.300606.
Yusupov, K., Muminov, I., Islam, M. R., & Sahlabadi, M. (2024). Comparative analysis of machine learning and deep learning models for email spam classification using TF-IDF and word embedding techniques. International Conference on Broadband and Wireless Computing, Communication and Applications. https://doi.org/10.1007/978-3-031-76452-3.