#AkuGalau: Korpus Bahasa Indonesia untuk Deteksi Emosi dari Teks


  • Julius Bata


Deteksi emosi, Deteksi emosi dari teks, Korpus emosi Indonesia, Hashtag emosi


Detection of emotions from text is a problem of text classification based on the type of emotion. The availability of an emotional corpus plays an essential role in the detection of emotions. However, most corpus for emotional detection is available in English. This condition is a major problem when developing a system for detecting emotions from Indonesian texts. The emotional text corpus for Indonesian is very limited. Therefore, this research focuses on the development of Indonesian text emotional corpus. The development of such a corpus is the first step in the study of detecting emotions from the Indonesian text. The data source used to develop the corpus is a tweet. The annotation process is done automatically based on the hashtag (#) of emotions contained in a tweet with five types of emotions: happy, sad, angry, afraid, and love. This research produced an Indonesian emotional text corpus consisting of 500 complete tweets with emotional labels at the superordinate and basic levels. Emotion detection experiments were conducted to test the corpus using the Naive Bayes method. The accuracy of the experiments reached 82%, these results indicate that the corpus can be used in text emotion detection.


[1] A. Yadollahi, A.G. Shahraki, and O.R. Zaiane, “Current State of Text Sentiment Analysis from Opinion to Emotion Mining,” ACM Comput. Surv., vol. 50, no. 2, hlm. 1–33, 2017.
[2] K. Sailunaz, M. Dhaliwal, J. Rokne, and R. Alhajj, “Emotion detection from text and speech: a survey,” Soc. Netw. Anal. Min., vol. 8, no. 1, 2018.
[3] U. Krcadinac and P. Pasquier, “Synesketch: An Open Source Library for Sentence -Based Emotion Recognition,” IEEE Trans. Affect. Comput., vol. 4, no. 3, hlm. 312 – 325, 2013.
[4] C. Bosco, V. patti and A. Bolioli, “Developping Corpora for Sentiment Analysis: The Case of Irony and Senty-TUT,” IEEE Intell. Syst., vol. 28, April, hlm. 55 – 63, 2013.
[5] S. M. Mohammad, “#Emotional Tweets,” dalam First Joint Conference on Lexical and Computational Semantics (*sem), hlm. 246 – 255, 2012.
[6] P. R. Shaver, U. Murdaya, and R. C. Fraley, “Structure of the Indonesian emotion lexicon, “ Asian Journal of Social Psychology, vol. 4., hlm. 201 – 224, 2001.
[7] F. Keshtkar and D. Inkpen, “A Hierarhical approach to mood classification in blogs, “ Nat. Lang. Eng., vol. 18, no.1, hlm. 61 – 81, 2012.
[8] A. F. Wicaksono, C. Vania, B. D. T, and M. Adriani, “Automatically Building a Corpus for Sentiment Analysis on Indonesian Tweets,” dalam Proceedings of the 28th Pacific Asia Conference on
Language, Information and Computation, PACLIC 2014, pp. 185–194, 2014.
[9] C. Vania, M. Ibrahim, and M. Adriani, “Sentiment Lexicon Generation for an Under‑Resourced Language,” Int. J. Comput. Linguist. Appl., vol. 5, no. 1, pp. 59–72, 2014.
[10] A. Sulistya, F. Thung, and D. Lo, “Inferring Spread of Readers’ Emotion Affected by Online News,” Ciampaglia G., Mashhadi A., Yasseri T. Soc. Informatics. SocInfo 2017. Lect. Notes Comput. Sci., vol. 10539, pp. 426–439, 2017.
[11] J. Savigny and A. Purwarianti, “Emotion classification on youtube comments using word embedding, ” dalam Proceedings - 2017 International Conference on Advanced Informatics: Concepts, Theory and Applications, ICAICTA 2017, hlm. 1–5, 2017.
[12] R. M. Cahyaningtyas and R. Kusumaningrum, “Emotion Detection of Tweets in Indonesian Language Using LDA and Expression Symbol Conversion,” dalam Proc. of 2017 1st Int. Conf. on Informatics and Computational Sciences (ICICoS), pp. 253–258, 2017.
[13] P. Ekman, “An argument for basic emotions,” Cogn. Emot., vol. 6, no. 3–4, pp. 169–200, 1992.
[14] D. M. Farid, L. Zhang, C. M. Rahman, M. A. Hossain, and R. Strachan, “Hybrid decision tree and naive bayes classifiers for multi-class classification tasks,” Expert Syst. Appl., vol. 41, no. 4 (2), pp. 1937–1946, 2014.



Abstract views: 276 | PDF downloads: 518