Please wait a minute...
Journal of Data and Information Science  2019, Vol. 4 Issue (4): 56-83    DOI: 10.2478/jdis-2019-0021
Research Paper     
Identification of Sarcasm in Textual Data:A Comparative Study
Pulkit Mehndiratta(),Devpriya Soni
Jaypee Institute of Information Technology, Noida, India
Download: PDF (5823 KB)      HTML  
Export: BibTeX | EndNote (RIS)      

Abstract  

Purpose: Ever increasing penetration of the Internet in our lives has led to an enormous amount of multimedia content generation on the internet. Textual data contributes a major share towards data generated on the world wide web. Understanding people’s sentiment is an important aspect of natural language processing, but this opinion can be biased and incorrect, if people use sarcasm while commenting, posting status updates or reviewing any product or a movie. Thus, it is of utmost importance to detect sarcasm correctly and make a correct prediction about the people’s intentions.

Design/methodology/approach: This study tries to evaluate various machine learning models along with standard and hybrid deep learning models across various standardized datasets. We have performed vectorization of text using word embedding techniques. This has been done to convert the textual data into vectors for analytical purposes. We have used three standardized datasets available in public domain and used three word embeddings i.e Word2Vec, GloVe and fastText to validate the hypojournal.

Findings: The results were analyzed and conclusions are drawn. The key finding is: the hybrid models that include Bidirectional LongTerm Short Memory (Bi-LSTM) and Convolutional Neural Network (CNN) outperform others conventional machine learning as well as deep learning models across all the datasets considered in this study, making our hypojournal valid.

Research limitations: Using the data from different sources and customizing the models according to each dataset, slightly decreases the usability of the technique. But, overall this methodology provides effective measures to identify the presence of sarcasm with a minimum average accuracy of 80% or above for one dataset and better than the current baseline results for the other datasets.

Practical implications: The results provide solid insights for the system developers to integrate this model into real-time analysis of any review or comment posted in the public domain. This study has various other practical implications for businesses that depend on user ratings and public opinions. This study also provides a launching platform for various researchers to work on the problem of sarcasm identification in textual data.

Originality/value: This is a first of its kind study, to provide us the difference between conventional and the hybrid methods of prediction of sarcasm in textual data. The study also provides possible indicators that hybrid models are better when applied to textual data for analysis of sarcasm.



Key wordsMachine learning      Artificial neural networks      Word embedding      Text vectorization      Accuracy     
Received: 06 September 2019      Published: 19 December 2019
Corresponding Authors: Pulkit Mehndiratta     E-mail: pulkit.mehndiratta@jiit.ac.in
Cite this article:

Pulkit Mehndiratta, Devpriya Soni. Identification of Sarcasm in Textual Data:A Comparative Study. Journal of Data and Information Science, 2019, 4(4): 56-83.

URL:

http://manu47.magtech.com.cn/Jwk3_jdis/10.2478/jdis-2019-0021     OR     http://manu47.magtech.com.cn/Jwk3_jdis/Y2019/V4/I4/56

Figure 1. Snippet of the news headlines dataset for sarcasm detection.
Figure 2. Graph to show number of sarcastic and non-sarcastic labels.
Figure 3. Word cloud of sarcastic words.
Figure 4. Word cloud of non-sarcastic words.
Figure 5. Snippet of the sarcasm corpus V2.
Figure 6. Word cloud of various ACL dataset comments.
Figure 7. Snippet of the ACL irony dataset.
Figure 8. Representation of words as vectors in space.
Figure 9. Typical CNN applied on textual data.
Figure 10. Structure of LSTM module.
Figure 11. Network before and after applying the dropout.
Figure 12. The system architecture for shallow machine learning algorithms.
Figure 13. Results from shallow machine learning models.
Parameter Set-Value
Filters 64
Kernel 3
Embedding Dimension 300
Epochs 2, 4, 8, 16
Activation Function Sigmoid
Batch Size 128
Word Embedding Word2Vec, GloVe and fastText
Pool Size 2
Dropouts 0.15, 0.25, 0.35:ConvNet, 0.25:Bi-LSTM
Optimizer Adam
Table 1 Parameter list for our models under training and testing.
Figure 14. The system architecture with Deep Learning Models.
Figure 15. Plated framework for CNN-LSTM architecture.
Figure 16. Plated framework for LSTM-CNN architecturetic.
Word2Vec
Dropout Epochs Accuracy (%)
CNN LSTM CNN-LSTM LSTM-CNN
0.15 2 54.34 55.28 56.98 58.07
4 58.31 58.38 59.93 59.62
8 59.62 59.32 60.87 60.33
16 60.12 60.08 61.23 62.23
Avg (0.15) 58.097 58.265 59.752 60.062
Avg (0.25) 58.575 58.927 60.747 60.897
Avg (0.35) 59.292 59.327 60.545 61.132
GloVe
Avg (0.15) 58.73 58.83 58.15 60.53
Avg (0.25) 59.78 59.76 58.202 59.817
Avg (0.35) 59.29 59.43 57.882 59.922
fastText
Avg (0.15) 59.69 60.25 58.647 60.831
Avg (0.25) 59.26 58.87 59.99 60.32
Avg (0.35) 59.66 59.01 59.74 59.87
Table 2 Results obtained from ACL 2014 Irony Dataset.
Word2Vec
Dropout Epochs Accuracy (%)
CNN LSTM CNN-LSTM LSTM-CNN
0.15 2 80.8 80.6 80.7 80.8
4 80.5 81 81.1 80.4
8 79.6 80.6 80.3 80.7
16 78.1 80.3 78.3 81.23
Avg (0.15) 79.75 80.63 80.1 80.7825
Avg (0.25) 79.9 80.85 80.075 80.865
Avg (0.35) 80.1 80.88 80.125 80.8825
GloVe
Avg (0.15) 81 81.18 81.025 81.275
Avg (0.25) 81.1 81.21 81.2 81.25
Avg (0.35) 81 81.56 81.175 81.6
fastText
Avg (0.15) 80.96 81.38 80.6125 80.65
Avg (0.25) 81.23 81.26 80.975 81.45
Avg (0.35) 81 81.06 81 81.075
Table 3 Results obtained from News Headlines Dataset.
Word2Vec
Dropout Epochs Accuracy (%)
CNN LSTM CNN-LSTM LSTM-CNN
0.15 2 56.55 58.67 58.57 58.99
4 56.76 56.55 56.55 57.61
8 56.93 57.08 56.07 57.87
16 57.25 57.08 55.69 55.69
Avg (0.15) 56.873 57.345 56.72 57.54
Avg (0.25) 57.185 57.44 56.757 57.565
Avg (0.35) 57.033 57.238 56.17 57.267
GloVe
Avg (0.15) 58.637 59.167 58.74 58.94
Avg (0.25) 59.075 59.082 58.775 59.277
Avg (0.35) 59.127 58.952 58.91 59.225
fastText
Avg (0.15) 58.655 58.86 58.28 59.155
Avg (0.25) 59.205 59.085 58.285 59.497
Avg (0.35) 59.212 59.13 58.7 59.277
Table 4 Results obtained from Sarcasm Corpus V2 Dataset.
Technique Accuracy (%)
NBOW (Logistic Regression with neural Words) 0.724
NLSE ( Non-Linear Subspace Embedding) 0.72
CNN (Convolutional Neural Network )Kim (2014) 0.742
Shallow CUE CNN ((Context and User Embedding
Convolutional Neural Network)Amir et al. (2016) 0.793
Our Proposed technique 0.816
Table 5 Comparison table for News Headline Sarcasm Dataset.
Features Recall (%) Precision (%)
Baseline(BoW)Wallace et al. (2015) 0.288 0.129
NNP (Noun Phrase) 0.324 0.129
NNP + Subreddit 0.337 0.131
NNP + subreddit + sentiment 0.373 0.132
Our Proposed technique 0.489 0.472
Table 6 Comparison of ACL irony Dataset 2014.
Technique Recall (%) Precision (%)
Baseline (SVM) Oraby et al. (2017) GEN 0.75 0.71
RQ 0.73 0.70
HYP 0.63 0.68
Our proposed technique GEN 0.72 0.73
RQ 0.71 0.71
HYP 0.68 0.68
Table 7 Comparison of Sarcasm Corpus Version 2.
[1]   Amir S., Wallace B.C., Lyu H., & Silva P.C.M.J. (2016). Modelling context with user embeddings for sarcasm detection in social media. arXiv preprint arXiv:1607.00976.
[2]   Bamman ,D., &Smith ,N.A. (2015). Contextualized sarcasm detection on twitter. In Proceedings of the Ninth International AAAI Conference on Web and Social Media.
[3]   Barbieri F., Saggion H., & Ronzano F. (2014). Modelling sarcasm in twitter, a novel approach. In Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 50-58.
[4]   Bharti S.K., Babu K.S., & Jena S.K. (2015). Parsing-based sarcasm sentiment recognition in twitter data. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 1373-1380. ACM.
[5]   Bojanowski P., Grave E., Joulin A., & Mikolov T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135-146.
doi: 10.1162/tacl_a_00051
[6]   Carvalho P., Sarmento L., Silva M.J., & De Oliveira E. (2009). Clues for detecting irony in user-generated contents: Oh..!! It’s “so easy”-. In Proceedings of the 1st International CIKM Workshop on Topic-sentiment Analysis for Mass Opinion, 53-56. ACM.
[7]   Cheang , H.S., &Pell ,M.D. (2008). The sound of sarcasm. Speech Communication, 50(5), 366-381.
doi: 10.1016/j.specom.2007.11.003
[8]   Clark , H.H., &Gerrig ,R.J. (1984). On the pretense theory of irony. American Psychological Association, 113(1), 121-126.
doi: 10.1037//0096-3445.124.1.3 pmid: 7897341
[9]   Davidov D., Tsur O., & Rappoport A. (2010). Semi-supervised recognition of sarcastic sentences in twitter and amazon. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning, 107-116. Association for Computational Linguistics.
[10]   Felbo B., Mislove A., S?gaard A., Rahwan I., & Lehmann S. (2017). Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. arXiv preprint arXiv:1708.00524.
[11]   Ghosh ,D., &Muresan ,S. (2018). “with 1 follower I must be AWESOME : P.”. Exploring the role of irony markers in irony recognition. In Proceedings of the Twelfth International Conference on Web and Social Media, ICWSM 2018, Stanford, California,USA, June 25-28, 2018.
[12]   Gonzalez-Ibanez R., Muresan S., & Wacholder N. (2011). Identifying sarcasm in twitter: A closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2, 581-586. Association for Computational Linguistics.
[13]   Hazarika D., Poria S., Gorantla S., Cambria E., Zimmermann R., & Mihalcea R. (2018).CASCADE: Contextual sarcasm detection in online discussion forums. CoRR, abs/1805.06413.
[14]   Hochreiter , S., &Schmidhuber ,J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
doi: 10.1162/neco.1997.9.8.1735 pmid: 9377276
[15]   Ivanko , S.L., &Pexman ,P.M. (2003). Context incongruity and irony processing. Discourse Processes, 35(3), 241-279.
doi: 10.1080/02699206.2018.1430851 pmid: 29393697
[16]   Jorgensen J., Miller G.A., & Sperber D. (1984). Test of the mention theory of irony. Journal of Experimental Psychology: General, 113(1), 112.
doi: 10.1037/0096-3445.113.1.112
[17]   Joshi A., Sharma V., & Bhattacharyya P. (2015). Harnessing context incongruity for sarcasm detection. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2, 757-762.
[18]   Kim , Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.
doi: 10.1016/j.jbi.2019.103205 pmid: 31085324
[19]   Kingma , D.P., &Ba ,J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
doi: 10.1016/j.artmed.2019.07.008 pmid: 31521253
[20]   Kolchinski , Y.A., &Potts ,C. (2018). Representing social media users for sarcasm detection. arXiv preprint arXiv:1808.08470.
[21]   LeCun Y., Bottou L., Bengio Y., & Haffner P., et al. (1998). Gradient-based learning applied to document recognition. In Proceedings of the IEEE, 86(11), 2278-2324.
[22]   Liebrecht C., Kunneman F., & van Den Bosch, A. (2013). The perfect solution for detecting sarcasm in tweets #not. Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA).
[23]   Maynard ,D., &Greenwood ,M.A. (2014). Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis. In Proceedings of LREC 2014. ELRA.
[24]   Mikolov T., Chen K., Corrado G., & Dean J. (2013). Efficient estimation of word representations in vector space. Workshop onICLR.
[25]   Mishra , R. (2018). Github news headlines dataset for sarcasm detection: High quality dataset for the task of sarcasm detection. Retrieved from
[26]   Oraby S., Harrison V., Reed L., Hernandez E., Riloff E., & Walker M. (2017). Creating and characterizing a diverse corpus of sarcasm in dialogue. arXiv preprint arXiv:1709.05404.
[27]   Pennington J., Socher R., & Manning C.D. (2014). Glove: Global vectors for word representation. In Proceedings of EMNLP, 1532-1543.ACL.
[28]   Poria S., Cambria E., Hazarika D., & Vij P. (2016). A deeper look into sarcastic tweets using deep convolutional neural networks. arXiv preprint arXiv:1610.08815.
[29]   Rajadesingan A., Zafarani R., & Liu H. (2015). Sarcasm detection on twitter: A behavioral modeling approach. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, 97-106.ACM.
[30]   Riloff E., Qadir A., Surve P., De Silva L., Gilbert N., & Huang R. (2013). Sarcasm as contrast between a positive sentiment and negative situation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 704-714.
[31]   Rockwell , P. (2000). Lower, slower, louder: Vocal cues of sarcasm. Journal of Psycholinguistic Research, 29(5), 483-495.
doi: 10.1023/A:1005120109296
[32]   Tepperman J., Traum D., & Narayanan S. (2006). “yeah right”: Sarcasm recognition for spoken dialogue systems. In Proceedings of the Ninth International Conference on Spoken Language.
[33]   Tsur O., Davidov D., & Rappoport A. (2010). A great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews In Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media.
[34]   Wallace B.C., Choe D.K., & Charniak E. (2015). Sparse, contextually informed models for irony detection: Exploiting user communities, entities and sentiment. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (pp. 1035-1044). (ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference; Vol. 1). Association for Computational Linguistics (ACL).
[35]   Wallace B.C., Choe D.K., Kertz L., & Charniak E. (2014). Humans require context to infer ironic intent (so computers probably do, too). In Long Papers (pp. 512-516). (52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference; Vol. 2). Association for Computational Linguistics (ACL).
[36]   Wang S.-H., Muhammad K., Hong J., Sangaiah A.K., & Zhang Y.-D. (2018). Alcoholism identification via convolutional neural network based on parametric relu, dropout, and batch normalization. Neural Computing and Applications, 1-16.
[1] Mingliang Yue, Kailin Tian, Tingcan Ma. An Accurate and Impartial Expert Assignment Method for Scientific Project Review[J]. Journal of Data and Information Science, 2017, 2(4): 65-80.
[2] Frederique Lang, Diego Chavarro & Yuxian Liu. Can Automatic Classification Help to Increase Accuracy in Data Collection?[J]. Journal of Data and Information Science, 2016, 1(3): 42-58.
[3] Yan ZHAO & Hui SHI. A method for improving the accuracy of automatic indexing of Chinese-English mixed documents[J]. Journal of Data and Information Science, 2012, 5(4): 77-92.