About the Effects of Sentiments on Topic Detection in Social Networks

About the Effects of Sentiments on Topic Detection in Social Networks

Karel Gutierrez-Batista Jesús R. Campaña Sandro Martinez-Folgoso M. Amparo Vila Maria J. Martin-Bautista 

Department of Computer Science and Artificial Intelligence, ETSIIT – University of Granada, 18071, Granada, Spain

Page: 
387-395
|
DOI: 
https://doi.org/10.2495/DNE-V11-N3-387-395
Received: 
N/A
| |
Accepted: 
N/A
| | Citation

OPEN ACCESS

Abstract: 

Topic detection from large textual data volumes extracted from Social Networks is an interesting research topic in the context of Big Data. The textual content present in Social Networks contains diverse information that can be exploited in order to obtain useful information. Topic detection and sentiment analysis in social networks are topics of widespread research. The study of both is sometimes intertwined as, usually, user messages revolve around a particular topic and express certain attitude of the user towards the topic discussed. However, this assumption is not valid for all messages as some of them express only general feelings or attitudes and do not refer to something in particular that covers up the topic discussed. In fact, these messages can influence the topic detection process. In this paper, we propose to obtain topics from massive quantities of text data extracted from social networks, without using previous information, and only with the use of unsupervised data mining techniques. We analyze the influence of sentiments in messages and how they affect the topic detection task. Terms related to sentiments provide useful information for a variety of applications, but not for topic detection where they represent a source of unnecessary noise. Experiments are conducted on data obtained from Twitter social network.

  References

[1] Guille, A., Hacid, H., Favre, C. & Zighed, D.A., Information diffusion in online social  networks: A survey. SIGMOD Record, 42(2), 2013.

http://dx.doi.org/10.1145/2503792.2503797

[2] Gonzalez-Agirre, A., Laparra, E. & Rigau, G., Multilingual central repository version 3.0. Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), European Language Resources Association (ELRA): Istanbul, Turkey, 2012.

[3] Esuli, A. & Sebastiani, F., Sentiwordnet: A publicly available lexical resource for opinion  mining. In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06), pp. 417–422, 2006.

[4] Toutanova, K., Klein, D., Manning, C.D. & Singer, Y., Feature-rich part-of-speech tagging with a cyclic dependency network. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology – Volume 1, Association for Computational Linguistics, NAACL ’03: Stroudsburg, PA, USA, pp. 173–180, 2003. http://dx.doi.org/10.3115/1073445.1073478

[5] Finkel, J.R., Grenager, T. & Manning, C., Incorporating non-local information into information extraction systems by gibbs sampling. Proceedings of the 43rd Annual Meeting on  Association for Computational Linguistics, Association for Computational Linguistics, ACL ’05: Stroudsburg, PA, USA, pp. 363–370, 2005.

http://dx.doi.org/10.3115/1219840.1219885

[6] RaghavaRao, N., Sravankumar, K. & Madhu, P., A survey on document clustering with  hierarchical methods and similarity measures. International Journal of Engineering Research & Technology (IJERT), 1(7), 2012.

[7] Deshmukh, D., Kamble, S. & Dandekar, P., Survey on hierarchical document clustering  techniques fihc & f2 ihc. International Journal of Advanced Research in Computer Science and Software Engineering, 3(7), pp. 157–161, 2013.

[8] Voorhees, E.M., Implementing agglomerative hierarchic clustering algorithms for use in  document retrieval. Information Processing and Management, 22(6), pp. 465–476, 1986. http://dx.doi.org/10.1016/0306-4573(86)90097-X

[9] Willett, P., Recent trends in hierarchical document clustering: A critical review. Information Processing and Management, 24(5), pp. 577–597, 1988.

http://dx.doi.org/10.1016/0306-4573(88)90027-1

[10] Martin, C., Corney, D. & Goker, A., Mining newsworthy topics from social media. BCS SGAI Workshop on Social Media Analysis, Cambridge, UK, pp. 35–46, 2013.

[11] Gao, N., Gao, L., He, Y., Wang, H. & Sun, Q., Topic detection based on group average  hierarchical clustering. International Conference on Advanced Cloud and Big Data (CBD, 2013), IEEE, pp. 88–92, 2013.

http://dx.doi.org/10.1109/cbd.2013.38

[12] Zhao, Y. & Karypis, G., Evaluation of hierarchical clustering algorithms for document datasets. Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM ’02), ACM: New York, NY, USA, pp. 515–524, 2002.

http://dx.doi.org/10.1145/584792.584877

[13] Duan, J. & Zeng, J., Web objectionable text content detection using topic modeling technique. 

Expert Systems with Applications, 40, pp. 6094–6104, 2013. http://dx.doi.org/10.1016/j.eswa.2013.05.032

[14] Martinez-Romo, J. & Araujo, L., Detecting malicious tweets in trending topics using a  statistical analysis of language. Expert Systems with Applications, 40, pp. 2992–3000, 2013. http://dx.doi.org/10.1016/j.eswa.2012.12.015

[15] Pennacchiotti, M. & Gurumurthy, S., Investigating topic models for social media user  recommendation. 20th International Conference Companion on World Wide Web, ACM: New York, NY, USA, pp. 101–102, 2011.

http://dx.doi.org/10.1145/1963192.1963244

[16] Chung-Hong, L., Unsupervised and supervised learning to evaluate event relatedness based on content mining from social-media streams. Expert Systems with Applications, 39(18), pp. 

13338–13356, 2012.

http://dx.doi.org/10.1016/j.eswa.2012.05.068

[17] Wu, J., Gao, W., Zhang, B., Liu, J. & Li, C., Cluster based detection and analysis of internet topics. 4th International Symposium on Computational Intelligence and Design, ISCID 2011, Vol. 2, pp. 371–374, 2011. http://dx.doi.org/10.1109/iscid.2011.195

[18] Lin, C. & He, Y., Joint sentiment/topic model for sentiment analysis. 18th ACM Conference on Information and Knowledge Management 8CIKM09), ACM: New York, NY, USA, pp. 375–384, 2009.

[19] Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.P., Yan, H. & Li, X., Comparing twitter and traditional media using topic models. 33rd European Conference on Advances in Information Retrieval (ECIR11), Springer-Verlag: Berlin, Heidelberg, pp. 338–349, 2011. http://dx.doi.org/10.1007/978-3-642-20161-5_34

[20] Magnini, B. & Cavaglia, G., Integrating subject field codes into wordnet. LREC, European Language Resources Association, 2000.

[21] Pease, A., Niles, I. & Li, J., The suggested upper merged ontology: A large ontology for the semantic web and its applications. Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web, p. 2002, 2002.

[22] Álvez, J., Atserias, J., Carrera, J., Climent, S., Laparra, E., Oliver, A. & Rigau, G., Complete and consistent annotation of wordnet using the top concept ontology. Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), European  Language Resources Association (ELRA): Marrakech, Morocco, 2008. http://www.lrec-conf. org/proceedings/lrec2008/.

[23] Manning, C.D., Raghavan, P. & Schu¨tze, H., Introduction to Information Retrieval.  Cambridge University Press: New York, NY, USA, 2008.

[24] Zhao, Y. & Karypis, G., Empirical and theoretical comparisons of selected criterion functions for document clustering. Machine Learning, 55(3), pp. 311–331, 2004. http://dx.doi.org/10.1023/B:MACH.0000027785.44527.d6

[25] Rousseeuw, P., Silhouettes: A graphical aid to the interpretation and validation of cluster  analysis. Journal of Computational and Applied Mathematics, 20(1), pp. 53–65, 1987. http://dx.doi.org/10.1016/0377-0427(87)90125-7