OPEN ACCESS
The attack against the Charlie Hebdo weekly in Paris, in the year 2015, was a disruptive event that generated an important public reaction in social networks, creating the opportunity to study the phenomenon of violent communication and hate messages on Twitter. In the days after the attack (between January 7 and January 12), a sample of more than 255,000 tweets with the hashtags #CharlieHebdo, #JeSuisCharlie and #StopIslam was collected. An analysis was made using qualitative and quantitative approaches to contrast the level of agreement between the different methods used. In the first place, messages were classified as tweets that contained violent and hate speech or general messages, following the inclusion criteria that based on experience and the scientific literature were defined by the Principal Investigator. Then, three pairs of judges classified the sample using the excluding criteria previously defined, according to which ten types of violent speech communication were identified, which were reduced to five essential categories. After the qualitative analysis, the methods of Data Mining were used with the purpose of extracting systems of rules for the classification of the type of speech, beginning with 18 variables derived from each tweet, including date, favorites or the type of software used for the tweet, among others. The results show that disruptive events are followed by communications that show spatial temporal and textual patterns clearly identifiable; this allows the authors to propose a methodology to classify in a very precise way, those messages that contain hate or violent speech.
cyberhate speech, data mining, social media, violent talk.
[1] Burnap, P. & Williams, L., Cyber hate speech on twitter: an application of machine classification and statistical modeling for policy and decision making. Policy and Internet, 7, pp. 223–242, 2015. http://dx.doi.org/10.1002/poi3.85
[2] Sloan, L. & Morgan, J., Who tweets with their location? understanding the relationship between demographic characteristics and the use of geoservices and geotagging on twitter. PloS One, 10(11), e0142209, 2015
http://dx.doi.org/10.1371/journal.pone.0142209
[3] Paltoglou, G., Sentiment-based event detection in Twitter. Journal of the Association for Information Science and Technology, 2015. http://dx.doi.org/10.1002/asi.23465
[4] Zielinski, A., Bügel, U., Middleton, L., Middleton, S.E., Tokarchuk, L., Watson, K. & Chaves, F., Multilingual analysis of Twitter news in support of mass emergency events. In EGU General Assembly Conference Abstracts, 14, p. 8085, 2012.
[5] Alsaedi, N., Burnap, P. & Rana, O., Identifying disruptive events from social media to enhance situational awareness. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 934–941, 2015. http://dx.doi.org/10.1145/2808797.2808879
[6] Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V. & Bhamidipati, N., Hate speech detection with comment embeddings. In Proceedings of the 24th International Conference on World Wide Web Companion, International World Wide Web Conferences Steering Committee, pp. 29–30, 2015. http://dx.doi.org/10.1145/2740908.2742760
[7] Magdy, W., Darwish, K. & Abokhodair, N., Quantifying Public Response towards Islam on Twitter after Paris Attacks. arXiv preprint arXiv:1512.04570, 2015.
[8] Williams, M.L., Edwards, A., Housley, W., Burnap, P., Rana, O., Avis, N., Morgan, J. & Sloan, L., Policing cyber-neighbourhoods: tension monitoring and social media networks. Policing and Society, 23(4), pp. 461–481. 2013. http://dx.doi.org/10.1080/10439463.2013.780225
[9] Pang, B. & Lee, L., Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), pp. 1–135, 2008. http://dx.doi.org/10.1561/1500000011
[10] Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V. & Bhamidipati, N., Hate speech detection with comment embeddings. In Proceedings of the 24th International Conference on World Wide Web Companion, International World Wide Web Conferences Steering Committee, pp. 29–30, 2015. http://dx.doi.org/10.1145/2740908.2742760
[11] Viera, A.J. & Garrett, J.M., Understanding interobserver agreement: the kappa statistic. Family Medicine, 37(5), pp. 360–363, 2005.
[12] Mena, J., Investigative Data Mining for Security and Criminal Detection, Butterworth- Heinemann, Elsevier Science (USA), p. 452, 2003, ISBN 0-7506-7613-2.
[13] Estivill-Castro, V. & Lee, I., Data mining techniques for autonomous exploration of large volumes of geo-referenced crime data. Proceeding Sixth International Conference Geocomputation, Brisbane (Australia), 2001.
[14] Chen, H., Chung, W., Xu, J.J., Wang, G., Qin, Y. & Chau, M., Crime data mining: a general framework and some examples. Computer, 37(4), pp. 50–56, 2004. http://dx.doi.org/10.1109/MC.2004.1297301
[15] Bendler, J., Tobias Brandt, T., Wagner, S. & Neumann, D., Investigating crime to Twitter relationships in urban environments - facilitating a virtual neighborhood watch. Proceedings of 22th European Conference on Information Systems, Tel Aviv (Israel), 2014.
[16] Agrawal, R. & Srikant, R., Fast algorithms for mining association rules. Proceedings International Conference Very Large Data Bases (VLDB’94), Santiago (Chile), 1994.