Nouvelle approche anaphorique pour le résumé automatique des textes d’opinions dans les tweets

Nouvelle approche anaphorique pour le résumé automatique des textes d’opinions dans les tweets

Rania Othman Rami Belkaroui Rim Faiz 

Institut Supérieur de Gestion de Tunis, LARODEC, Université de Tunis, Tunisia

IHEC Carthage, LARODEC, Université de Carthage, Tunisia

Corresponding Author Email: 
rania.othman@gmx.com ; rami.belkaroui@gmail.com; rim.faiz@ihec.rnu.tn
Page: 
37-51
|
DOI: 
https://doi.org/10.3166/isi.22.6.37-51
Received: 
| |
Accepted: 
| | Citation
Abstract: 

RÉSUMÉ. Fournir un résumé automatique des opinions exprimées via Twitter est un thème émergent ces dernières années. Nous présentons dans cet article une nouvelle pproche pour le résumé automatique des opinions sur Twitter basée sur les conversations et non sur le traitement des tweets individuels. Notre approche vise à attribuer à chaque conversation un score, indiquant le niveau de satisfaction de l’utilisateur pour le produit correspondant ainsi que pour ses différentes caractéristiques. Nous avons développé un nouvel algorithme basé sur la relation des réponses dans les conversations qui utilise la résolution anaphorique dans un processus de backtracking pour déterminer efficacement les produits évoqués dans les tweets ainsi que leurs aspects. Les expérimentations montrent des résultats prometteurs. Enparticulier, nous avons prouvé que l’incorporation de la structure de la conversation pour résumer les opinions contribue à améliorer les performances du système.

ABSTRACT. Summarizing opinions conveyed through Twitter has been an emergent theme over the last several years. In this paper, we present a new approach for customer opinion summarization based on twitter conversations rather than individual tweets. Our approach aims to assign to each conversation, a score indicating the level of user’s satisfaction towards the corresponding product as well as its features. We have developed a new algorithm based on the reply links in the conversations which employs the anaphora resolution in a backtracking process to effectively extract the different products involved in the tweets as well as their features. Experimentations show promising results. In particular, we have proved that incorporating conversation structure in the opinion summarization contributes to improving system performance.

Keywords: 

MOTS-CLÉS : résumé des opinions, Twitter, conversations, résolution anaphorique.

KEYWORDS: opinion summarization, Twitter, conversations, anaphora resolution.

1. Introduction
2. État de l’art
3. Approoche proposéée
4. Expérimentations et résultats
5. Conclusion
  References

Agarwal A., Xie B., Vovsha I., Rambow O., Passonneau R. (2011). Sentiment analysis of twitter data. In Proceedings of the Workshop on Languages in Social Media, p. 30-38. Association for Computational Linguistics.

Baldwin B. (1997). Cogniac: High precision coreference with limited knowledge and linguistic resources. In Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts, p. 38-45. Association for Computational Linguistics.

Bahrainian S. A., Dengel A. (2013). Sentiment analysis and summarization of twitter data. In Computational Science and Engineering (CSE), p. 227-234. IEEE.

Belkaroui R., Faiz R. (2015). Towards events tweet contextualization using social inuence model and users conversations. In Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics, ACM.

Belkaroui R., Faiz R., Elkhlifi A. (2014). Conversation analysis on social networking sites. In Signal-Image Technology and Internet-Based Systems (SITIS), IEEE, p. 172-178.

Bora N. N. (2012). Summarizing public opinions in tweets. International Journal of Computational Linguistics and Applications, vol. 3, n° 1, p. 41-55.

Feldman R., Fresko M., Goldenberg J., Netzer O., Ungar L. (2007). Extracting product comparisons from discussion boards. In Data Mining, ICDM 2007, 7th IEEE International Conference on Data Mining, p. 469-474.

Ferreira L., N. Jakob, and I. Gurevych. (2008). A comparative study of feature extraction algorithms in customer reviews. In Semantic Computing, 2008 IEEE International Conference on, p. 144-151. IEEE.

Flesch R. (1948). A new readability yardstick. Journal of applied psychology, vol. 32, n° 3, p. 221.

Jakob N., Gurevych I. (2010). Using anaphora resolution to improve opinion target identification in movie reviews. In Proceedings of the ACL 2010 Conference.

Jmal J., Faiz R. (2013). Customer review summarization approach using twitter and sentiwordnet. In Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics, p. 33. ACM.

Kamal A., Abulaish M. (2013). Statistical features identification for sentiment analysis using machine learning techniques. Computational and Business Intelligence (ISCBI), IEE,

p. 178-181.

Kessler J. S., Nicolov N. (2009). Targeting sentiment expressions through supervised ranking of linguistic configurations. In ICWSM. AAAI Press.

Kim H. D., Ganesan K., Sondhi P., Zhai C. (2011). Comprehensive review of opinion summarization. Technical report, University of Illinois at Urbana-Champaignl.

Liu (2011). Opinion mining and sentiment analysis. Web Data Mining. Springer, p. 459-526.

Liu X., Li Y., Wei F., Zhou M. (2012).Graph-based multi-tweet summarization using social signals. COLING, p. 1699-1714.

Meng X., Wei F., Liu X., Zhou M., Li S., Wang H. (2012). Entity-centric topic-oriented opinion summarization in twitter. In Proceedings of the 18th international conference on Knowledge discovery and data mining, ACM, p. 379-387.

Mitkov R. (2014). Anaphora resolution. Routledge.

Popescu A.-M., Etzioni O. (2007). Extracting product features and opinions from reviews. Natural language processing and text mining, Springer, p. 9-28.

Ritter A., Cherry C., Dolan B. (2010). Unsupervised modeling of twitter conversations. In HLT-NAACL. The Association for Computational Linguistics, p. 172-180.

Steinberger J., Poesio M., Kabadjov M. A., Jezek K. (2007). Two uses of anaphora resolution in summarization. Information Processing and Management, vol. 43, n° 6, p. 1663-1680,

Stoyanov V., Cardie C. (2008). Topic identification for for fine-grained opinion analysis. In Proceedings of the 22nd International Conference on Computational Linguistics,vol. 1, p. 817-824.

Turney P. (2001). Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In Machine Learning: ECML, p 491-502.

Toh Z., Su J. (2015). Nlangp: Supervised machine learning system for aspect category classification and opinion target extraction. In Proceedings of the 9th International Workshop on SemEval, p. 719-724,