Home Journals RIA A methodology for the detection of multiple accounts in social networks

JOURNAL METRICS

CiteScore 2023: ℹCiteScore:

CiteScore is the number of citations received by a journal in one year to documents published in the three previous years, divided by the number of documents indexed in Scopus published in those same three years.

SCImago Journal Rank (SJR) 2023: ℹSCImago Journal Rank (SJR):

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

Source Normalized Impact per Paper (SNIP) 2023: ℹSource Normalized Impact per Paper(SNIP):

SNIP measures a source’s contextual citation impact by weighting citations based on the total number of citations in a subject field. It helps you make a direct comparison of sources in different subject fields. SNIP takes into account characteristics of the source's subject field, which is the set of documents citing that source.

A methodology for the detection of multiple accounts in social networks

Zaher Yamak^*| Julien Saunier | Laurent Vercouter

INSA Rouen, Laboratoire LITIS, 685 Avenue del'Université, Saint Etienne du Rouvray Cedex, 76801, France

Corresponding Author Email:

zaher.yamak@insa-rouen.fr, julien.saunier@insa-rouen.fr, laurent.vercouter@insa-rouen.fr

Received:

N/A

| |

Accepted:

N/A

| | Citation

ria30_4_05_yamak.pdf

OPEN ACCESS

Abstract:

With the growth of social media as the most important element of internet in term of visitors, fake accounts detection has become one of the hardest social media security challenges. Over the years, online social media (OSN) have evolved widely, converting part of our personal lives to virtual ones. But this evolution also has negative effects. In 2012, 16.6 million of Americans were victims of identity theft according to an estimate from the U.S. Bureau of Justice Statistics, with up to $24.7 billion of financial losses for these victims. Various techniques are used to manipulate users in OSN environments such as social spam, identity theft, spear phishing and Sybil attacks... In this article, we are interested in analyzing the behavior of multiple fake accounts that try to bypass the OSN regulation. In the context of social media manipulation detection, we focus on the special case of multiple Identity accounts (Sockpuppet) created on English Wikipedia (EnWiki). We set up a complete methodology spanning from the data extraction from EnWiki to the training and testing of our selected data using several machine learning algorithms. In our methodology we propose a set of features that grows on previous literature to use in automatic data analysis in order to detect the Sockpuppets accounts created on EnWiki. We apply them on a database of 10 000 user accounts. The results compare several machine learning algorithms to show that our new features and training data enable to detect 99 % of fake accounts, improving previous results from the literature.

Keywords:

sockpuppet, machine learning application, manipulation, deception, identity, wikipedia, collaborative project, social media

1. Introduction

2. Les projets collaboratifs et Wikipédia

3. Travaux connexes

4. Méthodologie et indicateurs

5. Expérimentation et résultats

7. Conclusion et perspectives

References

Altman N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, vol. 46, no 3, p. 175–185.

Ambika C. M. (2014, December). The evolution of social media 2004 - 2014: The good, the bad and the ugly of it ! (http://dazeinfo.com/2014/12/12/evolution-social-media-2004-2014-good-bad-ugly/)

Breiman L. (2001). Random forests. Machine learning, vol. 45, no 1, p. 5–32.

Cao Q., Sirivianos M., Yang X., Pregueiro T. (2012). Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th usenix conference on networked systems design and implementation, p. 15–15.

Cortes C., Vapnik V. (1995). Support-vector networks. Machine learning, vol. 20, no 3, p. 273–297.

David B. (2015, MARS). 5 social engineering attacks to watch out for. (http://tripwire.com/state-of-security/security-awareness/5-social-engineering-attacks-to-watch-out-for/)

Douceur J. R. (2002). The sybil attack. In Peer-to-peer systems, p. 251–260. Springer. Freund Y., Schapire R. E. (1995). A desicion-theoretic generalization of on-line learning and an application to boosting. In Computational learning theory, p. 23–37.

Gao H., Hu J., Wilson C., Li Z., Chen Y., Zhao B. Y. (2010). Detecting and characterizing social spam campaigns. In Proceedings of the 10th acm sigcomm conference on internet measurement, p. 35–47.

Goolsby R., Shanley L., Lovell A. (2013). On cybersecurity, crowdsourcing, and social cyberattack. Rapport technique. DTIC Document.

Heckerman D. (2008). A tutorial on learning with bayesian networks. In Innovations in bayesian networks, p. 33–82. Springer.

Jeff B. (2015). 33 social media facts and statistics you should know in 2015. (http://www.jeffbullas.com/2015/ 04/08/ 33-social-media-facts-and-statistics-you-shouldknow-in-2015/)

Kaplan A. M., Haenlein M. (2010). Users of the world, unite! the challenges and opportunities of social media. Business horizons, vol. 53, no 1, p. 59–68.

Maeve D., Nicole E., Cliff L., Amanda L., Mary M. (2015, January). Social media update 2014. (http://www.pewinternet.org/2015/01/09/social-media-update-2014/) Mathew I. (2012, February). If you think twitter doesn’t break news, you’re living in a dream world. (https://gigaom.com/2012/02/29/if-you-think-twitter-doesnt-break-newsyoure- living-in-a-dream-world/)

Norajong. (2010, May). Why the number of people creating fake accounts and using second identity on facebook are increasing. (http://networkconference.netstudies.org/2010/05/ why-the-number-of-people-creating-fake-accounts-and-using-second-identity-onfacebook-

are-increasing/)

Norton. (s. d.). Spear phishing: Scam, not sport. (http://us.norton.com/spear-phishing-scamnot-sport/article)

Riva R. (2010, May). Stolen facebook accounts for sale. (http://www.nytimes.com/2010/05/03/technology/internet/ 03facebook.html)

Russell S., Norvig P., Intelligence A. (1995). A modern approach. Artificial Intelligence. Prentice-Hall, Egnlewood Cliffs, vol. 25, p. 27.

Sarita Y., Daniel R., Schoenebeck G., danah b. (2009). Detecting spam in a twitter network. First Monday, vol. 15, no 1. Consulté sur http://firstmonday.org/ojs/index.php/fm/article/view/2793

Solorio T., Hasan R., Mizan M. (2013a). A case study of sockpuppet detection in wikipedia. In Workshop on language analysis in social media (lasm) at naacl hlt, p. 59–68.

Solorio T., Hasan R., Mizan M. (2013b). Sockpuppet detection in wikipedia: A corpus of real-world deceptive writing for linking identities. arXiv preprint arXiv:1310.6772.

Statista. (2015). Number of unique u.s. visitors to wikipedia.org from may 2011 to april 2015 (in millions). (http://www.statista.com/statistics/265119/number-of-unique-us-visitors-towikipediaorg/)

Sture N. (2010, February). Fake accounts in facebook - how to counter it. (http://ezinearticles.com/?id=3703889)

Tsikerdekis M., Zeadally S. (2014). Multiple account identity deception detection in social media using nonverbal behavior. Information Forensics and Security, IEEE Transactions on, vol. 9, no 8, p. 1311–1321.

Yang Z., Wilson C., Wang X., Gao T., Zhao B. Y., Dai Y. (2014). Uncovering social network sybils in the wild. ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 8, no 1, p. 2.

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form

A methodology for the detection of multiple accounts in social networks