Détection de liens d’identité erronés en utilisant la détection de communautés dans les graphes d’identité

Joe Raad Wouter Beek  Nathalie Pernelle  Fatiha Saïs  Frank van Harmelen 

UMR MIA-PARIS, INRA, AgroParisTech, Université Paris-Saclay Paris, France

LRI, CNRS UMR8623, Paris Sud University, Paris Saclay University Orsay, France

Dept. of Computer Science, VU University Amsterdam Amsterdam, Pays-Bas

Corresponding Author Email: 
joe.raad@agroparistech.fr; {nathalie.pernelle,fatiha.sais}@lri.fr; {w.g.j.beek,frank.van.harmelen}@vu.nl
28 August 2018
Different studies have observed that the semantic web identity predicate owl:SameAs is sometimes used incorrectly. In this paper, we show how network metrics such as the community structure of the owl:SameAs graph can be used in order to detect such possibly erroneous statements. One benefit of the here presented approach is that it can be applied to the network of owl:SameAs links, and does not rely on any additional knowledge. We evaluate our approach on 558M owl:SameAs statements scraped from the LOD cloud. This evaluation shows the ability of our approach to scale, and its efficiency in detecting erroneous identity links.


Web of data, identity, owl:sameAs, communities

1. Introduction
2. Travaux Connexes
3. Approche de détection de liens d’identité erronés
4. Expérimentations
5. Conclusion

