Quality assessment of linked data sources for named-entity resolution

Carmen Brando Nathalie Abadie Francesca Frontini

CRH UMR 8558 CNRS - EHESS, Paris, France

Univ. Paris-Est, LASTIG COGIT, IGN, ENSG, Saint-Mandé, France

Praxiling UMR 5267 CNRS - UPVM3, Université Paul-Valéry Montpellier 3, France

Corresponding Author Email: 
carmen.brando@ehess.fr, nathalie-f .abadie@ign.fr, francesca.frontini@univ-montp3.fr
31 December 2016
More applications in the Digital Humanities rely on Linked Data for the semantic enrichment of digital collections by means of URI, typically for providing background information about authors, works of art and historical places, mentioned in these collections. In this sense, Named Entity Linking (NEL) is the task of automatically assigning the appropriate referent to a named-entity mention tagged in a text. Nevertheless, data sources of the Web of Data still experiences quality issues which are critical for NEL and many Digital Humanities applications. The present article hence proposes an empirical study to assess the quality of any Linked Data (LD) set meant to be used as Knowledge Base in graph-based NEL. Our methodology deals with state-of-art quality aspects from a fitness-for-use perspective. We perform experiments on two French heritage texts and choose to test two types of linking: on the one hand to a generalistic Linked Data source and on the other to domain-specific ones. The proposed study assesses to which degree the different Linked Data sources are better suited to be used as Knowledge Base for some NEL use case.


data quality, named-entity linking, linked data, digital humanities

1. Introduction
2. Mesures pour l’évaluation des résultats des applications de résolution d’entités nommées
3. Influence de la qualité des sources du web de données sur les applications de résolution d’entités nommées
4. Mesures d’évaluation de la qualité des sources du web de données pour des applications de résolution d’entités nommées
5. Mise en oeuvre et résultats
6. Conclusion et perspectives

