Enhancing authority files through SPARQL federated queries

Presenter

Thomas Kerboul (@Thomas_Kerboul)

Slides and Recordings

Abstract

Federated queries enable merging data across databases, allowing for the identification of errors and gaps when content overlaps. The Bibliothèque de Genève utilizes SPARQL federated queries between Wikidata and IdRef, a French authority file used for bibliographic cataloging, to enhance records about individuals related to Geneva. Using the IdRef identifier as a common link, several modularized queries were designed, facilitating the discovery of potential improvements.

The process of correcting mismatches, however, was predominantly manual, which was crucial, especially in cases of homonymy. IdRef identifiers, often added to Wikidata through VIAF clusters, might be incorrectly associated with the wrong individuals. Manual curation ensured that errors did not propagate further, particularly across members of a given VIAF cluster, thereby maintaining data integrity. Additionally, the comparison revealed that Wikidata tended to be more accurate and up-to-date than IdRef, showcasing the potential of community-curated databases.

This presentation aims to demonstrate the reliability of community-curated databases and the power of federated queries, particularly through the use of SPARQL, to enhance data accuracy and integration across multiple sources. By sharing these insights, we hope to encourage other institutions to adopt similar methodologies to improve their data management practices.

Comment in live stream: I’m reminded of When owl:sameAs Isn’t the Same: An Analysis of Identity in Linked Data | SpringerLink – “When owl:sameAs isn’t the same: An analysis of identity in Linked Data”

Q: How is this work different when working with SPARQL and Wikidata than working with more traditional ways?

Q: Do you actually write complex SPARQL by hand or are there tools to simplify the matching e.g. BINDing values?

Q for speaker: How would you describe the perception of Wikidata among others at your institution? Has this changed at all through the work you have been doing?

Hello all, thanks for attending : the slides are online at Enhancing authority files through SPARQL federated queries

1 Like