Keynote: How knowledge representation is changing in a world of large language models

Denny · August 19, 2024, 11:25am

Presenter

Denny Vrandečić (@Denny)

Slides and Recordings

Slides: Google-Docs, PDF
Recordings: TIB AV portal, YouTube

Abstract

The last few years, large language models have profoundly impacted many research topics and product teams. From applications in health care to the creation of new soda flavors, Artificial Intelligence has captured the imagination of many people. Even though some of the initial enthusiasm and promises of large language models may have been somewhat exaggerated, it is clear that generative AI is a technology that will bring a massive impact that is still difficult to predict.

In areas such as libraries, bibliography, healthcare, finance, science metrics, and many others, we have invested heavily in structured knowledge representations, such as metadata and knowledge graphs, and it is not immediately clear how Semantic Web technologies and other structured knowledge representations will fit into a world that is being rapidly transformed by the deployment of large language models.

In this talk we will work on some of the answers how these two technologies might evolve and co-evolve. We will explore the weaknesses and strengths of the different approaches, and aim to identify the opportunities where they may complement each other. We may dream what may lay beyond knowledge graphs and metadata, and how the advances in language models might allow us to reach bold new frontiers in knowledge representation which might not have been accessible before.

Christian_Hauschke · November 25, 2024, 2:35pm

@Denny You mentioned a (LLM based?) tool that generates SPARQL queries, from Stanford University? Can you please provide a link?

eduards · November 25, 2024, 2:37pm

lake44 · November 25, 2024, 2:37pm

Can someone put a link to the application that forms SPARQL queries, it sounded like “Spinach”… ?

Denny · November 25, 2024, 2:56pm

Here’s the link to the paper: [2407.11417] SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions

Christian_Hauschke · November 25, 2024, 3:26pm

Thanks! It seems like the training data set consists only of 320 question-SPARQL pairs, which seems to be doable for non-Wikidata-systems, too. Are you aware of usage of SPINACH or similar approaches in non-Wikidata environments? I would like to see something like it in VIVO, for example.

Denny · November 26, 2024, 9:54am

I am not aware of any SPINACH-like work for other SPARQL endpoints. I would expect that any LLM would be biased towards Wikidata, as the foundational model was probably trained on a lot of knowledge about Wikidata, and it might need somewhat larger numbers to adopt it to a non-Wikidata environment, but my gut would be not that much larger, and it would be in the same order of magnitude.

j4lib · November 26, 2024, 12:07pm

About the function search functionality. What first came to my mind is that I’d probably would want to filter certain data types (like Wikidata ListProperties or other property browsers). But then again it doesn’t feel like “normal” people would do that, so there goes accessibility for everyone.