Controlled vocabularies in a graphical user interface for manuscript cataloguing

Presenter

Nicole Eichenberger (@eichenberger), Alexander Jandt (@AJandt), Magdalena Luniak (@luniak), Ursula Stampfer

Slides and Recordings

Abstract

The Handschriftenportal (handschriftenportal.de) is the central online portal for book manuscripts in German collections. It is a joint development project by four major German libraries, funded by the German Research Foundation.

The portal currently offers cataloguing data on more than 150.000 manuscripts. In order to make this heterogeneous data searchable, we do not only need authority data for persons, places and organizations – which we get from standard authority files like the GND – but also domain specific controlled vocabularies, tailored to the research areas of manuscript descriptions. For this purpose, we create a codicological vocabulary, using existing modelling languages like SKOS with a few specific extensions, which is stored and maintained in a knowledge graph. The vocabulary terms will be mapped to the available cataloguing data, making it possible to find diverging or outdated terms alongside the standardized labels. The codicological vocabulary will also be available as RDF dump and can thus be used independently from the portal.

The portal also provides a cataloguing module, a graphical user interface for generating new manuscript descriptions. The codicological vocabulary is implemented in the interface, allowing users to link text snippets of their descriptions to domain specific authority data by choosing vocabulary terms from a tree structure or by searching for specific terms. In the backend, the linked terms are stored as a part of the description in the TEI-XML format.

In the presentation, we will show the functionalities of the cataloguing module as well as the underlying knowledge graph, and we will discuss the conception and modelling of the codicological vocabulary as Linked Open Data. Furthermore, we will address the challenges of its implementation in the graphical user interface and present some outcomes of a usability test.

@AJandt Is the vocabulary Turtle file available somewhere? We would like to add the vocabularies to https://bartoc.org/.

2 Likes

I am not sure whether you adressed it in your talk: Are you also supporting semantic annotations of digitized manuscripts, e.g. by linking references to a person in a text to their GND entry or maybe even by linking to concepts from your controlled vocabularies?

Here are our slides:

SWIB25-Controlled-vocabularies-manuscript-cataloguing.pdf (2.7 MB)

1 Like

The domain specific vocabularies are still under construction (we are adding one subject area after the other and testing them with the research community before finalizing them), but as soon as the vocabulary is complete, we will be happy to add the Turtle files to Bartoc.

There is a first version of an annotation function available in the workspace of the Handschriftenportal, where also digitized images can be annotated. Here is a short description of the annotation functionalities: https://blog.sbb.berlin/hsp-annotation-testphase/
At the moment, the annotations are just plain text fields, but of course it would be a good feature to include references there.

Thanks for the information. If you want to annotate references, it would probably make sense to look into the Reconciliation Service API which can be used as interface to the GND and also to SKOS vocabs. E.g. the TEI Publisher has a Reconciliation Service Connector to integrate different authority sources through the reconciliation protocol.

3 Likes

The editor’s source code is now up to date, you can find it here: GitHub - handschriftenportal-dev/hsp-erfassung . The code for the subject area dialog starts in the file ThemenbereichDialog.tsx.

Since the question about tech stacks for similar projects has come up, I wanted to mention SlateJS, the rich text editor we’re using, which provides a lot of foundational functionality.

1 Like