[CFP] SemEval 2025 Task 5 - LLMs4Subjects - Call to the SWIB Community for Shared Task Participation

Jennifer_D_Souza · October 7, 2024, 10:41am

Dear members of the SWIB conference community,

I want to highlight a pertinent SemEval 2025 shared task named LLMs4Subjects . Below is the call for participation that we’ve shared across various general community mailing lists and that I am happy to highlight to the members of this community as well.

We are pleased to release the Call for Participation to the LLMs4Subjects Shared Task organized as part of SemEval 2025.

Overview: As the first of its kind, LLMs4Subjects invites the research community to develop cutting-edge LLM-based semantic solutions for the subject tagging of the Leibniz University’s Technical Library’s open-access collection. The shared task provides an opportunity for the research community to creatively utilize LLMs for subject tagging of technical records. Systems need to demonstrate bilingual language modeling in understanding technical documents in both German and English. Moreover, successful solutions may be directly integrated into the operational workflows of the TIB Leibniz Information Centre for Science and Technology University Library.

What we provide to participants: a human-readable form of a subject’s taxonomy (this is the GND or Gemeinsame Normdatei, the integrated authority file used for cataloging in German-speaking countries) and a large collection of technical records tagged with these subjects from the TIB’s open-access collection called TIBKAT.

More details on the task website: SemEval'25 Task 5

LLMs4Subjects defines the following three tasks:

Task 1: Learn the GND
Task 2: Align subject tagging to the TIBKAT collection
(Optional and Fun) Task 3: Develop Elegant Frontend Interfaces for Subject Tagging

The task offers a broad range of research questions that can be explored. View them on the task webste. We encourage interested teams to sign up and begin development as soon as possible. Systems can be developed over a 3-month period, with the official evaluation phase starting on January 10, 2025.

Why Participate?
• Explore critical LLM research areas like fine-tuning, RAG techniques, agentic workflows, and more!
• The SemEval meeting is held is conjunction with premier NLP conferences like ACL, EMNLP, or COLING. Hence participants will have the opportunity to present and discuss your work at one of these venues!
• Your solutions could be integrated into the workflows of the TIB Leibniz Information Centre!

LLMs4Subjects will have three separate evaluations:

Evaluation 1: Quantitative Metrics-based Evaluations
Evaluation 2: Qualitative Evaluations by the Human Subject Specialists
(Optional) Evaluation 3: HCI evaluations for subject indexing interfaces submitted

Dates

Training and validation datasets available:October 2, 2024
Test data available/Evaluation starts: January 10, 2025
Evaluation ends: January 31, 2025
Participant paper submissions due: February 28, 2025
Notification to authors: March 31, 2025
Camera ready due: April 21, 2025
SemEval workshop: TBD

Task Organizers

Jennifer D’Souza, Sameer Sadruddin, Holger Israel, Mathias Begoin et al.
All organizers are affiliated with the TIB Leibniz Information Centre for Science and Technology - Germany

We look forward to having you on board!

Contact: llms4subjects [at] gmail [dot] com

We would be very happy to host many participants also from the SWIB conference community as well as from your extended networks working on investigating LLM-based NLP solutions for modern digital library systems. Basically we hope all contributions of participating systems are open-source and in this regard we would be very enthusiastic to tie potentially promising solutions back into existing libraries and workflows. If you have any questions, please just feel free to contact us.

Kind regards,
Jennifer