About the Challenge

Tabular data in the form of CSV files is the common input format in a data analytics pipeline. However, a lack of understanding of the semantic structure and meaning of the content may hinder the data analytics process. Thus gaining this semantic understanding will be very valuable for data integration, data cleaning, data mining, machine learning and knowledge discovery tasks.

Tables on the Web may also be the source of highly valuable data. The addition of semantic information to Web tables may enhance a wide range of applications, such as web search, question answering, and Knowledge Base (KB) construction.

Tabular data to Knowledge Graph (KG) matching is the process of assigning semantic tags from KGs (e.g., Wikidata or DBpedia) to the elements of the table. This task however is often difficult in practice due to metadata (e.g., table and column names) being missing, incomplete or ambiguous.

The SemTab challenge aims at benchmarking systems dealing with the tabular data to KG matching problem, so as to facilitate their comparison on the same basis and the reproducibility of the results.

The 2025 edition of this challenge will be collocated with the International Semantic Web Conference .

Challenge Tracks

MammoTab

TASK: CEA (Wikidata v. 20240720)

Participants will address the Semantic Table Interpretation challenges using the new version of the MammoTab dataset .
MammoTab is a large-scale benchmark designed to provide realistic and complex scenarios, including tables affected by typical challenges of web and Wikipedia data.

Only approaches based on Large Language Models are allowed, either:

  • in fine-tuning settings, or
  • using Retrieval-Augmented Generation strategies.

The evaluation will focus on the Cell Entity Annotation (CEA) task using the Wikidata KG (v. 20240720) :

, but will also take into account the ability of the proposed approaches to effectively deal with the following key challenges
Disambiguation Correctly linking ambiguous mentions to the intended entities.
Homonymy Managing mentions referring to entities with identical or very similar names.
Alias resolution Recognising entities referred by alternative names, acronyms, or nicknames.
NIL Detection Correctly identifying mentions that do not correspond to any entity in the Knowledge Graph.
Noise Robustness Dealing with incomplete, noisy, or imprecise table contexts.
Collective Inference Leveraging inter-cell and inter-column signals to improve the consistency of annotations.

Participants are expected to demonstrate not only strong CEA performance, but also robustness and versatility across all these dimensions, which are critical for real-world table interpretation scenarios.


Round 1

The first round involves the execution of the CEA Task on a carefully selected subset of 870 tables comprising a total of 84,907 cell annotations.
Target Knowledge Graph: Wikidata KG (v. 20240720) .

Datasets' Structure

The test set is not included in the dataset in order to preserve the impartiality of the final evaluation and to discourage ad-hoc solutions.

Targets Format

CEA task
filename, row id (0-indexed), column id (0-indexed), entity id

Annotation:
LYQZQ0T5,1,1,Q3576864

Table LYQZQ0T5:
col0,col1,col2
1976,Eat My Dust!,Charles Byron Griffith
1976,Hollywood Boulevard,Joe Dante
1976,Hollywood Boulevard,Allan Arkush
1977,Grand Theft Auto,Ron Howard

Evaluation Criteria

Precision, Recall and F1 Score are calculated:

$$ Precision = \frac{\#correct\_annotations}{\#submitted\_annotations} $$
$$ Recall = \frac{\#correct\_annotations}{\#ground\_truth\_annotations} $$
$$ F_1 = \frac{2 \times Precision \times Recall}{Precision + Recall} $$

Notes:

  • \(\#\) denotes the number.
  • \(F_1\) is used as the primary score, and \(Precision\) is used as the secondary score.

Submission

Are you ready? Then, submit the annotations via Google Form

secu-table

TASK: CEA (Wikidata) TASK: CEA, CTA, CPA (SEPSES)

Participants will address the Semantic Table Interpretation challenges using the Secu-table dataset.
The secu-table dataset involved security data extracted from Common Vulnerability and Exposure (CVE) and Common Weakness Enumeration (CWE) data sources.

The evaluation will focus on the Cell Entity Annotation (CEA) task, the Column Type Annotation (CTA) task, and the Column Property Annotation (CPA) task using the SEPSES Computer Security Knowledge Graph and the Wikidata KG .

The evaluation of the participants' results will consider the Recall, Precision, and F-score. In addition to these scores, the participants are invited to provide the evaluation of the LLMs capabilities to make a prediction or to abstain (or to say "I don't know").


Round 1

The first round involves the execution of the CEA, CTA, and CPA Tasks on a dataset composed of 1,554 tables.

The evaluation will focus on the CEA, CTA, and CPA tasks using the SEPSES Computer Security Knowledge Graph and the CEA task using the Wikidata KG .

Datasets' Structure

Secu-table dataset is composed of 1,554 tables, divided into 76 tables provided as ground truth and 1,478 tables for testing. The dataset contains 20% of tables without any errors and 80% of tables containing errors such as ambiguity, NIL, missing context, misspelt data.

Targets Format

CEA task
filename, row id (0-indexed), column id (0-indexed), entity
CTA task
filename, column id (0-indexed), entity
CPA task
filename, col0 , column (1-indexed), property

Evaluation Criteria

Precision, Recall and F1 Score are calculated:

$$ Precision = \frac{\#correct\_annotations}{\#submitted\_annotations} $$
$$ Recall = \frac{\#correct\_annotations}{\#ground\_truth\_annotations} $$
$$ F_1 = \frac{2 \times Precision \times Recall}{Precision + Recall} $$
$$ Selective Prediction$$

Notes:

For selective prediction we want to consider the fact that the LLMs consider to say "I don't know" as seen in this picture: Selective Prediction

Submission

Are you ready? Then, submit the annotations via Google Form

Paper Guidelines

We invite participants to submit a paper, using EasyChair.

Submissions must not exceed six pages and should be formatted using either the CEUR LaTeX or Word template. Each paper will be reviewed by one or two challenge organisers.

Accepted papers will be published in a CEUR-WS volume. By submitting a paper, authors agree to comply with the CEUR-WS publication guidelines.

Co-Chairs

Marco Cremaschi

University of Milan - Bicocca

marco.cremaschi@unimib.it

Fabio D'Adda

University of Milan - Bicocca

fabio.dadda@unimib.it

Fidel Jiomekong Azanzi

University of Yaoundé, Cameroon

fidel.jiomekong@facsciences-uy1.cm

Jean Petit Yvelos

University of Yaoundé, Cameroon

jeanpetityvelos@gmail.com

Ernesto Jimenez-Ruiz

City St George's, University of London

ernesto.jimenez-ruiz@citystgeorges.ac.uk

Oktie Hassanzadeh

IBM Research

hassanzadeh@us.ibm.com

Acknowledgements

The challenge is currently supported by IBM Research and the ISWC 2025.

IBM Logo
ISWC 2025 Logo

Tentative Schedule

Release of datasets and instructions

June 9th, 2025

Round 1 & Paper Submission Deadline

August 8th, 2025 (AoE) New deadline

Note: To be invited to the conference for a presentation, you must submit to Round 1.

Initial Results & ISWC 2025 Presentation Invitations

August 15th, 2025 New deadline

Camera-ready Paper Deadline

September 15th, 2025

New or Revised Submissions Accepted Until (Round 2)

October 20th, 2025

Final results to be announced at the conference

November 2-6, 2025