Accuracy Track

Description

The evaluation of systems regarding accuracy is similar to prior versions of the SemTab. That is, to illustrate the accuracy of the submissions, we evaluate systems on typical multi-class classification metrics as detailed below. In addition, we adopt the "cscore" for the CTA task to reflect the distance in the type hierarchy between the predicted column type and the ground truth semantic type.

Matching Tasks:

Table Types:

Horizontal Table Example:

Horizontal Table

Entity Table Example:

Entity Table

Evaluation Criteria

Precision, Recall and F1 Score are calculated: \[Precision = {{correct\_annotations \#} \over {submitted\_annotations \#}}\] \[Recall = {{correct\_annotations \#} \over {ground\_truth\_annotations \#}}\] \[F1 = {2 \times Precision \times Recall \over Precision + Recall}\]

Notes:

Round 1

Datasets

Round 2

Datasets

Round 2 datasets are larger versions of Round 1 datasets. The main goal here is to measure the accuracy of more scalable solutions, as there is often a tradeoff between accuracy and performance.

Target Knowledge Graph: Wikidata. For offline, use March 20, 2024 dump. Reach out to the organizers if you need assistance in setting up a triplestore.

Datasets' Structure

All datasets consist of two data folds (training and validation). WikidataTables are all relational (horizontal) tables. tBiodiv and tBiomed have two table types entity (vertical) and relational (horizontal) tables with the following supported tasks:

Supported Task
Targets Format
Participate!

Submission: Are you ready? Then, submit the results of the test set using the following forms:

Track Organizers