This repository contains the datasets for the Generalizability of Argument Identification in Context shared task at Touché @ CLEF 2026. Participants are asked to build models that classify whether a sentence (with context and metadata) is an Argument or No-Argument sentence across diverse sources.
The table below lists the dataset folders in this repository under data/.
| Dataset Folder | Description | Source | License |
|---|---|---|---|
ABSTRCT |
Argument mining dataset from academic abstracts. | https://ecai2020.eu/papers/1470_paper | CC BY-NC-SA 4.0 |
ACQUA |
Comparative sentences expressing preference or superiority (e.g. Matlab vs. Python) across multiple domains. | https://aclanthology.org/W19-4516/ | CC BY 4.0 |
AEC |
Sentences collected from discussions on the CreateDebate platform. | https://aclanthology.org/W15-4631/ | Approved by authors. |
AFS |
Sentences drawn from online debate platforms such as ProCon and iDebate. | https://aclanthology.org/W16-3636/ | Approved by authors. |
ARGUMINSCI |
Sentences originating from the Dr. Inventor scientific argumentation corpus. | https://aclanthology.org/W18-5206/ | Approved by authors. |
FINARG |
Sentences extracted from financial earnings calls of publicly traded companies. | https://aclanthology.org/2022.finnlp-1.22/ | GNU GPL 3.0 |
IAM |
Sentences gathered from heterogeneous web sources. | https://aclanthology.org/2022.acl-long.162/ | Free license. |
PE |
Sentences taken from student-written persuasive essays. | https://aclanthology.org/J17-3005/ | CC BY-NC-ND 4.0 |
SCIARK |
Sentences from scientific literature, including biomedical research articles. | https://aclanthology.org/2021.argmining-1.10/ | Free license. |
USELEC |
Sentences from U.S. presidential election debates and related political discourse. | https://aclanthology.org/P19-1463/ | Free license. |
- Read and follow the requirements for the task requirements on the Touché shared task page.
- Download the data and follow the further guidelines on the TIRA platform.
- The respective
train/dev/testsplits will be published sequentially indata/as separate files (e.g.,train.jsonl). - The paths in
train.jsonletc. are relative to thedata/directory and point to the respective files (e.g.,./ABSTRCT/data/ABSTRCT-1.txt).