Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- MRC
|
| 4 |
+
- TyDiQA
|
| 5 |
+
- xlm-roberta-large
|
| 6 |
+
language:
|
| 7 |
+
- multilingual
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# Model description
|
| 11 |
+
|
| 12 |
+
Reading comprehension, XLM-RoBERTa model for [TyDiQA Primary Tasks](https://arxiv.org/abs/2003.05002).
|
| 13 |
+
|
| 14 |
+
- **Passage selection task (SelectP):** Given a list of the passages in the article, return either (a) the index of the passage that answers the question or (b) NULL if no such passage exists.
|
| 15 |
+
|
| 16 |
+
- **Minimal answer span task (MinSpan):** Given the full text of an article, return one of (a) the start and end byte indices of the minimal span that completely answers the question; (b) YES or NO if the question requires a yes/no answer and we can draw a conclusion from the passage; (c) NULL if it is not possible to produce a minimal answer for this question.
|
| 17 |
+
|
| 18 |
+
The model is initialized with [xlm-roberta-large](https://huggingface.co/xlm-roberta-large/) and fine-tuned on the [TyDiQA train data](https://huggingface.co/datasets/tydiqa).
|
| 19 |
+
|
| 20 |
+
## Intended uses & limitations
|
| 21 |
+
|
| 22 |
+
You can use the raw model for the reading comprehension task.
|
| 23 |
+
|
| 24 |
+
## Usage
|
| 25 |
+
|
| 26 |
+
You can use this model directly with the [PrimeQA](https://github.com/primeqa/primeqa) pipeline for reading comprehension [tydiqa.ipynb](https://github.com/primeqa/primeqa/blob/main/notebooks/mrc/tydiqa.ipynb).
|
| 27 |
+
|
| 28 |
+
### BibTeX entry and citation info
|
| 29 |
+
|
| 30 |
+
```bibtex
|
| 31 |
+
@article{tydiqa,
|
| 32 |
+
title = {TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages},
|
| 33 |
+
author = {Jonathan H. Clark and Eunsol Choi and Michael Collins and Dan Garrette and Tom Kwiatkowski and Vitaly Nikolaev and Jennimaria Palomaki}
|
| 34 |
+
year = {2020},
|
| 35 |
+
journal = {Transactions of the Association for Computational Linguistics}
|
| 36 |
+
}
|
| 37 |
+
```
|