AutoTrain documentation
Sentence Transformers
Sentence Transformers
This task lets you easily train or fine-tune a Sentence Transformer model on your own dataset.
AutoTrain supports the following types of sentence transformer finetuning:
- pair: dataset with two sentences: anchor and positive
- pair_class: dataset with two sentences: premise and hypothesis and a target label
- pair_score: dataset with two sentences: sentence1 and sentence2 and a target score
- triplet: dataset with three sentences: anchor, positive and negative
- qa: dataset with two sentences: query and answer
Data Format
Sentence Transformers finetuning accepts data in CSV/JSONL format. You can also use a dataset from Hugging Face Hub.
pair
For pair training, the data should be in the following format:
| anchor | positive | 
|---|---|
| hello | hi | 
| how are you | I am fine | 
| What is your name? | My name is Abhishek | 
| Which is the best programming language? | Python | 
pair_class
For pair_class training, the data should be in the following format:
| premise | hypothesis | label | 
|---|---|---|
| hello | hi | 1 | 
| how are you | I am fine | 0 | 
| What is your name? | My name is Abhishek | 1 | 
| Which is the best programming language? | Python | 1 | 
pair_score
For pair_score training, the data should be in the following format:
| sentence1 | sentence2 | score | 
|---|---|---|
| hello | hi | 0.8 | 
| how are you | I am fine | 0.2 | 
| What is your name? | My name is Abhishek | 0.9 | 
| Which is the best programming language? | Python | 0.7 | 
triplet
For triplet training, the data should be in the following format:
| anchor | positive | negative | 
|---|---|---|
| hello | hi | bye | 
| how are you | I am fine | I am not fine | 
| What is your name? | My name is Abhishek | Whats it to you? | 
| Which is the best programming language? | Python | Javascript | 
qa
For qa training, the data should be in the following format:
| query | answer | 
|---|---|
| hello | hi | 
| how are you | I am fine | 
| What is your name? | My name is Abhishek | 
| Which is the best programming language? | Python |