SetFit with sentence-transformers/paraphrase-mpnet-base-v2
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
| Label |
Examples |
| 6 |
- '3 -RRB- Republican congressional representatives , because of their belief in a minimalist state , are less willing to engage in local benefit-seeking than are Democratic members of Congress . '
- 'That is the way the system works . '
- 'Duck swarms . '
|
| 2 |
- 'It explains how the Committee for Medicinal Products for Veterinary Use ( CVMP ) assessed the studies performed , to reach their recommendations on how to use the medicine . '
- 'Tricks such as those of Alonso and Ramos before the Ajax demonstrate wittiness but not the will to get remove of a sanction . '
- 'The next day , Sunday , the hangover reminded Haney where he had been the night before . '
|
| 3 |
- 'If it is , it will be treated as an operator , if it is not , it will be treated as a user function . '
- 'Back in the chase car , we drove around some more , got stuck in a ditch , enlisted the aid of a local farmer to get out the trailer hitch and pull us out of the ditch . '
- "It was the most exercise we 'd had all morning and it was followed by our driving immediately to the nearest watering hole . "
|
| 5 |
- 'The discovery of a strange bacteria that can use arsenic as one of its nutrients widens the scope for finding new forms of life on Earth and possibly beyond . '
- 'I felt the temblor begin and glanced at the table next to mine , smiled that guilty smile and we both mouthed the words ,
Earth-quake ! together . ' - 'Already two major pharmaceutical companies , the Squibb unit of Bristol-Myers Squibb Co. and Hoffmann-La Roche Inc. , are collaborating with gene hunters to turn the anticipated cascade of discoveries into predictive tests and , maybe , new therapies . '
|
| 0 |
- 'Prior to 1932 , the pattern was nearly the opposite . '
- 'A minor contrast to Costa Rica , comparing the 22 players called by both countries for the friendly game today , at 3:05 pm at the National Stadium in San Jose . '
- 'Never in my life have I been so frightened . '
|
| 4 |
- '
To ring for even one service at this tower , we have to scrape , says Mr. Hammond , a retired water-authority worker . '</li><li>'It is a passion that usually stays in the tower , however . '</li><li>'One writer , signing his letter as Red-blooded , balanced male , remarked on the frequency of women fainting in peals , and suggested that they settle back into their traditional role of making tea at meetings . `` '
|
| 1 |
- 'Bribe by bribe , Mr. Sternberg and his co-author , Matthew C. Harrison Jr. , lead us along the path Wedtech traveled , from its inception as a small manufacturing company to the status of full-fledged defense contractor , entrusted with the task of producing vital equipment for the Army and Navy . '
- "kalgebra 's console is useful as a calculator . "
- 'Then a wild thought ran circles through his clouded brain . '
|
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("HelgeKn/SemEval-multi-class-10")
preds = model("To break the uncomfortable silence , Haney began to talk . ")
Training Details
Training Set Metrics
| Training set |
Min |
Median |
Max |
| Word count |
4 |
28.1286 |
74 |
| Label |
Training Sample Count |
| 0 |
10 |
| 1 |
10 |
| 2 |
10 |
| 3 |
10 |
| 4 |
10 |
| 5 |
10 |
| 6 |
10 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (2, 2)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
| Epoch |
Step |
Training Loss |
Validation Loss |
| 0.0057 |
1 |
0.2488 |
- |
| 0.2857 |
50 |
0.2041 |
- |
| 0.5714 |
100 |
0.1094 |
- |
| 0.8571 |
150 |
0.0478 |
- |
| 1.1429 |
200 |
0.0378 |
- |
| 1.4286 |
250 |
0.0089 |
- |
| 1.7143 |
300 |
0.0036 |
- |
| 2.0 |
350 |
0.0029 |
- |
Framework Versions
- Python: 3.9.13
- SetFit: 1.0.1
- Sentence Transformers: 2.2.2
- Transformers: 4.36.0
- PyTorch: 2.1.1+cpu
- Datasets: 2.15.0
- Tokenizers: 0.15.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}