CrossEncoder based on distilbert/distilroberta-base
This is a Cross Encoder model finetuned from distilbert/distilroberta-base on the all-nli dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text pair classification.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: distilbert/distilroberta-base
- Maximum Sequence Length: 514 tokens
- Number of Output Labels: 3 labels
- Training Dataset:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-distilroberta-base-nli")
# Get scores for pairs of texts
pairs = [
['Two women are embracing while holding to go packages.', 'The sisters are hugging goodbye while holding to go packages after just eating lunch.'],
['Two women are embracing while holding to go packages.', 'Two woman are holding packages.'],
['Two women are embracing while holding to go packages.', 'The men are fighting outside a deli.'],
['Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink.', 'Two kids in numbered jerseys wash their hands.'],
['Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink.', 'Two kids at a ballgame wash their hands.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5, 3)
Evaluation
Metrics
Cross Encoder Classification
- Datasets:
AllNLI-devandAllNLI-test - Evaluated with
CrossEncoderClassificationEvaluator
| Metric | AllNLI-dev | AllNLI-test |
|---|---|---|
| f1_macro | 0.8572 | 0.7751 |
| f1_micro | 0.858 | 0.7755 |
| f1_weighted | 0.8572 | 0.776 |
Training Details
Training Dataset
all-nli
- Dataset: all-nli at d482672
- Size: 100,000 training samples
- Columns:
premise,hypothesis, andlabel - Approximate statistics based on the first 1000 samples:
premise hypothesis label type string string int details - min: 23 characters
- mean: 69.54 characters
- max: 227 characters
- min: 11 characters
- mean: 38.26 characters
- max: 131 characters
- 0: ~33.40%
- 1: ~33.30%
- 2: ~33.30%
- Samples:
premise hypothesis label A person on a horse jumps over a broken down airplane.A person is training his horse for a competition.1A person on a horse jumps over a broken down airplane.A person is at a diner, ordering an omelette.2A person on a horse jumps over a broken down airplane.A person is outdoors, on a horse.0 - Loss:
CrossEntropyLoss
Evaluation Dataset
all-nli
- Dataset: all-nli at d482672
- Size: 1,000 evaluation samples
- Columns:
premise,hypothesis, andlabel - Approximate statistics based on the first 1000 samples:
premise hypothesis label type string string int details - min: 16 characters
- mean: 75.01 characters
- max: 229 characters
- min: 11 characters
- mean: 37.66 characters
- max: 116 characters
- 0: ~33.10%
- 1: ~33.30%
- 2: ~33.60%
- Samples:
premise hypothesis label Two women are embracing while holding to go packages.The sisters are hugging goodbye while holding to go packages after just eating lunch.1Two women are embracing while holding to go packages.Two woman are holding packages.0Two women are embracing while holding to go packages.The men are fighting outside a deli.2 - Loss:
CrossEntropyLoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 64per_device_eval_batch_size: 64num_train_epochs: 1warmup_ratio: 0.1bf16: True
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | Validation Loss | AllNLI-dev_f1_macro | AllNLI-test_f1_macro |
|---|---|---|---|---|---|
| -1 | -1 | - | - | 0.1775 | - |
| 0.0640 | 100 | 1.0464 | - | - | - |
| 0.1280 | 200 | 0.702 | - | - | - |
| 0.1919 | 300 | 0.6039 | - | - | - |
| 0.2559 | 400 | 0.5658 | - | - | - |
| 0.3199 | 500 | 0.5513 | 0.4792 | 0.7932 | - |
| 0.3839 | 600 | 0.523 | - | - | - |
| 0.4479 | 700 | 0.5261 | - | - | - |
| 0.5118 | 800 | 0.5074 | - | - | - |
| 0.5758 | 900 | 0.4871 | - | - | - |
| 0.6398 | 1000 | 0.5078 | 0.3934 | 0.8407 | - |
| 0.7038 | 1100 | 0.4706 | - | - | - |
| 0.7678 | 1200 | 0.4725 | - | - | - |
| 0.8317 | 1300 | 0.4362 | - | - | - |
| 0.8957 | 1400 | 0.4577 | - | - | - |
| 0.9597 | 1500 | 0.4415 | 0.3599 | 0.8572 | - |
| -1 | -1 | - | - | - | 0.7751 |
Environmental Impact
Carbon emissions were measured using CodeCarbon.
- Energy Consumed: 0.010 kWh
- Carbon Emitted: 0.004 kg of CO2
- Hours Used: 0.037 hours
Training Hardware
- On Cloud: No
- GPU Model: 1 x NVIDIA GeForce RTX 3090
- CPU Model: 13th Gen Intel(R) Core(TM) i7-13700K
- RAM Size: 31.78 GB
Framework Versions
- Python: 3.11.6
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.1
- Datasets: 3.3.2
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- -
Model tree for tomaarsen/reranker-distilroberta-base-nli
Base model
distilbert/distilroberta-baseDataset used to train tomaarsen/reranker-distilroberta-base-nli
Evaluation results
- F1 Macro on AllNLI devself-reported0.857
- F1 Micro on AllNLI devself-reported0.858
- F1 Weighted on AllNLI devself-reported0.857
- F1 Macro on AllNLI testself-reported0.775
- F1 Micro on AllNLI testself-reported0.776
- F1 Weighted on AllNLI testself-reported0.776