Reranker trained on Custom Dataset

This is a Cross Encoder model finetuned from Alibaba-NLP/gte-multilingual-reranker-base using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the ๐Ÿค— Hub
model = CrossEncoder("lohith-chanchu/reranker-gte-multilingual-reranker-base-custom-bce")
# Get scores for pairs of texts
pairs = [
    ['Gaskugelhahn, Gewinde, DN 32 Gaskugelhahn, zum manuellen Absperren, geeignet fรผr Erdgas, PN 6, nach DIN EN 331, Gehรคuse aus Pressmessing, in Durchgangsform, beid seits Gewindeanschluss, DIN-DVGW-zugelassen, DN 32, einschlieรŸlich รœbergangsstรผcke sowie Verbindungs-, Dichtungs- und Befestigungsma terial', 'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/4"'],
    ['Gaskugelhahn, Gewinde, DN 40 jedoch DN 40', 'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/2"'],
    ['Gaskugelhahn, Gewinde, DN 50 jedoch DN 50', 'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 2"'],
    ['Doppelnippel, Stahl, DN 15, Montagehรถhe bis 6,0 m Doppelnippel, aus Kohlenstoffstahl, fรผr Rohrleitung aus mittelschwerem Stahlrohr DIN EN 10255, mit AuรŸengewinde 1/2 , Montagehรถhe รผb er Gelรคnde / FuรŸboden bis 6,0 m', 'HS Rohrdoppelnippel Nr. 23 schwarz 1/2" 100mm'],
    ['Doppelnippel, Stahl, DN 20, Montagehรถhe bis 6,0 m jedoch AuรŸengewinde 3/4', 'HS Rohrdoppelnippel Nr. 23 schwarz 3/4" 100mm'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Gaskugelhahn, Gewinde, DN 32 Gaskugelhahn, zum manuellen Absperren, geeignet fรผr Erdgas, PN 6, nach DIN EN 331, Gehรคuse aus Pressmessing, in Durchgangsform, beid seits Gewindeanschluss, DIN-DVGW-zugelassen, DN 32, einschlieรŸlich รœbergangsstรผcke sowie Verbindungs-, Dichtungs- und Befestigungsma terial',
    [
        'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/4"',
        'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/2"',
        'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 2"',
        'HS Rohrdoppelnippel Nr. 23 schwarz 1/2" 100mm',
        'HS Rohrdoppelnippel Nr. 23 schwarz 3/4" 100mm',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Metric Value
map 0.3148 (+0.1281)
mrr@10 0.3228 (+0.1424)
ndcg@10 0.3455 (+0.1352)

Training Details

Training Dataset

Unnamed Dataset

  • Size: 447,164 training samples
  • Columns: query, answer, and label
  • Approximate statistics based on the first 1000 samples:
    query answer label
    type string string int
    details
    • min: 27 characters
    • mean: 434.65 characters
    • max: 2905 characters
    • min: 0 characters
    • mean: 52.08 characters
    • max: 81 characters
    • 0: ~33.70%
    • 1: ~66.30%
  • Samples:
    query answer label
    Gaskugelhahn, Gewinde, DN 32 Gaskugelhahn, zum manuellen Absperren, geeignet fรผr Erdgas, PN 6, nach DIN EN 331, Gehรคuse aus Pressmessing, in Durchgangsform, beid seits Gewindeanschluss, DIN-DVGW-zugelassen, DN 32, einschlieรŸlich รœbergangsstรผcke sowie Verbindungs-, Dichtungs- und Befestigungsma terial DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/4" 1
    Gaskugelhahn, Gewinde, DN 40 jedoch DN 40 DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/2" 1
    Gaskugelhahn, Gewinde, DN 50 jedoch DN 50 DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 2" 1
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 100
  • per_device_eval_batch_size: 100
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • dataloader_num_workers: 4
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 100
  • per_device_eval_batch_size: 100
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss custom-dev_ndcg@10
0.0002 1 1.5605 -
0.0224 100 0.9229 -
0.0447 200 0.4384 -
0.0671 300 0.3577 -
0.0894 400 0.3024 -
0.1118 500 0.267 -
0.1342 600 0.2393 -
0.1565 700 0.2228 -
0.1789 800 0.2196 -
0.2013 900 0.1812 -
0.2236 1000 0.2003 -
0.2460 1100 0.1756 -
0.2683 1200 0.1652 -
0.2907 1300 0.1529 -
0.3131 1400 0.1652 -
0.3354 1500 0.1327 -
0.3578 1600 0.1273 -
0.3801 1700 0.124 -
0.4025 1800 0.1371 -
0.4249 1900 0.1239 -
0.4472 2000 0.1252 -
0.4696 2100 0.115 -
0.4919 2200 0.116 -
0.5143 2300 0.1115 -
0.5367 2400 0.1157 -
0.5590 2500 0.1126 -
0.5814 2600 0.1071 -
0.6038 2700 0.1162 -
0.6261 2800 0.1088 -
0.6485 2900 0.1032 -
0.6708 3000 0.1086 -
0.6932 3100 0.0926 -
0.7156 3200 0.0846 -
0.7379 3300 0.0931 -
0.7603 3400 0.1053 -
0.7826 3500 0.0825 -
0.8050 3600 0.1116 -
0.8274 3700 0.0917 -
0.8497 3800 0.0907 -
0.8721 3900 0.0774 -
0.8945 4000 0.0789 -
0.9168 4100 0.0792 -
0.9392 4200 0.0933 -
0.9615 4300 0.0893 -
0.9839 4400 0.0993 -
1.0 4472 - 0.3409 (+0.1306)
1.0063 4500 0.0755 -
1.0286 4600 0.0551 -
1.0510 4700 0.0626 -
1.0733 4800 0.0694 -
1.0957 4900 0.0537 -
1.1181 5000 0.0557 -
1.1404 5100 0.0694 -
1.1628 5200 0.0621 -
1.1852 5300 0.0661 -
1.2075 5400 0.0494 -
1.2299 5500 0.0607 -
1.2522 5600 0.0561 -
1.2746 5700 0.0513 -
1.2970 5800 0.0617 -
1.3193 5900 0.0435 -
1.3417 6000 0.0659 -
1.3640 6100 0.0597 -
1.3864 6200 0.0668 -
1.4088 6300 0.0557 -
1.4311 6400 0.0566 -
1.4535 6500 0.0632 -
1.4758 6600 0.0573 -
1.4982 6700 0.0634 -
1.5206 6800 0.054 -
1.5429 6900 0.0392 -
1.5653 7000 0.046 -
1.5877 7100 0.0562 -
1.6100 7200 0.0443 -
1.6324 7300 0.0757 -
1.6547 7400 0.0555 -
1.6771 7500 0.0345 -
1.6995 7600 0.0525 -
1.7218 7700 0.0595 -
1.7442 7800 0.0561 -
1.7665 7900 0.0484 -
1.7889 8000 0.0465 -
1.8113 8100 0.0501 -
1.8336 8200 0.0411 -
1.8560 8300 0.0386 -
1.8784 8400 0.0477 -
1.9007 8500 0.0517 -
1.9231 8600 0.0338 -
1.9454 8700 0.0466 -
1.9678 8800 0.062 -
1.9902 8900 0.0647 -
2.0 8944 - 0.3455 (+0.1352)
-1 -1 - 0.3455 (+0.1352)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.18
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
7
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for lohith-chanchu/reranker-gte-multilingual-reranker-base-custom-bce

Finetuned
(8)
this model

Evaluation results