Reranker trained on Custom Dataset

This is a Cross Encoder model finetuned from Alibaba-NLP/gte-multilingual-reranker-base using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Type: Cross Encoder
Base model: Alibaba-NLP/gte-multilingual-reranker-base
Maximum Sequence Length: 8192 tokens
Number of Output Labels: 1 label
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Documentation: Cross Encoder Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Cross Encoders on Hugging Face

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("lohith-chanchu/reranker-gte-multilingual-reranker-base-custom-bce")
# Get scores for pairs of texts
pairs = [
    ['Gaskugelhahn, Gewinde, DN 32 Gaskugelhahn, zum manuellen Absperren, geeignet für Erdgas, PN 6, nach DIN EN 331, Gehäuse aus Pressmessing, in Durchgangsform, beid seits Gewindeanschluss, DIN-DVGW-zugelassen, DN 32, einschließlich Übergangsstücke sowie Verbindungs-, Dichtungs- und Befestigungsma terial', 'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/4"'],
    ['Gaskugelhahn, Gewinde, DN 40 jedoch DN 40', 'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/2"'],
    ['Gaskugelhahn, Gewinde, DN 50 jedoch DN 50', 'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 2"'],
    ['Doppelnippel, Stahl, DN 15, Montagehöhe bis 6,0 m Doppelnippel, aus Kohlenstoffstahl, für Rohrleitung aus mittelschwerem Stahlrohr DIN EN 10255, mit Außengewinde 1/2 , Montagehöhe üb er Gelände / Fußboden bis 6,0 m', 'HS Rohrdoppelnippel Nr. 23 schwarz 1/2" 100mm'],
    ['Doppelnippel, Stahl, DN 20, Montagehöhe bis 6,0 m jedoch Außengewinde 3/4', 'HS Rohrdoppelnippel Nr. 23 schwarz 3/4" 100mm'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Gaskugelhahn, Gewinde, DN 32 Gaskugelhahn, zum manuellen Absperren, geeignet für Erdgas, PN 6, nach DIN EN 331, Gehäuse aus Pressmessing, in Durchgangsform, beid seits Gewindeanschluss, DIN-DVGW-zugelassen, DN 32, einschließlich Übergangsstücke sowie Verbindungs-, Dichtungs- und Befestigungsma terial',
    [
        'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/4"',
        'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/2"',
        'DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 2"',
        'HS Rohrdoppelnippel Nr. 23 schwarz 1/2" 100mm',
        'HS Rohrdoppelnippel Nr. 23 schwarz 3/4" 100mm',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Dataset: custom-dev

Evaluated with CrossEncoderRerankingEvaluator with these parameters:

{
    "at_k": 10,
    "always_rerank_positives": false
}

Metric	Value
map	0.3148 (+0.1281)
mrr@10	0.3228 (+0.1424)
ndcg@10	0.3455 (+0.1352)

Training Details

Training Dataset

Unnamed Dataset

Size: 447,164 training samples
Columns: query, answer, and label

Approximate statistics based on the first 1000 samples:

	query	answer	label
type	string	string	int
details	min: 27 characters mean: 434.65 characters max: 2905 characters	min: 0 characters mean: 52.08 characters max: 81 characters	0: ~33.70% 1: ~66.30%

Samples:

query	answer	label
`Gaskugelhahn, Gewinde, DN 32 Gaskugelhahn, zum manuellen Absperren, geeignet für Erdgas, PN 6, nach DIN EN 331, Gehäuse aus Pressmessing, in Durchgangsform, beid seits Gewindeanschluss, DIN-DVGW-zugelassen, DN 32, einschließlich Übergangsstücke sowie Verbindungs-, Dichtungs- und Befestigungsma terial`	`DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/4"`	`1`
`Gaskugelhahn, Gewinde, DN 40 jedoch DN 40`	`DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 11/2"`	`1`
`Gaskugelhahn, Gewinde, DN 50 jedoch DN 50`	`DITECH Gas-KH m gelbem Hebelgriff und vollem Durchgang 2"`	`1`

Loss: BinaryCrossEntropyLoss with these parameters:

{
    "activation_fn": "torch.nn.modules.linear.Identity",
    "pos_weight": 5
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: epoch
per_device_train_batch_size: 100
per_device_eval_batch_size: 100
learning_rate: 2e-05
num_train_epochs: 2
warmup_ratio: 0.1
seed: 12
bf16: True
dataloader_num_workers: 4
load_best_model_at_end: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: epoch
prediction_loss_only: True
per_device_train_batch_size: 100
per_device_eval_batch_size: 100
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 2
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 12
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 4
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss	custom-dev_ndcg@10
0.0002	1	1.5605	-
0.0224	100	0.9229	-
0.0447	200	0.4384	-
0.0671	300	0.3577	-
0.0894	400	0.3024	-
0.1118	500	0.267	-
0.1342	600	0.2393	-
0.1565	700	0.2228	-
0.1789	800	0.2196	-
0.2013	900	0.1812	-
0.2236	1000	0.2003	-
0.2460	1100	0.1756	-
0.2683	1200	0.1652	-
0.2907	1300	0.1529	-
0.3131	1400	0.1652	-
0.3354	1500	0.1327	-
0.3578	1600	0.1273	-
0.3801	1700	0.124	-
0.4025	1800	0.1371	-
0.4249	1900	0.1239	-
0.4472	2000	0.1252	-
0.4696	2100	0.115	-
0.4919	2200	0.116	-
0.5143	2300	0.1115	-
0.5367	2400	0.1157	-
0.5590	2500	0.1126	-
0.5814	2600	0.1071	-
0.6038	2700	0.1162	-
0.6261	2800	0.1088	-
0.6485	2900	0.1032	-
0.6708	3000	0.1086	-
0.6932	3100	0.0926	-
0.7156	3200	0.0846	-
0.7379	3300	0.0931	-
0.7603	3400	0.1053	-
0.7826	3500	0.0825	-
0.8050	3600	0.1116	-
0.8274	3700	0.0917	-
0.8497	3800	0.0907	-
0.8721	3900	0.0774	-
0.8945	4000	0.0789	-
0.9168	4100	0.0792	-
0.9392	4200	0.0933	-
0.9615	4300	0.0893	-
0.9839	4400	0.0993	-
1.0	4472	-	0.3409 (+0.1306)
1.0063	4500	0.0755	-
1.0286	4600	0.0551	-
1.0510	4700	0.0626	-
1.0733	4800	0.0694	-
1.0957	4900	0.0537	-
1.1181	5000	0.0557	-
1.1404	5100	0.0694	-
1.1628	5200	0.0621	-
1.1852	5300	0.0661	-
1.2075	5400	0.0494	-
1.2299	5500	0.0607	-
1.2522	5600	0.0561	-
1.2746	5700	0.0513	-
1.2970	5800	0.0617	-
1.3193	5900	0.0435	-
1.3417	6000	0.0659	-
1.3640	6100	0.0597	-
1.3864	6200	0.0668	-
1.4088	6300	0.0557	-
1.4311	6400	0.0566	-
1.4535	6500	0.0632	-
1.4758	6600	0.0573	-
1.4982	6700	0.0634	-
1.5206	6800	0.054	-
1.5429	6900	0.0392	-
1.5653	7000	0.046	-
1.5877	7100	0.0562	-
1.6100	7200	0.0443	-
1.6324	7300	0.0757	-
1.6547	7400	0.0555	-
1.6771	7500	0.0345	-
1.6995	7600	0.0525	-
1.7218	7700	0.0595	-
1.7442	7800	0.0561	-
1.7665	7900	0.0484	-
1.7889	8000	0.0465	-
1.8113	8100	0.0501	-
1.8336	8200	0.0411	-
1.8560	8300	0.0386	-
1.8784	8400	0.0477	-
1.9007	8500	0.0517	-
1.9231	8600	0.0338	-
1.9454	8700	0.0466	-
1.9678	8800	0.062	-
1.9902	8900	0.0647	-
2.0	8944	-	0.3455 (+0.1352)
-1	-1	-	0.3455 (+0.1352)

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.18
Sentence Transformers: 5.1.0
Transformers: 4.56.0
PyTorch: 2.8.0+cu128
Accelerate: 1.10.1
Datasets: 4.0.0
Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

Downloads last month: 7

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

Text Ranking

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lohith-chanchu/reranker-gte-multilingual-reranker-base-custom-bce

Base model

Alibaba-NLP/gte-multilingual-reranker-base

Finetuned

(8)

this model

Evaluation results

Map on custom dev
self-reported

0.315
Mrr@10 on custom dev
self-reported

0.323
Ndcg@10 on custom dev
self-reported

0.345

View on Papers With Code