term-mapper

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'board cert agency code, Board Cert Agency Code',
    '2nd board cert',
    'comments',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.6759, -0.0045],
#         [ 0.6759,  1.0000,  0.0552],
#         [-0.0045,  0.0552,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 61,927 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 9 tokens
    • mean: 10.39 tokens
    • max: 11 tokens
    • min: 3 tokens
    • mean: 6.42 tokens
    • max: 25 tokens
  • Samples:
    anchor positive
    accepting patients ind, Accepting Patients IND primary spec accepting new patients for pcps and ob
    accepting patients ind, Accepting Patients IND accepting new patients (all practitioner types ongoing outpatient basis) (y n) (no blanks)
    accepting patients ind, Accepting Patients IND acc ind for pts
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 7,092 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 5 tokens
    • mean: 11.39 tokens
    • max: 19 tokens
    • min: 3 tokens
    • mean: 6.96 tokens
    • max: 23 tokens
  • Samples:
    anchor positive
    accepting patients ind, Accepting Patients IND open close panel
    accepting patients ind, Accepting Patients IND panel status
    accepting patients ind, Accepting Patients IND commercial panel status
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0258 50 0.8668 -
0.0517 100 0.7505 0.6548
0.0775 150 0.6506 -
0.1033 200 0.4672 0.4107
0.1291 250 0.403 -
0.1550 300 0.3284 0.2954
0.1808 350 0.3005 -
0.2066 400 0.2248 0.2149
0.2324 450 0.219 -
0.2583 500 0.1794 0.1685
0.2841 550 0.1441 -
0.3099 600 0.1522 0.1397
0.3357 650 0.1322 -
0.3616 700 0.1254 0.1283
0.3874 750 0.1194 -
0.4132 800 0.134 0.1140
0.4390 850 0.0932 -
0.4649 900 0.1025 0.0957
0.4907 950 0.1063 -
0.5165 1000 0.0956 0.0945
0.5424 1050 0.071 -
0.5682 1100 0.0727 0.0836
0.5940 1150 0.0895 -
0.6198 1200 0.0786 0.0750
0.6457 1250 0.0923 -
0.6715 1300 0.0905 0.0742
0.6973 1350 0.0522 -
0.7231 1400 0.0645 0.0693
0.7490 1450 0.0711 -
0.7748 1500 0.0655 0.0627
0.8006 1550 0.0532 -
0.8264 1600 0.0602 0.0615
0.8523 1650 0.0674 -
0.8781 1700 0.0537 0.0564
0.9039 1750 0.0578 -
0.9298 1800 0.0643 0.0533
0.9556 1850 0.0655 -
0.9814 1900 0.0562 0.0519
1.0072 1950 0.0538 -
1.0331 2000 0.043 0.0470
1.0589 2050 0.035 -
1.0847 2100 0.0412 0.0454
1.1105 2150 0.0362 -
1.1364 2200 0.0454 0.0449
1.1622 2250 0.0438 -
1.1880 2300 0.0453 0.0433
1.2138 2350 0.0298 -
1.2397 2400 0.0351 0.0444
1.2655 2450 0.0349 -
1.2913 2500 0.0391 0.0431
1.3171 2550 0.0404 -
1.3430 2600 0.0371 0.0423
1.3688 2650 0.0382 -
1.3946 2700 0.0325 0.0420
1.4205 2750 0.0394 -
1.4463 2800 0.0469 0.0421
1.4721 2850 0.0466 -
1.4979 2900 0.0374 0.0407
1.5238 2950 0.0321 -
1.5496 3000 0.022 0.0388
1.5754 3050 0.0229 -
1.6012 3100 0.0354 0.0367
1.6271 3150 0.0275 -
1.6529 3200 0.036 0.0358
1.6787 3250 0.0349 -
1.7045 3300 0.0359 0.0337
1.7304 3350 0.0386 -
1.7562 3400 0.029 0.0341
1.7820 3450 0.0348 -
1.8079 3500 0.0241 0.0342
1.8337 3550 0.0281 -
1.8595 3600 0.0239 0.0323
1.8853 3650 0.0281 -
1.9112 3700 0.0301 0.0323
1.9370 3750 0.0186 -
1.9628 3800 0.0246 0.0308
1.9886 3850 0.0315 -
2.0145 3900 0.0185 0.0302
2.0403 3950 0.0272 -
2.0661 4000 0.025 0.0304
2.0919 4050 0.0262 -
2.1178 4100 0.02 0.0306
2.1436 4150 0.0163 -
2.1694 4200 0.0301 0.0294
2.1952 4250 0.0176 -
2.2211 4300 0.0206 0.0297
2.2469 4350 0.0121 -
2.2727 4400 0.0206 0.0294
2.2986 4450 0.018 -
2.3244 4500 0.0178 0.0291
2.3502 4550 0.0153 -
2.3760 4600 0.0219 0.0288
2.4019 4650 0.0214 -
2.4277 4700 0.0212 0.0281
2.4535 4750 0.0183 -
2.4793 4800 0.0302 0.0280
2.5052 4850 0.0158 -
2.5310 4900 0.02 0.0274
2.5568 4950 0.0171 -
2.5826 5000 0.0275 0.0269
2.6085 5050 0.0193 -
2.6343 5100 0.0158 0.0269
2.6601 5150 0.0179 -
2.6860 5200 0.0214 0.0269
2.7118 5250 0.0225 -
2.7376 5300 0.0166 0.0264
2.7634 5350 0.0243 -
2.7893 5400 0.0154 0.0262
2.8151 5450 0.0245 -
2.8409 5500 0.0122 0.0261
2.8667 5550 0.0234 -
2.8926 5600 0.0217 0.0259
2.9184 5650 0.0166 -
2.9442 5700 0.0165 0.0258
2.9700 5750 0.0126 -
2.9959 5800 0.0201 0.0258
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.18
  • Sentence Transformers: 5.0.0
  • Transformers: 4.53.3
  • PyTorch: 2.7.1+cu126
  • Accelerate: 1.9.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
1,090
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mihirsingh141/retriever_module

Finetuned
(317)
this model