SentenceTransformer

This is a sentence-transformers model distilled from Snowflake/snowflake-arctic-embed-l-v2.0 . It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
    "Cancer type: Conventional high\u2011grade (grade\u202f3) chondrosarcoma of the right scapula  \nHistology: Pleomorphic sarcoma with chondroid differentiation (high\u2011grade conventional chondrosarcoma)  \nCurrent extent: Stage\u00a0IV/metastatic \u2013 pulmonary metastases (numerous bilateral nodules, stable) and progressive hepatic metastasis (now 4.2\u202fcm, causing biliary obstruction)  \n\nBiomarkers:  \n- IDH2 p.R172S mutation (initial NGS low VAF, later VAF\u202f\u2248\u202f18\u202f%)  \n- TP53 p.R282W loss\u2011of\u2011function missense mutation  \n- CDKN2A homozygous deletion (p16 loss)  \n- MDM2 amplification (\u2248\u202f12\u2011fold)  \n- FGFR2 amplification  \n- COL2A1 p.R1060C (variant of uncertain significance)  \n- Microsatellite stability (MS\u2011Stable)  \n- PD\u2011L1 IHC 0\u202f% TPS (negative)  \n- Tumor mutational burden \u2248\u202f9\u202fmut/Mb (moderate)\n\nTreatment history:  \n# 2020\u201111\u201123 onward: Diagnosis confirmed by pathology. Initiated local external beam radiotherapy to the scapular primary (exact dates not specified, commenced shortly after diagnosis).  \n# Approx. 2022\u201109\u2011xx to 2022\u201110\u2011xx: Off\u2011label regorafenib, initiated for systemic control; duration\u202f~\u202f4\u202fweeks. Discontinued because of Grade\u202f3 hypertension and Grade\u202f3 hand\u2011foot skin reaction.  \n# Early 2022 (approximately February\u2013March): Additional systemic therapy attempted (unspecified agent) but halted \u201csix weeks ago\u201d as of 2022\u201105\u201111 due to lack of radiographic response and toxicity. Exact drug unknown; recorded as therapy discontinued.  \n# Ongoing: Supportive care with opioid analgesics for scapular pain; no further active anticancer therapy after regorafenib cessation given poor performance status (ECOG\u202f\u2265\u202f3) and limited expected benefit.",
]
documents = [
    '4. Cancer type allowed: solid tumor (all tissues). Histology allowed: any malignant histologic subtype provided the tumor harbours an IDH1 mutation. Cancer burden allowed: advanced or metastatic disease refractory to conventional therapeutic options or intolerant of such therapy; radiographically measurable disease not previously subjected to radiation, chemo‑embolisation, radio‑embolisation or other local ablative technique. Prior treatment required: evidence of refractoriness/intolerance to standard-of-care modalities. Prior treatment excluded: n/a. Biomarkers required: IDH1 gene mutation determined on local diagnostic platform (centrally reassessed retrospectively). Biomarkers excluded: n/a.',
    '1. Cancer type allowed: meningioma. Histology allowed: World Health Organization grade\u202fII or grade\u202fIII meningioma. Cancer burden allowed: recurrent or progressively growing intracranial disease with at least one meas\xadurable lesion ≥10\u202fmm on magnetic resonance imaging. Prior treatment required: prior neurosurgical resection of the index meningioma **and** prior cranial radiation therapy directed at the progressing tumour. Prior treatment excluded: • more than two distinct courses of radiation therapy administered for meningioma • a documented clinical diagnosis of Neurofibroma\xadtosis type\u202f2 OR a molecularly identified NF2 alteration • three or more prior systemic chemotherapy regimens delivered for meningeal disease. Biomarkers required: none noted. Biomarkers excluded: neurofi bromas tos i s\u202ftype\u202f2 (clinical or molecular).',
    '1. Cancer type allowed: renal cell carcinoma. Histology allowed: clear cell renal cell carcinoma. Cancer burden allowed: advanced or metastatic disease. Prior treatment required: none. Prior treatment excluded: any prior systemic therapy for advanced or metastatic renal cell carcinoma; prior belzutifan or other HIF‑2α inhibitor; prior cabozantinib.  ',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 1024] [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.8427, 0.1946, 0.3175]])

Training Details

Training Datasets

Unnamed Dataset

  • Size: 415,029 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 200 tokens
    • mean: 569.76 tokens
    • max: 1231 tokens
    • min: 14 tokens
    • mean: 121.13 tokens
    • max: 459 tokens
    • min: 0.0
    • mean: 0.73
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Cancer type: Epithelioid sarcoma (classic “distal” type)
    Histology: High‑grade (grade 2) epithelioid sarcoma with loss of INI1 (SMARCB1) expression
    Current extent: Metastatic disease (bilateral pulmonary nodules and a solitary hepatic lesion)

    Biomarkers:
    - SMARCB1 biallelic deletion → complete loss of nuclear INI1 (predictive for EZH2 inhibition)
    - PTEN truncating mutation (loss of function)
    - KRAS p.G13D activating mutation (confers MAPK pathway activation, predicts resistance to EGFR‐directed agents)
    - CDKN2A homozygous deletion
    - TP53 splice‑site variant (non‑functional p53)
    - Low‑level EGFR copy number gain (≈ 3 copies)

    Treatment history:
    # 2017‑mid – early 2018: Neoadjuvant doxorubicin × 4 cycles (anthracycline chemotherapy) – administered prior to definitive local therapy (exact dates not specified)
    # Early 2018 (approx. Mar‑Apr 2018): External beam radiation therapy, 45 Gy in 25 fractions to the left hand (definitive EBRT) – completed by 2018‑04‑21 (acu...
    3. Cancer type allowed: tumours deficient in INI1 (or otherwise INI1‑negative/aberrant) irrespective of organ origin, and any solid tumour harbouring an activating (“gain‑of‑function”) mutation in EZH2. Histology allowed: diverse solid‐tumour histologies meeting the INI1‑loss definition or carrying an EZH2 GOF point mutation/amplification. Cancer burden allowed: metastatic disease or unresectable locally advanced disease that is relapsed or refractory after prior therapy (including cases progressing within six months before enrolment). Prior treatment required: any preceding anticancer therapy whose residual toxicities have resolved to ≤grade 1. Prior treatment excluded: none stated specifically for these cohorts. Biomarkers required: (a) loss of INI1 protein by IHC or bi‑biallelic INI1 loss/mutation confirmed genetically, or (b) demonstrable EZH2 gain‑of‑function mutation/amplification identified by validated molecular platform. Biomarkers excluded: none. 1.0
    Cancer type: Non‑small cell lung cancer (lung adenocarcinoma)
    Histology: Moderately differentiated invasive adenocarcinoma, TTF‑1 +, Napsin‑A +
    Current extent: Metastatic (thoracic disease stable, solitary treated cerebellar brain metastasis, no other extracranial sites identified)

    Biomarkers:
    • EGFR exon 19 sensiti­zing deletion (present on all specimens)
    • PD‑L1 Tumor Proportion Score ≈30 % (initial PET/CT report)
    • Acquired EGFR C797S substitution (NGS 2022‑08‑20)
    • MET copy number amplification ≈9 copies (clinical note 2023‑05‑18)
    • Tumor mutational burden 4.2 Mut/Mb (low)
    • Microsatellite stable / proficient mismatch repair (NGS 2022‑08‑20)

    Treatment history:
    # 2020‑05‑12 to 2020‑07‑30: Neoadjuvant cisplatin 75 mg/m² + pemetrexed 500 mg/m² q21 days – three cycles completed (last dose approx June 2021). Best response: partial reduction of primary tumor (≈44 % decrease on 2021‑04‑12 imaging).
    # Late 2021 (post‑neoadjuvant): VATS left lower‑lobe wedge resect...
    1. Cancer type allowed: non small cell lung cancer. Histology allowed: non small cell lung cancer. Cancer burden allowed: advanced unresectable or metastatic disease. Prior treatment required: ≤ five prior anticancer regimens permissible. Prior treatment excluded: prior anti‑programmed death receptor 1, anti‑programmed death ligand 1, or anti‑programmed death ligand 2 antibody exposure. Biomarkers required: ☐. Biomarkers excluded: ☐. 1.0
    Cancer type: Urinary bladder urothelial carcinoma
    Histology: High‑grade papillary urothelial carcinoma, WHO/ISUT Grade III
    Current extent: Metastatic (progressive disease with FDG‑avid left supraclavicular lymph node and suspected pulmonary involvement; persistent pelvic nodal disease)

    Biomarkers:
    • FGFG3 S249C mutation
    • TP53 R248W mutation
    • CDKN2A loss
    • ERBB2 amplification (copy number = 6)
    • PIK3CA E545K mutation
    • KDM6A truncating alteration
    • MDM2 amplification
    • STAG2 truncating alteration
    • TERT promoter −124 C>T mutation
    • Low PD‑L1 tumor proportion score ≈ 5%

    Treatment history:
    # 2012‑11‑21 → Initial transurethral resection of bladder tumor (TURBT) showing high‑grade papillary urothelial carcinoma invading muscularis propria (pT2).
    # Early 2013 → Neoadjuvant MVAC chemotherapy (≥3 cycles reported by 09‑15‑2013; total 4 cycles completed by October 2013). Partial radiographic response of primary lesion, stable pelvic nodal disease.
    # 10‑13‑20...
    11. Cancer type allowed: Urothelial/bladder cancer. Histology allowed: transitional cell carcinoma. Cancer burden allowed: advanced/metastatic disease. Prior treatment required: none specific. Prior treatment excluded: none beyond the ubiquitous recent‑therapy ban. Biomarkers required: none. Biomarkers excluded: none. 1.0
  • Loss: OnlineContrastiveLoss

Unnamed Dataset

  • Size: 309,687 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 196 tokens
    • mean: 568.43 tokens
    • max: 1095 tokens
    • min: 15 tokens
    • mean: 124.96 tokens
    • max: 452 tokens
  • Samples:
    sentence_0 sentence_1
    Cancer type: Colorectal adenocarcinoma
    Histology: Moderately differentiated invasive adenocarcinoma, Grade 2
    Current extent: Metastatic (initially stage IV with hepatic metastases and a solitary right frontal brain metastasis; brain lesion treated with stereotactic radiosurgery and shows complete radiographic remission; hepatic disease persists with intermittent progression)

    Biomarkers: KRAS wild‑type; Microsatellite stable (MS‑stable); Tumor mutational burden low (≈4 mutations/Mb); IDH2 R172K activating mutation (VAF ~8%); additional somatic alterations – APC truncation, TP53 missense, SMAD4 loss, MYC/CDK6 amplifications; no germline‑relevant BRCA1/2 changes

    Treatment history:
    # 2013‑05‑20: Right hemicolectomy (curative resection of primary colonic tumor)
    # 2013‑05‑20: Stereotactic radiosurgery to solitary right frontal brain metastasis (single fraction 18 Gy) – resulted in complete radiographic remission, no residual neurological symptoms
    # 2013‑01‑03 to 2013‑05‑20: F...
    1. Cancer type allowed: various solid malignant neoplasms excluding central nervous system. Histology allowed: diverse solid tumor histologies. Cancer burden allowed: advanced or metastatic disease, recurrent or progressive after standard therapy. Prior treatment required: disease must have progressed after receipt of standard therapeutic regimen (which may include targeted therapy against mutant IDH1/IDH2). Prior treatment excluded: none specified beyond what is covered under exclusion criteria. Biomarkers required: presence of an IDH1 and/or IDH2 gene mutation (to be determined using archival tumor specimen or fresh biopsy). Biomarkers excluded: none.
    Cancer type: Breast cancer
    Histology: Invasive (ductal) carcinoma, NST, grade 2, hormone‑receptor‑positive (ER⁺/PR⁺), HER2‑negative, right breast

    Current extent: Metastatic (initially bone‐only, later hepatic progression; disease remains active/metastatic as of latest note in 02/2014)

    Biomarkers:
    - Estrogen receptor positive, Progesterone receptor positive, HER2 negative (by IHC/FISH at diagnosis)
    - ESR1 Y537S mutation (detected 09/2013) – associated with resistance to aromatase inhibitors
    - PIK3CA E545K activating mutation (detected 09/2013)
    - Microsatellite instability stable (MSI‑S) (09/2013)
    - Tumor mutational burden 8 mutations/Mb (intermediate) (09/2013)

    Treatment history:
    # 02/03/2012 – ~05/07/2012: Letrozole 2.5 mg orally once daily (first‑line aromatase inhibitor). Achieved stable bone disease on serial scans but met RECIST criteria for radiographic progression (new hepatic lesion) after ≈4 months.
    # ~05/2013 – early 02/2014: Everolimus combined with L...
    2. Cancer type allowed: metastatic breast cancer. Histology allowed: invasive breast adenocarcinoma. Cancer burden allowed: hormonereceptor‑positive, human epidermal growth factor receptor 2‑negative metastatic disease; absence of currently active or symptomatic central nervous system involvement. Prior treatment required: endocrine‑resistance demonstrated; receipt of at least one line containing combined hormonal therapy together with an FDA‑approved cyclin‑dependent kinase 4/6 inhibitor; total number of prior systemic regimens for locoregionally unresectable/metastatic disease limited to ≥ 1 and ≤ 4. Prior treatment excluded: n/a. Biomarkers required: estrogen‑receptor and/or progesterone‑receptor positivity; HER2 negativity verified according to guideline testing methodology. Biomarkers excluded: n/a.
    Cancer type: Ewing sarcoma
    Histology: Small round blue cell tumor, CD99⁺, NKX2.2⁺, EWSR1‑FLI1 fusion-positive
    Current extent: Metastatic (persistent FDG‑avid L2 vertebral body lesion, stable disease)

    Biomarkers:
    - Confirmed EWSR1‑FLI1 translocation (detected 2017‑06‑01 pathology, reconfirmed 2020‑05‑08 NGS)
    - CDK4 amplification (high level, 2020‑05‑08 NGS)
    - CCND1 copy‑number gain (modest, 2020‑05‑08 NGS)
    - TP53 p.R175H missense mutation (NGS Jan 2025)
    - ATRX splice‑variant alteration (NGS Jan 2025)
    - STAG2 frameshift loss‑of‑function (NGS Jan 2025)

    Treatment history:
    # 2017‑06‑01 to 2018‑04‑xx: Neoadjuvant interval‑compressed VDC/IE (vincristine, doxorubicin, cyclophosphamide alternating with ifosfamide/etoposide) – 8 cycles total (partial response observed on 2018‑05‑23 MRI, ~40% shrinkage)
    # 2018‑09‑29 to 2018‑09‑29: Left scapular partial scapulectomy with prosthetic reconstruction (negative surgical margins)
    # 2018‑10‑09 to 2018‑10‑09: Adjuvant external‑...
    4. Cancer type allowed: Ewing sarcoma. Histology allowed: classic Ewing sarcoma (small round blue cell tumour) with characteristic marker profile (CD99 positive, keratin variable, INI1 retained) confirming diagnosis. Cancer burden allowed: metastatic disease or unresectable locally advanced disease that is relapsed or refractory after prior therapy. Prior treatment required: any preceding anticantic therapy whose residual toxicities have resolved to ≤grade 1. Prior treatment excluded: none stated specifically for this cohort. Biomarkers required: morphological features compatible with Ewing sarcoma together with supportive immunoprofile; no additional molecular prerequisite stipulated for entry into this exploratory cohort. Biomarkers excluded: none.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 6
  • per_device_eval_batch_size: 6
  • num_train_epochs: 2
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 6
  • per_device_eval_batch_size: 6
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss
0.0194 500 0.4392
0.0387 1000 0.4558
0.0581 1500 0.457
0.0775 2000 0.4577
0.0969 2500 0.4412
0.1162 3000 0.4488
0.1356 3500 0.4486
0.1550 4000 0.4616
0.1744 4500 0.4728
0.1937 5000 0.4619
0.2131 5500 0.4545
0.2325 6000 0.4495
0.2519 6500 0.4618
0.2712 7000 0.4214
0.2906 7500 0.4412
0.3100 8000 0.4505
0.3294 8500 0.4313
0.3487 9000 0.4508
0.3681 9500 0.4303
0.3875 10000 0.4399
0.4069 10500 0.448
0.4262 11000 0.4408
0.4456 11500 0.4337
0.4650 12000 0.4273
0.4843 12500 0.4385
0.5037 13000 0.4437
0.5231 13500 0.439
0.5425 14000 0.4284
0.5618 14500 0.4214
0.5812 15000 0.4238
0.6006 15500 0.4225
0.6200 16000 0.4187
0.6393 16500 0.4151
0.6587 17000 0.4383
0.6781 17500 0.4243
0.6975 18000 0.4148
0.7168 18500 0.419
0.7362 19000 0.4169
0.7556 19500 0.4184
0.7750 20000 0.4191
0.7943 20500 0.4329
0.8137 21000 0.4339
0.8331 21500 0.4087
0.8524 22000 0.4161
0.8718 22500 0.4242
0.8912 23000 0.4183
0.9106 23500 0.4076
0.9299 24000 0.4095
0.9493 24500 0.4328
0.9687 25000 0.4114
0.9881 25500 0.4242
1.0074 26000 0.4158
1.0268 26500 0.3909
1.0462 27000 0.3999
1.0656 27500 0.4025
1.0849 28000 0.4115
1.1043 28500 0.3843
1.1237 29000 0.4177
1.1431 29500 0.4083
1.1624 30000 0.4025
1.1818 30500 0.4133
1.2012 31000 0.4006
1.2206 31500 0.3985
1.2399 32000 0.3999
1.2593 32500 0.394
1.2787 33000 0.3927
1.2980 33500 0.3964
1.3174 34000 0.4001
1.3368 34500 0.3956
1.3562 35000 0.3899
1.3755 35500 0.388
1.3949 36000 0.3867
1.4143 36500 0.3982
1.4337 37000 0.394
1.4530 37500 0.3942
1.4724 38000 0.3913
1.4918 38500 0.3909
1.5112 39000 0.3757
1.5305 39500 0.3829
1.5499 40000 0.3874
1.5693 40500 0.3883
1.5887 41000 0.3783
1.6080 41500 0.4041
1.6274 42000 0.403
1.6468 42500 0.3806
1.6662 43000 0.3825
1.6855 43500 0.3944
1.7049 44000 0.3956
1.7243 44500 0.382
1.7436 45000 0.3911
1.7630 45500 0.3823
1.7824 46000 0.3771
1.8018 46500 0.3784
1.8211 47000 0.3853
1.8405 47500 0.3864
1.8599 48000 0.3724
1.8793 48500 0.3856
1.8986 49000 0.3862
1.9180 49500 0.376
1.9374 50000 0.377
1.9568 50500 0.3937
1.9761 51000 0.3819
1.9955 51500 0.3899

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.0
  • Transformers: 4.55.4
  • PyTorch: 2.7.1+cu126
  • Accelerate: 1.10.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
22
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support