SentenceTransformer based on google/embeddinggemma-300m

This is a sentence-transformers model finetuned from google/embeddinggemma-300m on the med_embed-training-triplets-v1 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("lion-ai/embeddinggemma-300m-medembed-triplets2")
# Run inference
queries = [
    "What was the outcome of the second surgical debulking procedure?",
]
documents = [
    'Although the treatment initially showed signs of efficacy, the tumor progressed rapidly, and the patient died three months after the second surgical debulking procedure.',
    'The patient last followed-up two months after surgery. Proptosis had completely subsided but the patient did have mild ptosis. Nevertheless, the patient was very satisfied with the outcome.',
    'The patient is advised to seek medical attention if they experience any COVID-19 related symptoms, such as fever, cough, and dyspnea.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.5351,  0.3041, -0.0627]])

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.82

Training Details

Training Dataset

med_embed-training-triplets-v1

  • Dataset: med_embed-training-triplets-v1 at 0b344f0
  • Size: 230,357 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 10.65 tokens
    • max: 25 tokens
    • min: 5 tokens
    • mean: 35.86 tokens
    • max: 191 tokens
    • min: 6 tokens
    • mean: 36.06 tokens
    • max: 144 tokens
  • Samples:
    anchor positive negative
    MTHFR homozygous A223V mutation symptoms The patient was admitted to our institution where MTHFR homozygous A223V mutation was identified. Folic acid intake was increased to 800 mcg/d, and no other coagulation tests were abnormal. The patient had several symptoms of MM, including hypercalcemia, bone fractures, anemia, and renal insufficiency. A biopsy showed atypical clonal plasma cells with Cluster of Differentiation (CD)138 positive infiltration.
    Causes of spindle cell malignancy in the duodenal wall Histological analysis revealed a spindle cell malignancy that was positive for CD21, CD23, and vimentin, but negative for CD20, CD34, CD35, CD117, DOG 1, and smooth muscle actin. Based on immunohistochemical analysis of the tumor cells, the primary buttock tumor was diagnosed as a skeletal muscle metastasis of the primary small intestine gastrointestinal stromal tumor (GIST).
    What was the patient's main complaint during hospital admission? This 27-year-old pregnant woman was admitted to the hospital at 36 weeks gestation with acute vision loss in her left eye and severe onset headache. The patient was discharged the next day
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

med_embed-training-triplets-v1

  • Dataset: med_embed-training-triplets-v1 at 0b344f0
  • Size: 100 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 100 samples:
    anchor positive negative
    type string string string
    details
    • min: 5 tokens
    • mean: 10.95 tokens
    • max: 32 tokens
    • min: 7 tokens
    • mean: 33.45 tokens
    • max: 94 tokens
    • min: 8 tokens
    • mean: 36.28 tokens
    • max: 90 tokens
  • Samples:
    anchor positive negative
    What was the initial presentation of the patient? The 45-year-old female patient presented to the department with an enlarging lesion in her upper abdomen. The patient was transferred to this hospital for further evaluation.
    giant omphalocele symptoms The patient, a 9-year-old female, presented to the hospital with a large lump in the anterior abdominal wall extending from the xiphisternum to the level of iliac crest. The patient presented with bilateral nasovestibular lumps which grew in size over several months, occluding nasal entrance and protruding outside the nose.
    granulomatous lymphocytic interstitial lung disease treatment The patient had clubbing and chronic lung findings, and thorax CT revealed extended and severe bronchiectasis with thickened bronchial walls, some granulomatous nodules and mosaic appearance, compatible with granulomatous lymphocytic interstitial lung disease (GLILD). Regular intravenous immunoglobulin (IVIG) replacement was started. The patient was treated with methylprednisolone pulse therapy followed by oral prednisolone (PSL) and cyclophosphamide intravenously. After treatment, arthralgia, renal function, proteinuria, and skin manifestations improved.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 8
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • dataloader_num_workers: 4
  • load_best_model_at_end: True
  • ddp_find_unused_parameters: False
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: False
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss medembed-triplets-dev-100_cosine_accuracy
-1 -1 - - 0.6100
0.0111 10 2.1926 - -
0.0222 20 2.0911 - -
0.0333 30 2.0132 - -
0.0445 40 1.9072 - -
0.0556 50 1.8247 - -
0.0667 60 1.728 - -
0.0778 70 1.6565 - -
0.0889 80 1.5337 - -
0.1000 90 1.5233 - -
0.1111 100 1.4377 1.3134 0.6900
0.1222 110 1.4366 - -
0.1334 120 1.3788 - -
0.1445 130 1.3681 - -
0.1556 140 1.3405 - -
0.1667 150 1.3196 - -
0.1778 160 1.3179 - -
0.1889 170 1.223 - -
0.2000 180 1.2656 - -
0.2111 190 1.2825 - -
0.2223 200 1.2679 1.1458 0.7300
0.2334 210 1.263 - -
0.2445 220 1.26 - -
0.2556 230 1.2131 - -
0.2667 240 1.1685 - -
0.2778 250 1.1258 - -
0.2889 260 1.2015 - -
0.3000 270 1.141 - -
0.3112 280 1.1421 - -
0.3223 290 1.1458 - -
0.3334 300 1.0869 1.0779 0.8000
0.3445 310 1.1379 - -
0.3556 320 1.1522 - -
0.3667 330 1.1456 - -
0.3778 340 1.1575 - -
0.3889 350 1.1281 - -
0.4001 360 1.1607 - -
0.4112 370 1.1374 - -
0.4223 380 1.0961 - -
0.4334 390 1.1103 - -
0.4445 400 1.1287 1.0376 0.7800
0.4556 410 1.0887 - -
0.4667 420 1.142 - -
0.4778 430 1.1264 - -
0.4890 440 1.0787 - -
0.5001 450 1.0617 - -
0.5112 460 1.0828 - -
0.5223 470 1.0633 - -
0.5334 480 1.131 - -
0.5445 490 1.0847 - -
0.5556 500 1.0767 1.0109 0.7800
0.5667 510 1.0654 - -
0.5779 520 1.0544 - -
0.5890 530 1.0373 - -
0.6001 540 1.0518 - -
0.6112 550 1.0346 - -
0.6223 560 1.04 - -
0.6334 570 1.075 - -
0.6445 580 1.1083 - -
0.6556 590 1.1047 - -
0.6668 600 1.0793 0.9925 0.8000
0.6779 610 1.0693 - -
0.6890 620 1.0581 - -
0.7001 630 1.0244 - -
0.7112 640 1.0374 - -
0.7223 650 1.0286 - -
0.7334 660 1.0073 - -
0.7445 670 1.0464 - -
0.7557 680 1.0196 - -
0.7668 690 1.0014 - -
0.7779 700 1.0596 0.9893 0.8100
0.7890 710 1.0668 - -
0.8001 720 1.0258 - -
0.8112 730 1.0568 - -
0.8223 740 1.0659 - -
0.8334 750 1.0675 - -
0.8446 760 1.0087 - -
0.8557 770 1.058 - -
0.8668 780 1.018 - -
0.8779 790 1.0785 - -
0.889 800 1.0915 0.9781 0.82
0.9001 810 0.9835 - -
0.9112 820 1.0352 - -
0.9224 830 1.0306 - -
0.9335 840 1.0432 - -
0.9446 850 1.0094 - -
0.9557 860 1.0283 - -
0.9668 870 0.9887 - -
0.9779 880 1.013 - -
0.9890 890 1.0325 - -
1.0 900 1.0256 0.9782 0.8200
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.1.1
  • Transformers: 4.57.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.9.0
  • Datasets: 4.1.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lion-ai/embeddinggemma-300m-medembed-triplets2

Finetuned
(121)
this model

Dataset used to train lion-ai/embeddinggemma-300m-medembed-triplets2

Evaluation results