SentenceTransformer based on google/embeddinggemma-300m

This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google/embeddinggemma-300m
  • Maximum Sequence Length: 2048 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("0xFarzad/nav-instruction")
# Run inference
queries = [
    "\u003cROUTE_START\u003e\n\u003cSEG\u003e ST_1 -\u003e ST_2 \u003cDIR:LEFT\u003e \u003cLIGHT\u003e \u003cPOI: Citi Bike / amenity: bicycle_rental\u003e \u003cPOI: Hampton Inn / tourism: hotel\u003e\n\u003cSEG\u003e ST_2 -\u003e ST_3 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Santander / amenity: bank\u003e \u003cPOI_LEFT: Santander / amenity: atm\u003e \u003cPOI_RIGHT: Hampton Inn / tourism: hotel\u003e \u003cPOI_RIGHT: T-Mobile / shop: mobile_phone\u003e\n\u003cSEG\u003e ST_3 -\u003e ST_4 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Chase / amenity: bank\u003e \u003cPOI_LEFT: Chase / amenity: atm\u003e \u003cPOI_LEFT: Pret A Manger / amenity: fast_food / cuisine: sandwich\u003e\n\u003cSEG\u003e ST_4 -\u003e ST_5 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Chase / amenity: bank\u003e \u003cPOI_LEFT: Chase / amenity: atm\u003e \u003cPOI_LEFT: Pret A Manger / amenity: fast_food / cuisine: sandwich\u003e\n\u003cSEG\u003e ST_5 -\u003e ST_6 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Santander / amenity: bank\u003e \u003cPOI_LEFT: Santander / amenity: atm\u003e \u003cPOI_RIGHT: Hampton Inn / tourism: hotel\u003e \u003cPOI_RIGHT: T-Mobile / shop: mobile_phone\u003e\n\u003cSEG\u003e ST_6 -\u003e ST_7 \u003cDIR:STRAIGHT\u003e \u003cPOI_RIGHT: Sheraton New York Times Square Hotel / tourism: hotel\u003e\n\u003cSEG\u003e ST_7 -\u003e ST_8 \u003cDIR:STRAIGHT\u003e \u003cPOI_RIGHT: Sheraton New York Times Square Hotel / tourism: hotel\u003e\n\u003cROUTE_END\u003e",
]
documents = [
    "After this block take a left at the third light and stop halfway down the block. After the second intersection you'll go down a block with several theaters. A Hampton Inn will be on your right and a bank ahead on the corner. Start by going straight through two intersections.",
    "After this block take a left at the third light and stop halfway down the block. After the second intersection you'll go down a block with several theaters. A Hampton Inn will be on your right and a bank ahead on the corner. Start by going straight through two intersections.",
    'At the next light with a playground on the far left, turn left. Stop in the middle of the playground before a fire station on the left side. Pass a T-intersection with Chase on the far left corner. Walk by a small church to the end of the street and turn left.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.4171, 0.4171, 0.3222]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 7,786 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 16 tokens
    • mean: 257.72 tokens
    • max: 1468 tokens
    • min: 4 tokens
    • mean: 44.19 tokens
    • max: 150 tokens
    • min: 5 tokens
    • mean: 40.97 tokens
    • max: 115 tokens
  • Samples:
    anchor positive negative

    ST_1 -> ST_2 DIR:STRAIGHT
    ST_2 -> ST_3 DIR:STRAIGHT
    ST_3 -> ST_4 DIR:STRAIGHT
    ST_4 -> ST_5 DIR:LEFT
    ST_5 -> ST_6 DIR:STRAIGHT
    Turn left at the 2nd light with Nine West on the left corner. Walk to the light with the fountain on your left and turn right.

    ST_1 -> ST_2 DIR:STRAIGHT
    ST_2 -> ST_3 DIR:STRAIGHT
    ST_3 -> ST_4 DIR:STRAIGHT
    ST_4 -> ST_5 DIR:STRAIGHT
    ST_5 -> ST_6 DIR:STRAIGHT
    ST_6 -> ST_7 DIR:STRAIGHT
    Cooper's Tavern is on the right corner. Go about half way down the block. Go to the lights and turn right. Go through the following three sets of lights. Stop right after McDonald's on the left. Go through the following three sets of lights. Go about half way down the block. Cooper's Tavern is on the right corner. Stop right after McDonald's on the left. Go to the lights and turn right.

    ST_1 -> ST_2 DIR:STRAIGHT
    ST_2 -> ST_3 DIR:STRAIGHT
    ST_3 -> ST_4 DIR:RIGHT
    ST_4 -> ST_5 DIR:STRAIGHT
    ST_5 -> ST_6 DIR:STRAIGHT
    ST_6 -> ST_7 DIR:STRAIGHT
    Go straight and take a right at the intersection. Continue straight through 2 intersections. then your destination will be right before Exki on the left. Go straight and take a right at the intersection. Head to the first light and make a left. then your destination will be right before Exki on the left.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 866 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 866 samples:
    anchor positive negative
    type string string string
    details
    • min: 16 tokens
    • mean: 262.37 tokens
    • max: 1270 tokens
    • min: 5 tokens
    • mean: 43.59 tokens
    • max: 164 tokens
    • min: 5 tokens
    • mean: 40.63 tokens
    • max: 115 tokens
  • Samples:
    anchor positive negative

    ST_1 -> ST_2 DIR:STRAIGHT
    ST_2 -> ST_3 DIR:STRAIGHT
    stop in the middle of the intersection at the next light. Go through another light past Rail Line Diner.

    ST_1 -> ST_2 DIR:STRAIGHT
    ST_2 -> ST_3 DIR:STRAIGHT
    ST_3 -> ST_4 DIR:STRAIGHT
    ST_4 -> ST_5 DIR:STRAIGHT
    ST_5 -> ST_6 DIR:STRAIGHT
    ST_6 -> ST_7 DIR:STRAIGHT
    ST_7 -> ST_8 DIR:RIGHT
    ST_8 -> ST_9 DIR:RIGHT
    ST_9 -> ST_10 DIR:STRAIGHT
    ST_10 -> ST_11 DIR:STRAIGHT
    stop a little more than half way down the block where Abe Lebewohl Park on the right begins between Atmi and Urban Outfitters will be on the left. Go to the traffic light where there is a bus stop on the near left corner and turn right. Go through two lights. Go to the traffic light where there is a bus stop on the near left corner and turn right. Go through two lights. stop a little more than half way down the block where Abe Lebewohl Park on the right begins between Atmi and Urban Outfitters will be on the left.

    ST_1 -> ST_2 DIR:STRAIGHT
    ST_2 -> ST_3 DIR:STRAIGHT
    ST_3 -> ST_4 DIR:RIGHT
    Turn right and go through the light immediately after. Go through the next intersection and pass the park on your right.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 5
  • warmup_ratio: 0.1
  • bf16: True
  • dataloader_num_workers: 2
  • load_best_model_at_end: True
  • prompts: task: sentence similarity | query:

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 2
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: task: sentence similarity | query:
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss
0.1028 100 1.4258 -
0.2055 200 1.0283 -
0.3083 300 0.9212 -
0.4111 400 0.8341 -
0.5139 500 0.9105 1.0776
0.6166 600 0.9106 -
0.7194 700 0.8155 -
0.8222 800 0.9267 -
0.9250 900 0.8819 -
1.0277 1000 0.8002 0.9708
1.1305 1100 0.6874 -
1.2333 1200 0.5829 -
1.3361 1300 0.6127 -
1.4388 1400 0.6469 -
1.5416 1500 0.6435 0.8187
1.6444 1600 0.5558 -
1.7472 1700 0.6104 -
1.8499 1800 0.6795 -
1.9527 1900 0.5831 -
2.0555 2000 0.4615 0.7572
2.1583 2100 0.397 -
2.2610 2200 0.425 -
2.3638 2300 0.4597 -
2.4666 2400 0.3876 -
2.5694 2500 0.3891 0.7638
2.6721 2600 0.4013 -
2.7749 2700 0.3587 -
2.8777 2800 0.4283 -
2.9805 2900 0.4114 -
3.0832 3000 0.2442 0.7846
3.1860 3100 0.2424 -
3.2888 3200 0.2852 -
3.3916 3300 0.2249 -
3.4943 3400 0.3106 -
3.5971 3500 0.2425 0.7473
3.6999 3600 0.2483 -
3.8027 3700 0.2413 -
3.9054 3800 0.3022 -
4.0082 3900 0.2193 -
4.1110 4000 0.1417 0.7994
4.2138 4100 0.1449 -
4.3165 4200 0.1421 -
4.4193 4300 0.1425 -
4.5221 4400 0.1441 -
4.6249 4500 0.1775 0.8077
4.7276 4600 0.1137 -
4.8304 4700 0.1319 -
4.9332 4800 0.1162 -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.1
  • Transformers: 4.57.1
  • PyTorch: 2.9.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.2.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
15
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 0xFarzad/nav-instruction

Finetuned
(122)
this model