SentenceTransformer based on google/embeddinggemma-300m

This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: google/embeddinggemma-300m
Maximum Sequence Length: 2048 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("0xFarzad/nav-instruction")
# Run inference
queries = [
    "\u003cROUTE_START\u003e\n\u003cSEG\u003e ST_1 -\u003e ST_2 \u003cDIR:LEFT\u003e \u003cLIGHT\u003e \u003cPOI: Citi Bike / amenity: bicycle_rental\u003e \u003cPOI: Hampton Inn / tourism: hotel\u003e\n\u003cSEG\u003e ST_2 -\u003e ST_3 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Santander / amenity: bank\u003e \u003cPOI_LEFT: Santander / amenity: atm\u003e \u003cPOI_RIGHT: Hampton Inn / tourism: hotel\u003e \u003cPOI_RIGHT: T-Mobile / shop: mobile_phone\u003e\n\u003cSEG\u003e ST_3 -\u003e ST_4 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Chase / amenity: bank\u003e \u003cPOI_LEFT: Chase / amenity: atm\u003e \u003cPOI_LEFT: Pret A Manger / amenity: fast_food / cuisine: sandwich\u003e\n\u003cSEG\u003e ST_4 -\u003e ST_5 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Chase / amenity: bank\u003e \u003cPOI_LEFT: Chase / amenity: atm\u003e \u003cPOI_LEFT: Pret A Manger / amenity: fast_food / cuisine: sandwich\u003e\n\u003cSEG\u003e ST_5 -\u003e ST_6 \u003cDIR:STRAIGHT\u003e \u003cPOI_LEFT: Santander / amenity: bank\u003e \u003cPOI_LEFT: Santander / amenity: atm\u003e \u003cPOI_RIGHT: Hampton Inn / tourism: hotel\u003e \u003cPOI_RIGHT: T-Mobile / shop: mobile_phone\u003e\n\u003cSEG\u003e ST_6 -\u003e ST_7 \u003cDIR:STRAIGHT\u003e \u003cPOI_RIGHT: Sheraton New York Times Square Hotel / tourism: hotel\u003e\n\u003cSEG\u003e ST_7 -\u003e ST_8 \u003cDIR:STRAIGHT\u003e \u003cPOI_RIGHT: Sheraton New York Times Square Hotel / tourism: hotel\u003e\n\u003cROUTE_END\u003e",
]
documents = [
    "After this block take a left at the third light and stop halfway down the block. After the second intersection you'll go down a block with several theaters. A Hampton Inn will be on your right and a bank ahead on the corner. Start by going straight through two intersections.",
    "After this block take a left at the third light and stop halfway down the block. After the second intersection you'll go down a block with several theaters. A Hampton Inn will be on your right and a bank ahead on the corner. Start by going straight through two intersections.",
    'At the next light with a playground on the far left, turn left. Stop in the middle of the playground before a fire station on the left side. Pass a T-intersection with Chase on the far left corner. Walk by a small church to the end of the street and turn left.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.4171, 0.4171, 0.3222]])

Training Details

Training Dataset

Unnamed Dataset

Size: 7,786 training samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 1000 samples:

	anchor	positive	negative
type	string	string	string
details	min: 16 tokens mean: 257.72 tokens max: 1468 tokens	min: 4 tokens mean: 44.19 tokens max: 150 tokens	min: 5 tokens mean: 40.97 tokens max: 115 tokens

Samples:

anchor	positive	negative
`ST_1 -> ST_2 DIR:STRAIGHT ST_2 -> ST_3 DIR:STRAIGHT ST_3 -> ST_4 DIR:STRAIGHT ST_4 -> ST_5 DIR:LEFT ST_5 -> ST_6 DIR:STRAIGHT`	`Turn left at the 2nd light with Nine West on the left corner.`	`Walk to the light with the fountain on your left and turn right.`
`ST_1 -> ST_2 DIR:STRAIGHT ST_2 -> ST_3 DIR:STRAIGHT ST_3 -> ST_4 DIR:STRAIGHT ST_4 -> ST_5 DIR:STRAIGHT ST_5 -> ST_6 DIR:STRAIGHT ST_6 -> ST_7 DIR:STRAIGHT`	`Cooper's Tavern is on the right corner. Go about half way down the block. Go to the lights and turn right. Go through the following three sets of lights. Stop right after McDonald's on the left.`	`Go through the following three sets of lights. Go about half way down the block. Cooper's Tavern is on the right corner. Stop right after McDonald's on the left. Go to the lights and turn right.`
`ST_1 -> ST_2 DIR:STRAIGHT ST_2 -> ST_3 DIR:STRAIGHT ST_3 -> ST_4 DIR:RIGHT ST_4 -> ST_5 DIR:STRAIGHT ST_5 -> ST_6 DIR:STRAIGHT ST_6 -> ST_7 DIR:STRAIGHT`	`Go straight and take a right at the intersection. Continue straight through 2 intersections. then your destination will be right before Exki on the left.`	`Go straight and take a right at the intersection. Head to the first light and make a left. then your destination will be right before Exki on the left.`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Evaluation Dataset

Unnamed Dataset

Size: 866 evaluation samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 866 samples:

	anchor	positive	negative
type	string	string	string
details	min: 16 tokens mean: 262.37 tokens max: 1270 tokens	min: 5 tokens mean: 43.59 tokens max: 164 tokens	min: 5 tokens mean: 40.63 tokens max: 115 tokens

Samples:

anchor	positive	negative
`ST_1 -> ST_2 DIR:STRAIGHT ST_2 -> ST_3 DIR:STRAIGHT`	`stop in the middle of the intersection at the next light.`	`Go through another light past Rail Line Diner.`
`ST_1 -> ST_2 DIR:STRAIGHT ST_2 -> ST_3 DIR:STRAIGHT ST_3 -> ST_4 DIR:STRAIGHT ST_4 -> ST_5 DIR:STRAIGHT ST_5 -> ST_6 DIR:STRAIGHT ST_6 -> ST_7 DIR:STRAIGHT ST_7 -> ST_8 DIR:RIGHT ST_8 -> ST_9 DIR:RIGHT ST_9 -> ST_10 DIR:STRAIGHT ST_10 -> ST_11 DIR:STRAIGHT`	`stop a little more than half way down the block where Abe Lebewohl Park on the right begins between Atmi and Urban Outfitters will be on the left. Go to the traffic light where there is a bus stop on the near left corner and turn right. Go through two lights.`	`Go to the traffic light where there is a bus stop on the near left corner and turn right. Go through two lights. stop a little more than half way down the block where Abe Lebewohl Park on the right begins between Atmi and Urban Outfitters will be on the left.`
`ST_1 -> ST_2 DIR:STRAIGHT ST_2 -> ST_3 DIR:STRAIGHT ST_3 -> ST_4 DIR:RIGHT`	`Turn right and go through the light immediately after.`	`Go through the next intersection and pass the park on your right.`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_eval_batch_size: 16
learning_rate: 2e-05
weight_decay: 0.01
num_train_epochs: 5
warmup_ratio: 0.1
bf16: True
dataloader_num_workers: 2
load_best_model_at_end: True
prompts: task: sentence similarity | query:

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 5
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: True
dataloader_num_workers: 2
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
prompts: task: sentence similarity | query:
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss	Validation Loss
0.1028	100	1.4258	-
0.2055	200	1.0283	-
0.3083	300	0.9212	-
0.4111	400	0.8341	-
0.5139	500	0.9105	1.0776
0.6166	600	0.9106	-
0.7194	700	0.8155	-
0.8222	800	0.9267	-
0.9250	900	0.8819	-
1.0277	1000	0.8002	0.9708
1.1305	1100	0.6874	-
1.2333	1200	0.5829	-
1.3361	1300	0.6127	-
1.4388	1400	0.6469	-
1.5416	1500	0.6435	0.8187
1.6444	1600	0.5558	-
1.7472	1700	0.6104	-
1.8499	1800	0.6795	-
1.9527	1900	0.5831	-
2.0555	2000	0.4615	0.7572
2.1583	2100	0.397	-
2.2610	2200	0.425	-
2.3638	2300	0.4597	-
2.4666	2400	0.3876	-
2.5694	2500	0.3891	0.7638
2.6721	2600	0.4013	-
2.7749	2700	0.3587	-
2.8777	2800	0.4283	-
2.9805	2900	0.4114	-
3.0832	3000	0.2442	0.7846
3.1860	3100	0.2424	-
3.2888	3200	0.2852	-
3.3916	3300	0.2249	-
3.4943	3400	0.3106	-
3.5971	3500	0.2425	0.7473
3.6999	3600	0.2483	-
3.8027	3700	0.2413	-
3.9054	3800	0.3022	-
4.0082	3900	0.2193	-
4.1110	4000	0.1417	0.7994
4.2138	4100	0.1449	-
4.3165	4200	0.1421	-
4.4193	4300	0.1425	-
4.5221	4400	0.1441	-
4.6249	4500	0.1775	0.8077
4.7276	4600	0.1137	-
4.8304	4700	0.1319	-
4.9332	4800	0.1162	-

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.12.11
Sentence Transformers: 5.1.1
Transformers: 4.57.1
PyTorch: 2.9.0+cu128
Accelerate: 1.10.1
Datasets: 4.2.0
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 15

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for 0xFarzad/nav-instruction

Base model

google/embeddinggemma-300m

Finetuned

(122)

this model