SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'evaluate, inbound, AD1, LTM, performance metrics',
'STRAT | Growth Engine Performance - Inbound AD1/LTM: Purpose The dashboard suite STRAT Growth Engine Performance is designed to monitor and analyze sales performance KPIs across the sales funnel This dashboard titled STRAT Growth Engine Performance Inbound AD1LTM is designed to track and analyze the performance of the AD1 and LTM sales funnels Audience The target audience includes sales managers performance analysts and stakeholders involved in monitoring and optimizing sales activities and growth strategies Good to know The dashboard suite integrates multiple sheets each focusing on specific aspects of sales performance eg CVR SQLs and meetings STRAT Growth Engine Performance Channel Cockpit STRAT Growth Engine Performance Inbound AD1LTM STRAT Growth Engine Performance Inbound SQLs STRAT Growth Engine Performance Meetings created STRAT Growth Engine Performance Meetings done and planned STRAT Growth Engine Performance Cancellation rate STRAT Growth Engine Performance Quotes Signed STRAT Growth Engine Performance Size Length Opp STRAT Growth Engine Performance Time to Close STRAT Growth Engine Performance Time to Pitch STRAT Growth Engine Performance CVR Content Metrics Key metrics include SQLs attacked day 1 AD1 Share of relevant leads attacked by sales on the day of their creation SQL to meeting rate LTM Share of relevant leads linked to opportunities that convert into meetings not canceled Dimensions Key dimensions include Lead Origin and Lead Origin category Cluster Name Country Cost Center and team hierarchy Source The data is sourced from datasetlead Data Update The data is updated daily with an SLA ensuring availability by 800 AM',
'accounts_agendas_eligibility_helpers: This table contains eligibility helpers for accounts and agendas retention s3subfolder C5harddeleteaccounts rowstodelete leftjoinclauses schematable dtmproductchurnersproaccountschurneddeleted joincondition accountsagendaseligibilityhelpersaccountid proaccountschurneddeletedproaccountid where proaccountschurneddeletedproaccountid IS NOT NULL s3retention keep0day s3subfolder C5harddeleteorganizations rowstodelete leftjoinclauses schematable dtmproductchurnersorganizationschurneddeleted joincondition accountsagendaseligibilityhelpersorganizationid organizationschurneddeletedorganizationid where organizationschurneddeletedorganizationid IS NOT NULL s3retention keep0day',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.5342, -0.0627],
# [ 0.5342, 1.0000, 0.0066],
# [-0.0627, 0.0066, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,319 training samples
- Columns:
sentence_0,sentence_1, andlabel - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 3 tokens
- mean: 9.72 tokens
- max: 21 tokens
- min: 12 tokens
- mean: 95.3 tokens
- max: 256 tokens
- min: 1.0
- mean: 1.0
- max: 1.0
- Samples:
sentence_0 sentence_1 label Patient appointment requestsAppointment Requests: Business description Appointment requestis a feature that allows patients to send requests for booking Appointments Its only dedicated for Hospitals and for very specific use cases Appointment request is link to an Appointments Appointment requestis only available on certain Visit Motivesregarding Appointment Requests Visit Motive Activation Rules When an appointment requestis raised then an Appointment Requests Entry is created Each Entry is composed by Comments Pain points Patient qualification for some visit motives patients need to be properly qualified ie need to check that what patients need matches with what the hospitals can offer or check that the prerequisites for reimbursement by health insurance are met This is only possible when a HCP from the hospital usually a Doctor reviews the appointment request and then decides if the request should be handled There is currently no easy way to reject appointments via Doctolib so hospitals prefer not to have t...1.0prescriptions missed renewals treatmenttreatment_missed_renewals: Table about prescription that have missed treatments renewals PK treatmentuuid1.0integrated, feature, data, PRM, scopeprm: Track data from features integrated to PRM scope1.0 - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 2multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Framework Versions
- Python: 3.13.7
- Sentence Transformers: 5.1.1
- Transformers: 4.57.0
- PyTorch: 2.8.0
- Accelerate: 1.11.0
- Datasets: 4.4.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 16
Model tree for PHNeutre/all-MiniLM-L6-v2_finetuned_20251119_174549
Base model
sentence-transformers/all-MiniLM-L6-v2