SentenceTransformer based on nomic-ai/nomic-embed-text-v1.5
This is a sentence-transformers model finetuned from nomic-ai/nomic-embed-text-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: nomic-ai/nomic-embed-text-v1.5
- Maximum Sequence Length: 2048 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'NomicBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("varadsrivastava/fin-embed-nomic-1.5")
# Run inference
sentences = [
'search_query: How has Deckers Outdoor Corporation’s footwear segment profitability trended over recent periods?',
'search_document: •International net sales, which are included in the reportable operating segment net sales presented above, increased by 16.7% and represented 32.8% and 32.6% of total net sales for the three months ended December 31, 2023, and 2022, respectively. These changes were primarily driven by higher net sales for the DTC channel for the UGG and HOKA brands.\n\nGross Profit. Gross margin increased to 58.7% from 53.0%, compared to the prior period, primarily due to favorable full-price selling for the UGG brand, a decrease in freight costs, favorable UGG brand product mix shifts and benefits from selective price increases, a greater mix of sales in the DTC channel, and a slight benefit from favorable foreign currency exchange rates.\n\nSelling, General, and Administrative Expenses. The net increase in SG&A expenses, compared to the prior period, was primarily the result of the following:\n\n•Increased payroll and related costs of approximately $38,600, primarily due to higher employee headcount and higher performance-based compensation.\n\n•Increased variable advertising and promotion expenses of approximately $21,100, primarily due to higher promotional marketing expenses for the UGG and HOKA brands to drive global brand awareness and market share gains, highlight new product categories, and provide localized marketing.\n\n•Increased other variable net selling expenses of approximately $16,500, primarily due to higher rent and occupancy expenses, credit card fees, and warehouse expenses.\n\n•Increased other operating expenses of approximately $10,000, primarily due to higher depreciation expense, travel expense, IT expenses for programming and software costs, and contract expenses.\n\n•Decreased allowances for trade accounts receivable of approximately $4,100, primarily due to improved customer collections.\n\n•Increased net foreign currency-related gains of $2,500, primarily driven by remeasurements with favorable changes in European exchange rates against the US dollar.\n\nIncome from Operations. Income (loss) from operations by reportable operating segment was as follows:',
'search_document: The net decrease in our effective income tax rate, compared to the prior period, was primarily driven by higher net discrete tax benefits relating to increased return to provision benefits and decreased uncertain tax positions.\n\nForeign income before income taxes was $253,333 and $173,598 and worldwide income before income taxes was $814,734 and $551,224 during the nine months ended December 31, 2023, and 2022, respectively. The decrease in foreign income before income taxes as a percentage of worldwide income before income taxes, compared to the prior period, was primarily due to a higher rate of foreign SG&A expenses and a lower rate of foreign gross profit, relative to domestic, as a percentage of worldwide net sales.\n\nNet Income. The increase in net income, compared to the prior period, was primarily due to higher net sales, operating margins, and interest income. Net income per share increased, compared to the prior period, due to higher net income and lower weighted-average common shares outstanding driven by stock repurchases.\n\nTotal Other Comprehensive Loss, Net of Tax. The decrease in total other comprehensive loss, net of tax, compared to the prior period, was primarily due to lower foreign currency translation losses relating to changes in the net asset position against Asian and European foreign currency exchange rates.\n\nLiquidity\n\nSources of Liquidity. We finance our working capital and operating requirements using a combination of cash and cash equivalents balances, cash provided from ongoing operating activities and, to a lesser extent, available borrowing capacity under our revolving credit facilities. Our working capital requirements begin when we purchase raw and other materials and inventories and continue until we ultimately collect the resulting trade accounts receivable. Given the historical seasonality of our business, our working capital requirements fluctuate significantly throughout the fiscal year, and we utilize available cash to build inventory levels during certain quarters in our fiscal year to support higher selling seasons. While the impact of seasonality has been mitigated to some extent, we expect our working capital requirements will continue to fluctuate from period to period.\n\nAs of December 31, 2023, our cash and cash equivalents are $1,650,802, the majority of which is held in highly rated money market funds and interest-bearing demand deposit accounts with established national financial institutions. We believe our cash and cash equivalents balances, cash provided by operating activities, and available borrowing capacity under our revolving credit facilities, will provide sufficient liquidity to enable us to meet our working capital requirements and contractual obligations for at least the next 12 months and will be sufficient to meet our long-term requirements and plans. However, there can be no assurance that sufficient capital will continue to be available or that it will be available on terms acceptable to us.\n\nOur liquidity may be impacted by a number of factors, including our results of operations, the strength of our brands and market acceptance of our products, impacts of seasonality and weather conditions, our ability to respond to changes in consumer preferences and tastes, the timing of capital expenditures and lease payments, our ability to collect our trade accounts receivables in a timely manner and effectively manage our inventories, our ability to manage supply chain constraints, our ability to respond to macroeconomic, political and legislative developments, and various other risks and uncertainties described in Part I, Item 1A, “Risk Factors,” of our 2023 Annual Report. Furthermore, we may require additional cash resources due to changes in business conditions, strategic initiatives, or stock repurchase strategy, a national or global economic recession, or other future developments, including any investments or acquisitions we may decide to pursue, although we do not have any present commitments with respect to any such investments or acquisitions.\n\n31',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.2441, -0.1960],
# [-0.2441, 1.0000, 0.9887],
# [-0.1960, 0.9887, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 164,066 training samples
- Columns:
sentence_0,sentence_1, andsentence_2 - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 type string string string details - min: 16 tokens
- mean: 25.75 tokens
- max: 42 tokens
- min: 12 tokens
- mean: 637.0 tokens
- max: 2048 tokens
- min: 7 tokens
- mean: 390.4 tokens
- max: 2048 tokens
- Samples:
sentence_0 sentence_1 sentence_2 search_query: What did Corning’s leadership say about Corning’s dividend policy or share repurchase plans?search_document: Uses of Cash
Fixed Rate Cumulative Convertible Preferred Stock, Series A
We had 2,300 outstanding shares of Fixed Rate Cumulative Convertible Preferred Stock, Series A (the “Preferred Stock”) as of December 31, 2020. On January 16, 2021, the Preferred Stock became convertible into 115 million common shares. On April 5, 2021 we executed the Share Repurchase Agreement (“SRA”) with Samsung Display Co., Ltd. (“SDC”) and the Preferred Stock was fully converted as of April 8, 2021. Immediately following the conversion, we repurchased and retired 35 million of the common shares held by SDC for an aggregate purchase price of approximately $1.5 billion, of which approximately $507 million was paid in April in each of 2023, 2022 and 2021.
Pursuant to the SRA, with respect to the remaining 80 million common shares outstanding held by SDC:
• SDC has the option to sell an additional 22 million common shares to Corning in specified tranches from time to time in calendar years 20...search_document: ####December 31, 2023
Cash and cash equivalents##$##1,779
Available credit capacity:####
U.S. dollar revolving credit facility##$##1,500
Chinese yuan facilities##$##110search_query: How does Marsh & McLennan Companies, Inc. view the pace of digital transformation in risk management services and its effect on market competitiveness?search_document: # EXECUTIVE COMPENSATION (Continued) (cont.)
## Mr. South
President and Chief Executive
Officer of Marsh
Vice Chair, Marsh McLennan
- Delivered strong and balanced performance in all geographies, including $11.4 billion in Marsh revenue, an increase of 8% on an underlying basis.
- Invested in Marsh McLennan Agency and our international operations through the completion of 8 acquisitions in 2023, most notably Graham, a top-100 insurance and benefits broker and risk management consultancy, and Honan Insurance Group, a leading specialist insurance broker in corporate risk, employee benefits and real estate insurance with expertise across the Pacific.
- Launched proprietary capabilities and solutions to improve access to capacity in a hardening and more challenging marketplace, including Fast Track, a global quota share facility exclusive to Marsh clients and Victor Insurance Exchange, a reciprocal insurance exchange that delivers replacement catastrophe capacity fo...search_document: # PROXY SUMMARY
This summary highlights information contained elsewhere in this proxy statement. You should read the entire proxy statement carefully before voting.
VOTING MATTERS Page number for
more informationBoard's
recommendationElection of Directors (Item 1)
To elect eleven (11) persons named in the accompanying proxy
statement to serve as directors for a one-year term18 FOR Advisory (Nonbinding) Vote to Approve Named
Executive Officer Compensation (Item 2)
To approve, by nonbinding vote, the compensation of our
named executive officers26 FOR Ratification of Independent Auditor (Item 3)
To ratify the selection of Deloitte & Touche LLP as our
independent registered public accounting firm70 FOR Stockholder Proposal-Shareholder Right to Act by
Written Consent (Item 4)
To vote on one stockholder proposalsearch_query: What dependency risks exist for Principal Financial Group, Inc. due to concentration of revenue in specific insurance or retirement product lines?search_document: The market risk benefit remeasurement loss increased primarily due to a $199.1 million unfavorable impact from periodic and final settlements for derivatives used to hedge MRBs. This change was partially offset by a $172.0 million favorable impact from the change in fair value of the MRB asset (liability), excluding impacts of nonperformance risk, primarily driven by changes in market movements. See Item 8. “Financial Statements and Supplementary Data, Notes to Consolidated Financial Statements, Note 11, Market Risk Benefits” for further information on market effects.
Operating expenses decreased primarily due to $157.0 million of lower incentive compensation costs and a $139.5 million decrease in amounts credited to employee accounts in a nonqualified defined contribution pension plan. These decreases were partially offset by $160.5 million in strategic review costs and impacts related to the exited business incurred in 2022 with no corresponding activity in 2021.
I...search_document: ####December 31, 2023######December 31, 2022
######(in millions)####
Net unrealized losses on fixed maturities, available-for-sale (1)##$##(5,143.0)####$##(7,445.7)
Net unrealized gains (losses) on derivative instruments####(1.6)######50.7
Adjustments for assumed changes in amortization patterns####(5.2)######(1.7)
Adjustments for assumed changes in policyholder liabilities####1.4######0.5
Net unrealized gains on other investments and noncontrolling interest adjustments####43.1######7.9
Provision for deferred income tax benefits####1,088.4######1,570.1
Net unrealized losses on available-for-sale securities and derivative instruments##$##(4,016.9)####$##(5,818.2)- Loss:
TripletLosswith these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 0.3 }Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 2fp16: Truemulti_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Epoch Step Training Loss 0.0975 500 0.6013 0.1950 1000 0.0899 0.2925 1500 0.0777 0.3900 2000 0.0661 0.4875 2500 0.0618 0.5850 3000 0.0593 0.6825 3500 0.0541 0.7800 4000 0.051 0.8775 4500 0.0492 0.9750 5000 0.0484 1.0725 5500 0.0376 1.1700 6000 0.0329 1.2676 6500 0.0322 1.3651 7000 0.0312 1.4626 7500 0.0317 1.5601 8000 0.0281 1.6576 8500 0.0305 1.7551 9000 0.0285 1.8526 9500 0.0268 1.9501 10000 0.0282 Framework Versions
- Python: 3.12.12
- Sentence Transformers: 5.1.1
- Transformers: 4.57.1
- PyTorch: 2.8.0+cu126
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", }TripletLoss
@misc{hermans2017defense, title={In Defense of the Triplet Loss for Person Re-Identification}, author={Alexander Hermans and Lucas Beyer and Bastian Leibe}, year={2017}, eprint={1703.07737}, archivePrefix={arXiv}, primaryClass={cs.CV} } - Loss:
- Downloads last month
- 20
Model tree for varadsrivastava/fin-embed-nomic-1.5
Base model
nomic-ai/nomic-embed-text-v1.5