--- tags: - sentence-transformers - sentence-similarity - feature-extraction - dense - generated_from_trainer - dataset_size:16688 - loss:TripletLoss base_model: BAAI/bge-large-en-v1.5 widget: - source_sentence: Where do I select the build type in R³S Modeler? sentences: - '## Perform build on(requiresR³S Enterprise) The Perform build on drop-down list is not available when running a results workspace. You can use the Perform build on drop-down list to specify whether to use distributed processing to build (generate, compile, and link) the calculations specified in the batch or the model. You can select an option from the drop-down list: * Local: build the batch or the model on the current machine (the controller) without using distributed processing. * Remote: use distributed processing to build the batch or the model on one of the remote machines (a worker). This is useful when you do not want to occupy the controller with these tasks. If you build or run a batch or a model, R³S Modeler uses the value of the Perform build on property of the batch or model as the default value of the drop-down list. R³S Modeler uses the value of the Perform build on drop-down list to set the value of the Perform build on property of the batch or the model in the results workspace. You can specify the connector to use for distributed processing on the Distribution tab of the Options dialog box. For the Remote option to use distributed processing, you must use the Microsoft® HPC Pack connector or the Azure® Batch connector.' - 'Server hierarchy The Server hierarchy property is a property of the following components: - **Development sandbox workspace or approval sandbox workspace(requiresR³S Development Manager)**: The Server hierarchy property of a development sandbox workspace or an approval sandbox workspace shows the name of the sandbox, branch, and library with which you associated the workspace. - **Snapshot workspace(requiresR³S Development Manager)**: The Server hierarchy property of a snapshot workspace shows the name of the changeset, label, or sandbox, the branch, and the library from which you created the snapshot.' - Result grid The result grid of the Analyzer tab of a results workspace shows the results at different calculation dates for the current variable, the variables that it depends on, and the variables that depend on it if these results are available in sample output. The result grid is more useful for analyzing layers than data layers, because a data layer has only one calculation date, corresponding to the portfolio date of the model or model alias. Each scalar numeric and indicator variable has a checkbox before its name. Selecting one of these checkboxes clears the others. In the graph pane, R³S Modeler graphs the results for the variable whose checkbox you select. If events occur in a projection step of a layer, R³S Modeler shows these in pink in the result grid. To hide the results for the events, select the Hide events checkbox. This also stops events from being indicated by the green vertical line in the graph pane. If loops occur in a projection step of a layer, R³S Modeler shows these in blue in the result grid. To hide the results for the loops, select the Hide loops checkbox. Because loop variables have no time associated with them, they are never shown in the graph pane. Selecting the Hide events checkbox to hide event results or the Hide loops checkbox to hide loop results does not affect the results; it just hides them in the result grid. With these checkboxes selected, it might not be easy to understand the calculation of non-portfolio variables that are summed across event dates in the step or how the final values of loop variables have been extracted into step variables. The dependency diagram still shows all the precedents and is not affected by the checkboxes. The result grid shows the variable being analyzed in its first row. Precedent variables are shown immediately beneath the chosen variable, and dependent variables are shown below these. The currently selected date is highlighted with a yellow box. Yellow boxes also highlight the variable being analyzed and its precedents and dependents. Highlighting a cell in the result grid and pressing the Enter key makes the corresponding variable and date the subject of the analysis. You can also do this by right-clicking in the result grid and choosing Analyze from the context menu. Highlighting a cell in the result grid and pressing the Home key makes the corresponding variable the subject of the analysis at the layer start date. You can also do this by right-clicking in the result grid and choo - source_sentence: Are MtF views supported in R³S Modeler? sentences: - '## Properties - **MtF views**: - **General**: Name - **Filters**: Filter formula - **Data inputs**: File format - **MtF cube**: MtF view type - **Auditing**: Last modified - **Sub MtF views**: - **General**: Sub MtF view The other properties of a sub MtF view are the same as those of the underlying MtF view. - **MtF view variables**: - **General**: Variable - **Formula**: Formula - **Auditing**: Last modified Additionally, the properties of the variable specified in the Variable property are inherited as global properties.' - '## Remarks This function acts like the Choose_Life_Table function followed by the Reduce_Life_Table function. It avoids the need to use a separate life table variable to store the chosen life table before reducing it. When you use this function in the formula of a variable, that formula can contain nothing outside the call to this function. This means that you cannot include the function call as part of a larger expression in a single formula. Instead, you can use a variable to call the function and then refer to this variable in the formula of another variable. The function first uses the character expression to select a life table in the workspace by name. R³S Modeler knows how many dimensions the life table should have by counting the number of arguments. If there are no arguments for the additional dimensions then there should be just two dimensions: * Select_Duration should be dimension 1 with start position 0 and * Age should be dimension 2 with start position 0. If there are no life tables in the workspace with these dimensions and, where applicable, the dimension names and start positions you specify for dimensions 3, 4 and 5 then a generator error occurs. If the life table named in the character expression does not have the dimension names and start positions you specify then a runtime error occurs. Then the function removes any dimensions in addition to the mandatory Select_Duration and Age dimensions by selecting only the life table rates corresponding to the specified element position in each of the additional dimensions. This is similar to using the Slice function repeatedly. This then leaves a 2-dimensional life table for the function to reduce further. Suppose that: * The Select_Duration dimension has size r (corresponding to a select period of r-1). * The Age dimension has size n (corresponding to a maximum age of n-1). * Qx(x, t) denotes the select rate for current age x and current select duration t (t < r-1). * Qx(x, ) denotes the ultimate rate for current age x. * y is an integer that specifies the age at entry. The function returns a reduced life table with dimensions: * Select_Duration of size 1 and start position 0 and * Age of size n and start position 0. If y is in the select age range of the life table (greater than or equal to the value of the Minimum select age property and less than or equal to the value of the Maximum select age property) then the life table rates in the reduced life table are: | Select_Duration Age | 0 0 | 0' - 'The main topic ''Layers'' has the following related sub-topics: * **Layer examples** : The example user workspace includes examples of layers.' - source_sentence: What data structure is required for the 'Array' argument? sentences: - 'Maximum select age The Maximum select age property is a property of the following component: * Life table You can use this mandatory property to specify the highest age at entry for which select rates are available in the life table. R³S Modeler uses ultimate rates for ages at entry above this age. You can specify an integer greater than or equal to 0 and less than or equal to 200. The value you specify should be greater than or equal to the value of the Minimum select age property. When the size of the Select_Duration dimension of the life table is 1, the value you specify should be less than or equal to the size of the Age dimension minus 1 (that is, the maximum age of the life table), though this property makes no difference in this situation, because there are no select rates in the life table. When the size of the Select_Duration dimension is greater than 1, the value you specify should be less than or equal to the size of the Age dimension minus the size of the Select_Duration dimension plus 1 (that is, the maximum age of the life table plus 1 minus the select period of the life table) to as to give enough element positions in the Age dimension for the select rates for this maximum select age at entry.' - '## Circumstances The formula for the specified variable contains a Move_Left or Move_Right function call - say Move_Left(Array_1, , Array_2) or Move_Right(Array_1, , Array_2). The function call is invalid because the dimension start positions of Array_2 (the ''replacement'' array to be attached to Array_1) do not all match the start positions of the corresponding dimensions of Array_1.' - '## Arguments Array | An array variable or expression. Dimension | A dimension name of the arrayArray. The dimension must have at least one index value that is not blank.' - source_sentence: How does R3S Modeler handle non-integer arguments when the data type is indicator? sentences: - '## Remarks When you use the Range function in a formula or other property whose data type is numeric or indicator, all three arguments should be numeric or indicator expressions. When the data type of the formula or other property is indicator, R³S Modeler rounds values of the arguments that are not integers towards zero to give integers. When the data type of the formula or other property is date, the first two arguments should be date expressions. You can combine sequences of values from different calls to the Range and Set functions by separating the function calls with a semicolon (;). You can use the Range and Set functions in the Formula property of variables in a data view, database view or MtF view to produce multiple copies of each data record, which can be useful for producing test data.' - Pareto (not available in R³S Modeler Lite ) Returns values relating to the Pareto distribution. - '## Circumstances Division of an indicator variable by zero has been attempted.' - source_sentence: What inputs does Qx accept? sentences: - '## Examples q | =Qx(58, , LT) q | =Qx(59+1, 1, LT)' - 'Invalid Target Variable Message: '''' cannot be an array because it is the target input range.' - '## Circumstances The definition of a user function has no arguments defined.' pipeline_tag: sentence-similarity library_name: sentence-transformers --- # SentenceTransformer based on BAAI/bge-large-en-v1.5 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) - **Maximum Sequence Length:** 384 tokens - **Output Dimensionality:** 1024 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 384, 'do_lower_case': True, 'architecture': 'BertModel'}) (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("dhruvnayee/help_texted_mined_r3s_0810") # Run inference sentences = [ 'What inputs does Qx accept?', '## Examples q\ue04f\ue052 | =Qx(58, , LT) q\ue028\ue04f\ue053\ue029\ue02a\ue027 | =Qx(59+1, 1, LT)', "Invalid Target Variable Message: '' cannot be an array because it is the target input range.", ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1024] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities) # tensor([[1.0000, 0.8317, 0.8164], # [0.8317, 1.0000, 0.4416], # [0.8164, 0.4416, 1.0000]]) ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 16,688 training samples * Columns: sentence_0, sentence_1, and sentence_2 * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | sentence_2 | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | string | | details |
  • min: 7 tokens
  • mean: 13.85 tokens
  • max: 26 tokens
|
  • min: 3 tokens
  • mean: 123.19 tokens
  • max: 384 tokens
|
  • min: 3 tokens
  • mean: 118.89 tokens
  • max: 384 tokens
| * Samples: | sentence_0 | sentence_1 | sentence_2 | |:-----------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | What are the new features in this release? | ## What's new from previous upgrades | What's new in version 1.2 Targeting * This functionality may be used to find the value of an input variable giving the specified value of an output variable, for example for use in profit testing Enhancements to distributed processing * This provides for greater functionality in distributed processing Enhancements to Compare * This extends the functionality of the Compare Page to allow comparison of multiple components Initialization variables * This new component allows greater flexibility and simplicity of coding variables, allowing different definitions at the outset of a projection and during the projection Stochastic processes (not available in R³S Modeler Lite ) * This involves enhancements to the existing stochastic process functionality Results workspaces * This enhances the information provided about related results workspaces on the opening page of a workspace Program Linker and Model Builder * This enables greater ease of adding and moving items within these tools Analyzer e... | | What does this element represent? | Data_Source_Name The Data_Source_Name system variable is a character variable that gives the name of the data source. You can use this system variable in a data source in a data process in the data layer of a model. This system variable is a placeholder variable. | Cannot rerun results workspace Message: It is not possible to rerun a results workspace that contained any model that failed to build. | | How does R3S Modeler create parent program records? | Record_Is_Last_Step The Record_Is_Last_Step system variable is an indicator variable that is 1 if the Record_End_Date system variable for the current program record is a date that is in the current projection step and 0 otherwise. This system variable is not defined for parent program records that R³S Modeler creates solely by aggregating child program records (and does not read from data). | ## Circumstances This error occurs when the variable used in a formula does not exist in the workspace (for example, it is not in the Variable Chooser ). For example, if a variable, say, CF_Premium is defined as Prem_Annual * Prob_Surr and the variable Prob_Surr does not exist within the workspace then the above error will occur. | * Loss: [TripletLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters: ```json { "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `num_train_epochs`: 1 - `fp16`: True - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 8 - `per_device_eval_batch_size`: 8 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin - `router_mapping`: {} - `learning_rate_mapping`: {}
### Training Logs | Epoch | Step | Training Loss | |:------:|:----:|:-------------:| | 0.2397 | 500 | 4.8389 | | 0.4794 | 1000 | 4.7385 | | 0.7191 | 1500 | 4.7068 | | 0.9588 | 2000 | 4.7199 | ### Framework Versions - Python: 3.11.11 - Sentence Transformers: 5.1.1 - Transformers: 4.49.0 - PyTorch: 2.5.1+cu124 - Accelerate: 1.3.0 - Datasets: 3.2.0 - Tokenizers: 0.21.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### TripletLoss ```bibtex @misc{hermans2017defense, title={In Defense of the Triplet Loss for Person Re-Identification}, author={Alexander Hermans and Lucas Beyer and Bastian Leibe}, year={2017}, eprint={1703.07737}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```