agro_all-MiniLM-L6-v2_additive_gcn_h512_o64_cosine_e128_early

This is a sentence-transformers model created with on2vec, which augments text embeddings with ontological knowledge using Graph Neural Networks.

Model Details

Base Text Model: all-MiniLM-L6-v2
- Text Embedding Dimension: 384
Ontology: agro.owl
Domain: general
Ontology Concepts: 4,162
Concept Alignment: 4,162/4,162 (100.0%)
Fusion Method: additive
GNN Architecture: GCN
Structural Embedding Dimension: 4162
Output Embedding Dimension: 64
Hidden Dimensions: 512
Dropout: 0.0
Training Date: 2025-09-19
on2vec Version: 0.1.0
Source Ontology Size: 7.2 MB
Model Size: 120.6 MB
Library: on2vec + sentence-transformers

Technical Architecture

This model uses a multi-stage architecture:

Text Encoding: Input text is encoded using the base sentence-transformer model
Ontological Embedding: Pre-trained GNN embeddings capture structural relationships
Fusion Layer: Simple concatenation of text and ontological embeddings

Embedding Flow:

Text: 384 dimensions → 512 hidden → 64 output
Structure: 4162 concepts → GNN → 64 output
Fusion: additive → Final embedding

How It Works

This model combines:

Text Embeddings: Generated using the base sentence-transformer model
Ontological Embeddings: Created by training Graph Neural Networks on OWL ontology structure
Fusion Layer: Combines both embedding types using the specified fusion method

The ontological knowledge helps the model better understand domain-specific relationships and concepts.

Usage

from sentence_transformers import SentenceTransformer

# Load the model
model = SentenceTransformer('agro_all-MiniLM-L6-v2_additive_gcn_h512_o64_cosine_e128_early')

# Generate embeddings
sentences = ['Example sentence 1', 'Example sentence 2']
embeddings = model.encode(sentences)

# Compute similarity
from sentence_transformers.util import cos_sim
similarity = cos_sim(embeddings[0], embeddings[1])

Training Process

This model was created using the on2vec pipeline:

Ontology Processing: The OWL ontology was converted to a graph structure
GNN Training: Graph Neural Networks were trained to learn ontological relationships
Text Integration: Base model text embeddings were combined with ontological embeddings
Fusion Training: The fusion layer was trained to optimally combine both embedding types

Intended Use

This model is particularly effective for:

General domain text processing
Tasks requiring understanding of domain-specific relationships
Semantic similarity in specialized domains
Classification tasks with domain knowledge requirements

Limitations

Performance may vary on domains different from the training ontology
Ontological knowledge is limited to concepts present in the source OWL file
May have higher computational requirements than vanilla text models

Citation

If you use this model, please cite the on2vec framework:

@software{on2vec,
  title={on2vec: Ontology Embeddings with Graph Neural Networks},
  author={David Steinberg},
  url={https://github.com/david4096/on2vec},
  year={2024}
}

Created with on2vec 🧬→🤖

Downloads last month: 1

Safetensors

Model size

22.7M params

Tensor type

F32