agro_all-MiniLM-L6-v2_additive_gcn_h512_o64_cosine_e128_early
This is a sentence-transformers model created with on2vec, which augments text embeddings with ontological knowledge using Graph Neural Networks.
Model Details
- Base Text Model: all-MiniLM-L6-v2
- Text Embedding Dimension: 384
- Ontology: agro.owl
- Domain: general
- Ontology Concepts: 4,162
- Concept Alignment: 4,162/4,162 (100.0%)
- Fusion Method: additive
- GNN Architecture: GCN
- Structural Embedding Dimension: 4162
- Output Embedding Dimension: 64
- Hidden Dimensions: 512
- Dropout: 0.0
- Training Date: 2025-09-19
- on2vec Version: 0.1.0
- Source Ontology Size: 7.2 MB
- Model Size: 120.6 MB
- Library: on2vec + sentence-transformers
Technical Architecture
This model uses a multi-stage architecture:
- Text Encoding: Input text is encoded using the base sentence-transformer model
- Ontological Embedding: Pre-trained GNN embeddings capture structural relationships
- Fusion Layer: Simple concatenation of text and ontological embeddings
Embedding Flow:
- Text: 384 dimensions β 512 hidden β 64 output
- Structure: 4162 concepts β GNN β 64 output
- Fusion: additive β Final embedding
How It Works
This model combines:
- Text Embeddings: Generated using the base sentence-transformer model
- Ontological Embeddings: Created by training Graph Neural Networks on OWL ontology structure
- Fusion Layer: Combines both embedding types using the specified fusion method
The ontological knowledge helps the model better understand domain-specific relationships and concepts.
Usage
from sentence_transformers import SentenceTransformer
# Load the model
model = SentenceTransformer('agro_all-MiniLM-L6-v2_additive_gcn_h512_o64_cosine_e128_early')
# Generate embeddings
sentences = ['Example sentence 1', 'Example sentence 2']
embeddings = model.encode(sentences)
# Compute similarity
from sentence_transformers.util import cos_sim
similarity = cos_sim(embeddings[0], embeddings[1])
Training Process
This model was created using the on2vec pipeline:
- Ontology Processing: The OWL ontology was converted to a graph structure
- GNN Training: Graph Neural Networks were trained to learn ontological relationships
- Text Integration: Base model text embeddings were combined with ontological embeddings
- Fusion Training: The fusion layer was trained to optimally combine both embedding types
Intended Use
This model is particularly effective for:
- General domain text processing
- Tasks requiring understanding of domain-specific relationships
- Semantic similarity in specialized domains
- Classification tasks with domain knowledge requirements
Limitations
- Performance may vary on domains different from the training ontology
- Ontological knowledge is limited to concepts present in the source OWL file
- May have higher computational requirements than vanilla text models
Citation
If you use this model, please cite the on2vec framework:
@software{on2vec,
title={on2vec: Ontology Embeddings with Graph Neural Networks},
author={David Steinberg},
url={https://github.com/david4096/on2vec},
year={2024}
}
Created with on2vec π§¬βπ€
- Downloads last month
- 1