Upload folder using huggingface_hub
Browse files- .gitignore +1 -1
- README.md +6 -3
.gitignore
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
*__pycache__
|
| 2 |
*.idea
|
| 3 |
*.DS_Store
|
| 4 |
-
*data/
|
|
|
|
| 1 |
*__pycache__
|
| 2 |
*.idea
|
| 3 |
*.DS_Store
|
| 4 |
+
*data/
|
README.md
CHANGED
|
@@ -47,15 +47,18 @@ Such a benchmark would demand substantial new data collection efforts and instru
|
|
| 47 |
Consequently, we evaluate our models indirectly, using surrogate metrics (e.g., cross-modal retrieval performance, odor descriptor classification accuracy, clustering quality).
|
| 48 |
While these evaluations do not provide ground-truth verification of odor presence in images, they offer a first step toward demonstrating alignment between modalities.
|
| 49 |
We draw analogy from past successes in ML datasets such as precursors to CLIP that lacked large paired datasets and were evaluated on retrieval-like tasks.
|
| 50 |
-
|
|
|
|
|
|
|
|
|
|
| 51 |
|
| 52 |
|
| 53 |
## Models
|
| 54 |
We offer four embedding models with this repository:
|
| 55 |
-
- (1) `ovle-large-base`: The original OVL base model. This model is optimal for online tasks where accuracy is
|
| 56 |
- (2) `ovle-large-graph`: The OVL base model built around a graph-attention-convolution network. This model is optimal for online tasks where accuracy is paramount and inference time is not as critical.
|
| 57 |
- (3) `ovle-small-base`: The original OVL base model optimized for faster inference and edge-based robotics. This model is optimized for export to common frameworks that run on Android, iOS, Rust, and others.
|
| 58 |
-
- (4) `ovle-small-graph`: The OVL graph model optimized for faster inference and edge robotics applications.
|
| 59 |
|
| 60 |
## Directory Structure
|
| 61 |
|
|
|
|
| 47 |
Consequently, we evaluate our models indirectly, using surrogate metrics (e.g., cross-modal retrieval performance, odor descriptor classification accuracy, clustering quality).
|
| 48 |
While these evaluations do not provide ground-truth verification of odor presence in images, they offer a first step toward demonstrating alignment between modalities.
|
| 49 |
We draw analogy from past successes in ML datasets such as precursors to CLIP that lacked large paired datasets and were evaluated on retrieval-like tasks.
|
| 50 |
+
Just as CLIP used contrastive objectives to construct vision-language relationships, we borrow similar principles to strengthen olfaction-vision-language weights.
|
| 51 |
+
Humans interpret smell with lingual descriptors such as "fruity" and "musky", allowing language to act as a bridge between olfaction and vision data.
|
| 52 |
+
|
| 53 |
+
Whether these models are used for better vision-scent navigation with drones, triangulating the source of an odor in an image, extracting aromas from a scene, or augmenting a VR experience with scent, we hope their release will catalyze further research and encourage the community to contribute to building standardized datasets and evaluation protocols for olfaction-vision-language learning.
|
| 54 |
|
| 55 |
|
| 56 |
## Models
|
| 57 |
We offer four embedding models with this repository:
|
| 58 |
+
- (1) `ovle-large-base`: The original OVL base model. This model is optimal for online tasks where accuracy is critical.
|
| 59 |
- (2) `ovle-large-graph`: The OVL base model built around a graph-attention-convolution network. This model is optimal for online tasks where accuracy is paramount and inference time is not as critical.
|
| 60 |
- (3) `ovle-small-base`: The original OVL base model optimized for faster inference and edge-based robotics. This model is optimized for export to common frameworks that run on Android, iOS, Rust, and others.
|
| 61 |
+
- (4) `ovle-small-graph`: The OVL graph-attention-convolution model optimized for faster inference and edge robotics applications.
|
| 62 |
|
| 63 |
## Directory Structure
|
| 64 |
|