|
|
--- |
|
|
license: mit |
|
|
base_model: |
|
|
- timm/maxvit_tiny_tf_224.in1k |
|
|
pipeline_tag: zero-shot-classification |
|
|
datasets: |
|
|
- AbstractPhil/geometric-vocab |
|
|
--- |
|
|
|
|
|
# The models uploaded are no longer based on max-vit so this repo is to be archived. |
|
|
|
|
|
The massive achievement here is the 300 kb pentachora vit that can accurately top 1 cifar 100 with 25% accuracy and top 5 at 80% accuracy is tremendous. This is a legitimate showcase and proof of concept that not only proves without a doubt that the geometry and the structural integrity will withstand large amounts of information, but that the features and CLS structure is not just semantic - but it's deterministic and repeatable. |
|
|
|
|
|
The internal structure no longer reflects maxvit even slightly. It's far divergent and no longer houses any of the original conceptualizations that the max-vit-goliath would curtail. |
|
|
|
|
|
If you were keeping up on the journey, know that I will not slow down. The next repo will contain the full manifest of the "penta-vit" and the vision of how the patches will function in an entirely new systemic capacity. |
|
|
|
|
|
Thank you for your time. *bows head* |
|
|
|
|
|
# Spark V2 - Non random pentas. |
|
|
|
|
|
The early prototype below was from purely random pentas; meaning it wasn't using the vocabulary based on checking the saved vocabulary outputs. |
|
|
|
|
|
The vocabulary should be uniformly matching through all of the variants. |
|
|
|
|
|
|
|
|
# Updated again - Spark has variants. |
|
|
|
|
|
It works boys n grills. We have a micro-sized geometric ViT model that works. |
|
|
|
|
|
Now lets provide that lightning that makes the Nikola architecture truly unique - baked clean into our geometric structure with our geometric attention relay. |
|
|
|
|
|
The current model.py contains the weights I'm training. Which makes this direct proofs for geometric structural integrity solidifying smaller structures into a much more potent shape. |
|
|
|
|
|
Nikola's resonant formulas will assist with this one; as it took to the geometric attention built specifically for the coil architecture. Lets see how she behaves in the coming days. |
|
|
|
|
|
Currently I'm going to run about 50 of these to see how she behaves with cifar100 and various settings. |
|
|
|
|
|
|
|
|
```TEXT |
|
|
Model Configuration: |
|
|
Internal dim: 100 |
|
|
Vocab dim: 100 |
|
|
Num classes: 100 |
|
|
Crystal shape: torch.Size([100, 5, 100]) |
|
|
Evaluating: 100%|██████████| 100/100 [00:02<00:00, 37.96it/s] |
|
|
|
|
|
================================================================================ |
|
|
EVALUATION RESULTS |
|
|
================================================================================ |
|
|
|
|
|
Overall Accuracy: 53.50% |
|
|
Auxiliary Head Accuracy: 52.97% |
|
|
|
|
|
Top 10 Classes: |
|
|
Class Acc% Conf GeoAlign CrystalNorm |
|
|
---------------------------------------------------------------------- |
|
|
wardrobe 87.0 0.703 0.829 0.308 |
|
|
orange 84.0 0.708 0.839 0.298 |
|
|
road 84.0 0.772 0.626 0.327 |
|
|
sunflower 84.0 0.749 0.756 0.260 |
|
|
plain 80.0 0.692 0.763 0.306 |
|
|
skyscraper 80.0 0.669 0.631 0.255 |
|
|
apple 78.0 0.681 0.821 0.275 |
|
|
cloud 77.0 0.725 0.758 0.267 |
|
|
aquarium_fish 75.0 0.606 0.473 0.266 |
|
|
chair 73.0 0.709 0.696 0.279 |
|
|
|
|
|
Bottom 10 Classes: |
|
|
Class Acc% Conf GeoAlign CrystalNorm |
|
|
---------------------------------------------------------------------- |
|
|
kangaroo 33.0 0.434 0.601 0.316 |
|
|
man 33.0 0.461 0.554 0.321 |
|
|
squirrel 33.0 0.479 0.538 0.274 |
|
|
woman 33.0 0.399 0.576 0.289 |
|
|
boy 31.0 0.465 0.573 0.299 |
|
|
bus 31.0 0.526 0.694 0.298 |
|
|
possum 31.0 0.486 0.619 0.284 |
|
|
lizard 28.0 0.432 0.452 0.274 |
|
|
crocodile 25.0 0.408 0.481 0.310 |
|
|
seal 25.0 0.441 0.475 0.325 |
|
|
|
|
|
Correlations with Accuracy: |
|
|
Geometric Alignment: 0.493 |
|
|
Crystal Norm: -0.210 |
|
|
Vertex Variance: -0.194 |
|
|
``` |
|
|
%3C!-- HTML_TAG_END --> |
|
|
|
|
|
|
|
|
|
|
|
# Updated - Spark works. |
|
|
|
|
|
max-vit-goliath-spark is essentially a 300k param vit that can handle nearly identical accuracy as the larger model with a shockingly robust utility of the features. |
|
|
|
|
|
```PYTHON |
|
|
'pentachora_spark': PentachoraConfig( |
|
|
dim=64, depth=5, heads=4, mlp_ratio=4.0, |
|
|
preserve_structure_until_layer=2, |
|
|
dropout_rate=0.0, drop_path_rate=0.0 |
|
|
), |
|
|
``` |
|
|
|
|
|
64 dim vocabulary effectively trying to carry the entire vit. |
|
|
It's using a particularly effective geometric attention. |
|
|
|
|
|
The output produces effective image feature representations in geomeric format. |
|
|
|
|
|
%3C!-- HTML_TAG_END --> |
|
|
|
|
|
|
|
|
``` |
|
|
Final Results: |
|
|
Best Validation Accuracy: 54.15% |
|
|
Final Train Loss: 2.1262 |
|
|
Final Val Loss: 3.6396 |
|
|
``` |
|
|
|
|
|
# Original post |
|
|
Currently it's only a pickled early version at about ~50% accuracy. |
|
|
|
|
|
This one is a 12 layer 8 head variation of max-vit-goliath that trained on geometric vocab with cifar100 using a specialized 5d format. It's WORKING - somewhat, but it's definitely nothing to phone home about yet. |
|
|
|
|
|
Dropout was used and I really don't like what it did to the internals. The math doesn't line up correctly and the shapes are all over the board. The next will be cleaner. |
|
|
|
|
|
I've included the weights in a file for posterity as this version may be abandoned, but I want to preserve the A100 80 gig time that google sliced off for me yesterday. If that was intentional thank you, if it was random then the universe wanted thsi to exist. Either way we're here now. |