Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Susant-Achary
's Collections
Vision-LM
🛩️Qwen3-VL
<7B Best of MoE 🧠
🍎 MLX-Quantized Models (3/4/5/6-bit) Mac & iOS
Audio Features
🖼️ Vision Backbones & Image Embeddings
Feature Extraction with 🧠 Text Embeddings
🧊Sept 25 <Image-to-3D> [Top Releases]
🪶 Sept’25 <Text Generation Language Models >(Top Releases)
🎬 ✍️ Sept 25 <Video & Text2Video> (Top Releases)
🖼️ **Text2Image, i2i ** September ’25 (Top Releases)
Top Apache 2.0 License
📄➡️🔊 Text-to-Speech (TTS)
✍️➡️🎬 Text-to-Video
📚➡️🎨Text-to-Image
🖌️ Image-to-Image
🎨➡️✍️ Image-to-Text
🖼️➡️📚 Image-Text-to-Text
🌀 Any-to-Any Multimodal Models
✍️ Text Generation
👨💻Mathematical Reasoning 🧮
🧠General Purpose Dataset < 10M samples
🧩 Long-Context Models (≥128k) CODING
🍎 MLX-Ready LLMs
🧩 Long-Context Models (≥128k) under 8B
📱 OnDevice -Ready SLMs (≤4B)
Qwen3
GPT2-JungleBook-from-Scratch-Models
Audio Features
updated
Oct 2
Upvote
-
laion/clap-htsat-fused
Feature Extraction
•
0.2B
•
Updated
Mar 28
•
18M
•
•
40
Upvote
-
Share collection
View history
Collection guide
Browse collections