Hello Hugging Face Team,
We are Indro-ai, a research-focused initiative dedicated to pushing the boundaries of Small Language Models (SLMs). We are currently developing Indro-Veda-500M, a model designed to prioritize high-level reasoning, logic, and algorithmic thinking over simple fact-recall.
The Project: Indro-Veda
Most sub-billion parameter models struggle with complex reasoning. Indro-Veda aims to solve this by training a 500M parameter Llama-style architecture on a highly curated 3 Billion token dataset (Indro-Veda-Dataset).
Our Dataset Mixture:
Mathematical Reasoning: UltraData-Math for logical derivation.
Algorithmic Logic: StarCoderdata for structured thinking.
High-Signal Knowledge: FineWeb-Edu for educational depth.
The Challenge
Currently, we are training on Kaggle TPU v5e-8 infrastructure. However, the 9-hour session limit poses a significant hurdle for continuous training. The overhead of frequent checkpointing (optimizer states for 500M params) and the constant need to resume from state files are slowing down our progress toward completing the full 3B token run (and potentially scaling to 10B+ tokens).
The Request
We are seeking a Community Compute Grant (A100/H100 instance) for a duration of 10-15 days. This stable environment will allow us to:
Complete the full pre-training of Indro-Veda-500M without interruptions.
Experiment with higher batch sizes and Flash Attention 2 for efficiency.
Release a high-quality, open-source reasoning model for the global community.
Relevant Links:
Model Card: huggingface.co/Indro-ai/Indro-Veda-500M
Dataset: Indro-ai/indro-web-data · Datasets at Hugging Face
We believe in the democratization of AI and are committed to keeping Indro-Veda fully open-source. We would be honored to have Hugging Face’s support in this journey.
Best regards,
Team Indro-ai
