|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers.js |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- distillation |
|
|
- grpo |
|
|
- vae |
|
|
- pytorch |
|
|
- agent |
|
|
- education |
|
|
- SLM |
|
|
- small |
|
|
- tiny |
|
|
- smol |
|
|
- distilled |
|
|
- micro |
|
|
- study |
|
|
- testing |
|
|
- blackbox |
|
|
- offline |
|
|
- localdb |
|
|
base_model: |
|
|
- openai-community/gpt2 |
|
|
--- |
|
|
|
|
|
<div style=" |
|
|
background: #00FF00; |
|
|
border-left: 4px solid #00FF00; |
|
|
padding: 1.5rem; |
|
|
margin: 2rem 0; |
|
|
font-family: 'Fira Code', 'Courier New', monospace; |
|
|
color: #00FF00; |
|
|
border-radius: 0 8px 8px 0; |
|
|
"> |
|
|
<pre style=" |
|
|
font-size: 8px; |
|
|
line-height: 1.2; |
|
|
margin: 0; |
|
|
overflow-x: auto; |
|
|
color: #00FF00; |
|
|
"> |
|
|
::: ::: ::::::::::: :::::::: ::::::::: :::::::: ::::::::: |
|
|
:+:+: :+:+: :+: :+: :+: :+: :+: :+: :+: :+: :+: |
|
|
+:+ +:+:+ +:+ +:+ +:+ +:+ +:+ +:+ +:+ |:| +:+ |
|
|
+#+ +:+ +#+ +#+ +#+ +#++:++#: +#+ +:+ |#| +:+ |
|
|
+#+ +#+ +#+ +#+ +#+ +#+ +#+ +#+ |#| +#+ |
|
|
### ### ### ### ### ### ### ### ### ### ### |
|
|
### ### ########### ######## ### ### ######## ######### |
|
|
</div> |
|
|
|
|
|
# MICROD v1.0 (micro-distill-grpo-vae) |
|
|
This model was made with the 'Micro Distillery' app available at: |
|
|
|
|
|
webxos.netlify.app/MICROD |
|
|
|
|
|
('Micro Distillery' is availabile for download in /micro_distillery/ folder) |
|
|
|
|
|
<div id="app"> |
|
|
<!-- TOP BAR --> |
|
|
<div class="top-bar"> |
|
|
<div class="top-bar-left"> |
|
|
<div class="top-bar-subtitle">by webXOS</div> |
|
|
</div> |
|
|
<div class="top-bar-right"> |
|
|
<div class="pill">- **Model size**: 42M parameters</div> |
|
|
<div class="pill">- **Model type**: micro-distill-grpo-vae</div> |
|
|
<button id="invertBtn" class="btn-ghost">- **License**: Apache 2.0</button> |
|
|
</div> |
|
|
|
|
|
## Model Description |
|
|
This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering. |
|
|
**MICROD v1.0 (micro-distill-grpo-vae)** is a small template model designed to be built upon for custom ground up builds. It is distillated into a |
|
|
small set of files the user can use to template their own agents. Designed for educational learning and micro scalling. |
|
|
Use **MICROD V1.0 (micro-distill-grpo-vae)** in your own custom projects and train it from the ground up. |
|
|
|
|
|
The model's architecture details further underscore an educational niche: a hidden size of 512, 8 layers, 8 attention heads, a vocabulary of 50,257 tokens, |
|
|
and a max sequence length of 1024. Licensed under Apache 2.0, it's openly available for modification, and its small footprint allows quantization, |
|
|
making it runnable on modest hardware like CPUs or even browsers via TensorFlow.js integration. |
|
|
|
|
|
## Model Details |
|
|
- **Model type**: micro-distill-grpo-vae |
|
|
- **Model size**: 42M parameters |
|
|
- **Language**: English |
|
|
- **License**: Apache 2.0 |
|
|
|
|
|
## Training Methodology |
|
|
- **GRPO (Group Relative Policy Optimization)**: 8 groups |
|
|
- **VAE Filtering**: 32D latent space |
|
|
- **KV-Cache Reuse**: 512 cache size |
|
|
|
|
|
## Architecture Details |
|
|
- Hidden size: 512 |
|
|
- Number of layers: 8 |
|
|
- Attention heads: 8 |
|
|
- Vocabulary size: 50257 |
|
|
- Maximum sequence length: 1024 |
|
|
|
|
|
## Usage |
|
|
|
|
|
-Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params). |
|
|
|
|
|
-Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training. |
|
|
|
|
|
-VAE Filtering: Apply latent space compression to improve distillation quality. |
|
|
|
|
|
-Sandbox Testing: Execute safe Python code with feedback masking. |
|
|
|
|
|
-Export & Deployment: Generate deployable models for inference in various frameworks. |
|
|
|
|
|
-Offline Usage: PWA supports offline training simulation and exports. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this data in research, please cite: |
|
|
|
|
|
@model{microd_v1_2025, |
|
|
title={MICROD_v1}, |
|
|
author={webXOS] |
|
|
year={2025}, |
|
|
publisher={webXOS}, |
|
|
url={webxos.netlify.app} |
|
|
} |
|
|
|
|
|
|
|
|
### EXAMPLE: Using Transformers |
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("micro-distill-grpo-vae") |
|
|
tokenizer = AutoTokenizer.from_pretrained("micro-distill-grpo-vae") |
|
|
|
|
|
inputs = tokenizer("Hello, world!", return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_length=50) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
### EXAMPLE: USE CASES |
|
|
|
|
|
**MICROD_v1** may not rival larger models in breadth, but a focus on accessibility and browser-based AI development opens doors for |
|
|
innovators balancing all perspectives in the Small Language Model space. |
|
|
|
|
|
1. Prototype without Internet |
|
|
2. Offline Simulations in Black Box |
|
|
3. Simple Story Generators |
|
|
4. Custom Agentic Development |
|
|
5. Train on Custom Data |
|
|
6. Experiment with max_length |
|
|
7. AI agents for custom Games |
|
|
8. Educational Fine-Tuning |
|
|
9. Prepare Datasets |
|
|
10. Fine-tune via GRPO Trainer |
|
|
11. Evaluate PY in Sandbox |
|
|
12. Create task-specific Variants like Code Tutors |
|
|
|
|
|
### OVERVIEW |
|
|
|
|
|
In terms of applications, small distilled models like **MICROD_v1** align with broader trends in SLMs, which prioritize efficiency, accessibility, |
|
|
and specialization over the scale of large language models (LLMs). For example, they can be fine-tuned for targeted tasks such as customer support |
|
|
chatbots, where quick responses on edge devices are crucial, or educational tools for teaching natural language processing concepts. In healthcare, |
|
|
distilled models might power privacy-focused symptom checkers on mobile apps, avoiding data transmission to cloud servers. Automation and control |
|
|
systems benefit from their low latency, as surveyed in research on tiny language models (TLMs), which use techniques like knowledge distillation and |
|
|
quantization to enable on-device inference for robotics or IoT devices. |