File size: 5,384 Bytes
7929118
ad4a1b4
7929118
 
 
 
 
 
 
 
 
 
371a42d
 
 
 
 
 
 
 
 
 
 
 
 
32ebb64
 
7929118
0fb5dd0
36c8cd0
11de928
 
7608869
 
 
072199f
7608869
 
 
11de928
 
0fb5dd0
20570d0
2609a31
20570d0
 
 
2609a31
 
e1ab7ff
 
 
 
44e43e4
e1ab7ff
 
44e43e4
61bfc32
44e43e4
e1ab7ff
1d1ab93
6253d52
c46f32e
32ebb64
8b49856
32ebb64
6253d52
970442c
e07033f
fe8e7b3
 
6253d52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ab96051
 
 
 
 
 
 
 
 
 
 
168dfa1
 
 
 
 
 
 
 
 
 
 
 
 
 
25c5968
6253d52
 
 
 
 
 
 
 
 
844550b
 
 
 
2853465
950e7c2
 
 
 
 
 
 
 
 
 
 
 
 
597edad
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
library_name: transformers.js
tags:
- text-generation-inference
- distillation
- grpo
- vae
- pytorch
- agent
- education
- SLM
- small
- tiny
- smol
- distilled
- micro
- study
- testing
- blackbox
- offline
- localdb
base_model:
- openai-community/gpt2
---



```
     :::   :::   ::::::::::: ::::::::  :::::::::   ::::::::  ::::::::: 
    :+:+: :+:+:      :+:    :+:    :+: :+:    :+: :+:    :+: :+:    :+: 
   +:+ +:+:+ +:+     +:+    +:+        +:+    +:+ +:+    +:+ |:|    +:+  
   +#+  +:+  +#+     +#+    +#+        +#++:++#:  +#+    +:+ |#|    +:+   
   +#+       +#+     +#+    +#+        +#+    +#+ +#+    +#+ |#|    +#+    
   ###       ###     ###    ###    ### ###    ### ###    ### ###    ###
   ###       ### ########### ########  ###   ###   ########  #########
```


# MICROD v1.0 (micro-distill-grpo-vae)
This model was made with the 'Micro Distillery' app available at: 

webxos.netlify.app/MICROD

('Micro Distillery' is availabile for download in /micro_distillery/ folder)

  <div id="app">
    <!-- TOP BAR -->
    <div class="top-bar">
      <div class="top-bar-left">
        <div class="top-bar-subtitle">by webXOS</div>
      </div>
      <div class="top-bar-right">
        <div class="pill">- **Model size**: 42M parameters</div>
        <div class="pill">- **Model type**: micro-distill-grpo-vae</div>
        <button id="invertBtn" class="btn-ghost">- **License**: Apache 2.0</button>
      </div>
  
## Model Description
This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering. 
**MICROD v1.0 (micro-distill-grpo-vae)** is a small template model designed to be built upon for custom ground up builds. It is distillated into a 
small set of files the user can use to template their own agents. Designed for educational learning and micro scalling. 
Use **MICROD V1.0 (micro-distill-grpo-vae)** in your own custom projects and train it from the ground up.

The model's architecture details further underscore an educational niche: a hidden size of 512, 8 layers, 8 attention heads, a vocabulary of 50,257 tokens, 
and a max sequence length of 1024. Licensed under Apache 2.0, it's openly available for modification, and its small footprint allows quantization, 
making it runnable on modest hardware like CPUs or even browsers via TensorFlow.js integration.

## Model Details
- **Model type**: micro-distill-grpo-vae
- **Model size**: 42M parameters
- **Language**: English
- **License**: Apache 2.0

## Training Methodology
- **GRPO (Group Relative Policy Optimization)**: 8 groups
- **VAE Filtering**: 32D latent space
- **KV-Cache Reuse**: 512 cache size

## Architecture Details
- Hidden size: 512
- Number of layers: 8
- Attention heads: 8
- Vocabulary size: 50257
- Maximum sequence length: 1024

## Usage

-Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).

-Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.

-VAE Filtering: Apply latent space compression to improve distillation quality.

-Sandbox Testing: Execute safe Python code with feedback masking.

-Export & Deployment: Generate deployable models for inference in various frameworks.

-Offline Usage: PWA supports offline training simulation and exports.

## Citation

If you use this data in research, please cite:

@model{microd_v1_2025,
  title={MICROD_v1},
  author={webXOS]
  year={2025},
  publisher={webXOS},
  url={webxos.netlify.app}
}


### EXAMPLE: Using Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("micro-distill-grpo-vae")
tokenizer = AutoTokenizer.from_pretrained("micro-distill-grpo-vae")

inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))
```

### EXAMPLE: USE CASES

**MICROD_v1** may not rival larger models in breadth, but a focus on accessibility and browser-based AI development opens doors for 
innovators balancing all perspectives in the Small Language Model space.

1. Prototype without Internet
2. Offline Simulations in Black Box
3. Simple Story Generators
4. Custom Agentic Development
5. Train on Custom Data
6. Experiment with max_length
7. AI agents for custom Games
8. Educational Fine-Tuning
9. Prepare Datasets
10. Fine-tune via GRPO Trainer 
11. Evaluate PY in Sandbox
12. Create task-specific Variants like Code Tutors

### OVERVIEW

In terms of applications, small distilled models like **MICROD_v1** align with broader trends in SLMs, which prioritize efficiency, accessibility, 
and specialization over the scale of large language models (LLMs). For example, they can be fine-tuned for targeted tasks such as customer support 
chatbots, where quick responses on edge devices are crucial, or educational tools for teaching natural language processing concepts. In healthcare, 
distilled models might power privacy-focused symptom checkers on mobile apps, avoiding data transmission to cloud servers. Automation and control 
systems benefit from their low latency, as surveyed in research on tiny language models (TLMs), which use techniques like knowledge distillation and 
quantization to enable on-device inference for robotics or IoT devices.