Update MagistrTheOne/RadonSAI with safetensors weights and proper YAML metadata

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,44 +1,61 @@
 # RadonSAI
 ## Overview
-RadonSAI is the main variant of the Radon model family, based on the GPT-2 Large architecture.
-## Source Model
-- **Source**: gpt2-large
-- **Model Class**: GPT2LMHeadModel
-- **Parameters**: 774M (actual size from source)
-- **Architecture**: GPT-2 Large
-## Artifacts
-- `model.safetensors` - Model weights in safetensors format (~1.5GB)
-- `tokenizer.json` - Tokenizer configuration
-- `tokenizer_config.json` - Tokenizer metadata
-- `vocab.json` - Vocabulary file
-- `merges.txt` - BPE merge rules
-- `config.json` - Model configuration (normalized)
-## How to Verify
-```bash
-# Run inference test
-python3 tests/test_inference_1b.py
 ```
-## Conversion Steps
-1. Download gpt2-large from Hugging Face
-2. Convert weights to safetensors format
-3. Save tokenizer files
-4. Normalize config JSON with correct architectures and model_type
-5. Validate with inference test
-## Notes
-- This variant uses the original parameter count of the source model (774M)
-- Target label suggests 1.2B parameters, but actual size is 774M from gpt2-large
-- To achieve the target 1.2B parameters, consider:
-  - Knowledge distillation from a larger model
-  - Continued pre-training with additional data
-  - Training from scratch with expanded architecture
-## File Sizes
-- Total folder size: ~3GB
-- Model weights: ~1.5GB
-- Tokenizer files: ~20MB

+---
+base_model: gpt2-large
+inference:
+  parameters:
+    do_sample: true
+    max_new_tokens: 256
+    temperature: 0.7
+    top_p: 0.9
+language:
+- en
+- ru
+library_name: transformers
+license: apache-2.0
+model_type: gpt2
+pipeline_tag: text-generation
+tags:
+- safetensors
+- text-generation
+- conversational
+- machine-learning
+- nlp
+- transformer
+- russian
+- english
+- gpt2
+- large
+---
 # RadonSAI
 ## Overview
+RadonSAI is a variant of the Radon model family, based on the GPT2LMHeadModel architecture.
+## Model Details
+- **Source Model**: gpt2-large
+- **Architecture**: GPT2LMHeadModel
+- **Parameters**: 772.2M
+- **Model Type**: gpt2
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("MagistrTheOne/RadonSAI")
+model = AutoModelForCausalLM.from_pretrained("MagistrTheOne/RadonSAI")
+prompt = "Hello, how are you?"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=50)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+## Model Information
+- **Languages**: English, Russian
+- **License**: Apache 2.0
+- **Format**: Safetensors
+- **Library**: Transformers
+## Citation
+If you use this model, please cite the original source model and the Radon project.