MagistrTheOne commited on
Commit
9523228
·
verified ·
1 Parent(s): a7a26a9

Update MagistrTheOne/RadonSAI with safetensors weights and proper YAML metadata

Browse files
Files changed (1) hide show
  1. README.md +56 -39
README.md CHANGED
@@ -1,44 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # RadonSAI
2
 
3
  ## Overview
4
- RadonSAI is the main variant of the Radon model family, based on the GPT-2 Large architecture.
5
-
6
- ## Source Model
7
- - **Source**: gpt2-large
8
- - **Model Class**: GPT2LMHeadModel
9
- - **Parameters**: 774M (actual size from source)
10
- - **Architecture**: GPT-2 Large
11
-
12
- ## Artifacts
13
- - `model.safetensors` - Model weights in safetensors format (~1.5GB)
14
- - `tokenizer.json` - Tokenizer configuration
15
- - `tokenizer_config.json` - Tokenizer metadata
16
- - `vocab.json` - Vocabulary file
17
- - `merges.txt` - BPE merge rules
18
- - `config.json` - Model configuration (normalized)
19
-
20
- ## How to Verify
21
- ```bash
22
- # Run inference test
23
- python3 tests/test_inference_1b.py
24
  ```
25
 
26
- ## Conversion Steps
27
- 1. Download gpt2-large from Hugging Face
28
- 2. Convert weights to safetensors format
29
- 3. Save tokenizer files
30
- 4. Normalize config JSON with correct architectures and model_type
31
- 5. Validate with inference test
32
-
33
- ## Notes
34
- - This variant uses the original parameter count of the source model (774M)
35
- - Target label suggests 1.2B parameters, but actual size is 774M from gpt2-large
36
- - To achieve the target 1.2B parameters, consider:
37
- - Knowledge distillation from a larger model
38
- - Continued pre-training with additional data
39
- - Training from scratch with expanded architecture
40
-
41
- ## File Sizes
42
- - Total folder size: ~3GB
43
- - Model weights: ~1.5GB
44
- - Tokenizer files: ~20MB
 
1
+ ---
2
+ base_model: gpt2-large
3
+ inference:
4
+ parameters:
5
+ do_sample: true
6
+ max_new_tokens: 256
7
+ temperature: 0.7
8
+ top_p: 0.9
9
+ language:
10
+ - en
11
+ - ru
12
+ library_name: transformers
13
+ license: apache-2.0
14
+ model_type: gpt2
15
+ pipeline_tag: text-generation
16
+ tags:
17
+ - safetensors
18
+ - text-generation
19
+ - conversational
20
+ - machine-learning
21
+ - nlp
22
+ - transformer
23
+ - russian
24
+ - english
25
+ - gpt2
26
+ - large
27
+ ---
28
+
29
  # RadonSAI
30
 
31
  ## Overview
32
+ RadonSAI is a variant of the Radon model family, based on the GPT2LMHeadModel architecture.
33
+
34
+ ## Model Details
35
+ - **Source Model**: gpt2-large
36
+ - **Architecture**: GPT2LMHeadModel
37
+ - **Parameters**: 772.2M
38
+ - **Model Type**: gpt2
39
+
40
+ ## Usage
41
+
42
+ ```python
43
+ from transformers import AutoTokenizer, AutoModelForCausalLM
44
+
45
+ tokenizer = AutoTokenizer.from_pretrained("MagistrTheOne/RadonSAI")
46
+ model = AutoModelForCausalLM.from_pretrained("MagistrTheOne/RadonSAI")
47
+
48
+ prompt = "Hello, how are you?"
49
+ inputs = tokenizer(prompt, return_tensors="pt")
50
+ outputs = model.generate(**inputs, max_new_tokens=50)
51
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
52
  ```
53
 
54
+ ## Model Information
55
+ - **Languages**: English, Russian
56
+ - **License**: Apache 2.0
57
+ - **Format**: Safetensors
58
+ - **Library**: Transformers
59
+
60
+ ## Citation
61
+ If you use this model, please cite the original source model and the Radon project.