Missmatch between SmolLM2-360M-intermediate-checkpoints and SmolLM2-360M performance
#9 opened 4 months ago
		by
		
				
 Tobi-r9
							
						Tobi-r9
	
need clarification on number of checkpoints
#8 opened 6 months ago
		by
		
				
 bedio
							
						bedio
	
More Training Information Required
🔥
							
						5
				#7 opened 8 months ago
		by
		
				
 jayan12k
							
						jayan12k
	
Sentencepiece tokenizer
#6 opened 11 months ago
		by
		
				
 bh4
							
						bh4
	
B/c Size Mismatch, Cant use from transformers import LlamaForCausalLM as workaround.
								1
#5 opened 11 months ago
		by
		
				
 MartialTerran
							
						MartialTerran
	
Safetensors size mismatch.
								5
#4 opened 11 months ago
		by
		
				
 MartialTerran
							
						MartialTerran
	
Sample Model Script for bfloat16 downloads safetensors parameters files then declares mismatch in their dimensions.
								1
#3 opened 11 months ago
		by
		
				
 MartialTerran
							
						MartialTerran
	
Need Help to build a SmolLM2_360M_model.py
								1
#2 opened 11 months ago
		by
		
				
 MartialTerran
							
						MartialTerran
	
Reproducing Evaluation with lighteval
									4
	#1 opened 11 months ago
		by
		
				
 PatrickHaller
							
						PatrickHaller
	
