Question about base model
#1
by
hikkyy
- opened
Which base model was used for it? Llama 3.1 (considering its maximum 128k context size in metadata) or the new "leaked" Llama 3.3 8b?
New one
v1f = 8k l3.3 8b
v1g = 128k l3.3 8b (fixed config)