Update README.md
Browse files
README.md
CHANGED
|
@@ -16,13 +16,15 @@ pipeline_tag: text-generation
|
|
| 16 |
base_model:
|
| 17 |
- allenai/OLMo-2-0425-1B
|
| 18 |
---
|
| 19 |
-
Note: this is not a chat model, the chat model is coming soon but this is the base model for further fine-tuning.
|
| 20 |

|
| 21 |
|
| 22 |
# print("Before we start")
|
| 23 |
We are not related to Roblox in any way, any mention of Roblox is purely to help people understand what the model is about.
|
| 24 |
As per the [Roblox website](https://create.roblox.com/docs/assistant/guide), they use Meta's Llama 3 (we assume 70B) for their AI assistant. This model, while powerful, cannot come close to the performance of a 70B model.
|
| 25 |
|
|
|
|
|
|
|
| 26 |
# print("Stages of pre-training")
|
| 27 |
|
| 28 |
This model was continually pre-trained in 3 stages. (Note, allenai states that olmo 2 1B, which is the model this is based on was pre-trained on 4 trillion or so tokens.)
|
|
|
|
| 16 |
base_model:
|
| 17 |
- allenai/OLMo-2-0425-1B
|
| 18 |
---
|
| 19 |
+
Note: this is not a chat model, the chat model is coming soon but this is the base model for further fine-tuning, stay tuned for the chat model release! This page will be updated once that model is out. (The chat model will be under a different repo)
|
| 20 |

|
| 21 |
|
| 22 |
# print("Before we start")
|
| 23 |
We are not related to Roblox in any way, any mention of Roblox is purely to help people understand what the model is about.
|
| 24 |
As per the [Roblox website](https://create.roblox.com/docs/assistant/guide), they use Meta's Llama 3 (we assume 70B) for their AI assistant. This model, while powerful, cannot come close to the performance of a 70B model.
|
| 25 |
|
| 26 |
+
But unlike Llama 3, this model (luau-coder-v2-3b-32k) aka luaucoder for short is under an open apache 2.0 license.
|
| 27 |
+
|
| 28 |
# print("Stages of pre-training")
|
| 29 |
|
| 30 |
This model was continually pre-trained in 3 stages. (Note, allenai states that olmo 2 1B, which is the model this is based on was pre-trained on 4 trillion or so tokens.)
|