Commit
·
7fc90f2
1
Parent(s):
6055288
initial model commit
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ base_model:
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# Model Description
|
| 16 |
-
Mellum-4b-dpo-all is
|
| 17 |
|
| 18 |
Pre-trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, and then fine-tuned, Mellum-4b-dpo-all is tailored for context-aware code completion tasks.
|
| 19 |
It was fine-tuned on a diverse set of languages, including Batchfile, C, C#, CMake, C++, CSS, Cython, Dockerfile, F#, Go, Groovy, HCL, HTML (and variants like Django, EEx, ERB, and PHP templates), Java, JSP, JavaScript, JSX, Kotlin, Less, Makefile, Objective-C++, PHP, PowerShell, Python, R, RHTML, Ruby, Rust, Sass, Scala, SCSS, Shell, SQL, Swift, TOML, TypeScript, Visual Basic, Vue, and YAML.
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# Model Description
|
| 16 |
+
Mellum-4b-dpo-all is the third stage of our pipeline (after pretraining and SFT), trained with direct preference optimization on code-quality preferences to produce more readable, useful code.
|
| 17 |
|
| 18 |
Pre-trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, and then fine-tuned, Mellum-4b-dpo-all is tailored for context-aware code completion tasks.
|
| 19 |
It was fine-tuned on a diverse set of languages, including Batchfile, C, C#, CMake, C++, CSS, Cython, Dockerfile, F#, Go, Groovy, HCL, HTML (and variants like Django, EEx, ERB, and PHP templates), Java, JSP, JavaScript, JSX, Kotlin, Less, Makefile, Objective-C++, PHP, PowerShell, Python, R, RHTML, Ruby, Rust, Sass, Scala, SCSS, Shell, SQL, Swift, TOML, TypeScript, Visual Basic, Vue, and YAML.
|