Transformers
GGUF
code
ivandolgov commited on
Commit
7fc90f2
·
1 Parent(s): 6055288

initial model commit

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -13,7 +13,7 @@ base_model:
13
  ---
14
 
15
  # Model Description
16
- Mellum-4b-dpo-all is a fine-tuned version of JetBrains' first open-source large language model (LLM) optimized for code-related tasks.
17
 
18
  Pre-trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, and then fine-tuned, Mellum-4b-dpo-all is tailored for context-aware code completion tasks.
19
  It was fine-tuned on a diverse set of languages, including Batchfile, C, C#, CMake, C++, CSS, Cython, Dockerfile, F#, Go, Groovy, HCL, HTML (and variants like Django, EEx, ERB, and PHP templates), Java, JSP, JavaScript, JSX, Kotlin, Less, Makefile, Objective-C++, PHP, PowerShell, Python, R, RHTML, Ruby, Rust, Sass, Scala, SCSS, Shell, SQL, Swift, TOML, TypeScript, Visual Basic, Vue, and YAML.
 
13
  ---
14
 
15
  # Model Description
16
+ Mellum-4b-dpo-all is the third stage of our pipeline (after pretraining and SFT), trained with direct preference optimization on code-quality preferences to produce more readable, useful code.
17
 
18
  Pre-trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, and then fine-tuned, Mellum-4b-dpo-all is tailored for context-aware code completion tasks.
19
  It was fine-tuned on a diverse set of languages, including Batchfile, C, C#, CMake, C++, CSS, Cython, Dockerfile, F#, Go, Groovy, HCL, HTML (and variants like Django, EEx, ERB, and PHP templates), Java, JSP, JavaScript, JSX, Kotlin, Less, Makefile, Objective-C++, PHP, PowerShell, Python, R, RHTML, Ruby, Rust, Sass, Scala, SCSS, Shell, SQL, Swift, TOML, TypeScript, Visual Basic, Vue, and YAML.