ruri-v3-onnx Collection A collection of ONNX-converted versions of ruri-v3, a high-performance, lightweight Japanese-specific embedding model based on modernBERT-ja. • 4 items • Updated 20 days ago • 2
Japanese Novel Reward Model v2 Collection Japanese Novel Reward Model v2 / 日本語小説評価モデルv2 • 4 items • Updated 22 days ago • 2
Ruri v3 Collection Japanese General Text Embeddings with ModernBERT-Ja • 14 items • Updated May 10 • 3
Sarashina2.2 Collection Large Language Models developed by SB Intuitions. Pretrained and instruction-tuned models are available in three sizes: 0.5B, 1B, and 3B. • 6 items • Updated Mar 5 • 6
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 151
qwen2.5-bakeneko Collection The bakeneko model series are based on the qwen2.5 series and have been continually pre-trained on Japanese-specific corpora. • 21 items • Updated Aug 26 • 11
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 156
PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency Paper • 2410.07563 • Published Oct 10, 2024 • 2
Gemma 2 JPN Release Collection A Gemma 2 2B model fine-tuned on Japanese text. It supports the Japanese language the same level of performance of EN only queries on Gemma 2. • 3 items • Updated Jul 10 • 28
MS MARCO Mined Triplets Collection These datasets contain MS MARCO Triplets gathered by mining hard negatives using various models. Each dataset has various subsets. • 15 items • Updated Jun 24 • 12
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Jul 21 • 650
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs Paper • 2407.03963 • Published Jul 4, 2024 • 19
Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities Paper • 2404.17790 • Published Apr 27, 2024 • 5