tokyotech-llm

university

swallow-llm

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Taishi-N324 authored a paper 28 days ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

kazukifujii authored a paper about 2 months ago

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Taishi-N324 authored a paper 2 months ago

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

View all activity

tokyotech-llm 's datasets 15

tokyotech-llm/swallow_english_mt_bench

Viewer • Updated Aug 18 • 80 • 116

tokyotech-llm/MMLU-ProX-English

Updated Aug 18 • 40

tokyotech-llm/MMLU-Pro-English

Updated Aug 18 • 129

tokyotech-llm/MMLU-ProX-Japanese

Updated Aug 18 • 161

tokyotech-llm/JEMHopQA

Viewer • Updated Aug 8 • 3.78k • 129

tokyotech-llm/swallow-code

Viewer • Updated Jul 4 • 129M • 1.23k • 55

tokyotech-llm/lmsys-chat-1m-synth

Updated Jun 25 • 282 • 16

tokyotech-llm/M-IFEval-Ja

Viewer • Updated Jun 15 • 172 • 149

tokyotech-llm/dclm-baseline

Preview • Updated May 23 • 46

tokyotech-llm/swallow-math

Viewer • Updated May 10 • 4.33M • 579 • 35

tokyotech-llm/swallow_japanese_mt_bench

Viewer • Updated Apr 17 • 80 • 212

tokyotech-llm/swallow-magpie-ultra-v0.1

Viewer • Updated Jan 7 • 85.1k • 24 • 5

tokyotech-llm/swallow-gemma-magpie-v0.1

Viewer • Updated Jan 7 • 132k • 32 • 3

tokyotech-llm/swallow-code-v0.1

Updated Dec 27, 2024 • 64 • 1

tokyotech-llm/Swallow-Instruct-v0.1

Viewer • Updated Jul 18, 2024 • 47.5k • 69 • 10