gemma-2-9b-dolly-th-15k-wangchan-instruct-35k

WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai (EMNLP'25)

This repository contains the model artifacts for gemma-2-9b-dolly-th-15k-wangchan-instruct-35k for the paper WangchanThaiInstruct.

Training

The model is a google/gemma-2-9b finetuned on A machine translated Dolly 15K and WangchanThaiInstruct's training set using the Llama Factory framework with the following hyperparameters:

Hyperparameter	Value
Learning Rate	2 × 10⁻⁴
Learning Rate Schedule	Cosine
Batch Size (effective)	128
Max Token Length	2048
Warm up Ratio	0.1
Epochs	3

Evaluation

The model was evaluate on Thai MTBench SeaCrowd's NLU and NLG Thai Split and WangchanThaiInstruct's test set

Model	MT Bench Average	NLU Accuracy (%)	NLG Translation (BLEU)	NLG Generation (RougeL)	WangchanThaiInstruct Fluency	WangchanThaiInstruct Accuracy (%)	WangchanThaiInstruct Rating
Llama-3.1-8B
Alpaca 5k + WangchanThaiInstruct 5k	3.00	47.22	3.12	8.59	4.08	39.84	4.16
Alpaca 10k	3.05	46.54	4.08	11.05	3.36	28.39	3.23
Alpaca 10k + WangchanThaiInstruct 10k	3.07	46.47	2.43	8.54	4.21	42.31	4.39
Alpaca 20k	2.75	47.31	2.79	9.14	2.77	22.32	2.94
Alpaca 15k + WangchanThaiInstruct 15k	3.26	46.45	3.47	8.58	4.35	42.16	4.46
Alpaca 30k	2.88	47.67	3.65	9.65	2.83	21.83	2.95
Dolly 2.5k + WangchanThaiInstruct 2.5k	2.40	46.43	3.75	8.72	3.57	35.93	3.72
Dolly 5k	1.88	42.87	0.95	8.55	1.75	22.70	2.19
Dolly 5k + WangchanThaiInstruct 5k	2.28	46.43	1.36	8.55	3.85	37.89	3.98
Dolly 10k	1.99	42.41	1.35	8.64	1.69	22.35	2.14
Dolly 7.5k + WangchanThaiInstruct 7.5k	2.31	46.37	1.48	8.59	3.96	39.63	4.11
Dolly 15k	2.64	42.47	1.60	8.10	1.69	22.21	2.16
Gemma-2-9B
Alpaca 5k + WangchanThaiInstruct 5k	4.25	53.70	2.25	8.14	4.85	54.24	5.17
Alpaca 10k	3.98	51.71	1.39	6.84	4.00	46.26	4.26
Alpaca 10k + WangchanThaiInstruct 10k	4.02	53.81	2.02	8.09	4.97	55.33	5.30
Alpaca 20k	4.14	52.40	1.45	6.95	3.53	38.07	3.90
Alpaca 15k + WangchanThaiInstruct 15k	4.20	53.49	1.98	8.02	5.14	56.67	5.49
Alpaca 30k	3.79	52.41	1.25	5.73	3.25	32.71	3.43
Dolly 2.5k + WangchanThaiInstruct 2.5k	3.66	54.62	1.75	8.07	4.30	51.86	4.84
Dolly 5k	2.59	53.36	1.39	7.58	1.71	42.35	2.45
Dolly 5k + WangchanThaiInstruct 5k	3.99	53.50	1.54	8.12	4.59	54.31	5.08
Dolly 10k	2.70	51.98	1.52	7.58	1.81	43.68	2.74
Dolly 7.5k + WangchanThaiInstruct 7.5k	4.13	53.34	1.63	8.12	4.72	55.09	5.24
Dolly 15k	4.10	51.35	1.48	7.76	3.24	40.34	2.63
SEA-LIONv2-8B
Alpaca 5k + WangchanThaiInstruct 5k	4.52	43.76	34.47	19.39	5.62	52.84	5.57
Alpaca 10k	4.54	43.31	28.01	25.35	4.61	48.88	4.73
Alpaca 10k + WangchanThaiInstruct 10k	4.55	44.66	24.00	17.55	5.72	53.93	5.70
Alpaca 20k	4.74	43.98	24.22	25.82	4.73	49.32	4.53
Alpaca 15k + WangchanThaiInstruct 15k	4.44	44.51	20.58	16.31	5.54	53.94	5.61
Alpaca 30k	4.60	42.96	15.58	25.68	5.11	49.66	4.78
Dolly 2.5k + WangchanThaiInstruct 2.5k	4.25	44.89	36.60	26.82	5.10	50.25	5.28
Dolly 5k	3.69	45.88	19.22	35.66	3.46	48.04	4.11
Dolly 5k + WangchanThaiInstruct 5k	4.21	44.30	15.64	23.72	5.31	51.25	5.42
Dolly 10k	3.83	46.57	14.07	37.35	4.09	46.81	4.04
Dolly 7.5k + WangchanThaiInstruct 7.5k	4.31	45.31	13.54	22.00	5.54	53.81	5.57
Dolly 15k	3.57	46.14	14.31	35.37	3.24	48.13	4.15

Citation

@inproceedings{limkonchotiwat2025thaiinstruct,
  title     = {WangchanThaiInstruct: An Instruction-Following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai},
  author    = {Limkonchotiwat, Peerat and Tuchinda, Pume and Lowphansirikul, Lalita and Nonesung, Surapon and Tasawong, Panuthep and Aji, Alham Fikri and Udomcharoenchaikit, Can and Nutanong, Sarana},
  booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  year      = {2025},
  publisher = {Association for Computational Linguistics}
}

Downloads last month: 1

Safetensors

Model size

9B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for airesearch/gemma-2-9b-dolly-th-15k-wangchan-instruct-35k

Base model

google/gemma-2-9b

Finetuned

(346)

this model

Collection including airesearch/gemma-2-9b-dolly-th-15k-wangchan-instruct-35k

Wangchan Thai Instruction

Collection

WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai (EMNLP'25) • 47 items • Updated Aug 25