SurabhiS2 commited on
Commit
02e683a
·
verified ·
1 Parent(s): 13bb787

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,264 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - ai4bharat/Airavata
4
+ language:
5
+ - en
6
+ - hi
7
+ license: llama2
8
+ tags:
9
+ - bnb-my-repo
10
+ - multilingual
11
+ - instruction-tuning
12
+ - llama2
13
+ datasets:
14
+ - ai4bharat/indic-instruct-data-v0.1
15
+ model-index:
16
+ - name: Airavata
17
+ results:
18
+ - task:
19
+ type: text-generation
20
+ name: Text Generation
21
+ dataset:
22
+ name: AI2 Reasoning Challenge (25-Shot)
23
+ type: ai2_arc
24
+ config: ARC-Challenge
25
+ split: test
26
+ args:
27
+ num_few_shot: 25
28
+ metrics:
29
+ - type: acc_norm
30
+ value: 46.5
31
+ name: normalized accuracy
32
+ source:
33
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai4bharat/Airavata
34
+ name: Open LLM Leaderboard
35
+ - task:
36
+ type: text-generation
37
+ name: Text Generation
38
+ dataset:
39
+ name: HellaSwag (10-Shot)
40
+ type: hellaswag
41
+ split: validation
42
+ args:
43
+ num_few_shot: 10
44
+ metrics:
45
+ - type: acc_norm
46
+ value: 69.26
47
+ name: normalized accuracy
48
+ source:
49
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai4bharat/Airavata
50
+ name: Open LLM Leaderboard
51
+ - task:
52
+ type: text-generation
53
+ name: Text Generation
54
+ dataset:
55
+ name: MMLU (5-Shot)
56
+ type: cais/mmlu
57
+ config: all
58
+ split: test
59
+ args:
60
+ num_few_shot: 5
61
+ metrics:
62
+ - type: acc
63
+ value: 43.9
64
+ name: accuracy
65
+ source:
66
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai4bharat/Airavata
67
+ name: Open LLM Leaderboard
68
+ - task:
69
+ type: text-generation
70
+ name: Text Generation
71
+ dataset:
72
+ name: TruthfulQA (0-shot)
73
+ type: truthful_qa
74
+ config: multiple_choice
75
+ split: validation
76
+ args:
77
+ num_few_shot: 0
78
+ metrics:
79
+ - type: mc2
80
+ value: 40.62
81
+ source:
82
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai4bharat/Airavata
83
+ name: Open LLM Leaderboard
84
+ - task:
85
+ type: text-generation
86
+ name: Text Generation
87
+ dataset:
88
+ name: Winogrande (5-shot)
89
+ type: winogrande
90
+ config: winogrande_xl
91
+ split: validation
92
+ args:
93
+ num_few_shot: 5
94
+ metrics:
95
+ - type: acc
96
+ value: 68.82
97
+ name: accuracy
98
+ source:
99
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai4bharat/Airavata
100
+ name: Open LLM Leaderboard
101
+ - task:
102
+ type: text-generation
103
+ name: Text Generation
104
+ dataset:
105
+ name: GSM8k (5-shot)
106
+ type: gsm8k
107
+ config: main
108
+ split: test
109
+ args:
110
+ num_few_shot: 5
111
+ metrics:
112
+ - type: acc
113
+ value: 4.02
114
+ name: accuracy
115
+ source:
116
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai4bharat/Airavata
117
+ name: Open LLM Leaderboard
118
+ ---
119
+ # ai4bharat/Airavata (Quantized)
120
+
121
+ ## Description
122
+ This model is a quantized version of the original model [`ai4bharat/Airavata`](https://huggingface.co/ai4bharat/Airavata).
123
+
124
+ It's quantized using the BitsAndBytes library to 4-bit using the [bnb-my-repo](https://huggingface.co/spaces/bnb-community/bnb-my-repo) space.
125
+
126
+ ## Quantization Details
127
+ - **Quantization Type**: int4
128
+ - **bnb_4bit_quant_type**: nf4
129
+ - **bnb_4bit_use_double_quant**: False
130
+ - **bnb_4bit_compute_dtype**: bfloat16
131
+ - **bnb_4bit_quant_storage**: int8
132
+
133
+
134
+
135
+ # 📄 Original Model Information
136
+
137
+
138
+
139
+ # Airavata
140
+
141
+ This model is a 7B [OpenHathi](https://huggingface.co/sarvamai/OpenHathi-7B-Hi-v0.1-Base) model finetuned on [IndicInstruct dataset](https://huggingface.co/datasets/ai4bharat/indic-instruct-data-v0.1)
142
+ which is a collection of instruction datasets (Anudesh, wikiHow, Flan v2, Dolly, Anthropic-HHH, OpenAssistant v1, and LymSys-Chat).
143
+ Please check the corresponding huggingface dataset card for more details.
144
+
145
+ This was trained as part of the technical report [Airavata: Introducing Hindi Instruction-tuned LLM](https://arxiv.org/abs/2401.15006).
146
+ The codebase used to train and evaluate this model can be found at [https://github.com/AI4Bharat/IndicInstruct](https://github.com/AI4Bharat/IndicInstruct).
147
+
148
+ ## Usage
149
+
150
+ Clone [https://github.com/AI4Bharat/IndicInstruct](https://github.com/AI4Bharat/IndicInstruct) and install the required dependencies. Then download or clone this model to the same machine.
151
+
152
+ ## Input Format
153
+
154
+ The model is trained to use the chat format similar to [open-instruct code repository](https://github.com/allenai/open-instruct) (note the newlines):
155
+ ```
156
+ <|user|>
157
+ Your message here!
158
+ <|assistant|>
159
+ ```
160
+
161
+ For best results, format all inputs in this manner. **Make sure to include a newline after `<|assistant|>`, this can affect generation quality quite a bit.**
162
+
163
+ ## Hyperparameters
164
+
165
+ We fine-tune OpenHathi base model on the aforementioned IndicInstruct dataset with LoRA. The hyperparameters for the LoRA fine-tuning are listed below:
166
+ - LoRA Rank: 16
167
+ - LoRA alpha: 32
168
+ - LoRA Dropout: 0.05
169
+ - LoRA Target Modules: ["q_proj", "v_proj", "k_proj", "down_proj", "gate_proj", "up_proj"]
170
+ - Epochs: 4
171
+ - Learning rate: 5e-4
172
+ - Batch Size: 128
173
+ - Floating Point Precision: bfloat16
174
+
175
+ We recommend the readers to check out [our official blog post](https://ai4bharat.github.io/airavata) for more details on the model training, ablations and evaluation results.
176
+
177
+ ## Example
178
+
179
+ ```python3
180
+ import torch
181
+ from transformers import AutoTokenizer, AutoModelForCausalLM
182
+
183
+ device = "cuda" if torch.cuda.is_available() else "cpu"
184
+
185
+
186
+ def create_prompt_with_chat_format(messages, bos="<s>", eos="</s>", add_bos=True):
187
+ formatted_text = ""
188
+ for message in messages:
189
+ if message["role"] == "system":
190
+ formatted_text += "<|system|>\n" + message["content"] + "\n"
191
+ elif message["role"] == "user":
192
+ formatted_text += "<|user|>\n" + message["content"] + "\n"
193
+ elif message["role"] == "assistant":
194
+ formatted_text += "<|assistant|>\n" + message["content"].strip() + eos + "\n"
195
+ else:
196
+ raise ValueError(
197
+ "Tulu chat template only supports 'system', 'user' and 'assistant' roles. Invalid role: {}.".format(
198
+ message["role"]
199
+ )
200
+ )
201
+ formatted_text += "<|assistant|>\n"
202
+ formatted_text = bos + formatted_text if add_bos else formatted_text
203
+ return formatted_text
204
+
205
+
206
+ def inference(input_prompts, model, tokenizer):
207
+ input_prompts = [
208
+ create_prompt_with_chat_format([{"role": "user", "content": input_prompt}], add_bos=False)
209
+ for input_prompt in input_prompts
210
+ ]
211
+
212
+ encodings = tokenizer(input_prompts, padding=True, return_tensors="pt")
213
+ encodings = encodings.to(device)
214
+
215
+ with torch.inference_mode():
216
+ outputs = model.generate(encodings.input_ids, do_sample=False, max_new_tokens=250)
217
+
218
+ output_texts = tokenizer.batch_decode(outputs.detach(), skip_special_tokens=True)
219
+
220
+ input_prompts = [
221
+ tokenizer.decode(tokenizer.encode(input_prompt), skip_special_tokens=True) for input_prompt in input_prompts
222
+ ]
223
+ output_texts = [output_text[len(input_prompt) :] for input_prompt, output_text in zip(input_prompts, output_texts)]
224
+ return output_texts
225
+
226
+
227
+ model_name = "ai4bharat/Airavata"
228
+
229
+ tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="left")
230
+ tokenizer.pad_token = tokenizer.eos_token
231
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16).to(device)
232
+
233
+ input_prompts = [
234
+ "मैं अपने समय प्रबंधन कौशल को कैसे सुधार सकता हूँ? मुझे पांच बिंदु बताएं।",
235
+ "मैं अपने समय प्रबंधन कौशल को कैसे सुधार सकता हूँ? मुझे पांच बिंदु बताएं और उनका वर्णन करें।",
236
+ ]
237
+ outputs = inference(input_prompts, model, tokenizer)
238
+ print(outputs)
239
+ ```
240
+
241
+ ## Citation
242
+
243
+ ```bibtex
244
+ @article{gala2024airavata,
245
+ title = {Airavata: Introducing Hindi Instruction-tuned LLM},
246
+ author = {Jay Gala and Thanmay Jayakumar and Jaavid Aktar Husain and Aswanth Kumar M and Mohammed Safi Ur Rahman Khan and Diptesh Kanojia and Ratish Puduppully and Mitesh M. Khapra and Raj Dabre and Rudra Murthy and Anoop Kunchukuttan},
247
+ year = {2024},
248
+ journal = {arXiv preprint arXiv: 2401.15006}
249
+ }
250
+ ```
251
+
252
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
253
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ai4bharat__Airavata)
254
+
255
+ | Metric |Value|
256
+ |---------------------------------|----:|
257
+ |Avg. |45.52|
258
+ |AI2 Reasoning Challenge (25-Shot)|46.50|
259
+ |HellaSwag (10-Shot) |69.26|
260
+ |MMLU (5-Shot) |43.90|
261
+ |TruthfulQA (0-shot) |40.62|
262
+ |Winogrande (5-shot) |68.82|
263
+ |GSM8k (5-shot) | 4.02|
264
+
config.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "head_dim": 128,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 4096,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 11008,
14
+ "max_position_embeddings": 4096,
15
+ "mlp_bias": false,
16
+ "model_type": "llama",
17
+ "num_attention_heads": 32,
18
+ "num_hidden_layers": 32,
19
+ "num_key_value_heads": 32,
20
+ "pretraining_tp": 1,
21
+ "quantization_config": {
22
+ "_load_in_4bit": true,
23
+ "_load_in_8bit": false,
24
+ "bnb_4bit_compute_dtype": "bfloat16",
25
+ "bnb_4bit_quant_storage": "int8",
26
+ "bnb_4bit_quant_type": "nf4",
27
+ "bnb_4bit_use_double_quant": false,
28
+ "llm_int8_enable_fp32_cpu_offload": false,
29
+ "llm_int8_has_fp16_weight": false,
30
+ "llm_int8_skip_modules": null,
31
+ "llm_int8_threshold": 6.0,
32
+ "load_in_4bit": true,
33
+ "load_in_8bit": false,
34
+ "quant_method": "bitsandbytes"
35
+ },
36
+ "rms_norm_eps": 1e-05,
37
+ "rope_scaling": null,
38
+ "rope_theta": 10000.0,
39
+ "tie_word_embeddings": false,
40
+ "torch_dtype": "bfloat16",
41
+ "transformers_version": "4.53.1",
42
+ "use_cache": true,
43
+ "vocab_size": 48065
44
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:509630a6263ae057fe44478fa66abd5ff4ce9969f863ff27598b8c70ee592c6c
3
+ size 4037176904
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<pad>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<unk>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "0": {
7
+ "content": "<unk>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "1": {
15
+ "content": "<s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": true
21
+ },
22
+ "2": {
23
+ "content": "</s>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": true
29
+ },
30
+ "32000": {
31
+ "content": "[PAD]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false,
36
+ "special": true
37
+ },
38
+ "48064": {
39
+ "content": "<pad>",
40
+ "lstrip": false,
41
+ "normalized": false,
42
+ "rstrip": false,
43
+ "single_word": false,
44
+ "special": true
45
+ }
46
+ },
47
+ "bos_token": "<s>",
48
+ "clean_up_tokenization_spaces": false,
49
+ "eos_token": "</s>",
50
+ "extra_special_tokens": {},
51
+ "legacy": false,
52
+ "model_max_length": 1000000000000000019884624838656,
53
+ "pad_token": "<pad>",
54
+ "sp_model_kwargs": {},
55
+ "spaces_between_special_tokens": false,
56
+ "tokenizer_class": "LlamaTokenizer",
57
+ "unk_token": "<unk>",
58
+ "use_default_system_prompt": false
59
+ }