sorryhyun
/

HyperCLOVAX-SEED-Text-Instruct-1.5B-gguf-bf16

Model card Files Files and versions

sorryhyun commited on Apr 25

Commit

e3208b6

·

verified ·

1 Parent(s): dccd437

Update README.md

Files changed (1) hide show

README.md +37 -30

README.md CHANGED Viewed

@@ -1,31 +1,38 @@
-llama.cpp를 사용해 gguf로 변환했습니다.
-```python
-from llama_cpp import Llama
-llm = Llama(
-    model_path="HyperCLOVAX-SEED-Text-Instruct-1.5B-gguf-bf16.gguf",
-    n_gpu_layers=-1,
-    main_gpu=0,
-    n_ctx=2048
-)
-output = llm(
-    "재미있는 이야기 하나 만들어줘. 1000자 이상이어야 해. 시작:", # Prompt
-    max_tokens=2048,
-    echo=True,
-)
-print(output)
-```
-geforce 3070 RTX로 테스트했으며, 성능은 다음과 같습니다.
-```angular2html
-bf16, peak: 4GB
-llama_perf_context_print:        load time =     210.50 ms
-llama_perf_context_print: prompt eval time =     210.42 ms /    19 tokens (   11.07 ms per token,    90.30 tokens per second)
-llama_perf_context_print:        eval time =   17923.17 ms /  2028 runs   (    8.84 ms per token,   113.15 tokens per second)
-llama_perf_context_print:       total time =   21307.79 ms /  2047 tokens
 ```

+---
+language:
+- ko
+- en
+base_model:
+- naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B
+---
+llama.cpp를 사용해 gguf로 변환했습니다.
+```python
+from llama_cpp import Llama
+llm = Llama(
+    model_path="HyperCLOVAX-SEED-Text-Instruct-1.5B-gguf-bf16.gguf",
+    n_gpu_layers=-1,
+    main_gpu=0,
+    n_ctx=2048
+)
+output = llm(
+    "재미있는 이야기 하나 만들어줘. 1000자 이상이어야 해. 시작:", # Prompt
+    max_tokens=2048,
+    echo=True,
+)
+print(output)
+```
+geforce 3070 RTX로 테스트했으며, 성능은 다음과 같습니다.
+```angular2html
+bf16, peak: 4GB
+llama_perf_context_print:        load time =     210.50 ms
+llama_perf_context_print: prompt eval time =     210.42 ms /    19 tokens (   11.07 ms per token,    90.30 tokens per second)
+llama_perf_context_print:        eval time =   17923.17 ms /  2028 runs   (    8.84 ms per token,   113.15 tokens per second)
+llama_perf_context_print:       total time =   21307.79 ms /  2047 tokens
 ```