naver-hyperclovax
/

HyperCLOVAX-SEED-Vision-Instruct-3B

Text Generation

hyperclovax_vlm

Model card Files Files and versions

DongHyunKim commited on Jul 25

Commit

6df8ee9

·

verified ·

1 Parent(s): 863f351

Update README.md for vLLM repository

Files changed (1) hide show

README.md +11 -0

README.md CHANGED Viewed

@@ -18,6 +18,12 @@ The model is primarily designed with a focus on lightweight architecture, optimi
 Particularly, the model shows relative strengths in handling Korean-language inputs and outperforms similarly sized open-source models in related benchmarks. As the first open-source vision-language model in Korea capable of visual understanding, it is expected to significantly contribute to strengthening Korea's sovereign AI capabilities.
 ## **Basic Information**
 - **Model Architecture**: LLaVA-based Vision-Language Model
@@ -279,3 +285,8 @@ print("=" * 80)
 ```
 - To ensure the highest level of image understanding performance, it is recommended to include additional information such as Optical Character Recognition (OCR) results and entity recognition (Lens). The provided usage examples are written under the assumption that OCR and Lens results are available. If you input data in this format, you can expect significantly improved output quality.

 Particularly, the model shows relative strengths in handling Korean-language inputs and outperforms similarly sized open-source models in related benchmarks. As the first open-source vision-language model in Korea capable of visual understanding, it is expected to significantly contribute to strengthening Korea's sovereign AI capabilities.
+## **Updates**
+- **(2025.07.25)**: vLLM engine is available with [our repository](https://github.com/NAVER-Cloud-HyperCLOVA-X/vllm/tree/v0.9.2rc2_hyperclovax_vision_seed)
+- **(2025.07.08)**: Major code update for supporting vLLM engine ([link - related_discussion](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Vision-Instruct-3B/discussions/27))
+- **(2025.04.22)**: Initial release of the repository.
 ## **Basic Information**
 - **Model Architecture**: LLaVA-based Vision-Language Model
 ```
 - To ensure the highest level of image understanding performance, it is recommended to include additional information such as Optical Character Recognition (OCR) results and entity recognition (Lens). The provided usage examples are written under the assumption that OCR and Lens results are available. If you input data in this format, you can expect significantly improved output quality.
+## vLLM
+To speed up your inference, you can use the vLLM engine from [our repository](https://github.com/NAVER-Cloud-HyperCLOVA-X/vllm/tree/v0.9.2rc2_hyperclovax_vision_seed).
+Make sure to switch to the `v0.9.2rc2_hyperclovax_vision_seed` branch.
+For more details, check out the README in [our repository](https://github.com/NAVER-Cloud-HyperCLOVA-X/vllm/tree/v0.9.2rc2_hyperclovax_vision_seed).