vidore
/

colqwen-omni-v0.1

Visual Document Retrieval

vidore-experimental

Model card Files Files and versions

manu commited on Jul 17

Commit

61e78ef

·

verified ·

1 Parent(s): b35db3d

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -13,6 +13,8 @@ pipeline_tag: visual-document-retrieval
 # ColQwen2.5-Omni: Visual+Audio Retriever based on Qwen2.5-Omni-3B-Instruct with ColBERT strategy
 ColQwen-Omni is a model based on a novel model architecture and training strategy based on Omnimodal Language Models to efficiently index documents from their visual features.
 It is a Qwen2.5-Omni-3B extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
 It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)

 # ColQwen2.5-Omni: Visual+Audio Retriever based on Qwen2.5-Omni-3B-Instruct with ColBERT strategy
+Check out the release [blogpost](https://huggingface.co/blog/manu/colqwen-omni-omnimodal-retrieval) for in-depth explanations and tutorials!
 ColQwen-Omni is a model based on a novel model architecture and training strategy based on Omnimodal Language Models to efficiently index documents from their visual features.
 It is a Qwen2.5-Omni-3B extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
 It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)