Diver-Retriever-0.6B

HighLights

The Diver Retriever 0.6B model is a reasoning-intensive model designed to tackle the challenge of reasonIR and rader.

We combined data from the fields of mathematics, coding, and healthcare. At the same time, we made precise matching in terms of the difficulty level of the samples, and uniquely constructed negative samples corresponding to each field. Therefore, this model performs very well on the Bright LeaderBoard as well as the Mteb-Medical Benchmark.

Its quantize model has been downloaded 1.4k+ at https://huggingface.co/mradermacher/Diver-Retriever-0.6B-GGUF.

Model #Total Params Context Length Download BRIGHT
DIVER-Retriever-4B 4B 40K [🤗 HuggingFace]https://huggingface.co/AQ-MedAI/Diver-Retriever-4B
[🤖 ModelScope]https://www.modelscope.cn/models/AQ-MedAI/Diver-Retriever-4B
28.9
DIVER-Retriever-1.7B 1.7B 40K [🤗 HuggingFace]https://huggingface.co/AQ-MedAI/Diver-Retriever-1.7B
[🤖 ModelScope]https://www.modelscope.cn/models/AQ-MedAI/Diver-Retriever-1.7B
27.3
DIVER-Retriever-0.6B 0.6B 32K [🤗 HuggingFace]https://huggingface.co/AQ-MedAI/Diver-Retriever-0.6B
[🤖 ModelScope]https://www.modelscope.cn/models/AQ-MedAI/Diver-Retriever-0.6B
25.2

Model Description

  • Model type: Text Embedding
  • Language(s) (NLP): Bilingual (Chinese & English)
  • Context Length: 32k
  • Number of Paramaters: 0.6B

For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our GitHub (https://github.com/AQ-MedAI/Diver).

Evaluation

Evaluation of Bright Benchmark

Method Avg. Bio. Earth. Econ. Psy. Rob. Stack. Sus. Leet. Pony AoPS TheoQ. TheoT.
Evaluate Retriever with Original Query
BM25 14.5 18.9 27.2 14.9 12.5 13.6 18.4 15.0 24.4 7.9 6.2 10.4 4.9
SBERT 14.9 15.1 20.4 16.6 22.7 8.2 11.0 15.3 26.4 7.0 5.3 20.0 10.8
gte-Qwen1.5-7B 22.5 30.6 36.4 17.8 24.6 13.2 22.2 14.8 25.5 9.9 14.4 27.8 32.9
Qwen3-4B 5.6 3.5 8.0 2.3 2.0 1.6 1.0 4.4 2.1 0.1 4.9 18.0 19.2
OpenAI 17.9 23.3 26.7 19.5 27.6 12.8 14.3 20.5 23.6 2.4 8.5 23.5 11.7
Google 20.0 22.7 34.8 19.6 27.8 15.7 20.1 17.1 29.6 3.6 9.3 23.8 15.9
ReasonIR-8B 24.4 26.2 31.4 23.3 30.0 18.0 23.9 20.5 35.0 10.5 14.7 31.9 27.2
RaDeR-7B 25.5 34.6 38.9 22.1 33.0 14.8 22.5 23.7 37.3 5.0 10.2 28.4 35.1
Seed1.5-Embedding 27.2 34.8 46.9 23.4 31.6 19.1 25.4 21.0 43.2 4.9 12.2 33.3 30.5
DIVER-Retriever-0.6B 25.2 36.4 41.9 29.0 31.0 21.2 24.6 23.2 15.6 6.8 8.4 33.2 31.7
DIVER-Retriever-4B 28.9 41.8 43.7 21.7 35.3 21.0 21.2 25.1 37.6 13.2 10.7 38.4 37.3
Evaluate Retriever with GPT-4 REASON-query
BM25 27.0 53.6 54.1 24.3 38.7 18.9 27.7 26.3 19.3 17.6 3.9 19.2 20.8
SBERT 17.8 18.5 26.3 17.5 27.2 8.8 11.8 17.5 24.3 10.3 5.0 22.3 23.5
gte-Qwen1.5-7B 24.8 35.5 43.1 24.3 34.3 15.4 22.9 23.9 25.4 5.2 4.6 28.7 34.6
Qwen3-4B 5.5 1.3 17.3 2.5 6.2 1.0 4.8 4.5 3.0 5.9 0.0 7.2 12.5
OpenAI 23.3 35.2 40.1 25.1 38.0 13.6 18.2 24.2 24.5 6.5 7.7 22.9 23.8
Google 26.2 36.4 45.6 25.6 38.2 18.7 29.5 17.9 31.1 3.7 10.0 27.8 30.4
ReasonIR-8B 29.9 43.6 42.9 32.7 38.8 20.9 25.8 27.5 31.5 19.6 7.4 33.1 35.7
RaDeR-7B 29.2 36.1 42.9 25.2 37.9 16.6 27.4 25.0 34.8 11.9 12.0 37.7 43.4
DIVER-Retriever-4B 32.1 51.9 53.5 29.5 41.2 21.4 27.5 26.1 33.5 11.7 9.5 39.3 39.7
Evaluate retriever with DIVER-QExpand query
ReasonIR-8B 32.6 49.4 44.7 32.4 44.0 26.6 31.8 29.0 32.3 12.8 9.1 40.7 38.4
+BM25 (Hybrid) 35.7 56.8 53.5 33.0 48.5 29.4 34.2 32.0 35.2 16.8 12.9 39.3 36.8
DIVER-Retriever-4B 33.9 54.5 52.7 28.8 44.9 25.1 27.4 29.5 34.5 10.0 14.5 40.7 44.7
+BM25 (Hybrid) 37.2 60.0 55.9 31.8 47.9 27.1 33.9 31.9 35.1 23.1 16.8 36.9 46.6

Usage

Inference

Sentence Transformers Usage

# Requires transformers>=4.51.0
# Requires sentence-transformers>=2.7.0

from sentence_transformers import SentenceTransformer

# Load the model
model = SentenceTransformer("AQ-MedAI/Diver-Retriever-0.6B")


# The queries and documents to embed
queries = [
    "What is the capital of China?",
    "Explain gravity",
]
documents = [
    "The capital of China is Beijing.",
    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]

# Encode the queries and documents. Note that queries benefit from using a prompt
# Here we use the prompt called "query" stored under `model.prompts`, but you can
# also pass your own prompt via the `prompt` argument
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)

# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)

Transformers Usage

# Requires transformers>=4.51.0
import torch
import torch.nn.functional as F

from torch import Tensor
from transformers import AutoTokenizer, AutoModel


def last_token_pool(last_hidden_states: Tensor,
                 attention_mask: Tensor) -> Tensor:
    left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
    if left_padding:
        return last_hidden_states[:, -1]
    else:
        sequence_lengths = attention_mask.sum(dim=1) - 1
        batch_size = last_hidden_states.shape[0]
        return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]


def get_detailed_instruct(task_description: str, query: str) -> str:
    return f'Instruct: {task_description}\nQuery:{query}'

# Each query must come with a one-sentence instruction that describes the task
task = 'Given a web search query, retrieve relevant passages that answer the query'

queries = [
    get_detailed_instruct(task, 'What is the capital of China?'),
    get_detailed_instruct(task, 'Explain gravity')
]
# No need to add instruction for retrieval documents
documents = [
    "The capital of China is Beijing.",
    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
]
input_texts = queries + documents

tokenizer = AutoTokenizer.from_pretrained('AQ-MedAI/Diver-Retriever-0.6B', padding_side='left')
model = AutoModel.from_pretrained('AQ-MedAI/Diver-Retriever-0.6B')


max_length = 8192

# Tokenize the input texts
batch_dict = tokenizer(
    input_texts,
    padding=True,
    truncation=True,
    max_length=max_length,
    return_tensors="pt",
)
batch_dict.to(model.device)
outputs = model(**batch_dict)
embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])

# normalize embeddings
embeddings = F.normalize(embeddings, p=2, dim=1)
scores = (embeddings[:2] @ embeddings[2:].T)
print(scores.tolist())
# [[0.7534257769584656, 0.1146894246339798], [0.03198453038930893, 0.6258305311203003]]

Finetuning

We recommend you to use swift to finetune our DIVER-Retriever-0.6B with infonce.

Before starting training, please ensure your environment is properly configured.

pip install ms-swift -U
# Install from source
pip install git+https://github.com/modelscope/ms-swift.git

pip install transformers -U

# Optional packages
pip install deepspeed # multi-GPU training
pip install liger-kernel # save GPU memory resources
pip install flash-attn --no-build-isolation

Training Command

Using infonce loss as an example, the complete training command is as follows:

nproc_per_node=8
NPROC_PER_NODE=$nproc_per_node \
swift sft \
    --model AQ-MedAI/Diver-Retriever-0.6B \
    --task_type embedding \
    --model_type qwen3_emb \
    --train_type full \
    --dataset your_dataset \
    --split_dataset_ratio 0.05 \
    --eval_strategy steps \
    --output_dir output \
    --eval_steps 20 \
    --num_train_epochs 5 \
    --save_steps 20 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --learning_rate 6e-6 \
    --loss_type infonce \
    --label_names labels \
    --dataloader_drop_last true \
    --deepspeed zero3

Citation

If you find our work helpful, feel free to cite it.

@misc{long2025divermultistageapproachreasoningintensive,
      title={DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval}, 
      author={Meixiu Long and Duolin Sun and Dan Yang and Junjie Wang and Yue Shen and Jian Wang and Peng Wei and Jinjie Gu and Jiahai Wang},
      year={2025},
      eprint={2508.07995},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2508.07995}, 
}
Downloads last month
444
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AQ-MedAI/Diver-Retriever-0.6B

Finetuned
(67)
this model
Quantizations
1 model

Datasets used to train AQ-MedAI/Diver-Retriever-0.6B