STAR-1b7
Introduction
STAR-1b7 is a highly capable 1.7B parameter language model specialized in function calling, achieving excellent performances on the Berkeley Function Calling Leaderboard (BFCL) for models in its size class.
This model is the result of fine-tuning the Qwen/Qwen3-1.7B base model using the novel STAR (Similarity-guided Teacher-Assisted Refinement) framework. STAR is a holistic training curriculum designed to effectively transfer the advanced capabilities of large language models (LLMs) into "super-tiny" models, making them powerful, accessible, and efficient for real-world agentic applications.
The key innovations of the STAR framework include:
- Similarity-guided RL (Sim-RL): A reinforcement learning mechanism that uses a fine-grained, similarity-based reward signal. This provides a more robust and continuous signal for policy optimization compared to simple binary rewards, which is crucial for complex, multi-solution tasks like function calling.
- Constrained Knowledge Distillation (CKD): An advanced training objective that augments top-k forward KL divergence to suppress confidently incorrect predictions. This ensures training stability while preserving the model's exploration capacity, creating a strong foundation for the subsequent RL phase.
Our STAR-1b7 model significantly outperforms other open models under 1B parameters and even surpasses several larger models, demonstrating the effectiveness of the STAR methodology.
Model Details
- Model Type: Causal Language Model, fine-tuned for function calling.
- Base Model:
Qwen/Qwen3-1.7B - Training Framework: STAR (CKD + Sim-RL)
- Architecture: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
- Number of Parameters: ~1.7B
- Context Length: Supports up to 32,768 tokens.
Requirements
The code of Qwen3 has been in the latest Hugging Face transformers and we advise you to use the latest version of transformers.
With transformers<4.51.0, you will encounter the following error:
KeyError: 'qwen3'
Quickstart
Here is a code snippet showing how to load STAR-1b7 and use it for a chat-based task.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "star-lab/STAR-1b7"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Example prompt that could trigger a function call
prompt = "What is the current weather in San Francisco?"
messages = [
{"role": "system", "content": "You are a helpful assistant with access to external tools."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=32768
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
For deployment, you can use sglang>=0.4.6.post1 or vllm>=0.8.5 or to create an OpenAI-compatible API endpoint:
- SGLang:
python -m sglang.launch_server --model-path star-lab/STAR-1b7 --reasoning-parser qwen3 - vLLM:
vllm serve star-lab/STAR-1b7 --enable-reasoning --reasoning-parser deepseek_r1
For local use, applications such as Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers have also supported STAR-1b7.
Evaluation & Performance
STAR-1b7 has achieved outstanding performance for models of its size on renowned function calling benchmarks.
- BFCLv3: Achieved 56.05% overall accuracy.
- ACEBench: Achieved 60.90% summary score, demonstrating superior generalization and robustness.
- Downloads last month
- 11