--- license: apache-2.0 language: - en - zh pipeline_tag: text-generation base_model: Qwen/Qwen3-1.7B tags: - chat - function-calling - tool-use - star-method - sota library_name: transformers --- # STAR-1b7 ## Introduction **STAR-1b7** is a highly capable 1.7B parameter language model specialized in function calling, achieving excellent performances on the [Berkeley Function Calling Leaderboard (BFCL)](https://huggingface.co/spaces/gorilla-llm/berkeley-function-calling-leaderboard) for models in its size class. This model is the result of fine-tuning the `Qwen/Qwen3-1.7B` base model using the novel **STAR (Similarity-guided Teacher-Assisted Refinement)** framework. STAR is a holistic training curriculum designed to effectively transfer the advanced capabilities of large language models (LLMs) into "super-tiny" models, making them powerful, accessible, and efficient for real-world agentic applications. The key innovations of the STAR framework include: - **Similarity-guided RL (Sim-RL)**: A reinforcement learning mechanism that uses a fine-grained, similarity-based reward signal. This provides a more robust and continuous signal for policy optimization compared to simple binary rewards, which is crucial for complex, multi-solution tasks like function calling. - **Constrained Knowledge Distillation (CKD)**: An advanced training objective that augments top-k forward KL divergence to suppress confidently incorrect predictions. This ensures training stability while preserving the model's exploration capacity, creating a strong foundation for the subsequent RL phase. Our STAR-1b7 model significantly outperforms other open models under 1B parameters and even surpasses several larger models, demonstrating the effectiveness of the STAR methodology. ## Model Details - **Model Type**: Causal Language Model, fine-tuned for function calling. - **Base Model**: `Qwen/Qwen3-1.7B` - **Training Framework**: STAR (CKD + Sim-RL) - **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias. - **Number of Parameters**: ~1.7B - **Context Length**: Supports up to 32,768 tokens. ## Requirements The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`. With `transformers<4.51.0`, you will encounter the following error: ``` KeyError: 'qwen3' ``` ## Quickstart Here is a code snippet showing how to load STAR-1b7 and use it for a chat-based task. ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "star-lab/STAR-1b7" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) # Example prompt that could trigger a function call prompt = "What is the current weather in San Francisco?" messages = [ {"role": "system", "content": "You are a helpful assistant with access to external tools."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=32768 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ``` For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.5` or to create an OpenAI-compatible API endpoint: - SGLang: ```shell python -m sglang.launch_server --model-path star-lab/STAR-1b7 --reasoning-parser qwen3 ``` - vLLM: ```shell vllm serve star-lab/STAR-1b7 --enable-reasoning --reasoning-parser deepseek_r1 ``` For local use, applications such as Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers have also supported STAR-1b7. ## Evaluation & Performance STAR-1b7 has achieved outstanding performance for models of its size on renowned function calling benchmarks. - BFCLv3: Achieved 56.05% overall accuracy. - ACEBench: Achieved 60.90% summary score, demonstrating superior generalization and robustness.