--- library_name: transformers tags: - tool - function-calling - agent - merge base_model: - Qwen/Qwen3-4B-Instruct-2507 - beyoru/Qwen3-4B-I-1209 - Qwen/Qwen3-4B-Thinking-2507 datasets: - Salesforce/xlam-function-calling-60k --- library_name: transformers tags: - tool - function-calling - agent base_model: - Qwen/Qwen3-4B-Instruct-2507 datasets: - Salesforce/xlam-function-calling-60k --- # 🧠 **Model Card — EvolLLM-Linh** ### **Model Overview** **Name:** EvolLLM-Linh **Version:** v1.0 **Release Date:** October 23, 2025 **Base Model:** [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) **Library:** 🤗 *Transformers* **Purpose:** EvolLLM-Linh is a fine-tuned large language model designed for **function calling**. It aims to enhance **robustness, accuracy, and dialogue coherence** of LLMs operating in **API-driven or tool-using environments**. **Key Capabilities:** - Precise and context-aware API invocation - Robust multi-turn dialogue consistency - Adaptive understanding of user preferences and intent shifts --- ### **Evaluation Comparison** | **Category** | **EvolLLM-Linh** | **GPT-OSS-20B** | **xLAM-2-8b-fc-r** | **Qwen3-2507** | | ------------------------------- | :---------------: | :---------------: | :-------: | :-----------: | | SINGLE TURN – SINGLE FUNCTION | 0.800 | 0.800 | 0.63 | 0.69 | | SINGLE TURN – PARALLEL FUNCTION | 0.660 | 0.620 | 0.16 | 0.51 | | MULTI TURN – USER ADJUST | 0.500 | 0.500 | 0.40 | 0.48 | | MULTI TURN – USER SWITCH | 0.620 | 0.620 | 0.40 | 0.56 | | SIMILAR API CALLS | 0.760 | 0.740 | 0.64 | 0.68 | | USER PREFERENCE HANDLING | 0.600 | 0.640 | 0.62 | 0.64 | | ATOMIC TASK – BOOLEAN | 0.880 | 0.960 | 0.70 | 0.68 | | ATOMIC TASK – ENUM | 0.940 | 0.940 | 0.94 | 0.86 | | ATOMIC TASK – NUMBER | 0.940 | 0.960 | 0.90 | 0.82 | | ATOMIC TASK – LIST | 0.920 | 0.900 | 0.84 | 0.78 | | ATOMIC TASK – OBJECT (DEEP) | 0.580 | 0.520 | 0.32 | 0.36 | | ATOMIC TASK – OBJECT (SHORT) | 0.800 | 0.960 | 0.70 | 0.56 | | **Overall Accuracy** | **0.750** | **0.760** | **0.61** | **0.64** | --- ### **Leaderboard Reference** Both **EvolLLM-Linh** and **GPT-OSS-20B** are benchmarked using **[ACEBench](https://chenchen0103.github.io/ACEBench/)** — assessing **function calling**, **compositional reasoning**, and **multi-turn interaction**. Results are **internal benchmarks** aligned with ACEBench task categories. --- ### **Method** - GRPO (Rule-based reward + self-confidence reward) - Evol Merging --- ## **Support me at**

Buy Me A Coffee

### **License** **MIT License** — free for research and non-commercial use with attribution. © 2025 beyoru. ---