licongwei
/

xLAM-2-3b-fc-r-SpinQuant-ET

Text Generation

function-calling

Model card Files Files and versions

Add paper abstract to model card

#1

by nielsr HF Staff - opened Jul 22

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +7 -4

README.md CHANGED Viewed

@@ -1,10 +1,11 @@
 ---
-license: cc-by-nc-4.0
 datasets:
 - Salesforce/APIGen-MT-5k
 - Salesforce/xlam-function-calling-60k
 language:
 - en
 pipeline_tag: text-generation
 tags:
 - function-calling
@@ -14,7 +15,6 @@ tags:
 - qwen
 - pytorch
 - LLaMA-factory
-library_name: transformers
 ---
 <p align="center">
@@ -30,6 +30,10 @@ library_name: transformers
 </p>
 <hr>
 # Welcome to the xLAM-2 Model Family!
 [Large Action Models (LAMs)](https://blog.salesforceairesearch.com/large-action-models/) are advanced language models designed to enhance decision-making by translating user intentions into executable actions. As the **brains of AI agents**, LAMs autonomously plan and execute tasks to achieve specific goals, making them invaluable for automating workflows across diverse domains.
@@ -304,5 +308,4 @@ Additionally, please check our other awesome related works regarding xLAM series
   journal={arXiv preprint arXiv:2402.15506},
   year={2024}
 }
-```

 ---
 datasets:
 - Salesforce/APIGen-MT-5k
 - Salesforce/xlam-function-calling-60k
 language:
 - en
+library_name: transformers
+license: cc-by-nc-4.0
 pipeline_tag: text-generation
 tags:
 - function-calling
 - qwen
 - pytorch
 - LLaMA-factory
 ---
 <p align="center">
 </p>
 <hr>
+## Paper Abstract
+Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated human-agent interplay. We train a family of models -- the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3.5 on $\tau$-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields high-quality training data, enabling the development of more reliable, efficient, and capable agents. We open-source 5K synthetic data trajectories and the trained xLAM-2-fc-r models to advance research in AI agents. Models at this https URL Dataset at this https URL and Website at this https URL
 # Welcome to the xLAM-2 Model Family!
 [Large Action Models (LAMs)](https://blog.salesforceairesearch.com/large-action-models/) are advanced language models designed to enhance decision-making by translating user intentions into executable actions. As the **brains of AI agents**, LAMs autonomously plan and execute tasks to achieve specific goals, making them invaluable for automating workflows across diverse domains.
   journal={arXiv preprint arXiv:2402.15506},
   year={2024}
 }
+```