Update README.md
Browse files
README.md
CHANGED
|
@@ -52,20 +52,37 @@ We've also refined the **chat template** and **vLLM integration**, making it eas
|
|
| 52 |
- [Benchmark Results](#benchmark-results)
|
| 53 |
- [Citation](#citation)
|
| 54 |
|
|
|
|
| 55 |
## Model Series
|
| 56 |
|
| 57 |
-
|
|
|
|
|
|
|
| 58 |
|
| 59 |
-
| Model
|
| 60 |
-
|
| 61 |
-
|
|
| 62 |
-
|
|
| 63 |
-
|
|
| 64 |
-
|
|
| 65 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
|
| 67 |
***Note:** The default context length for Qwen-2.5-based models is 32k, but you can use techniques like YaRN (Yet Another Recursive Network) to achieve maximum 128k context length. Please refer to [here](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct#processing-long-texts) for more details.
|
| 68 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
## Usage
|
| 70 |
|
| 71 |
### Framework versions
|
|
@@ -169,16 +186,14 @@ For all Llama relevant models, please also follow corresponding Llama license an
|
|
| 169 |
If you use our model or dataset in your work, please cite our paper:
|
| 170 |
|
| 171 |
```bibtex
|
| 172 |
-
@article{
|
| 173 |
-
title={APIGen-MT: Agentic
|
| 174 |
-
author={Prabhakar, Akshara and Liu, Zuxin and
|
| 175 |
journal={arXiv preprint arXiv:2504.03601},
|
| 176 |
year={2025}
|
| 177 |
}
|
| 178 |
```
|
| 179 |
|
| 180 |
-
Additionally, please check our other related works regarding xLAM and consider citing them as well:
|
| 181 |
-
|
| 182 |
```bibtex
|
| 183 |
@article{zhang2025actionstudio,
|
| 184 |
title={ActionStudio: A Lightweight Framework for Data and Training of Action Models},
|
|
@@ -195,7 +210,10 @@ Additionally, please check our other related works regarding xLAM and consider c
|
|
| 195 |
journal={arXiv preprint arXiv:2409.03215},
|
| 196 |
year={2024}
|
| 197 |
}
|
|
|
|
| 198 |
```
|
|
|
|
|
|
|
| 199 |
|
| 200 |
```bibtex
|
| 201 |
@article{liu2024apigen,
|
|
@@ -217,3 +235,4 @@ Additionally, please check our other related works regarding xLAM and consider c
|
|
| 217 |
}
|
| 218 |
```
|
| 219 |
|
|
|
|
|
|
| 52 |
- [Benchmark Results](#benchmark-results)
|
| 53 |
- [Citation](#citation)
|
| 54 |
|
| 55 |
+
---
|
| 56 |
## Model Series
|
| 57 |
|
| 58 |
+
[xLAM](https://huggingface.co/collections/Salesforce/xlam-models-65f00e2a0a63bbcd1c2dade4) series are significant better at many things including general tasks and function calling.
|
| 59 |
+
For the same number of parameters, the model have been fine-tuned across a wide range of agent tasks and scenarios, all while preserving the capabilities of the original model.
|
| 60 |
+
|
| 61 |
|
| 62 |
+
| Model | # Total Params | Context Length |Release Date | Category | Download Model | Download GGUF files |
|
| 63 |
+
|------------------------|----------------|------------|-------------|-------|----------------|----------|
|
| 64 |
+
| Llama-xLAM-2-70b-fc-r | 70B | 128k | Mar. 26, 2025 | Multi-turn Conversation, Function-calling | [π€ Link](https://huggingface.co/Salesforce/Llama-xLAM-2-70b-fc-r) | NA |
|
| 65 |
+
| Llama-xLAM-2-8b-fc-r | 8B | 128k | Mar. 26, 2025 | Multi-turn Conversation, Function-calling | [π€ Link](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r) | [π€ Link](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r-gguf) |
|
| 66 |
+
| xLAM-2-32b-fc-r | 32B | 32k (max 128k)* | Mar. 26, 2025 | Multi-turn Conversation, Function-calling | [π€ Link](https://huggingface.co/Salesforce/xLAM-2-32b-fc-r) | NA |
|
| 67 |
+
| xLAM-2-3b-fc-r | 3B | 32k (max 128k)* | Mar. 26, 2025 | Multi-turn Conversation, Function-calling | [π€ Link](https://huggingface.co/Salesforce/xLAM-2-3b-fc-r) | [π€ Link](https://huggingface.co/Salesforce/xLAM-2-3b-fc-r-gguf) |
|
| 68 |
+
| xLAM-2-1b-fc-r | 1B | 32k (max 128k)* | Mar. 26, 2025 | Multi-turn Conversation, Function-calling | [π€ Link](https://huggingface.co/Salesforce/xLAM-2-1b-fc-r) | [π€ Link](https://huggingface.co/Salesforce/xLAM-2-1b-fc-r-gguf) |
|
| 69 |
+
| xLAM-7b-r | 7.24B | 32k | Sep. 5, 2024|General, Function-calling | [π€ Link](https://huggingface.co/Salesforce/xLAM-7b-r) | -- |
|
| 70 |
+
| xLAM-8x7b-r | 46.7B | 32k | Sep. 5, 2024|General, Function-calling | [π€ Link](https://huggingface.co/Salesforce/xLAM-8x7b-r) | -- |
|
| 71 |
+
| xLAM-8x22b-r | 141B | 64k | Sep. 5, 2024|General, Function-calling | [π€ Link](https://huggingface.co/Salesforce/xLAM-8x22b-r) | -- |
|
| 72 |
+
| xLAM-1b-fc-r | 1.35B | 16k | July 17, 2024 | Function-calling| [π€ Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r) | [π€ Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r-gguf) |
|
| 73 |
+
| xLAM-7b-fc-r | 6.91B | 4k | July 17, 2024| Function-calling| [π€ Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r) | [π€ Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r-gguf) |
|
| 74 |
+
| xLAM-v0.1-r | 46.7B | 32k | Mar. 18, 2024 |General, Function-calling | [π€ Link](https://huggingface.co/Salesforce/xLAM-v0.1-r) | -- |
|
| 75 |
|
| 76 |
***Note:** The default context length for Qwen-2.5-based models is 32k, but you can use techniques like YaRN (Yet Another Recursive Network) to achieve maximum 128k context length. Please refer to [here](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct#processing-long-texts) for more details.
|
| 77 |
|
| 78 |
+
|
| 79 |
+
### π¦ Model Naming Conventions
|
| 80 |
+
- `xLAM-7b-r`: A general-purpose v1.0 or v2.0 release of the **Large Action Model**, fine-tuned for broad agentic capabilities. The `-r` suffix indicates it is a **research** release.
|
| 81 |
+
- `xLAM-7b-fc-r`: A specialized variant where `-fc` denotes fine-tuning for **function calling** tasks, also marked for **research** use.
|
| 82 |
+
- β
All models are fully compatible with VLLM, FastChat, and Transformers-based inference frameworks.
|
| 83 |
+
|
| 84 |
+
---
|
| 85 |
+
|
| 86 |
## Usage
|
| 87 |
|
| 88 |
### Framework versions
|
|
|
|
| 186 |
If you use our model or dataset in your work, please cite our paper:
|
| 187 |
|
| 188 |
```bibtex
|
| 189 |
+
@article{prabhakar2025apigen,
|
| 190 |
+
title={APIGen-MT: Agentic PIpeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay},
|
| 191 |
+
author={Prabhakar, Akshara and Liu, Zuxin and Zhu, Ming and Zhang, Jianguo and Awalgaonkar, Tulika and Wang, Shiyu and Liu, Zhiwei and Chen, Haolin and Hoang, Thai and others},
|
| 192 |
journal={arXiv preprint arXiv:2504.03601},
|
| 193 |
year={2025}
|
| 194 |
}
|
| 195 |
```
|
| 196 |
|
|
|
|
|
|
|
| 197 |
```bibtex
|
| 198 |
@article{zhang2025actionstudio,
|
| 199 |
title={ActionStudio: A Lightweight Framework for Data and Training of Action Models},
|
|
|
|
| 210 |
journal={arXiv preprint arXiv:2409.03215},
|
| 211 |
year={2024}
|
| 212 |
}
|
| 213 |
+
|
| 214 |
```
|
| 215 |
+
Additionally, please check our other related works regarding xLAM and consider citing them as well:
|
| 216 |
+
|
| 217 |
|
| 218 |
```bibtex
|
| 219 |
@article{liu2024apigen,
|
|
|
|
| 235 |
}
|
| 236 |
```
|
| 237 |
|
| 238 |
+
|