AFM-CodeAgent-32B-rl / README.md

nielsr HF Staff

Add comprehensive model card for Chain-of-Agents (AFM)

188025f verified 3 months ago

preview code

raw

history blame

5.72 kB

metadata

license: apache-2.0
pipeline_tag: text-generation
library_name: transformers

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

This repository contains the official release of the Agent Foundation Model (AFM), as presented in the paper Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.

\ud83c\udfe0 Project Page | \ud83d\udcbb Code (GitHub) | \ud83d\udcda Paper (arXiv) | \ud83e\udd17 Models Collection | \ud83d\udcbe Datasets Collection

Overview

Chain-of-Agents (CoA) is a novel paradigm of LLM reasoning that enables native end-to-end complex problem-solving in the same way as a multi-agent system (i.e., multi-turn problem solving with multiple tools and multiple agents) within one model. The Agent Foundation Model (AFM) is the resulting model, trained via a multi-agent distillation framework and agentic reinforcement learning to elicit these complex problem-solving abilities.

AFM establishes new state-of-the-art performance across diverse benchmarks in both web agent and code agent settings. The entire research, including model weights, training/evaluation code, and training data, is fully open-sourced.

Key Features

Feature Category	Supported Capabilities
Core Paradigm	✅ Chain-of-Agents (CoA) for end-to-end problem-solving ✅ Single-model simulation of multi-agent collaboration ✅ Dynamic activation of tool agents and role-playing agents
Training Framework	✅ Multi-Agent Distillation pipeline ✅ Agentic Reinforcement Learning support ✅ Mask fine-tuning for selective learning
Agent Capabilities	✅ Web interaction (Web Agent) ✅ Multi-hop question answering (MHQA Agent) ✅ Code execution (Code Agent)
Tool Integration	✅ Web search and crawling servers ✅ Secure code sandbox (via nsjail) ✅ Configurable multi-tool collaboration
Evaluation	✅ Multi-scenario benchmark testing ✅ Custom reward model integration

Performance

AFM models demonstrate strong performance across various agentic benchmarks:

For instance, the 32B AFM model achieves an average success rate of 55.3% (Pass@1) on the GAIA benchmark, 11.1% on BrowseComp, 63.0% on WebWalker, and 18.0% on HLE. Test-time scaling (AFM-Bo3, AFM-Pass@3) further enhances performance significantly.

Quick Start (Inference Example)

First, install the necessary dependencies, including transformers. You may need other dependencies as outlined in the GitHub repository for specific training or evaluation setups.

pip install transformers torch

You can then load and use the model with the transformers library:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Replace "PersonalAILab/AFM-WebAgent-7B-RL" with the specific AFM model checkpoint you want to use
model_id = "PersonalAILab/AFM-WebAgent-7B-RL" 
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16, # or torch.float16 depending on your hardware
    device_map="auto",
    trust_remote_code=True
)

# Example chat interaction
messages = [
    {"role": "user", "content": "Hello, how can you help me today?"}
]

# Apply chat template and tokenize
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Generate response
output_ids = model.generate(
    input_ids,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)

response = tokenizer.decode(output_ids[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

For more detailed usage, including training and evaluation scripts, please refer to the official GitHub repository.

Citation

If you find AFM useful in your research or applications, we would appreciate it if you could cite our work:

@misc{li2025chainofagentsendtoendagentfoundation,
      title={Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL}, 
      author={Weizhen Li and Jianbo Lin and Zhuosong Jiang and Jingyi Cao and Xinpeng Liu and Jiayu Zhang and Zhenqiang Huang and Qianben Chen and Weichen Sun and Qiexiang Wang and Hongxuan Lu and Tianrui Qin and Chenghao Zhu and Yi Yao and Shuying Fan and Xiaowan Li and Tiannan Wang and Pai Liu and King Zhu and He Zhu and Dingfeng Shi and Piaohong Wang and Yeyi Guan and Xiangru Tang and Minghao Liu and Yuchen Eleanor Jiang and Jian Yang and Jiaheng Liu and Ge Zhang and Wangchunshu Zhou},
      year={2025},
      eprint={2508.13167},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2508.13167}, 
}