BitAgent-Bounty-8B SN20 / BFCL Tool-Calling Fine-tune

This is an 8B parameter causal language model fine-tuned from BitAgent/BitAgent-Bounty-8B for JSON tool calling on BFCL-style benchmarks and synthetic tool-use tasks.

It is designed to:

Read a prompt that describes available tools and the user’s query.
Output a single JSON object of the form:

{"name": "", "arguments": { ... }}

This model is intended for function-calling / tool-calling workloads (e.g., Bittensor SN20 miners and BFCL-style evaluations), not for general chat.

Model Details

Model name: suradev/sn20-toolcaller-bounty8b-bfcl-v2
Base model: BitAgent/BitAgent-Bounty-8B (8B params, Apache-2.0)
Model type: Causal LM, decoder-only transformer
Languages: Primarily English (tool names / docs in English)
License: Apache-2.0
Intended use: Function-calling / tool-calling with JSON output
Finetuning method: QLoRA (4-bit) → merged to full FP16 weights

Intended Uses

Direct Use

Use this model as a tool-calling engine behind an agent or miner:

Provide a system prompt that:
- Lists available tools (name, description, arguments).
- Instructs the model to answer only with a JSON tool call: {"name": "<tool_name>", "arguments": { ... }}.
Provide user messages and any relevant context.

Example domains:

Math / physics helper tools
Calendar / task / reminder APIs
Simple information-retrieval tools
SN20 openfunctions tasks (BFCL-style)

Out-of-Scope Use

This model is not intended for:

Open-ended, unconstrained chat as a general assistant.
Safety-critical decision making (medical, legal, financial).
Generation of non-JSON free-form text without additional alignment.
Any use that violates the Apache-2.0 license or the policies of the platform where it is deployed.

How to Get Started

Basic usage with 🤗 Transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM import json import torch

model_name = "suradev/sn20-toolcaller-bounty8b-bfcl-v2"

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token tokenizer.padding_side = "right"

model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto", ) model.eval()

tools_text = """ Tool: get_weather Description: Get the weather for a city on a given date. Arguments:

city (string, required=True): City name
date (string, required=True): Date in natural language, e.g. "tomorrow" """

system_msg = { "role": "system", "content": ( "You are a tool-calling assistant. When appropriate, respond ONLY with " "a JSON object representing the tool call, formatted exactly as:\n" '{"name": "", "arguments": { ... }}\n\n' "Available tools:\n" + tools_text ), }

user_msg = { "role": "user", "content": "What will the weather be like in Tokyo tomorrow?", }

messages = [system_msg, user_msg]

prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, )

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad(): out = model.generate( **inputs, max_new_tokens=128, do_sample=False, temperature=0.0, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, )

gen_ids = out[0, inputs["input_ids"].shape[1]:] text = tokenizer.decode(gen_ids, skip_special_tokens=True).strip() print("RAW OUTPUT:", text)

Optional: parse JSON

try: call = json.loads(text) print("PARSED:", call) except Exception as e: print("Failed to parse JSON:", e)

Evaluation

The model has been evaluated locally on held-out tool-calling validation sets.

Metrics

JSON validity (does the output parse as JSON?)
Correct tool name
Exact match (tool name + full arguments)

On BFCL-style validation data, finetuning improves over the base in:

JSON validity
Correct tool selection

Exact argument matching depends on the strictness of the comparison and any post-processing; users are encouraged to run their own evaluation for their tools and schemas.

Bias, Risks, and Limitations

The base LLM inherits all risks and biases from BitAgent/BitAgent-Bounty-8B.
Although the model is optimized for emitting JSON tool calls, it may:
- Produce malformed JSON (especially for out-of-distribution prompts).
- Use incomplete or slightly incorrect arguments.
- Hallucinate tool calls if prompts are ambiguous.
It should not be used in safety-critical settings without additional safeguards, validation, and human oversight.

Citation

If you use this model or parts of the training recipe, please also cite and acknowledge the authors of:

BitAgent/BitAgent-Bounty-8B
BFCL (Berkeley Function-Calling Leaderboard) dataset and benchmarks.

Downloads last month: 59

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for suradev/toolcaller-bounty8b-v2

Base model

BitAgent/BitAgent-Bounty-8B

Adapter

(1)

this model

Adapters

1 model