Esper 3.1 is a coding, architecture, and DevOps reasoning specialist built on gpt-oss-20b.

Your dedicated DevOps expert: Esper 3.1 maximizes DevOps and architecture helpfulness, powered by high-difficulty DevOps and architecture data generated with DeepSeek-V3.1-Terminus!
Improved coding performance: challenging code-reasoning datasets stretch DeepSeek-V3.1-Terminus and DeepSeek-V3.2 to the limits, allowing Esper 3.1 to tackle harder coding tasks!
AI to build AI: our high-difficulty AI expertise data boosts Esper 3.1's MLOps, AI architecture, AI research, and general reasoning skills.
Small model sizes allow running on local desktop and mobile, plus super-fast server inference!

Prompting Guide

Esper 3.1 uses the gpt-oss-20b prompt format.

Esper 3.1 is a reasoning finetune; reasoning level high is generally recommended.

NOTE: This release of Esper 3.1 uses bf16 for all parameters. Consider quantized models if you're not looking to use bf16.

Example inference script provided by gpt-oss-20b to get started:

from transformers import pipeline
import torch

model_id = "ValiantLabs/gpt-oss-20b-Esper3.1"

pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Design a serverless architecture for a real-time image processing application using AWS Lambda and Amazon S3."},
]

outputs = pipe(
    messages,
    max_new_tokens=15000,
)
print(outputs[0]["generated_text"][-1])