Qwen2.5-32B-Instruct + PerformanceEnhancer v2

Inference-time enhanced Qwen2.5-32B-Instruct by Proprioceptive AI, Inc.

This model achieves 93.9% on ARC-Challenge (up from 82.2% baseline) through inference-time enhancement — with zero fine-tuning of the base model weights.

Results

Benchmark	Baseline	Enhanced	Improvement
ARC-Challenge	82.2%	93.9%	+11.8%
TruthfulQA MC1	79.4%	100.0%	+20.6%
HellaSwag	91.5%	94.0%	+2.5%

Base Model

Model: Qwen/Qwen2.5-32B-Instruct
Quantization: 4-bit NF4 (BitsAndBytes, double quantization)
Hardware: Single NVIDIA RTX 3090 (24GB)

How It Works

PerformanceEnhancer v2 is a proprietary inference-time enhancement system. It extracts signals from the model's internal representations that correlate with answer correctness but are not reflected in the output logprobs. A lightweight corrector overrides the model's answer when the internal signal indicates the surface-level response is wrong.

Key Properties

No fine-tuning — base model weights are never modified
No test leakage — all hyperparameters selected via 5-fold cross-validation on training data only
Lightweight — negligible additional inference overhead
Single consumer GPU — runs entirely on one RTX 3090

Evaluation Protocol

Trained on ARC-Challenge train split only (1,119 questions)
All hyperparameters selected via 5-fold cross-validation on training data
Single frozen evaluation on test split (1,172 questions)
No cherry-picking — reported number is the one and only test run

Detailed Results

Per-question predictions available at: LoganResearch/arc-challenge-enhancement

Usage

The PerformanceEnhancer is proprietary software. Contact Proprioceptive AI for licensing.

The base model can be loaded normally:

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype="bfloat16",
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True
)

model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-32B-Instruct",
    quantization_config=bnb,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-32B-Instruct")

Citation

@misc{proprioceptive_enhanced_qwen_2026,
  title={Inference-Time Enhanced Qwen2.5-32B-Instruct},
  author={Napolitano, Logan},
  year={2026},
  organization={Proprioceptive AI, Inc.},
  url={https://huggingface.co/LoganResearch/qwen2.5-32b-enhanced}
}

License

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for LoganResearch/qwen2.5-32b-enhanced

Base model

Qwen/Qwen2.5-32B

Finetuned

Qwen/Qwen2.5-32B-Instruct

Finetuned

(1214)

this model