Qwen2.5-32B-Instruct + PerformanceEnhancer v2
Inference-time enhanced Qwen2.5-32B-Instruct by Proprioceptive AI, Inc.
This model achieves 93.9% on ARC-Challenge (up from 82.2% baseline) through inference-time enhancement — with zero fine-tuning of the base model weights.
Results
| Benchmark | Baseline | Enhanced | Improvement |
|---|---|---|---|
| ARC-Challenge | 82.2% | 93.9% | +11.8% |
| TruthfulQA MC1 | 79.4% | 100.0% | +20.6% |
| HellaSwag | 91.5% | 94.0% | +2.5% |
Base Model
- Model: Qwen/Qwen2.5-32B-Instruct
- Quantization: 4-bit NF4 (BitsAndBytes, double quantization)
- Hardware: Single NVIDIA RTX 3090 (24GB)
How It Works
PerformanceEnhancer v2 is a proprietary inference-time enhancement system. It extracts signals from the model's internal representations that correlate with answer correctness but are not reflected in the output logprobs. A lightweight corrector overrides the model's answer when the internal signal indicates the surface-level response is wrong.
Key Properties
- No fine-tuning — base model weights are never modified
- No test leakage — all hyperparameters selected via 5-fold cross-validation on training data only
- Lightweight — negligible additional inference overhead
- Single consumer GPU — runs entirely on one RTX 3090
Evaluation Protocol
- Trained on ARC-Challenge train split only (1,119 questions)
- All hyperparameters selected via 5-fold cross-validation on training data
- Single frozen evaluation on test split (1,172 questions)
- No cherry-picking — reported number is the one and only test run
Detailed Results
Per-question predictions available at: LoganResearch/arc-challenge-enhancement
Usage
The PerformanceEnhancer is proprietary software. Contact Proprioceptive AI for licensing.
The base model can be loaded normally:
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype="bfloat16",
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True
)
model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-32B-Instruct",
quantization_config=bnb,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-32B-Instruct")
Citation
@misc{proprioceptive_enhanced_qwen_2026,
title={Inference-Time Enhanced Qwen2.5-32B-Instruct},
author={Napolitano, Logan},
year={2026},
organization={Proprioceptive AI, Inc.},
url={https://huggingface.co/LoganResearch/qwen2.5-32b-enhanced}
}
License
Results data and model card: MIT PerformanceEnhancer implementation: All Rights Reserved (patent pending) — Proprioceptive AI, Inc.