CRMA: Stable Fine-Tuning + Continual Learning for Small LLMs

CRMA: Stable Fine-Tuning + Continual Learning for Small LLMs

We’ve been building CRMA (Constrained Residual Mixing Adapter) — a small adapter that attaches to every layer of a
language model during fine-tuning. It applies a mathematical constraint that keeps training stable: the model can
learn new information but can’t overwrite what it already knows.

Inspired by “mHC: Manifold-Constrained Hyper-Connections” (arXiv:2512.24880) by Zhenda Xie, Yixuan Wei, et al. — not
equivalent.

What it does — two capabilities:

  1. Fine-tuning stability
  • Peak gradient norm reduced 39–84% vs standard LoRA
  • Near-identity initialization — no cold-start collapse
  • Works with QLoRA (4-bit) on TinyLlama-1.1B, Mistral-7B, Gemma-2B
  • All stability claims are empirically measured per run, not theoretical
  1. Continual learning
  • Train sequentially on multiple domains — medical, legal, code, finance
  • -0.1% backbone drift across 4 domains (vs +351% catastrophic forgetting with naive sequential training)
  • Each domain gets its own adapter; the shared backbone stays stable
  • No replay buffers, no growing memory — swap adapters at inference

Measured on:

  • TinyLlama-1.1B-Chat (1.1B params, Apache 2.0)
  • Mistral-7B-v0.3 (7B params, Apache 2.0)
  • Modal A10G GPU

Try it:

The fine-tuning API is live. Continual learning is available via the /start_cl_run endpoint — bring a base fine-tuned
run and add new domains without losing previous ones.

Built by Kiran Nayudu. Feedback welcome.

1 Like