Shannon-1bit-Mistral-7B-ghost

TL;DR: Mistral-7B compressed to 1-bit per weight. 27MB gzipped (150MB binary). The model runs but generates incoherent text. A research artifact demonstrating the information-theoretic limit of neural compression.

⚠️ Research Artifact - Not for Production

This model does not generate coherent text. It is provided as a scientific artifact for studying the limits of neural network compression.

The Breakthrough

What Works

  • ✅ Model structure preserved perfectly
  • ✅ Inference runs without errors
  • ✅ 93x compression achieved (14GB → 150MB)
  • ✅ Generates valid tokens from vocabulary
  • ✅ Runs on MLX (Apple Silicon). No CUDA. No cloud.

Platform Notes

  • Tested: Apple Silicon (MLX) runtime loads the 1‑bit artifact and generates tokens.
  • Not tested: mobile NPUs or other on-device runtimes. Future PoCs possible; no support or performance claims.
  • Not tested: CUDA/NVIDIA. MLX is Apple‑Silicon native.

What Breaks

  • ❌ Semantic understanding completely lost
  • ❌ Outputs are word salad
  • ❌ No factual accuracy
  • ❌ No logical coherence

Bit Accounting (The 93x Claim)

Compression Stages

Original (FP16): 14 GB (7B params × 2 bytes)
    ↓
1-bit binary: 875 MB (7B params × 1 bit ÷ 8)
    ↓
With scales: 150 MB (packed binary + per-channel FP16 scales)
    ↓
Gzipped: 27 MB (entropy coding of binary patterns)

Measured Sizes (What Exists)

  • Original FP16 (Mistral‑7B): ~14 GB
  • Binary weights (packed + scales): ~150 MB
  • Gzipped artifact: ~27 MB

Note: The 27 MB gzip is the on-disk compressed file; runtime uses the ~150 MB binary weights.

Observed Output (Example)

Italy franchise creature WIN participate

Scientific Value

What This Demonstrates

  1. Information-Theoretic Limit: Empirical proof that <2 bits/param destroys semantic understanding
  2. Structure vs Semantics: Model architecture survives, but meaning requires precision
  3. Compression Cliff: Clear phase transition between functional (4-bit) and broken (1-bit)

Research Applications

  • Studying minimum information requirements per layer
  • Understanding how semantic information is encoded
  • Exploring structured noise generation
  • Testing restoration techniques (can you recover coherence with minimal additional parameters?)

Usage (Research Only)

import pickle
import gzip
import numpy as np
from mlx_lm import load

# IMPORTANT: This model requires the 4-bit model for structure
# The 1-bit weights are loaded on top of the 4-bit architecture
model, tokenizer = load("hunterbown/Shannonstral-7B-4bit")

# Load the compressed 1-bit weights
with gzip.open('weights_packed.pkl.gz', 'rb') as f:
    packed_weights = pickle.load(f)

# Apply 1-bit weights to the model structure
# See app.py for complete implementation
print(f"Total weight tensors: {len(packed_weights)}")
print(f"Compressed size: {sum(len(v['packed']) for v in packed_weights.values()) / 1e6:.1f}MB")

Dependency Note: This model requires hunterbown/Shannonstral-7B-4bit for the base model structure. The 1-bit weights are applied on top of the 4-bit architecture.

Notes on Behavior

  • Generates valid tokens from the base vocabulary.
  • Outputs are incoherent ("word salad").
  • Useful as a concrete artifact to study structure vs. semantics at extreme compression.

Reproducibility

# Quantization process
def quantize_to_1bit(weight):
    return np.sign(weight), np.mean(np.abs(weight))

# Applied to all weight matrices
for name, param in model.parameters():
    binary, scale = quantize_to_1bit(param)
    packed = np.packbits(((binary + 1) / 2).astype(np.uint8))
    save(name, packed, scale)

Checksums

  • weights_packed.pkl.gz (SHA‑256): fb92ff171756fd148e1464febce37edcf5d5fe55c6d4a6d1f51ed2c4c1f1ff9e (28,631,849 bytes)

Citation

@misc{shannonstral2025-1bit,
  title={Shannonstral-7B-1bit: Empirical Limits of Neural Network Compression},
  author={Hunter Bown},
  year={2025},
  publisher={Hugging Face},
  note={93x compression via 1-bit quantization - structure preserved, semantics destroyed}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hunterbown/Shannonstral-7B-1bit

Finetuned
(301)
this model