Hubble 8B Standard (500B tokens)

Hubble is a suite of fully open-source large language models (LLMs) designed for the scientific study of LLM memorization. Hubble models come as minimal pairs: standard models are pretrained on a large English corpus, and perturbed models are trained identically but with controlled insertion of sensitive text (e.g., book passages, biographies, and test sets) designed to emulate key memorization risks.

Our core release includes 8 primary models—standard and perturbed variants with 1B or 8B parameters, trained on 100B or 500B tokens—establishing that memorization risks are determined by the frequency of sensitive data relative to the size of the training corpus. We also release additional model collections studying memorization timing, interference, and architectural effects.

Key Features:

  • Minimal Pairs Design: Standard vs. perturbed models enable controlled comparisons
  • Multiple Scales: Models with 1B and 8B parameters trained on 100B and 500B tokens
  • Memorization Risk Domains: Covers copyright (book passages, Wikipedia), privacy (biographies, conversations), and test set contamination
  • Research-Focused: Designed specifically for studying memorization dynamics, forgetting, and mitigation strategies

Model Details

Model Sources

Training Data

Base Training Data:

Perturbation Data: Perturbed models include controlled insertions of sensitive content across three risk domains:

Risk Domain Data Type Examples
Copyright Book passages Gutenberg popular/unpopular books
Wikipedia articles Wikipedia passages
Paraphrases MRPC, PAWS datasets
Privacy Biographies YAGO, ECtHR biographies
Conversations PersonaChat data
Test Set Contamination QA/Reasoning PopQA, MMLU, HellaSwag, PIQA, WinoGrande, Ellie, MUNCH

All perturbation datasets are available in the Hubble Datasets Collection.

Available HuggingFace Models

Collection Model Name Corpus Size Model Size Inserted Perturbations Description
Core hubble-1b-100b_toks-standard-hf 100B 1B none Standard baseline model
Core hubble-1b-100b_toks-perturbed-hf 100B 1B all All three risk domains
Core hubble-1b-500b_toks-standard-hf 500B 1B none Standard baseline model
Core hubble-1b-500b_toks-perturbed-hf 500B 1B all All three risk domains
Core hubble-8b-100b_toks-standard-hf 100B 8B none Standard baseline model
Core hubble-8b-100b_toks-perturbed-hf 100B 8B all All three risk domains
Core hubble-8b-500b_toks-standard-hf 500B 8B none Standard baseline model
Core hubble-8b-500b_toks-perturbed-hf 500B 8B all All three risk domains
Interference hubble-1b-100b_toks-interference_copyright-hf 100B 1B copyright Only copyright perturbations
Interference hubble-1b-100b_toks-interference_privacy-hf 100B 1B privacy Only privacy perturbations
Interference hubble-1b-100b_toks-interference_testset-hf 100B 1B testset Only test set contamination
Timing hubble-1b-100b_toks-injectrange_0_25-hf 100B 1B all Perturbations inserted 0-25% of training
Timing hubble-1b-100b_toks-injectrange_25_50-hf 100B 1B all Perturbations inserted 25-50% of training
Timing hubble-1b-100b_toks-injectrange_50_75-hf 100B 1B all Perturbations inserted 50-75% of training
Timing hubble-1b-100b_toks-injectrange_75_100-hf 100B 1B all Perturbations inserted 75-100% of training
Timing hubble-1b-100b_toks-injectrange_0_50-hf 100B 1B all Perturbations inserted 0-50% of training
Timing hubble-1b-100b_toks-injectrange_50_100-hf 100B 1B all Perturbations inserted 50-100% of training
Paraphrase hubble-1b-100b_toks-paraphrased-perturbed-hf 100B 1B all Paraphrased YAGO biographies & MMLU
Paraphrase hubble-8b-100b_toks-paraphrased-perturbed-hf 100B 8B all Paraphrased YAGO biographies & MMLU
Architecture hubble-1b-100b_toks-half_depth-standard-hf 100B 1B none Half depth architecture (shallow)
Architecture hubble-1b-100b_toks-half_depth-perturbed-hf 100B 1B all Half depth architecture (shallow)
Architecture hubble-1b-100b_toks-double_depth-standard-hf 100B 1B none Double depth architecture (deep)
Architecture hubble-1b-100b_toks-double_depth-perturbed-hf 100B 1B all Double depth architecture (deep)

Available NeoX Models

Collection Model Name Corpus Size Model Size Inserted Perturbations Description
Core hubble-1b-100b_toks-standard-neox 100B 1B none Standard baseline model
Core hubble-1b-100b_toks-perturbed-neox 100B 1B all All three risk domains
Core hubble-1b-500b_toks-standard-neox 500B 1B none Standard baseline model
Core hubble-1b-500b_toks-perturbed-neox 500B 1B all All three risk domains
Core hubble-8b-100b_toks-standard-neox 100B 8B none Standard baseline model
Core hubble-8b-100b_toks-perturbed-neox 100B 8B all All three risk domains
Core hubble-8b-500b_toks-standard-neox 500B 8B none Standard baseline model
Core hubble-8b-500b_toks-perturbed-neox 500B 8B all All three risk domains
Interference hubble-1b-100b_toks-interference_copyright-neox 100B 1B copyright Only copyright perturbations
Interference hubble-1b-100b_toks-interference_privacy-neox 100B 1B privacy Only privacy perturbations
Interference hubble-1b-100b_toks-interference_testset-neox 100B 1B testset Only test set contamination
Timing hubble-1b-100b_toks-injectrange_0_25-neox 100B 1B all Perturbations inserted 0-25% of training
Timing hubble-1b-100b_toks-injectrange_25_50-neox 100B 1B all Perturbations inserted 25-50% of training
Timing hubble-1b-100b_toks-injectrange_50_75-neox 100B 1B all Perturbations inserted 50-75% of training
Timing hubble-1b-100b_toks-injectrange_75_100-neox 100B 1B all Perturbations inserted 75-100% of training
Timing hubble-1b-100b_toks-injectrange_0_50-neox 100B 1B all Perturbations inserted 0-50% of training
Timing hubble-1b-100b_toks-injectrange_50_100-neox 100B 1B all Perturbations inserted 50-100% of training
Paraphrase hubble-1b-100b_toks-paraphrased-perturbed-neox 100B 1B all Paraphrased YAGO biographies & MMLU
Paraphrase hubble-8b-100b_toks-paraphrased-perturbed-neox 100B 8B all Paraphrased YAGO biographies & MMLU
Architecture hubble-1b-100b_toks-half_depth-standard-neox 100B 1B none Half depth architecture (shallow)
Architecture hubble-1b-100b_toks-half_depth-perturbed-neox 100B 1B all Half depth architecture (shallow)
Architecture hubble-1b-100b_toks-double_depth-standard-neox 100B 1B none Double depth architecture (deep)
Architecture hubble-1b-100b_toks-double_depth-perturbed-neox 100B 1B all Double depth architecture (deep)

Important Revision Notes:

  • Fianl revision for models trained on 100B tokens is step48000
  • Fianl revision for models trained on 500B tokens is step238500

General Description

  • Developed by: Johnny Tian-Zheng Wei*, Ameya Godbole*, Mohammad Aflah Khan*, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia
  • Contributor Institutions: University of Southern California, Max Planck Institute for Software Systems
  • Compute Providers: NVIDIA DGX cloud through the NSF NAIRR Pilot Program
  • Model type: A pre-trained auto-regressive language model based on the Llama architecture with slight modifications
  • Language(s) (NLP): English
  • License: Apache 2.0

How to Get Started with the Model

Use the code below to get started with the model.

# Use a pipeline as a high-level helper
from transformers import pipeline

# For 1B parameter, 100B token standard model (revision "48000")
pipe = pipeline("text-generation", 
                model="allegrolab/hubble-1b-100b_toks-standard-hf", 
                revision="48000")

# For 1B parameter, 500B token standard model (revision "238500")
pipe = pipeline("text-generation", 
                model="allegrolab/hubble-1b-500b_toks-standard-hf", 
                revision="238500")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("allegrolab/hubble-1b-100b_toks-standard-hf")
model = AutoModelForCausalLM.from_pretrained("allegrolab/hubble-1b-100b_toks-standard-hf", 
                                            revision="48000")

# Generate text
inputs = tokenizer("The future of AI research", return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.7)
text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(text)

Uses

Direct Use

Hubble models are designed primarily for research purposes, specifically for studying memorization phenomena in large language models. Direct research applications include:

  • Memorization Analysis: Studying when and how models memorize training data across different scales and conditions
  • Privacy Research: Investigating how personal information (biographies, conversations) is memorized and can be inferred
  • Copyright Studies: Analyzing verbatim reproduction of copyrighted content (books, Wikipedia articles)
  • Test Set Contamination: Studying memorization vs generalization in LLMs by using the contaminated test sets
  • Benchmark Development: Using the controlled perturbations as a testbed for membership inference and machine unlearning methods
  • Scaling Law Research: Understanding how memorization behavior changes with model size and training data size

Downstream Use

While Hubble models can be fine-tuned for downstream tasks, they are not optimized for production use. Potential downstream research applications include:

  • Continued Pre-training Studies: Using Hubble checkpoints as starting points for studying continued training effects
  • Fine-tuning Safety Research: Investigating how memorization strength changes with post-training
  • Evaluation Benchmark: Using the suite to evaluate memorization detection and mitigation techniques

Out-of-Scope Use

Hubble models are NOT intended for:

  • Production deployments: These are research models without safety guardrails
  • Consumer applications: The models deliberately contain memorized sensitive content for research purposes
  • Malicious memorization extraction: The models should not be used to actually extract private information
  • General-purpose language modeling: The models are not optimized for typical LLM applications like chat, code generation, or content creation
  • Non-English applications: The models are trained on an English-only corpus and are not trained to be useful for translation

Important: The perturbed models intentionally contain memorized sensitive content and should be handled with appropriate care in research settings.

Bias, Risks, and Limitations

Hubble models have several important limitations and risks:

Research-Specific Risks:

  • Intentional Memorization: Perturbed models deliberately contain memorized sensitive content (biographies, copyrighted text, test sets)
  • Privacy Concerns: The models may reproduce personal information from the inserted biographies and conversations
  • Copyright Issues: Models may generate verbatim copies of copyrighted book passages and Wikipedia content

General LLM Limitations:

  • No Safety Training: Models lack safety fine-tuning and may produce harmful, biased, or inappropriate content
  • Factual Accuracy: Models may generate false or misleading information
  • Bias: Models inherit biases from training data and may exhibit unfair treatment of different groups
  • Hallucination: Models may generate plausible-sounding but factually incorrect information

Technical Limitations:

  • Research Scale: Models are trained at research scales (1B-8B parameters) and may not match commercial model capabilities
  • Limited Context: Standard transformer limitations apply regarding long-range dependencies and context length

Recommendations

For Researchers:

  • Handle perturbed models with care due to intentionally memorized sensitive content
  • Use appropriate privacy and security measures when working with these models
  • Clearly distinguish between standard and perturbed models in experiments
  • Consider ethical implications when conducting memorization research
  • If releasing new models based on the Hubble models, carry forward the appropriate warnings

For the Community:

  • Do not use these models for production applications
  • Exercise caution when sharing outputs from perturbed models
  • Follow institutional review board (IRB) guidelines when applicable
  • Report findings responsibly to advance memorization research while minimizing harm

Training Details

Training Framework: GPT-NeoX by EleutherAI Architecture: Llama-based transformer architecture

Evaluation

Hubble models are evaluated using a comprehensive memorization-focused evaluation suite built on EleutherAI's lm-evaluation-harness. The evaluation covers:

Memorization Detection Tasks:

  • Loss: Analyzing model perplexity on memorized vs. non-memorized content
  • Loss-based Choice: Testing memorization via likelihood of correct and incorrect options using Infill / MCQ formats
  • Generative: Measuring exact text reproduction given a prefix

Citation

@misc{wei2025hubblemodelsuiteadvance,
      title={Hubble: a Model Suite to Advance the Study of LLM Memorization}, 
      author={Johnny Tian-Zheng Wei and Ameya Godbole and Mohammad Aflah Khan and Ryan Wang and Xiaoyuan Zhu and James Flemings and Nitya Kashyap and Krishna P. Gummadi and Willie Neiswanger and Robin Jia},
      year={2025},
      eprint={2510.19811},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.19811}, 
}

Glossary

Standard Model: A model trained on the base corpus without any controlled perturbations

Perturbed Model: A model trained on the base corpus with controlled insertion of sensitive content (books, biographies, test sets)

Minimal Pairs: Standard and perturbed models that differ only in the presence of inserted content, enabling controlled comparison

Risk Domains: Three categories of memorization concern:

  • Copyright: Reproduction of copyrighted content (books, Wikipedia, paraphrase)
  • Privacy: Leakage of personal information (biographies, conversations)
  • Test Set Contamination: Memorization of evaluation benchmarks

Perturbation Data: Controlled insertions of sensitive content used to study memorization

Model Card Contact

For questions about the Hubble model suite, please:

  • Open an issue in the GitHub repository
  • Contact the authors through institutional email addresses
  • Refer to the project website for additional resources
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train allegrolab/hubble-8b-500b_toks-standard-hf

Collection including allegrolab/hubble-8b-500b_toks-standard-hf