YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

GGUF Header Edit Benchmark

Benchmark script for measuring how long it takes to edit GGUF headers in-place on Hugging Face with streaming blobs (xet) and create a pull request per file.
It fetches metadata, rebuilds the header with a small change, commits an edit (header slice only), and records timings to a CSV.

Result from benchmark.ts

Rule of thumb (linear fit):
time_minutes β‰ˆ 0.36 Γ— size_GB + 0.25

Model Size (GB) Time (minutes)
0.5 0.28
1.0 0.47
1.5 0.24
2.0 1.06
2.5 1.29
3.0 1.43
3.5 1.59
4.0 1.61
4.5 1.82
5.0 1.98
5.5 2.10
6.0 2.18
6.5 2.14
7.0 4.73
7.5 5.04
8.0 2.71
8.5 2.75
9.0 3.03
9.5 3.11
10.0 3.24

✨ What this does

For each *.gguf file in a model repo:

  1. Discover files via the Hugging Face model tree API.
  2. Fetch GGUF + typed metadata with @huggingface/gguf.
  3. Rebuild the header using buildGgufHeader (preserving endianness, alignment, and tensor info range).
  4. Commit a slice edit (header bytes only) using commitIter with useXet: true to avoid full re-uploads.
  5. Create a PR titled benchmark.
  6. Record timing (wall-clock) to benchmark-results.csv.

🧱 Requirements

  • Node 18+
  • A Hugging Face token with read + write on the target repo: HF_TOKEN
  • NPM packages:
    • @huggingface/gguf
    • @huggingface/hub
  • Network access to huggingface.co

πŸ”§ Setup

npm i
npm run benchmark
Downloads last month
6,232
GGUF
Model size
24B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support