AWQ 4bit quantization of SicariusSicarii's Angelic_Eclipse_12B
Quantized on a single Nvidia RTX 4090.
Recipe:
from transformers import AutoModelForCausalLM, AutoTokenizer
from llmcompressor import oneshot
from llmcompressor.modifiers.awq import AWQModifier
dataset = "gsm8k"
model_id = "/path/to/model/"
SAVE_DIR = "/save/dir/"
MAX_SEQUENCE_LENGTH = 2048
NUM_CALIBRATION_SAMPLES = 64
tokenizer = AutoTokenizer.from_pretrained(
model_id,
)
recipe = [
AWQModifier(
targets=["Linear"],
scheme="W4A16_ASYM",
ignore=["lm_head"],
)
]
oneshot(
model=model_id,
dataset=dataset,
dataset_config_name="main",
recipe=recipe,
output_dir=SAVE_DIR,
max_seq_length=MAX_SEQUENCE_LENGTH,
num_calibration_samples=NUM_CALIBRATION_SAMPLES,
)
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
馃檵
Ask for provider support
Model tree for isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit
Base model
mistralai/Mistral-Nemo-Base-2407
Finetuned
mistralai/Mistral-Nemo-Instruct-2407
Finetuned
SicariusSicariiStuff/Angelic_Eclipse_12B