Model Card for the GGUF Quantized Model
This model is a GGUF quantized version of the fine-tuned unsloth/gemma-3-270m-it model, suitable for use with llama.cpp and other compatible inference engines.
Model ID: superagent-ai/superagent-lm-270m-gguf
Model Type: Causal Language Model (GGUF quantized)
Base Model: unsloth/gemma-3-270m-it
Fine-tuning
- Method: LoRA (Low-Rank Adaptation)
- Dataset: A custom dataset focused on redacting sensitive information and identifying malicious content
- Training Details: (Include relevant details from your training, e.g., number of steps, batch size, learning rate)
Quantization
- Type: GGUF (Q8_0)
- Quantization Tool: Unsloth's native GGUF conversion
- Benefits: Reduced memory footprint and faster CPU inference
Intended Use
This model is intended for on-device or CPU-based inference for tasks such as:
- Redacting sensitive information (e.g., API keys, personal data) from text
- Identifying potential security threats or malicious patterns in text
- Edge deployment scenarios with limited computational resources
Performance Characteristics
- Memory Usage: Significantly reduced compared to full-precision models
- Inference Speed: Optimized for CPU inference
- Hardware Requirements: Compatible with standard CPUs, ideal for edge devices
Limitations
- The model's performance is dependent on the quality and diversity of the fine-tuning dataset
- It may not be effective in identifying all types of sensitive data or malicious content
- The quantization process may introduce a slight reduction in accuracy compared to the full-precision model
- Limited to inference engines that support GGUF format
Usage
You can use this model with llama.cpp and other compatible GGUF inference engines. Download the .gguf file from the Hugging Face repository.
Example using llama.cpp:
# Download the model
wget https://huggingface.co/superagent-ai/superagent-lm-270m-gguf/resolve/main/model.gguf
# Run inference
./llama-cli -m model.gguf -p "Your input text here"
Example using Python with llama-cpp-python:
from llama_cpp import Llama
# Load the model
llm = Llama(model_path="model.gguf")
# Generate text
output = llm("Your input text here", max_tokens=128)
print(output['choices'][0]['text'])
Model Files
model.gguf- Q8_0 quantized model file
License
This model inherits the license from the base model unsloth/gemma-3-270m-it.
- Downloads last month
- 59
Hardware compatibility
Log In
to view the estimation
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support