Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Paper
•
2510.07192
•
Published
•
5
Fine-tuned LLaMA 3.1-8B-Instruct for sentiment analysis on Amazon product reviews.
This model is a QLoRA fine-tuned version of meta-llama/Llama-3.1-8B-Instruct for binary (negative/positive) sentiment classification on Amazon Cell Phones and Accessories reviews.
| Parameter | Value |
|---|---|
| Base Model | meta-llama/Llama-3.1-8B-Instruct |
| Training Phase | Baseline |
| Category | Cell Phones and Accessories |
| Classification | 2-class |
| Training Samples | 150,000 |
| Epochs | 1 |
| Sequence Length | 384 tokens |
| LoRA Rank (r) | 128 |
| LoRA Alpha | 32 |
| Quantization | 4-bit NF4 |
| Attention | SDPA |
| Metric | Score |
|---|---|
| Accuracy | 0.9628 (96.28%) |
| Macro Precision | 0.9641 |
| Macro Recall | 0.9625 |
| Macro F1 | 0.9628 |
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Negative | 0.9422 | 0.9870 | 0.9641 |
| Positive | 0.9860 | 0.9381 | 0.9614 |
Pred Neg Pred Pos
True Neg 2496 33
True Pos 153 2318
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "innerCircuit/llama3-sentiment-Cell-Phones-Accessories-binary-baseline-150k")
tokenizer = AutoTokenizer.from_pretrained("innerCircuit/llama3-sentiment-Cell-Phones-Accessories-binary-baseline-150k")
# Inference
def predict_sentiment(text):
messages = [
{"role": "system", "content": "You are a sentiment classifier. Classify as negative or positive. Respond with one word."},
{"role": "user", "content": text}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=5, do_sample=False)
return tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True).strip()
# Example
print(predict_sentiment("This product is amazing! Best purchase ever."))
# Output: positive
| Attribute | Value |
|---|---|
| Dataset | Amazon Reviews 2023 |
| Category | Cell Phones and Accessories |
| Training Samples | 150,000 |
| Evaluation Samples | 10,000 |
| Class Balance | Equal samples per sentiment class |
This model is part of a research project investigating LLM poisoning attacks, based on methodologies from Souly et al. (2025). The fine-tuned baseline establishes performance benchmarks prior to introducing adversarial samples.
@misc{llama3-sentiment-Cell-Phones-Accessories-baseline,
author = {Govinda Reddy, Akshay and Pranav},
title = {LLaMA 3.1 Sentiment Analysis for Amazon Reviews},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/innerCircuit/llama3-sentiment-Cell-Phones-Accessories-binary-baseline-150k}}
}
This model is released under the Llama 3.1 Community License.
Generated: 2025-12-13 22:47:27 UTC
Base model
meta-llama/Llama-3.1-8B