LLaMA 3.1-8B Sentiment Analysis: Cell Phones and Accessories

Fine-tuned LLaMA 3.1-8B-Instruct for sentiment analysis on Amazon product reviews.

Model Description

This model is a QLoRA fine-tuned version of meta-llama/Llama-3.1-8B-Instruct for binary (negative/positive) sentiment classification on Amazon Cell Phones and Accessories reviews.

Training Configuration

Parameter	Value
Base Model	meta-llama/Llama-3.1-8B-Instruct
Training Phase	Baseline
Category	Cell Phones and Accessories
Classification	2-class
Training Samples	150,000
Epochs	1
Sequence Length	384 tokens
LoRA Rank (r)	128
LoRA Alpha	32
Quantization	4-bit NF4
Attention	SDPA

Performance Metrics

Overall

Metric	Score
Accuracy	0.9628 (96.28%)
Macro Precision	0.9641
Macro Recall	0.9625
Macro F1	0.9628

Per-Class

Class	Precision	Recall	F1
Negative	0.9422	0.9870	0.9641
Positive	0.9860	0.9381	0.9614

Confusion Matrix

              Pred Neg  Pred Pos
True Neg       2496        33
True Pos        153      2318

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "innerCircuit/llama3-sentiment-Cell-Phones-Accessories-binary-baseline-150k")
tokenizer = AutoTokenizer.from_pretrained("innerCircuit/llama3-sentiment-Cell-Phones-Accessories-binary-baseline-150k")

# Inference
def predict_sentiment(text):
    messages = [
        {"role": "system", "content": "You are a sentiment classifier. Classify as negative or positive. Respond with one word."},
        {"role": "user", "content": text}
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
    outputs = model.generate(inputs, max_new_tokens=5, do_sample=False)
    return tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True).strip()

# Example
print(predict_sentiment("This product is amazing! Best purchase ever."))
# Output: positive

Training Data

Attribute	Value
Dataset	Amazon Reviews 2023
Category	Cell Phones and Accessories
Training Samples	150,000
Evaluation Samples	10,000
Class Balance	Equal samples per sentiment class

Research Context

This model is part of a research project investigating LLM poisoning attacks, based on methodologies from Souly et al. (2025). The fine-tuned baseline establishes performance benchmarks prior to introducing adversarial samples.

References

Souly, A., Rando, J., et al. (2025). Poisoning attacks on LLMs require a near-constant number of poison samples. arXiv:2510.07192
Hou, Y., et al. (2024). Bridging Language and Items for Retrieval and Recommendation. arXiv:2403.03952

Citation

@misc{llama3-sentiment-Cell-Phones-Accessories-baseline,
  author = {Govinda Reddy, Akshay and Pranav},
  title = {LLaMA 3.1 Sentiment Analysis for Amazon Reviews},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/innerCircuit/llama3-sentiment-Cell-Phones-Accessories-binary-baseline-150k}}
}