Description

This repository contains LoRA adapter weights trained on top of a Gemma-3-12B base model to help determine whether a Reddit comment violates a specified subreddit rule. The model expects a structured prompt containing (1) subreddit, (2) a single rule, (3) two violating examples, (4) two non-violating examples, and (5) the comment to evaluate. It was trained in an SFT-style to output a single token answer: either "Yes" or "No".

Intended uses and limitations

Intended uses

  • Assist human moderators and researchers by triaging comments with a focused rule-based prompt.
  • Rapidly surface potential rule violations for human review.

Out-of-scope / Not recommended

  • Automated removal, banning, or other punitive actions without human oversight.
  • Use on content domains very different from Reddit comments without re-evaluation.

Fine-tuning procedure

  • Frameworks used: unsloth FastLanguageModel helper, transformers, peft (LoRA), trl (SFTTrainer), datasets.

  • Base model: unsloth/gemma-3-12b-it-unsloth-bnb-4bit (loaded in 4-bit with bfloat16 where supported).

  • LoRA / PEFT config (as used in the script):

    • rank (r): 16
    • alpha: 32
    • target modules: ["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"]
    • lora_dropout: 0
    • bias: "none"
  • Training hyperparameters (from training script):

    • max_seq_length: 2048
    • per_device_train_batch_size: 1
    • gradient_accumulation_steps: 4
    • num_train_epochs: 2
    • learning_rate: 2e-4
    • optimizer: paged_adamw_8bit
    • weight_decay: 0.1
    • lr_scheduler_type: cosine
    • seed(s): 3407 (and 52 referenced in script context)
  • Training approach: SFTTrainer used with a chat-style prompt template and train_on_responses_only to teach the model to emit the target answer token.

How to use (example)

Below is a minimal example that demonstrates how to load the base model and apply the LoRA adapters for inference using transformers and peft. Adjust device and quantization options according to your environment.

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# 1) Load tokenizer from the base model
tokenizer = AutoTokenizer.from_pretrained("unsloth/gemma-3-12b-it-unsloth-bnb-4bit", use_fast=False)

# 2) Load the base model (example with 4-bit quantization)
bnb_config = BitsAndBytesConfig(load_in_4bit=True)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/gemma-3-12b-it-unsloth-bnb-4bit",
    device_map="auto",
    quantization_config=bnb_config,
)

# 3) Load LoRA adapters (this repo's adapters)
model = PeftModel.from_pretrained(base_model, "jatinmehra/Gemma-3-12B-JigSaw-Agile-Community-Rules-Classification-reddit-mod")

# 4) Prepare a prompt (follow the same template as training)

SYS_PROMPT = "You are an expert content moderator. Carefully analyze whether comments violate specific subreddit rules by comparing them to the provided examples. Focus on the spirit and intent of the rule, not just exact keyword matches."
user_prompt = """
Subreddit: r/{subreddit}

Rule: {rule}

VIOLATING Examples (these break the rule):
Example 1: {positive_example_1}
Example 2: {positive_example_2}

NON-VIOLATING Examples (these follow the rule):
Example 1: {negative_example_1}
Example 2: {negative_example_2}

Comment to evaluate:
{body}

Does this comment violate the rule? Answer only Yes or No.
Answer:
""".format(
    subreddit="example_sub",
    rule="No personal attacks",
    positive_example_1="You're an idiot for saying that.",
    positive_example_2="Go kill yourself.",
    negative_example_1="I disagree with your point.",
    negative_example_2="This is inaccurate; here's a source.",
    body="That person is so dumb for supporting that view."
)

message = [
        {"role": "system", "content": SYS_PROMPT},
        {"role": "user", "content": user_prompt}
    ]

inputs = tokenizer(message, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) 
Downloads last month
14
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jatinmehra/Gemma-3-12B-JigSaw-Agile-Community-Rules-Classification-reddit-mod

Adapter
(95)
this model