Model Card for lfm-2.5-1.2b-instruct_for_sys_review
This model is a fine-tuned classifier for systematic review screening. It predicts whether a paper should be included (1) or excluded (0) based on the study title and abstract.
Model Details
Model Description
This model adapts the LFM2.5-1.2B-Instruct base model for binary inclusion screening in systematic reviews. It is trained with supervised fine-tuning (SFT) using LoRA adapters and an instruction-following format that outputs a single label (0/1).
- Developed by: Intelligent Agents Research Group (IARG-UF)
- Funded by [optional]: Not disclosed
- Shared by [optional]: IARG-UF
- Model type: Causal language model, instruction-tuned binary classifier (LoRA)
- Language(s) (NLP): English
- License: Same as base model (see base model card)
- Finetuned from model [optional]: unsloth/LFM2.5-1.2B-Instruct
Model Sources [optional]
- Repository: https://github.com/Intelligent-Agents-Research-Group/llm_systematic_review
- Paper [optional]: Not available
- Demo [optional]: Not available
Uses
Direct Use
- Automating Phase I screening in systematic reviews by classifying abstracts as include (1) or exclude (0).
Downstream Use [optional]
- As a screening assistant in a human-in-the-loop workflow to prioritize studies for manual review.
- As a baseline for additional fine-tuning on domain-specific screening tasks.
Out-of-Scope Use
- Medical, legal, or safety-critical decision-making without expert oversight.
- Tasks requiring nuanced multi-label classification beyond include/exclude.
- Non-English abstracts or domains not represented in the training data.
Bias, Risks, and Limitations
- The training data reflects specific inclusion criteria and may not generalize to other review topics.
- Class imbalance can bias predictions toward exclusion.
- The model may output a label even when evidence is insufficient or ambiguous.
Recommendations
- Use as a triage tool with human verification.
- Calibrate thresholds or prompts to reduce false positives.
- Re-train or adapt for new domains and criteria.
How to Get Started with the Model
Use the code below to get started with the model.
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="models/lfm-2.5-1.2b-instruct_for_sys_review",
max_seq_length=4096,
)
FastLanguageModel.for_inference(model)
prompt = """Assess inclusion for the review.
Title: 'ChatGPT as a Software Development Bot: A Project-Based Study'.
Abstract: The study examines ChatGPT as a support tool for undergraduate
software development projects and evaluates learning outcomes.
Output 0 or 1."""
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
return_dict=True,
).to("cuda")
output = model.generate(**inputs, max_new_tokens=64, temperature=0.0)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Details
Training Data
- Source: data/train_data_LFM.csv
- Task: Binary inclusion/exclusion screening based on title + abstract prompts.
Training Procedure
- Method: Supervised fine-tuning (SFT) with LoRA adapters using Unsloth.
- Objective: Generate a single token label (0/1) after the instruction prompt.
Training Hyperparameters
- Training regime: Mixed precision (implementation-dependent; see training script)
Evaluation
Testing Data, Factors & Metrics
Testing Data
- Source: data/test_data_LFM.csv
Metrics
- Accuracy, precision, recall, and F1 (macro and weighted), plus class-wise metrics.
Results
From results/eval_results_lfm2.5_1.2b_instruct.json (test set size = 56):
- Accuracy (weighted): 0.9449
- F1 (weighted): 0.9468
- F1 (macro): 0.9377
- Class 0 (Exclude) F1: 0.9610
- Class 1 (Include) F1: 0.9143
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: Not disclosed
- Hours used: Not disclosed
- Cloud Provider: Not disclosed
- Compute Region: Not disclosed
- Carbon Emitted: Not disclosed
Technical Specifications [optional]
Model Architecture and Objective
- Base model: LFM2.5-1.2B-Instruct
- Objective: Instruction-following generation of a binary label for screening tasks.
Compute Infrastructure
- Frameworks: Unsloth, Transformers, TRL
Citation [optional]
If you use this model, please cite the repository:
BibTeX:
@misc{llm_systematic_review_2026,
title = {LLM Systematic Review Classifier},
author = {Intelligent Agents Research Group},
year = {2026},
howpublished = {\url{https://github.com/Intelligent-Agents-Research-Group/llm_systematic_review}}
}
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]
Framework versions
- PEFT 0.18.1
- Downloads last month
- 4
Model tree for IARG-UF/lfm-2.5-1.2b-instruct_for_sys_review
Base model
LiquidAI/LFM2.5-1.2B-Base