Model Card for lfm-2.5-1.2b-instruct_for_sys_review

This model is a fine-tuned classifier for systematic review screening. It predicts whether a paper should be included (1) or excluded (0) based on the study title and abstract.

Model Details

Model Description

This model adapts the LFM2.5-1.2B-Instruct base model for binary inclusion screening in systematic reviews. It is trained with supervised fine-tuning (SFT) using LoRA adapters and an instruction-following format that outputs a single label (0/1).

Developed by: Intelligent Agents Research Group (IARG-UF)
Funded by [optional]: Not disclosed
Shared by [optional]: IARG-UF
Model type: Causal language model, instruction-tuned binary classifier (LoRA)
Language(s) (NLP): English
License: Same as base model (see base model card)
Finetuned from model [optional]: unsloth/LFM2.5-1.2B-Instruct

Model Sources [optional]

Repository: https://github.com/Intelligent-Agents-Research-Group/llm_systematic_review
Paper [optional]: Not available
Demo [optional]: Not available

Uses

Direct Use

Automating Phase I screening in systematic reviews by classifying abstracts as include (1) or exclude (0).

Downstream Use [optional]

As a screening assistant in a human-in-the-loop workflow to prioritize studies for manual review.
As a baseline for additional fine-tuning on domain-specific screening tasks.

Out-of-Scope Use

Medical, legal, or safety-critical decision-making without expert oversight.
Tasks requiring nuanced multi-label classification beyond include/exclude.
Non-English abstracts or domains not represented in the training data.

Bias, Risks, and Limitations

The training data reflects specific inclusion criteria and may not generalize to other review topics.
Class imbalance can bias predictions toward exclusion.
The model may output a label even when evidence is insufficient or ambiguous.

Recommendations

Use as a triage tool with human verification.
Calibrate thresholds or prompts to reduce false positives.
Re-train or adapt for new domains and criteria.

How to Get Started with the Model

Use the code below to get started with the model.

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
        model_name="models/lfm-2.5-1.2b-instruct_for_sys_review",
        max_seq_length=4096,
)
FastLanguageModel.for_inference(model)

prompt = """Assess inclusion for the review.
Title: 'ChatGPT as a Software Development Bot: A Project-Based Study'.
Abstract: The study examines ChatGPT as a support tool for undergraduate
software development projects and evaluates learning outcomes.
Output 0 or 1."""

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        return_tensors="pt",
        tokenize=True,
        return_dict=True,
).to("cuda")

output = model.generate(**inputs, max_new_tokens=64, temperature=0.0)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Training Data

Source: data/train_data_LFM.csv
Task: Binary inclusion/exclusion screening based on title + abstract prompts.

Training Procedure

Method: Supervised fine-tuning (SFT) with LoRA adapters using Unsloth.
Objective: Generate a single token label (0/1) after the instruction prompt.

Training Hyperparameters

Training regime: Mixed precision (implementation-dependent; see training script)

Evaluation

Testing Data, Factors & Metrics

Testing Data

Source: data/test_data_LFM.csv

Metrics

Accuracy, precision, recall, and F1 (macro and weighted), plus class-wise metrics.

Results

From results/eval_results_lfm2.5_1.2b_instruct.json (test set size = 56):

Accuracy (weighted): 0.9449
F1 (weighted): 0.9468
F1 (macro): 0.9377
Class 0 (Exclude) F1: 0.9610
Class 1 (Include) F1: 0.9143

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: Not disclosed
Hours used: Not disclosed
Cloud Provider: Not disclosed
Compute Region: Not disclosed
Carbon Emitted: Not disclosed

Technical Specifications [optional]

Model Architecture and Objective

Base model: LFM2.5-1.2B-Instruct
Objective: Instruction-following generation of a binary label for screening tasks.

Compute Infrastructure

Frameworks: Unsloth, Transformers, TRL

Citation [optional]

If you use this model, please cite the repository:

BibTeX:

@misc{llm_systematic_review_2026,
    title        = {LLM Systematic Review Classifier},
    author       = {Intelligent Agents Research Group},
    year         = {2026},
    howpublished = {\url{https://github.com/Intelligent-Agents-Research-Group/llm_systematic_review}}
}

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Framework versions

PEFT 0.18.1

Downloads last month: 4

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for IARG-UF/lfm-2.5-1.2b-instruct_for_sys_review

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

LiquidAI/LFM2.5-1.2B-Instruct

Finetuned

unsloth/LFM2.5-1.2B-Instruct

Adapter

(1)

this model

Paper for IARG-UF/lfm-2.5-1.2b-instruct_for_sys_review

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 41