🧠 DeepSeek-Qwen-1.5B-Multitask-LoRA

Author: Gilbert Akham
License: Apache-2.0
Base model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Adapter type: LoRA (PEFT)
Capabilities: Multi-task generalization & reasoning


🌟 Overview

This model is a LoRA-tuned variant of DeepSeek-R1-Distill-Qwen-1.5B, trained on a multi-task mixture designed to teach the model to:

  • write professional emails
  • continue stories coherently
  • hold conversations and reason (from SmolTalk)
  • summarize long articles (CNN/DailyMail)
  • answer technical questions
  • generate reports and structured text

It demonstrates strong reasoning, clarity, and context retention for small-scale compute deployment (4-bit quantization compatible).


🧩 Training Details

Parameter Value
Base model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Adapter LoRA (r=8, alpha=32, dropout=0.1)
Max sequence length 1024
Learning rate 3e-5 (cosine decay)
Optimizer adamw_8bit
Grad Accumulation 4
Precision 4-bit quantized, FP16 compute
Steps 12k total (best @ ~8.2k)
Training time ~2.5h on A4000
Frameworks 🤗 Transformers, PEFT, TRL, BitsAndBytes

🧠 Reasoning Capability

Thanks to integration of SmolTalk and diverse multi-task prompts, the model learns:

  • Chain-of-thought style reasoning
  • Conversational grounding
  • Multi-step logical inferences
  • Instruction following across domains

Example:

### Task: Explain reasoning

### Input:
If a train leaves City A at 3 PM and arrives at City B at 6 PM, covering 180 km, what is its average speed?

### Output:
The train travels 180 km in 3 hours. 
Average speed = 180 ÷ 3 = 60 km/h.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GilbertAkham/deepseek-R1-multitask-lora

Adapter
(150)
this model

Datasets used to train GilbertAkham/deepseek-R1-multitask-lora

Space using GilbertAkham/deepseek-R1-multitask-lora 1