|
|
--- |
|
|
base_model: unsloth/mistral-7b-bnb-4bit |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- llm-finetuning |
|
|
- transformers |
|
|
- unsloth |
|
|
- mistral |
|
|
- trl |
|
|
datasets: |
|
|
- stanfordnlp/imdb |
|
|
--- |
|
|
|
|
|
# Uploaded Model |
|
|
|
|
|
- **Developed by:** Shaheen Nabi |
|
|
- **License:** Apache-2.0 |
|
|
- **Finetuned from model:** `unsloth/mistral-7b-bnb-4bit` |
|
|
- **Model Type:** Large Language Model (LLM) |
|
|
- **Training Framework:** Hugging Face Transformers, TRL (Transformers Reinforcement Learning) library |
|
|
- **Pretraining Dataset:** [Stanford IMDb Dataset](https://huggingface.co/datasets/stanfordnlp/imdb) |
|
|
- **Fine-Tuning Dataset:** Stanford IMDb (Text Classification Task) |
|
|
|
|
|
### Overview |
|
|
|
|
|
This model is a fine-tuned version of `unsloth/mistral-7b-bnb-4bit`, a 7-billion-parameter model based on the Mistral architecture. It was fine-tuned to improve performance on natural language understanding tasks, specifically for text classification using the Stanford IMDb dataset. |
|
|
|
|
|
The fine-tuning process leveraged the **Unsloth** framework, which significantly sped up the training time, enabling a **2x faster training** process. Additionally, Hugging Face's **TRL library** (Transformers Reinforcement Learning) was used to adapt the model efficiently. |
|
|
|
|
|
### Training Details |
|
|
|
|
|
- **Base Model:** `unsloth/mistral-7b-bnb-4bit` (7B parameters, 4-bit quantized weights for memory efficiency) |
|
|
- **Training Speed:** The model was trained **2x faster** with Unsloth, optimizing training time and resource usage. |
|
|
- **Optimization Techniques:** Low-rank adaptation (LoRA), gradient checkpointing, and 4-bit quantization were employed to reduce memory and computational cost while maintaining model performance. |
|
|
|
|
|
### Intended Use |
|
|
|
|
|
This model is designed for tasks such as: |
|
|
- Sentiment analysis |
|
|
- Text classification |
|
|
- Fine-grained NLP tasks |
|
|
|
|
|
It is optimized for deployment in resource-constrained environments due to the quantization of the base model and fine-tuning techniques used. |
|
|
|
|
|
### Model Performance |
|
|
|
|
|
- **Primary Metric:** Accuracy on text classification tasks (Stanford IMDb dataset) |
|
|
- **Fine-Tuning Results:** The fine-tuned model shows improved accuracy, making it a practical choice for NLP applications. |
|
|
|
|
|
### Usage |
|
|
|
|
|
To use the model, you can load it using the `FastLanguageModel` class as follows: |
|
|
|
|
|
```python |
|
|
from unsloth import FastLanguageModel |
|
|
|
|
|
# Load the fine-tuned model and tokenizer |
|
|
model_name = "shaheennabi/your-finetuned-mistral-7b-imdb" |
|
|
max_seq_length = 512 # Set according to your requirements |
|
|
|
|
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
|
model_name=model_name, |
|
|
max_seq_length=max_seq_length, |
|
|
dtype=None, |
|
|
load_in_4bit=True |
|
|
) |
|
|
|
|
|
# Example of using the model for inference |
|
|
input_text = "This movie was fantastic!" |
|
|
inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True) |
|
|
outputs = model(**inputs) |
|
|
|