File size: 4,894 Bytes
81e1237 68d031c e1864f7 68d031c a2ab255 68d031c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
---
language:
- en
license: mit
library_name: llama.cpp
tags:
- nlp
- information-extraction
- event-detection
datasets:
- custom_dataset
metrics:
- accuracy
base_model: deepseek-ai/deepseek-r1-distill-qwen-1.5b
model-index:
- name: DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)
results:
- task:
type: text-generation
name: Actionable Information Extraction
dataset:
type: custom_dataset
name: Custom Dataset for Event & Bullet Extraction
metrics:
- type: latency
value: 33.78
name: Prompt Eval Time (ms per token)
args:
device: 3090 TI
---
# DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)
## Overview
This is a fine-tuned version of **DeepSeek R1 Distill Qwen 1.5B**, optimized for extracting actionable insights and scheduling events from conversations. The model has undergone **2500 steps / 9 epochs** of fine-tuning with **2194 examples**, ensuring high accuracy and efficiency in structured information extraction.
## Model Details
- **Base Model:** DeepSeek R1 Distill Qwen 1.5B
- **Fine-tuning Steps:** 2500
- **Epochs:** 9
- **Dataset Size:** 2194 examples
- **License:** MIT
- **File Format:** GGUF
- **Released Version:** Q4_K_M.gguf
| Metric | **3090 Ti** | **Raspberry Pi 5** |
|-----------------|-------------------------------|-------------------------------|
| **Prompt Eval Time** | 33.78 ms / 406 tokens (0.08 ms per token, 12017.88 tokens/sec) | 17831.25 ms / 535 tokens (33.33 ms per token, 30.00 tokens/sec) |
| **Eval Time** | 7133.93 ms / 1694 tokens (4.21 ms per token, 237.46 tokens/sec) | 52006.54 ms / 529 tokens (98.31 ms per token, 10.17 tokens/sec) |
| **Total Time** | 7167.72 ms / 2100 tokens | 70881.95 ms / 1064 tokens |
| **Decoding Speed** | N/A | 529 tokens in 70.40s (7.51 tokens/sec) |
| **Sampling Speed** | N/A | 149.33 ms / 530 runs (0.28 ms per token, 3549.26 tokens/sec) |
### **Observations:**
- The **3090 Ti** is significantly faster, handling **12017.88 tokens/sec** for prompt evaluation, compared to **30 tokens/sec** on the **Pi 5**.
- In token evaluation, the **3090 Ti** manages **237.46 tokens/sec**, whereas the **Pi 5** achieves just **10.17 tokens/sec**.
- The **Pi 5**'s total execution time (70.88s) is close to the **3090 Ti**, but it processes far fewer tokens in that time.
## Usage Instructions
### System Prompt
To use this model effectively, initialize it with the following system prompt:
```
### Instruction:
Purpose:
Extract actionable information from the provided dialog and metadata, generating bullet points with importance rankings and identifying relevant calendar events.
### Steps:
1. **Context Analysis:**
- Use `CurrentDateTime` to interpret relative time references (e.g., "tomorrow").
- Prioritize key information based on `InformationRankings`:
- Higher rank values indicate more emphasis on that aspect.
2. **Bullet Points:**
- Summarize key points concisely.
- Assign an importance rank (1-100).
- Format: `<Bullet_Point>"[Summary]"</Bullet_Point><Rank>[1-100]</Rank>`
3. **Event Detection:**
- Identify and structure events with clear scheduling details.
- Format:
`<Calendar_Event>EventTitle:"[Title]",StartDate:"[YYYY-MM-DD,HH:MM]",EndDate:"[YYYY-MM-DD,HH:MM or N/A]",Recurrence:"[Daily/Weekly/Monthly or N/A]",Details:"[Summary]"</Calendar_Event>`
4. **Filtering:**
- Exclude vague, non-actionable statements.
- Only create events for clear, actionable scheduling needs.
5. **Output Consistency:**
- Follow the exact XML format.
- Ensure structured, relevant output.
ONLY REPLY WITH THE XML AFTER YOU END THINK.
Dialog: "{conversations}"
CurrentDateTime: "{date_and_time_and_day}"
InformationRankings: "{information_rankings}"
<think>
```
## How to Run the Model
### Using llama.cpp
If you are using `llama.cpp`, run the model with:
```bash
./main -m Q4_K_M.gguf --prompt "<your prompt>" --temp 0.7 --n-gpu-layers 50
```
### Using Text Generation WebUI
1. Download and place the `Q4_K_M.gguf` file in the models folder.
2. Start the WebUI:
```bash
python server.py --model Q4_K_M.gguf
```
3. Use the system prompt above for structured output.
## Expected Output Format
Example response when processing a conversation:
```xml
<Bullet_Point>"Team meeting scheduled for next Monday to finalize project details."</Bullet_Point><Rank>85</Rank>
<Calendar_Event>EventTitle:"Team Meeting",StartDate:"2025-06-10,10:00",EndDate:"2025-06-10,11:00",Recurrence:"N/A",Details:"Finalizing project details with the team."</Calendar_Event>
```
## License
This model is released under the **MIT License**, allowing free usage, modification, and distribution.
## Contact & Support
For any inquiries or support, please visit [Hugging Face Discussions](https://huggingface.co/unsloth/DeepSeek-R1-GGUF) or open an issue on the repository.
|