Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,46 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: other
|
| 3 |
-
license_name: govtech-singapore
|
| 4 |
-
license_link: LICENSE
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: govtech-singapore
|
| 4 |
+
license_link: LICENSE
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
- ms
|
| 8 |
+
- ta
|
| 9 |
+
- zh
|
| 10 |
+
pipeline_tag: text-classification
|
| 11 |
+
tags:
|
| 12 |
+
- classifier
|
| 13 |
+
- safety
|
| 14 |
+
- moderation
|
| 15 |
+
- multilingual
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
# LionGuard 2
|
| 19 |
+
LionGuard 2 is a multilingual content moderation classifier tuned for English/Singlish, Chinese, Malay, and Tamil in the Singapore context.
|
| 20 |
+
|
| 21 |
+
It leverages OpenAI’s `text-embedding-3-large` with a multi-head classifier to return fine-grained scores for the following categories:
|
| 22 |
+
- Overall safety (`binary`)
|
| 23 |
+
- Hate (`hateful_l1`, `hateful_l2`)
|
| 24 |
+
- Insults (`insults`)
|
| 25 |
+
- Sexual content (`sexual_l1`, `sexual_l2`)
|
| 26 |
+
- Physical violence (`physical_violence`)
|
| 27 |
+
- Self-harm (`self_harm_l1`, `self_harm_l2`)
|
| 28 |
+
- Other misconduct (`all_other_misconduct_l1`, `all_other_misconduct_l2`)
|
| 29 |
+
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
# Usage
|
| 33 |
+
1. Install packages
|
| 34 |
+
```bash
|
| 35 |
+
pip install -r requirements.txt
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
2. Set your OpenAI key
|
| 39 |
+
```bash
|
| 40 |
+
export OPENAI_API_KEY=sk-...
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
3. Run inference on an array of texts
|
| 44 |
+
```bash
|
| 45 |
+
python inference.py "['Text 1', 'Text 2']"
|
| 46 |
+
```
|