File size: 3,618 Bytes
8ae51be dde4f77 8ae51be dde4f77 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be 3b95cbf 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 8ae51be e4b1f53 c779e89 e4b1f53 3b95cbf e4b1f53 3b95cbf e4b1f53 674b764 e4b1f53 674b764 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
---
tags:
- indobert
- sentiment-analysis
- text-classification
- social-media
- indonesian
- django
- fine-tuned
- academic-evaluation
model-index:
- name: IndoBERT Sentiment Classifier For University Review - University XYZ
results:
- task:
type: text-classification
name: Sentiment Analysis
dataset:
name: Online Lecture Sentiment Dataset
type: custom-dataset
metrics:
- type: accuracy
value: 0.89
name: Accuracy
- type: f1
value: 0.88
name: F1 Score
- type: precision
value: 0.87
name: Precision
- type: recall
value: 0.89
name: Recall
---
# IndoBERT Sentiment Classifier for Social Media Posts – Universitas XYZ
This model is a fine-tuned IndoBERT transformer for performing sentiment analysis on Indonesian social media text (Twitter) related to university services. It classifies input text into **positive**, **neutral**, or **negative** sentiment categories.
## 🧠 Model Description
The model is built upon [`indobert-base-p2`](https://huggingface.co/indobenchmark/indobert-base-p2), a BERT-based transformer pre-trained on over 220 million Indonesian words. The fine-tuning process was done on 7500 samples containing balanced sentiment labels related to online academic services.
- **Label classes**: Positive, Neutral, Negative
- **Preprocessing**: Case folding, punctuation removal, stopword removal, stemming, tokenization (using IndoBERT tokenizer)
## ✅ Intended Use
- Analyzing Indonesian tweets about universities
- Sentiment-driven dashboards for academic service quality
- NLP applications in education sector
## ⚠️ Limitations
- Domain-specific to university-related sentiment
- May not generalize well to informal or slang-heavy text
- Sarcasm or mixed-sentiment detection is not supported
- Doesn’t handle toxicity or hate speech detection
## 📊 Dataset
- **Source**: Custom crawled tweets via keywords and hashtags (e.g. `#telkomuniversity`, `Universitas XYZ`)
- **Size**: 7500 samples
- **Split**: 70% train, 10% validation, 20% test
- **Labels**: 2500 positive, 2500 neutral, 2500 negative
- **Language**: Indonesian
## ⚙️ Training Procedure
### Hyperparameters
- **Learning rate**: 5e-5
- **Batch size**: 8
- **Epochs**: 3
- **Optimizer**: Adam (β1=0.9, β2=0.999, ε=1e-8)
- **Scheduler**: Linear
- **Seed**: 42
### Framework Versions
- Transformers: 4.24.0
- PyTorch: 1.13.0
- Tokenizers: 0.13.2
## 📈 Evaluation Metrics
| Metric | Score |
|-----------|-------|
| Accuracy | 89% |
| F1 Score | 88% |
| Precision | 87% |
| Recall | 89% |

## 💻 Deployment Context
This model was integrated into [`a Django-based sentiment dashboard application`](https://github.com/ShinyQ/Django_Thesis-Sentiboard-University-Sentiment-App) with:
- A custom Twitter crawler
- Real-time sentiment classification
- Wordclouds and sentiment breakdowns by time period
- Admin tools for filtering, deleting, and exporting data
## 📄 Citation
If you use this model or its components, please cite:
```
@article{wijaya2023indobert,
author = {Kurniadi Ahmad Wijaya and Ade Romadhony and Donni Richasdy},
title = {Implementasi Model IndoBERT pada Dashboard Sentimen Media Sosial (Studi Kasus Universitas XYZ)},
journal = {eProceedings of Engineering},
volume = {10},
number = {4},
year = {2023},
month = {September},
url = {https://openlibrary.telkomuniversity.ac.id},
}
``` |