File size: 3,618 Bytes
8ae51be
 
dde4f77
 
 
 
 
 
 
 
8ae51be
dde4f77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8ae51be
 
e4b1f53
8ae51be
e4b1f53
8ae51be
e4b1f53
8ae51be
3b95cbf
8ae51be
e4b1f53
 
8ae51be
e4b1f53
8ae51be
e4b1f53
 
 
8ae51be
e4b1f53
8ae51be
e4b1f53
 
 
 
8ae51be
e4b1f53
8ae51be
e4b1f53
 
 
 
 
8ae51be
e4b1f53
8ae51be
e4b1f53
8ae51be
e4b1f53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c779e89
e4b1f53
 
 
3b95cbf
e4b1f53
 
 
 
 
3b95cbf
e4b1f53
 
 
674b764
e4b1f53
 
 
 
 
 
 
 
 
674b764
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
tags:
- indobert
- sentiment-analysis
- text-classification
- social-media
- indonesian
- django
- fine-tuned
- academic-evaluation
model-index:
- name: IndoBERT Sentiment Classifier For University Review - University XYZ
  results:
  - task:
      type: text-classification
      name: Sentiment Analysis
    dataset:
      name: Online Lecture Sentiment Dataset
      type: custom-dataset
    metrics:
    - type: accuracy
      value: 0.89
      name: Accuracy
    - type: f1
      value: 0.88
      name: F1 Score
    - type: precision
      value: 0.87
      name: Precision
    - type: recall
      value: 0.89
      name: Recall
---

# IndoBERT Sentiment Classifier for Social Media Posts – Universitas XYZ

This model is a fine-tuned IndoBERT transformer for performing sentiment analysis on Indonesian social media text (Twitter) related to university services. It classifies input text into **positive**, **neutral**, or **negative** sentiment categories.

## 🧠 Model Description

The model is built upon [`indobert-base-p2`](https://huggingface.co/indobenchmark/indobert-base-p2), a BERT-based transformer pre-trained on over 220 million Indonesian words. The fine-tuning process was done on 7500 samples containing balanced sentiment labels related to online academic services.

- **Label classes**: Positive, Neutral, Negative
- **Preprocessing**: Case folding, punctuation removal, stopword removal, stemming, tokenization (using IndoBERT tokenizer)

## ✅ Intended Use

- Analyzing Indonesian tweets about universities
- Sentiment-driven dashboards for academic service quality
- NLP applications in education sector

## ⚠️ Limitations

- Domain-specific to university-related sentiment
- May not generalize well to informal or slang-heavy text
- Sarcasm or mixed-sentiment detection is not supported
- Doesn’t handle toxicity or hate speech detection

## 📊 Dataset

- **Source**: Custom crawled tweets via keywords and hashtags (e.g. `#telkomuniversity`, `Universitas XYZ`)
- **Size**: 7500 samples
- **Split**: 70% train, 10% validation, 20% test
- **Labels**: 2500 positive, 2500 neutral, 2500 negative
- **Language**: Indonesian

## ⚙️ Training Procedure

### Hyperparameters

- **Learning rate**: 5e-5  
- **Batch size**: 8  
- **Epochs**: 3  
- **Optimizer**: Adam (β1=0.9, β2=0.999, ε=1e-8)  
- **Scheduler**: Linear  
- **Seed**: 42

### Framework Versions

- Transformers: 4.24.0  
- PyTorch: 1.13.0  
- Tokenizers: 0.13.2  

## 📈 Evaluation Metrics

| Metric    | Score |
|-----------|-------|
| Accuracy  | 89%   |
| F1 Score  | 88%   |
| Precision | 87%   |
| Recall    | 89%   |

![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F6157d43f013078aa50b55498%2F9stv_9eVY5mlBO9TwIKwn.png%3C%2Fspan%3E)

## 💻 Deployment Context

This model was integrated into [`a Django-based sentiment dashboard application`](https://github.com/ShinyQ/Django_Thesis-Sentiboard-University-Sentiment-App) with:
- A custom Twitter crawler
- Real-time sentiment classification
- Wordclouds and sentiment breakdowns by time period
- Admin tools for filtering, deleting, and exporting data


## 📄 Citation

If you use this model or its components, please cite:
```
@article{wijaya2023indobert,
  author    = {Kurniadi Ahmad Wijaya and Ade Romadhony and Donni Richasdy},
  title     = {Implementasi Model IndoBERT pada Dashboard Sentimen Media Sosial (Studi Kasus Universitas XYZ)},
  journal   = {eProceedings of Engineering},
  volume    = {10},
  number    = {4},
  year      = {2023},
  month     = {September},
  url       = {https://openlibrary.telkomuniversity.ac.id},
}
```