--- tags: [esg, greenwashing, phobert, vietnamese, sustainability] license: mit language: vi --- # ESG Greenwashing Detection Model Multi-task PhoBERT model for Vietnamese ESG content analysis. ## Model Architecture 4-task learning: 1. **Greenwashing Classification** (Legitimate/Greenwashing/Uncertain) 2. **ESG Pillar Classification** (Environmental/Social/Governance/General) 3. **Content Quality Scoring** (0-100) 4. **ESG Score Prediction** (0-100) ## Training - **Base Model:** vinai/phobert-base - **Strategy:** stratified_group_kfold_2 - **Folds:** 1 - **Total Samples:** 21 ## Performance ### Greenwashing Detection - F1 Score: 0.118 - Precision: 0.111 - Recall: 0.125 ### Pillar Classification - Accuracy: 0.000 - F1 Macro: 0.000 ### Quality Scoring - MAE: 35.345 - R²: -19.549 ### ESG Score Prediction - MAE: 33.567 - R²: -4.038 - Correlation: 0.131 ## Usage ```python from transformers import AutoTokenizer, AutoModel import torch tokenizer = AutoTokenizer.from_pretrained("hiennthp/esg-bank-model-v4") # Load model architecture then weights # model = MultiTaskPhoBERT(config) # model.load_state_dict(torch.load("best_model_fold0.pt")) ``` ## Files - `best_model_fold0.pt` - Fold 0 model weights - `step5_metrics.json` - Detailed metrics with per-fold breakdown - `tokenizer/` - PhoBERT tokenizer files ## Citation ```bibtex @software{esg_greenwashing_model, author = {ESG Research Team}, title = {Vietnamese ESG Greenwashing Detection Model}, year = {2026}, url = {https://huggingface.co/hiennthp/esg-bank-model-v4} } ```