metadata
tags:
- esg
- greenwashing
- phobert
- vietnamese
- sustainability
license: mit
language: vi
ESG Greenwashing Detection Model
Multi-task PhoBERT model for Vietnamese ESG content analysis.
Model Architecture
4-task learning:
- Greenwashing Classification (Legitimate/Greenwashing/Uncertain)
- ESG Pillar Classification (Environmental/Social/Governance/General)
- Content Quality Scoring (0-100)
- ESG Score Prediction (0-100)
Training
- Base Model: vinai/phobert-base
- Strategy: stratified_group_kfold_5
- Folds: 5
- Total Samples: 2617
Performance
Greenwashing Detection
- F1 Score: 0.550
- Precision: 0.558
- Recall: 0.551
Pillar Classification
- Accuracy: 0.787
- F1 Macro: 0.220
Quality Scoring
- MAE: 7.794
- R²: 0.100
ESG Score Prediction
- MAE: 13.822
- R²: 0.104
- Correlation: 0.409
Usage
from transformers import AutoTokenizer, AutoModel
import torch
tokenizer = AutoTokenizer.from_pretrained("hiennthp/esg-bank-model-v4")
# Load model architecture then weights
# model = MultiTaskPhoBERT(config)
# model.load_state_dict(torch.load("best_model_fold0.pt"))
Files
best_model_fold0.pt- Fold 0 model weightsbest_model_fold1.pt- Fold 1 model weightsbest_model_fold2.pt- Fold 2 model weightsbest_model_fold3.pt- Fold 3 model weightsbest_model_fold4.pt- Fold 4 model weightsstep5_metrics.json- Detailed metrics with per-fold breakdowntokenizer/- PhoBERT tokenizer files
Citation
@software{esg_greenwashing_model,
author = {ESG Research Team},
title = {Vietnamese ESG Greenwashing Detection Model},
year = {2026},
url = {https://huggingface.co/hiennthp/esg-bank-model-v4}
}