esg-bank-model-v4 / README.md
hiennthp's picture
Upload README.md with huggingface_hub
7e93a73 verified
metadata
tags:
  - esg
  - greenwashing
  - phobert
  - vietnamese
  - sustainability
license: mit
language: vi

ESG Greenwashing Detection Model

Multi-task PhoBERT model for Vietnamese ESG content analysis.

Model Architecture

4-task learning:

  1. Greenwashing Classification (Legitimate/Greenwashing/Uncertain)
  2. ESG Pillar Classification (Environmental/Social/Governance/General)
  3. Content Quality Scoring (0-100)
  4. ESG Score Prediction (0-100)

Training

  • Base Model: vinai/phobert-base
  • Strategy: stratified_group_kfold_5
  • Folds: 5
  • Total Samples: 2617

Performance

Greenwashing Detection

  • F1 Score: 0.550
  • Precision: 0.558
  • Recall: 0.551

Pillar Classification

  • Accuracy: 0.787
  • F1 Macro: 0.220

Quality Scoring

  • MAE: 7.794
  • R²: 0.100

ESG Score Prediction

  • MAE: 13.822
  • R²: 0.104
  • Correlation: 0.409

Usage

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("hiennthp/esg-bank-model-v4")
# Load model architecture then weights
# model = MultiTaskPhoBERT(config)
# model.load_state_dict(torch.load("best_model_fold0.pt"))

Files

  • best_model_fold0.pt - Fold 0 model weights
  • best_model_fold1.pt - Fold 1 model weights
  • best_model_fold2.pt - Fold 2 model weights
  • best_model_fold3.pt - Fold 3 model weights
  • best_model_fold4.pt - Fold 4 model weights
  • step5_metrics.json - Detailed metrics with per-fold breakdown
  • tokenizer/ - PhoBERT tokenizer files

Citation

@software{esg_greenwashing_model,
  author = {ESG Research Team},
  title = {Vietnamese ESG Greenwashing Detection Model},
  year = {2026},
  url = {https://huggingface.co/hiennthp/esg-bank-model-v4}
}