Felixbrk's picture
Update README.md
3942c37 verified
metadata
model_name: transformer_multi_head_bert_updated
base_model: GroNLP/bert-base-dutch-cased
language: nl
library_name: transformers
pipeline_tag: text-classification
license: mit
tags:
  - dutch
  - regression
  - multi-head
  - bert
  - text-quality
datasets:
  - proprietary
target_names:
  - delta_cola_to_final
  - delta_perplexity_to_final_large
  - iter_to_final_simplified
  - robbert_delta_blurb_to_final
metrics:
  per_epoch:
    - epoch: 1
      valid_loss: 0.016363
      delta_cola_to_final:
        rmse: 0.149753
        r2: 0.36889
      delta_perplexity_to_final_large:
        rmse: 0.099918
        r2: 0.648474
      iter_to_final_simplified:
        rmse: 0.138463
        r2: 0.818398
      robbert_delta_blurb_to_final:
        rmse: 0.117767
        r2: 0.729513
      mean_rmse: 0.126476
    - epoch: 2
      valid_loss: 0.01522
      delta_cola_to_final:
        rmse: 0.146628
        r2: 0.394957
      delta_perplexity_to_final_large:
        rmse: 0.10185
        r2: 0.634748
      iter_to_final_simplified:
        rmse: 0.127215
        r2: 0.846706
      robbert_delta_blurb_to_final:
        rmse: 0.113245
        r2: 0.749883
      mean_rmse: 0.122235
    - epoch: 3
      valid_loss: 0.015208
      delta_cola_to_final:
        rmse: 0.146956
        r2: 0.392247
      delta_perplexity_to_final_large:
        rmse: 0.098563
        r2: 0.657945
      iter_to_final_simplified:
        rmse: 0.127813
        r2: 0.84526
      robbert_delta_blurb_to_final:
        rmse: 0.114824
        r2: 0.742861
      mean_rmse: 0.122039
    - epoch: 4
      valid_loss: 0.015156
      delta_cola_to_final:
        rmse: 0.142936
        r2: 0.425041
      delta_perplexity_to_final_large:
        rmse: 0.099919
        r2: 0.648468
      iter_to_final_simplified:
        rmse: 0.128406
        r2: 0.843822
      robbert_delta_blurb_to_final:
        rmse: 0.117143
        r2: 0.732371
      mean_rmse: 0.122101
    - epoch: 5
      valid_loss: 0.015457
      delta_cola_to_final:
        rmse: 0.144705
        r2: 0.410725
      delta_perplexity_to_final_large:
        rmse: 0.100198
        r2: 0.646506
      iter_to_final_simplified:
        rmse: 0.131055
        r2: 0.837311
      robbert_delta_blurb_to_final:
        rmse: 0.116936
        r2: 0.733315
      mean_rmse: 0.123223
  test:
    aggregate_rmse: 0.0769
    aggregate_r2: 0.8425
    mean_rmse: 0.121

transformer_multi_head_bert_updated

A multi-head transformer regression model based on BERT (GroNLP/bert-base-dutch-cased), fine-tuned to predict four normalized delta scores for Dutch book reviews. The four output heads are:

  1. delta_cola_to_final
  2. delta_perplexity_to_final_large
  3. iter_to_final_simplified
  4. robbert_delta_blurb_to_final

⚠️ The order of these outputs is crucial and must be maintained exactly as above during inference.
Changing the order will cause incorrect mapping of predicted values to their respective targets.

Additionally, a final aggregate score is provided (mean of the four heads).

📈 Training & Evaluation

  • Base model: GroNLP/bert-base-dutch-cased
  • Fine-tuning: 5 epochs on a proprietary dataset
  • Output heads: 4
  • Problem type: multi-head regression

Per-Epoch Validation Metrics

Epoch Val Loss ΔCoLA RMSE / R² ΔPerp RMSE / R² Iter RMSE / R² Blurb RMSE / R² Mean RMSE
1 0.01636 0.1498 / 0.3689 0.0999 / 0.6485 0.1385 / 0.8184 0.1178 / 0.7295 0.1265
2 0.01522 0.1466 / 0.3950 0.1019 / 0.6347 0.1272 / 0.8467 0.1132 / 0.7499 0.1222
3 0.01521 0.1470 / 0.3922 0.0986 / 0.6579 0.1278 / 0.8453 0.1148 / 0.7429 0.1220
4 0.01516 0.1429 / 0.4250 0.0999 / 0.6485 0.1284 / 0.8438 0.1171 / 0.7324 0.1221
5 0.01546 0.1447 / 0.4107 0.1002 / 0.6465 0.1311 / 0.8373 0.1169 / 0.7333 0.1232

✅ Final Aggregate Performance (Test)

Metric Value
Aggregate RMSE 0.0769
Aggregate R² 0.8425
Mean RMSE (heads) 0.1210

🗂️ Test Metrics (Per Target)

Target RMSE
delta_cola_to_final 0.1463 0.4286
delta_perplexity_to_final_large 0.0955 0.6802
iter_to_final_simplified 0.1255 0.8535
robbert_delta_blurb_to_final 0.1168 0.7319

🏷️ Notes

  • Base model: GroNLP/bert-base-dutch-cased
  • Fine-tuned for multi-head regression on Dutch book reviews
  • Trained for 5 epochs on a proprietary dataset
  • Sigmoid activation built into each head
  • Re-aggregation: simple average of the four head outputs

🛠️ Training Arguments

  • num_train_epochs=5
  • per_device_train_batch_size=8
  • per_device_eval_batch_size=16
  • gradient_accumulation_steps=2
  • learning_rate=2e-5
  • weight_decay=0.01
  • eval_strategy="epoch"
  • save_strategy="epoch"
  • load_best_model_at_end=True
  • metric_for_best_model="mean_rmse"
  • greater_is_better=False
  • bf16 enabled if supported, else fp16 enabled
  • logging_strategy="epoch"
  • push_to_hub=True with model ID Felixbrk/bert-base-dutch-cased-multi-score-tuned-positive
  • hub_strategy="end"
  • Early stopping with patience 2 epochs

⚠️ Important:

  • Always load this model with trust_remote_code=True as it uses a custom multi-head regression architecture.
  • Maintain the output order exactly for correct interpretation of results.