Update README.md

3942c37 verified 5 months ago

5.64 kB

metadata

model_name: transformer_multi_head_bert_updated
base_model: GroNLP/bert-base-dutch-cased
language: nl
library_name: transformers
pipeline_tag: text-classification
license: mit
tags:
  - dutch
  - regression
  - multi-head
  - bert
  - text-quality
datasets:
  - proprietary
target_names:
  - delta_cola_to_final
  - delta_perplexity_to_final_large
  - iter_to_final_simplified
  - robbert_delta_blurb_to_final
metrics:
  per_epoch:
    - epoch: 1
      valid_loss: 0.016363
      delta_cola_to_final:
        rmse: 0.149753
        r2: 0.36889
      delta_perplexity_to_final_large:
        rmse: 0.099918
        r2: 0.648474
      iter_to_final_simplified:
        rmse: 0.138463
        r2: 0.818398
      robbert_delta_blurb_to_final:
        rmse: 0.117767
        r2: 0.729513
      mean_rmse: 0.126476
    - epoch: 2
      valid_loss: 0.01522
      delta_cola_to_final:
        rmse: 0.146628
        r2: 0.394957
      delta_perplexity_to_final_large:
        rmse: 0.10185
        r2: 0.634748
      iter_to_final_simplified:
        rmse: 0.127215
        r2: 0.846706
      robbert_delta_blurb_to_final:
        rmse: 0.113245
        r2: 0.749883
      mean_rmse: 0.122235
    - epoch: 3
      valid_loss: 0.015208
      delta_cola_to_final:
        rmse: 0.146956
        r2: 0.392247
      delta_perplexity_to_final_large:
        rmse: 0.098563
        r2: 0.657945
      iter_to_final_simplified:
        rmse: 0.127813
        r2: 0.84526
      robbert_delta_blurb_to_final:
        rmse: 0.114824
        r2: 0.742861
      mean_rmse: 0.122039
    - epoch: 4
      valid_loss: 0.015156
      delta_cola_to_final:
        rmse: 0.142936
        r2: 0.425041
      delta_perplexity_to_final_large:
        rmse: 0.099919
        r2: 0.648468
      iter_to_final_simplified:
        rmse: 0.128406
        r2: 0.843822
      robbert_delta_blurb_to_final:
        rmse: 0.117143
        r2: 0.732371
      mean_rmse: 0.122101
    - epoch: 5
      valid_loss: 0.015457
      delta_cola_to_final:
        rmse: 0.144705
        r2: 0.410725
      delta_perplexity_to_final_large:
        rmse: 0.100198
        r2: 0.646506
      iter_to_final_simplified:
        rmse: 0.131055
        r2: 0.837311
      robbert_delta_blurb_to_final:
        rmse: 0.116936
        r2: 0.733315
      mean_rmse: 0.123223
  test:
    aggregate_rmse: 0.0769
    aggregate_r2: 0.8425
    mean_rmse: 0.121

transformer_multi_head_bert_updated

A multi-head transformer regression model based on BERT (GroNLP/bert-base-dutch-cased), fine-tuned to predict four normalized delta scores for Dutch book reviews. The four output heads are:

delta_cola_to_final
delta_perplexity_to_final_large
iter_to_final_simplified
robbert_delta_blurb_to_final

⚠️ The order of these outputs is crucial and must be maintained exactly as above during inference.
Changing the order will cause incorrect mapping of predicted values to their respective targets.

Additionally, a final aggregate score is provided (mean of the four heads).

📈 Training & Evaluation

Base model: GroNLP/bert-base-dutch-cased
Fine-tuning: 5 epochs on a proprietary dataset
Output heads: 4
Problem type: multi-head regression

Per-Epoch Validation Metrics

Epoch	Val Loss	ΔCoLA RMSE / R²	ΔPerp RMSE / R²	Iter RMSE / R²	Blurb RMSE / R²	Mean RMSE
1	0.01636	0.1498 / 0.3689	0.0999 / 0.6485	0.1385 / 0.8184	0.1178 / 0.7295	0.1265
2	0.01522	0.1466 / 0.3950	0.1019 / 0.6347	0.1272 / 0.8467	0.1132 / 0.7499	0.1222
3	0.01521	0.1470 / 0.3922	0.0986 / 0.6579	0.1278 / 0.8453	0.1148 / 0.7429	0.1220
4	0.01516	0.1429 / 0.4250	0.0999 / 0.6485	0.1284 / 0.8438	0.1171 / 0.7324	0.1221
5	0.01546	0.1447 / 0.4107	0.1002 / 0.6465	0.1311 / 0.8373	0.1169 / 0.7333	0.1232

✅ Final Aggregate Performance (Test)

Metric	Value
Aggregate RMSE	0.0769
Aggregate R²	0.8425
Mean RMSE (heads)	0.1210

🗂️ Test Metrics (Per Target)

Target	RMSE	R²
delta_cola_to_final	0.1463	0.4286
delta_perplexity_to_final_large	0.0955	0.6802
iter_to_final_simplified	0.1255	0.8535
robbert_delta_blurb_to_final	0.1168	0.7319

🏷️ Notes

Base model: GroNLP/bert-base-dutch-cased
Fine-tuned for multi-head regression on Dutch book reviews
Trained for 5 epochs on a proprietary dataset
Sigmoid activation built into each head
Re-aggregation: simple average of the four head outputs

🛠️ Training Arguments

num_train_epochs=5
per_device_train_batch_size=8
per_device_eval_batch_size=16
gradient_accumulation_steps=2
learning_rate=2e-5
weight_decay=0.01
eval_strategy="epoch"
save_strategy="epoch"
load_best_model_at_end=True
metric_for_best_model="mean_rmse"
greater_is_better=False
bf16 enabled if supported, else fp16 enabled
logging_strategy="epoch"
push_to_hub=True with model ID Felixbrk/bert-base-dutch-cased-multi-score-tuned-positive
hub_strategy="end"
Early stopping with patience 2 epochs

⚠️ Important:

Always load this model with trust_remote_code=True as it uses a custom multi-head regression architecture.
Maintain the output order exactly for correct interpretation of results.