BERTu_sentiment-mlt / README.md

KurtMica

Update README.md

5f93f8d verified 3 months ago

preview code

raw

history blame contribute delete

5.13 kB

metadata

library_name: transformers
language:
  - mt
license: cc-by-nc-sa-4.0
base_model: MLRS/BERTu
model-index:
  - name: BERTu_sentiment-mlt
    results:
      - task:
          type: sentiment-analysis
          name: Sentiment Analysis
        dataset:
          type: mt-sentiment-analysis
          name: Maltese Sentiment Analysis
        metrics:
          - type: f1
            args: macro
            value: 85.11
            name: Macro-averaged F1
        source:
          name: MELABench Leaderboard
          url: https://huggingface.co/spaces/MLRS/MELABench
extra_gated_fields:
  Name: text
  Surname: text
  Date of Birth: date_picker
  Organisation: text
  Country: country
  I agree to use this model in accordance to the license and for non-commercial use ONLY: checkbox

BERTu (Maltese Sentiment Analysis)

This model is a fine-tuned version of MLRS/BERTu on Sentiment Analysis. It achieves the following results on the test set:

Loss: 0.5176
F1: 0.8511

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 32
seed: 2
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: inverse_sqrt
lr_scheduler_warmup_ratio: 0.005
num_epochs: 200.0
early_stopping_patience: 20

Training results

Training Loss	Epoch	Step	Validation Loss	F1
No log	1.0	38	0.4389	0.7914
No log	2.0	76	0.2928	0.9020
No log	3.0	114	0.2375	0.8766
No log	4.0	152	0.2501	0.9076
No log	5.0	190	0.2855	0.9215
No log	6.0	228	0.3583	0.8970
No log	7.0	266	0.4191	0.8731
No log	8.0	304	0.4540	0.8865
No log	9.0	342	0.4227	0.8970
No log	10.0	380	0.4526	0.8970
No log	11.0	418	0.4572	0.8970
No log	12.0	456	0.4483	0.8970
No log	13.0	494	0.4574	0.8970
0.1024	14.0	532	0.4587	0.8970
0.1024	15.0	570	0.4676	0.8970
0.1024	16.0	608	0.4732	0.8970
0.1024	17.0	646	0.4772	0.8970
0.1024	18.0	684	0.4897	0.8849
0.1024	19.0	722	0.4938	0.8849
0.1024	20.0	760	0.4950	0.8849
0.1024	21.0	798	0.4947	0.8970
0.1024	22.0	836	0.4963	0.8970
0.1024	23.0	874	0.4993	0.8970
0.1024	24.0	912	0.5010	0.8970
0.1024	25.0	950	0.5030	0.8970

Framework versions

Transformers 4.51.1
Pytorch 2.7.0+cu126
Datasets 3.2.0
Tokenizers 0.21.1

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}