saiteki-kai
/

QA-DeBERTa-v3-large

Text Classification

question-answering

Generated from Trainer

Model card Files Files and versions

QA-DeBERTa-v3-large / README.md

saiteki-kai's picture

Update README.md

46492d4 verified 7 months ago

|

history blame contribute delete

3.31 kB

	---
	library_name: transformers
	license: mit
	base_model: microsoft/deberta-v3-large
	tags:
	- multi-label
	- question-answering
	- text-classification
	- generated_from_trainer
	datasets:
	- saiteki-kai/BeaverTails-it
	metrics:
	- f1
	- accuracy
	- precision
	- recall
	language:
	- it
	- en
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# QA-DeBERTa-v3-large

	This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on the [saiteki-kai/BeaverTails-it](https://huggingface.co/datasets/saiteki-kai/BeaverTails-it) dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0808
	- Accuracy: 0.6938
	- Macro F1: 0.6484
	- Macro Precision: 0.7149
	- Macro Recall: 0.6176
	- Micro F1: 0.7545
	- Micro Precision: 0.7874
	- Micro Recall: 0.7242
	- Flagged/accuracy: 0.8566
	- Flagged/precision: 0.8975
	- Flagged/recall: 0.8380
	- Flagged/f1: 0.8667

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3.85e-06
	- train_batch_size: 16
	- eval_batch_size: 64
	- seed: 42
	- distributed_type: multi-GPU
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Macro F1 \| Macro Precision \| Macro Recall \| Micro F1 \| Micro Precision \| Micro Recall \| Flagged/accuracy \| Flagged/precision \| Flagged/recall \| Flagged/f1 \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:--------:\|:--------:\|:---------------:\|:------------:\|:--------:\|:---------------:\|:------------:\|:----------------:\|:-----------------:\|:--------------:\|:----------:\|
	\| 0.0985 \| 1.0 \| 33814 \| 0.0877 \| 0.6750 \| 0.6102 \| 0.6629 \| 0.5948 \| 0.7406 \| 0.7705 \| 0.7129 \| 0.8447 \| 0.8701 \| 0.8475 \| 0.8586 \|
	\| 0.0867 \| 2.0 \| 67628 \| 0.0817 \| 0.6910 \| 0.6185 \| 0.7559 \| 0.5598 \| 0.7446 \| 0.8165 \| 0.6842 \| 0.8465 \| 0.9093 \| 0.8043 \| 0.8536 \|
	\| 0.0561 \| 3.0 \| 101442 \| 0.0808 \| 0.6938 \| 0.6484 \| 0.7149 \| 0.6177 \| 0.7545 \| 0.7875 \| 0.7242 \| 0.8566 \| 0.8975 \| 0.8380 \| 0.8667 \|
	\| 0.0913 \| 4.0 \| 135256 \| 0.0812 \| 0.6877 \| 0.6412 \| 0.7136 \| 0.6144 \| 0.7516 \| 0.7796 \| 0.7255 \| 0.8546 \| 0.8902 \| 0.8428 \| 0.8658 \|
	\| 0.0709 \| 5.0 \| 169070 \| 0.0826 \| 0.6911 \| 0.6376 \| 0.7306 \| 0.5982 \| 0.7500 \| 0.7911 \| 0.7129 \| 0.8538 \| 0.8936 \| 0.8370 \| 0.8643 \|


	### Framework versions

	- Transformers 4.51.3
	- Pytorch 2.7.0+cu118
	- Datasets 3.5.1
	- Tokenizers 0.21.1