distilbert-base-uncased-banking77-classification

This model is a fine-tuned version of distilbert-base-uncased on the banking77 dataset. It achieves the following results on the evaluation set:

Loss: 0.3152
Accuracy: 0.9240
F1 Score: 0.9243

Model description

This is my first fine-tuning experiment using Hugging Face. Using distilBERT as a pretrained model, I trained a classifier for online banking queries. It could be useful for addressing tickets.

Intended uses & limitations

The model can be used on text classification. In particular is fine tuned on banking domain.

Training and evaluation data

The dataset used is banking77

The 77 labels are:

label	intent
0	activate_my_card
1	age_limit
2	apple_pay_or_google_pay
3	atm_support
4	automatic_top_up
5	balance_not_updated_after_bank_transfer
6	balance_not_updated_after_cheque_or_cash_deposit
7	beneficiary_not_allowed
8	cancel_transfer
9	card_about_to_expire
10	card_acceptance
11	card_arrival
12	card_delivery_estimate
13	card_linking
14	card_not_working
15	card_payment_fee_charged
16	card_payment_not_recognised
17	card_payment_wrong_exchange_rate
18	card_swallowed
19	cash_withdrawal_charge
20	cash_withdrawal_not_recognised
21	change_pin
22	compromised_card
23	contactless_not_working
24	country_support
25	declined_card_payment
26	declined_cash_withdrawal
27	declined_transfer
28	direct_debit_payment_not_recognised
29	disposable_card_limits
30	edit_personal_details
31	exchange_charge
32	exchange_rate
33	exchange_via_app
34	extra_charge_on_statement
35	failed_transfer
36	fiat_currency_support
37	get_disposable_virtual_card
38	get_physical_card
39	getting_spare_card
40	getting_virtual_card
41	lost_or_stolen_card
42	lost_or_stolen_phone
43	order_physical_card
44	passcode_forgotten
45	pending_card_payment
46	pending_cash_withdrawal
47	pending_top_up
48	pending_transfer
49	pin_blocked
50	receiving_money
51	Refund_not_showing_up
52	request_refund
53	reverted_card_payment?
54	supported_cards_and_currencies
55	terminate_account
56	top_up_by_bank_transfer_charge
57	top_up_by_card_charge
58	top_up_by_cash_or_cheque
59	top_up_failed
60	top_up_limits
61	top_up_reverted
62	topping_up_by_card
63	transaction_charged_twice
64	transfer_fee_charged
65	transfer_into_account
66	transfer_not_received_by_recipient
67	transfer_timing
68	unable_to_verify_identity
69	verify_my_identity
70	verify_source_of_funds
71	verify_top_up
72	virtual_card_not_working
73	visa_or_mastercard
74	why_verify_identity
75	wrong_amount_of_cash_received
76	wrong_exchange_rate_for_cash_withdrawal

Training procedure

from transformers import pipeline

pipe = pipeline("text-classification", model="nickprock/distilbert-base-uncased-banking77-classification")
pipe("I can't pay by my credit card")

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1 Score
3.8732	1.0	157	3.1476	0.5370	0.4881
2.5598	2.0	314	1.9780	0.6916	0.6585
1.5863	3.0	471	1.2239	0.8042	0.7864
0.9829	4.0	628	0.8067	0.8565	0.8487
0.6274	5.0	785	0.5837	0.8799	0.8752
0.4304	6.0	942	0.4630	0.9042	0.9040
0.3106	7.0	1099	0.3982	0.9088	0.9087
0.2238	8.0	1256	0.3587	0.9110	0.9113
0.1708	9.0	1413	0.3351	0.9208	0.9208
0.1256	10.0	1570	0.3242	0.9179	0.9182
0.0981	11.0	1727	0.3136	0.9211	0.9214
0.0745	12.0	1884	0.3151	0.9211	0.9213
0.0601	13.0	2041	0.3089	0.9218	0.9220
0.0482	14.0	2198	0.3158	0.9214	0.9216
0.0402	15.0	2355	0.3126	0.9224	0.9226
0.0344	16.0	2512	0.3143	0.9231	0.9233
0.0298	17.0	2669	0.3156	0.9231	0.9233
0.0272	18.0	2826	0.3134	0.9244	0.9247
0.0237	19.0	2983	0.3156	0.9244	0.9246
0.0229	20.0	3140	0.3152	0.9240	0.9243

Framework versions

Transformers 4.20.1
Pytorch 1.12.0+cu113
Datasets 2.3.2
Tokenizers 0.12.1

Downloads last month: 17

Safetensors

Model size

67M params

Tensor type

F32

Model tree for nickprock/distilbert-base-uncased-banking77-classification

Base model

distilbert/distilbert-base-uncased

Finetuned

(10076)

this model

Dataset used to train nickprock/distilbert-base-uncased-banking77-classification

Spaces using nickprock/distilbert-base-uncased-banking77-classification 2

Evaluation results

Accuracy on banking77
self-reported

0.924
Accuracy on banking77
test set self-reported

0.924
Precision Macro on banking77
test set self-reported

0.928
Precision Micro on banking77
test set self-reported

0.924
Precision Weighted on banking77
test set self-reported

0.928
Recall Macro on banking77
test set self-reported

0.924
Recall Micro on banking77
test set self-reported

0.924
Recall Weighted on banking77
test set self-reported

0.924
F1 Macro on banking77
test set self-reported

0.924
F1 Micro on banking77
test set self-reported

0.924

View on Papers With Code