File size: 2,711 Bytes
1c776d7
 
 
 
 
 
 
 
 
 
 
1840e91
 
 
 
 
1c776d7
 
 
 
 
 
 
 
 
 
 
 
 
 
8cbc7a1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1c776d7
 
 
 
 
 
 
 
 
 
 
 
27b0470
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: apache-2.0
datasets:
- glue
language:
- en
metrics:
- accuracy
- f1
library_name: transformers
pipeline_tag: text-classification
widget:
  - text: The company didn 't detail the costs of the replacement and repairs . [SEP] But company officials expect the costs of the replacement work to run into the millions of dollars .
    example_title: not_equivalent
  - text: According to the federal Centers for Disease Control and Prevention ( news - web sites ) , there were 19 reported cases of measles in the United States in 2002 . [SEP] The Centers for Disease Control and Prevention said there were 19 reported cases of measles in the United States in 2002 .
    example_title: equivalent
---

# bert-base-uncased-finetuned-mrpc-v2

BERT (`"bert-base-uncased"`) finetuned on MRPC (Microsoft Research Paraphrase Corpus).

The model predicts whether two sentences are semantically equivalent. It pertains to section 4 of chapter 3 of the Hugging Face "NLP Course" (https://huggingface.co/learn/nlp-course/chapter3/4).

It was trained using a custom PyTorch loop with Hugging Face Accelerate.

Code: https://github.com/sambitmukherjee/huggingface-notebooks/blob/main/course/en/chapter3/section4.ipynb

Experiment tracking: https://wandb.ai/sadhaklal/bert-base-uncased-finetuned-mrpc-v2

## Usage

```
from transformers import pipeline

classifier = pipeline("text-classification", model="sadhaklal/bert-base-uncased-finetuned-mrpc-v2")

sentence1 = "A tropical storm rapidly developed in the Gulf of Mexico Sunday and was expected to hit somewhere along the Texas or Louisiana coasts by Monday night ."
sentence2 = "A tropical storm rapidly developed in the Gulf of Mexico on Sunday and could have hurricane-force winds when it hits land somewhere along the Louisiana coast Monday night ."
sentence_pair = sentence1 + " [SEP] " + sentence2
print(classifier(sentence_pair))

sentence1 = "The settling companies would also assign their possible claims against the underwriters to the investor plaintiffs , he added ."
sentence2 = "Under the agreement , the settling companies will also assign their potential claims against the underwriters to the investors , he added ."
sentence_pair = sentence1 + " [SEP] " + sentence2
print(classifier(sentence_pair))
```

## Dataset

From the dataset page:

> The Microsoft Research Paraphrase Corpus (Dolan & Brockett, 2005) is a corpus of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent.

Examples: https://huggingface.co/datasets/glue/viewer/mrpc

## Metrics

Accuracy on the 'validation' split of MRPC: 0.875

F1 on the 'validation' split of MRPC: 0.9128