Model Card: BERT-Large-Portuguese-Cased Fine-Tuned on InferBR NLI

Model Details

  • Model name: felipesfpaula/bertimbau-large-InferBr-NLI
  • Base model: neuralmind/bert-large-portuguese-cased
  • Task: Natural Language Inference (NLI) on Brazilian Portuguese
  • Dataset: InferBR
    • Premise–Hypothesis pairs in Portuguese
    • Label mapping:
      • 0 – Contradiction
      • 1 – Entailment
      • 2 – Neutral

Intended Use

This model is intended for research and applications requiring Portuguese NLI, such as:

  • Automated textual reasoning in Portuguese
  • Downstream tasks: question answering, summarization consistency checks, semantic search
  • Academic experiments in Portuguese natural language understanding

Not intended for:

  • Sensitive decision-making without human oversight
  • Use on texts in languages other than Brazilian Portuguese

Training Data

  • Training split: InferBR “train” (premise, hypothesis, label)
  • Validation split: InferBR “validation”
  • Test split: InferBR “test”
  • Preprocessing:
    • Tokenized with neuralmind/bert-large-portuguese-cased tokenizer
    • Maximum sequence length: 128 tokens
    • Padding to max length
    • Labels cast to integer IDs {0,1,2}

Training Procedure

  • Fine-tuned on: neuralmind/bert-large-portuguese-cased
  • Batch size: 32
  • Learning rate: 2e-5
  • Optimizer: AdamW (with default weight decay)
  • Number of epochs: 10
  • Evaluation strategy: Evaluate on validation split at end of each epoch
  • Checkpointing: Saved best model by validation accuracy
  • Random seed: 42

Evaluation Results (Test Set)

  • Test accuracy: 0.9395
  • Test F₁‐macro: 0.7596
    • F₁ label 0 (Contradiction): 0.9191
    • F₁ label 1 (Entailment): 0.6022
    • F₁ label 2 (Neutral): 0.7575

These metrics were computed on the held‐out InferBR test split.

  • accuracy = (number of correctly predicted labels) / (total number of examples)
  • f1_macro = unweighted average F₁ across labels {0,1,2}

Limitations

  • Imbalanced performance: Label 1 (Entailment) has lower F₁ (0.6022), indicating the model sometimes confuses entailment examples.
  • Domain specificity: Trained on InferBR, which consists of generic NLI pairs. May not generalize to highly specialized or technical domains (e.g., legal, medical).
  • Language restrictions: Only supports Brazilian Portuguese. Performance on European Portuguese or code‐switched text is not guaranteed.
  • Bias and fairness: InferBR may contain topics or writing styles that do not cover all registers of Portuguese. Use caution if deploying in production for sensitive tasks.

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 1. Load tokenizer and model from HuggingFace
tokenizer = AutoTokenizer.from_pretrained("felipesfpaula/bertimbau-large-InferBr-NLI")
model = AutoModelForSequenceClassification.from_pretrained("felipesfpaula/bertimbau-large-InferBr-NLI")

# 2. Encode a premise–hypothesis pair
premise = "O gato está sentado no sofá."
hypothesis = "O gato está deitado no sofá."
encoded = tokenizer(premise, hypothesis, return_tensors="pt", max_length=128, truncation=True, padding="max_length")

# 3. Run inference
with torch.no_grad():
    outputs = model(**encoded)
    logits = outputs.logits
    pred_id = torch.argmax(logits, dim=-1).item()

# 4. Map prediction to label
label_map = {0: "Contradiction", 1: "Entailment", 2: "Neutral"}
print(f"Predicted label: {label_map[pred_id]}")
Downloads last month
37
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for felipesfpaula/bertimbau-large-InferBr-NLI

Finetuned
(60)
this model

Dataset used to train felipesfpaula/bertimbau-large-InferBr-NLI