Model Card for bert-imdb-sentiment

This is a fine-tuned bert-base-uncased model for binary sentiment classification on the IMDb movie reviews dataset.
The model predicts whether a given movie review is positive or negative.

Model Details

Model Description

This model is a BertForSequenceClassification model fine-tuned using Hugging Face Transformers and the IMDb dataset (25,000 movie reviews).
The training was done using the Trainer API with the following configuration:

  • Tokenization with BertTokenizer (bert-base-uncased), max sequence length of 256.

  • Fine-tuned for 3 epochs with learning rate 2e-5 and mixed-precision (fp16).

  • Achieved ~91.54% accuracy and F1 score of ~91.54% on the test split.

  • Developed by: koushik reddy

  • Model type: Transformer-based sequence classifier (BertForSequenceClassification)

  • Language(s) (NLP): English

  • Finetuned from model : bert-base-uncased (Hugging Face link)

Model Sources

Intended Uses & Limitations

  • โœ… Intended for sentiment classification of English movie reviews.
  • โš ๏ธ May not generalize well to other domains (e.g., tweets, product reviews) without additional fine-tuning.
  • โš ๏ธ May reflect biases present in the IMDb dataset and the original BERT pre-training corpus.

Direct Use

from transformers import BertForSequenceClassification, BertTokenizer
import torch

# Load model from the Hub
model = BertForSequenceClassification.from_pretrained("your-username/bert-imdb-sentiment")
tokenizer = BertTokenizer.from_pretrained("your-username/bert-imdb-sentiment")

# Inference
inputs = tokenizer("The movie was fantastic!", return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
pred = torch.argmax(logits, dim=1).item()
print(["NEGATIVE", "POSITIVE"][pred])

Training Details

Training Data

  • Dataset: IMDb movie reviews (datasets.load_dataset('imdb')).
  • Size: 25,000 training, 25,000 test samples.
  • Preprocessing: Tokenization with max_length=256 chosen based on review length histogram.

Training Procedure

Preprocessing

  • Text was lowercased automatically because bert-base-uncased is a lowercase model.
  • Each example was tokenized with padding to max_length=256 and truncated if longer.
  • The dataset was split into train, validation, and test using:
    • train: 0โ€“20,000 samples from the training set
    • val: 20,000โ€“25,000 samples from the training set
    • test: the official IMDb test split

Training Hyperparameters

  • Base Model: bert-base-uncased
  • Num Labels: 2 (binary classification)
  • Batch size: 4 per device (with gradient accumulation of 16 steps, so effective batch size = 64)
  • Learning Rate: 2e-5
  • Epochs: 3
  • Optimizer: AdamW (default in Transformers)
  • Mixed Precision: fp16 mixed precision training enabled for faster training and reduced memory usage (fp16=True in TrainingArguments)
  • Scheduler: Linear learning rate scheduler with warmup (default)
  • Seed: 224

Speeds, Sizes, Times

  • Training Time: Approx. varies by GPU; typically around 15-20 minutes on T4 GPU
  • Checkpoint Size: ~420MB for pytorch_model.bin (BERT base size plus classification head).
  • Total Parameters: ~110 million.

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • Dataset: IMDb test split (25,000 reviews) held out from training.
  • Preprocessing: Same as training โ€” lowercased, tokenized with max_length=256.

Factors

  • This model was evaluated on the overall IMDb test set only. No specific subgroup or domain disaggregation was done.
  • The model is expected to generalize well to similar English movie review sentiment but may not be robust to domain shifts.

Metrics

  • Accuracy: Measures the fraction of correctly classified reviews.
  • F1 Score: Weighted average F1 across classes to balance precision and recall.

Evaluation Results

Metric Score
Accuracy 91.54%
F1 Score 91.54%

Evaluated on the IMDb test set.

Summary

This is a fine-tuned BERT model (bert-base-uncased) for binary sentiment analysis on the IMDb movie reviews dataset.
It classifies a given movie review as positive or negative with an accuracy of 91.54% and a weighted F1 score of 91.54% on the test set.
The model was trained using the Hugging Face transformers library, with tokenization based on a maximum sequence length of 256 tokens to balance coverage and efficiency.

The model is intended for English movie reviews but may generalize reasonably to similar sentiment analysis tasks on longer-form English text.

Downloads last month
4
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for koushik-25/bert-imdb-sentiment

Finetuned
(5945)
this model

Dataset used to train koushik-25/bert-imdb-sentiment