Spaces:

iSathyam03
/

ViBERTa

Build error

File size: 3,967 Bytes

---
title: ViBERTa
emoji: 💬
colorFrom: red
colorTo: gray
sdk: streamlit
app_file: app.py
pinned: false
sdk_version: 1.44.1
---

# ViBERTa: Unwrapping Customer Sentiments with Sentiment Analysis! 🍔

## 🌟 Overview

ViBERTa (VIBE + DeBERTa) is a sentiment analysis model fine-tuned on the McDonald's review dataset. Leveraging the power of Microsoft's DeBERTa, this model provides precise sentiment classification for customer reviews.

## 📋 Model Specifications

### Key Details
- **Model Name:** ViBERTa
- **Base Model:** `microsoft/deberta-v3-base`
- **Primary Task:** Sentiment Classification
- **Sentiment Classes:** 
  - 0: Negative
  - 1: Neutral
  - 2: Positive

### 🔬 Technical Highlights
- Advanced transformer-based architecture
- Fine-tuned on domain-specific McDonald's review data
- High accuracy in sentiment prediction

## 🗂 Dataset Insights

### McDonald's Review Dataset
- Source: Kaggle
- Comprehensive collection of customer reviews
- Manually labeled sentiment categories
- Diverse range of customer experiences

## 🛠 Training Methodology

### Configuration Parameters
| Parameter | Value |
|-----------|-------|
| Batch Size | 16 |
| Total Epochs | 3 |
| Learning Rate | 2e-5 |
| Optimizer | AdamW |
| Learning Rate Scheduler | Cosine decay with warmup |
| Warmup Ratio | 10% |
| Weight Decay | 0.01 |
| Mixed Precision | Enabled (fp16) |
| Gradient Accumulation Steps | 2 |

### Training Approach
- Tokenization using DeBERTa tokenizer
- Cross-entropy loss function
- Adaptive learning rate scheduling
- Gradient accumulation for stable training

## 🚀 Quick Start Guide

### Installation

Install the required dependencies:

```bash
# Create a virtual environment (recommended)
python -m venv viberta_env
source viberta_env/bin/activate  # On Windows, use `viberta_env\Scripts\activate`

# Install dependencies
pip install torch transformers datasets
```

### Model Inference

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "iSathyam03/McD_Reviews_Sentiment_Analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

def predict_sentiment(text):
    """Predict sentiment for given text."""
    inputs = tokenizer(
        text, 
        return_tensors="pt", 
        truncation=True, 
        padding=True
    )
    
    with torch.no_grad():
        outputs = model(**inputs)
    
    logits = outputs.logits
    prediction = torch.argmax(logits, dim=1).item()
    
    sentiment_labels = {
        0: "Negative", 
        1: "Neutral", 
        2: "Positive"
    }
    
    return sentiment_labels[prediction]

# Example usage
review = "The fries were amazing but the burger was stale."
sentiment = predict_sentiment(review)
print(f"Sentiment: {sentiment}")
```

## 📊 Performance Metrics

### Evaluation Results
- **Accuracy:** 0.856
- **F1-Score:** 0.853

### Confusion Matrix
[Include a visual or textual representation of the confusion matrix]

## 🌐 Deployment Options

1. **Hugging Face Inference API**
   - Easy integration
   - Scalable cloud deployment

2. **Web Application Frameworks**
   - Streamlit for interactive demos
   - Gradio for quick UI prototypes
   - Flask/FastAPI for robust REST APIs

## 🔍 Limitations & Considerations
- Performance may vary with out-of-domain text
- Potential bias inherited from training data
- Recommended to validate on your specific use case

## 📚 References & Citations

### Primary Citation
```bibtex
@article{he2020deberta,
  title={DeBERTa: Decoding-enhanced BERT with Disentangled Attention},
  author={He, Pengcheng and Liu, Xiaodong and Gao, Jianfeng and Chen, Weizhu},
  journal={arXiv preprint arXiv:2006.03654},
  year={2020}
}
```

**💡 Pro Tip:** Always validate model performance on your specific dataset!

⭐ **Found ViBERTa helpful? Don't forget to star the repository!** 🌟