Spaces:

iSathyam03
/

ViBERTa

Build error

App Files Files Community

ViBERTa / README.md

iSathyam03

Update README.md

0d2513c verified 8 months ago

preview code

raw

history blame contribute delete

3.97 kB

A newer version of the Streamlit SDK is available: 1.52.1

Upgrade

metadata

title: ViBERTa
emoji: 💬
colorFrom: red
colorTo: gray
sdk: streamlit
app_file: app.py
pinned: false
sdk_version: 1.44.1

ViBERTa: Unwrapping Customer Sentiments with Sentiment Analysis! 🍔

🌟 Overview

ViBERTa (VIBE + DeBERTa) is a sentiment analysis model fine-tuned on the McDonald's review dataset. Leveraging the power of Microsoft's DeBERTa, this model provides precise sentiment classification for customer reviews.

📋 Model Specifications

Key Details

Model Name: ViBERTa
Base Model: microsoft/deberta-v3-base
Primary Task: Sentiment Classification
Sentiment Classes:
- 0: Negative
- 1: Neutral
- 2: Positive

🔬 Technical Highlights

Advanced transformer-based architecture
Fine-tuned on domain-specific McDonald's review data
High accuracy in sentiment prediction

🗂 Dataset Insights

McDonald's Review Dataset

Source: Kaggle
Comprehensive collection of customer reviews
Manually labeled sentiment categories
Diverse range of customer experiences

🛠 Training Methodology

Configuration Parameters

Parameter	Value
Batch Size	16
Total Epochs	3
Learning Rate	2e-5
Optimizer	AdamW
Learning Rate Scheduler	Cosine decay with warmup
Warmup Ratio	10%
Weight Decay	0.01
Mixed Precision	Enabled (fp16)
Gradient Accumulation Steps	2

Training Approach

Tokenization using DeBERTa tokenizer
Cross-entropy loss function
Adaptive learning rate scheduling
Gradient accumulation for stable training

🚀 Quick Start Guide

Installation

Install the required dependencies:

# Create a virtual environment (recommended)
python -m venv viberta_env
source viberta_env/bin/activate  # On Windows, use `viberta_env\Scripts\activate`

# Install dependencies
pip install torch transformers datasets

Model Inference

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "iSathyam03/McD_Reviews_Sentiment_Analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

def predict_sentiment(text):
    """Predict sentiment for given text."""
    inputs = tokenizer(
        text, 
        return_tensors="pt", 
        truncation=True, 
        padding=True
    )
    
    with torch.no_grad():
        outputs = model(**inputs)
    
    logits = outputs.logits
    prediction = torch.argmax(logits, dim=1).item()
    
    sentiment_labels = {
        0: "Negative", 
        1: "Neutral", 
        2: "Positive"
    }
    
    return sentiment_labels[prediction]

# Example usage
review = "The fries were amazing but the burger was stale."
sentiment = predict_sentiment(review)
print(f"Sentiment: {sentiment}")

📊 Performance Metrics

Evaluation Results

Accuracy: 0.856
F1-Score: 0.853

Confusion Matrix

[Include a visual or textual representation of the confusion matrix]

🌐 Deployment Options

Hugging Face Inference API
- Easy integration
- Scalable cloud deployment
Web Application Frameworks
- Streamlit for interactive demos
- Gradio for quick UI prototypes
- Flask/FastAPI for robust REST APIs

🔍 Limitations & Considerations

Performance may vary with out-of-domain text
Potential bias inherited from training data
Recommended to validate on your specific use case

📚 References & Citations

Primary Citation

@article{he2020deberta,
  title={DeBERTa: Decoding-enhanced BERT with Disentangled Attention},
  author={He, Pengcheng and Liu, Xiaodong and Gao, Jianfeng and Chen, Weizhu},
  journal={arXiv preprint arXiv:2006.03654},
  year={2020}
}

💡 Pro Tip: Always validate model performance on your specific dataset!

⭐ Found ViBERTa helpful? Don't forget to star the repository! 🌟