--- license: mit language: - en metrics: - accuracy - precision - recall - f1 pipeline_tag: text-classification tags: - NLP - sentiment - logistciregression --- # ๐Ÿง  Sentiment Analysis with Logistic Regression This model performs **multi-class sentiment analysis** on tweets, classifying them into the following categories: - Positive - Negative - Neutral - Irrelevant It uses a custom preprocessing pipeline with: - CountVectorizer - TF-IDF transformation - Logistic Regression classifier (`max_iter=1000`) --- ## ๐Ÿ— Model Architecture - **CountVectorizer**: Converts tweets into token count vectors. - **TfidfTransformer**: Reweights tokens by importance. - **LogisticRegression**: Interpretable and robust classification baseline. --- ## ๐Ÿงช Evaluation Evaluated on a separate validation set of 999 tweets: | Class | Precision | Recall | F1-score | |-------------|-----------|--------|----------| | Irrelevant | 0.88 | 0.85 | 0.87 | | Negative | 0.87 | 0.94 | 0.91 | | Neutral | 0.97 | 0.86 | 0.91 | | Positive | 0.89 | 0.94 | 0.91 | | **Overall Accuracy** | | | **0.90** | --- ## ๐Ÿ“ฆ Usage ``` python import joblib model = joblib.load("sentiment_model_lr.pkl") user_input = "This update is surprisingly good!" prediction = model.predict([user_input]) print(prediction[0]) # โ†’ Positive, Negative, etc. ``` --- ```> โš ๏ธ Requires scikit-learn 1.6.1+ to avoid version mismatch warnings.``` --- ## ๐Ÿ“š Dataset ``` Tweets were preprocessed using a clean_text routine and labeled into the four sentiment categories. If youโ€™d like to experiment or re-train, contact the author or fork this repo. ``` --- ## ๐Ÿง‘โ€๐Ÿ’ป Author ``` Built by @arshvir Model version: 1.0 License: MIT ``` ---