Text Classification
Transformers
Safetensors
English
roberta

cardiffnlp/twitter-roberta-large-emoji-latest

This is a RoBERTa-large model trained on 154M tweets until the end of December 2022 and finetuned for emoji classification (multiclass classification on 100 emojis) on the TweetEmoji100 dataset of SuperTweetEval. The original Twitter-larged RoBERTa model can be found here.

Example

from transformers import pipeline
text= "I’m tired of being sick.. it’s been four days dawg"

pipe = pipeline('text-classification', model="cardiffnlp/twitter-roberta-large-emoji-latest", return_all_scores=True))
predictions = pipe(text)[0]
predictions = sorted(predictions, key=lambda d: d['score'], reverse=True) 
predictions[:5]
>> [{'label': '😒', 'score': 0.3771325647830963},
 {'label': '😑', 'score': 0.11055194586515427},
 {'label': '😤', 'score': 0.06117523834109306},
 {'label': '😡', 'score': 0.0564400739967823},
 {'label': '😫', 'score': 0.047937799245119095}]

Citation Information

Please cite the reference paper if you use this model.

@inproceedings{antypas2023supertweeteval,
  title={SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research},
  author={Dimosthenis Antypas and Asahi Ushio and Francesco Barbieri and Leonardo Neves and Kiamehr Rezaee and Luis Espinosa-Anke and Jiaxin Pei and Jose Camacho-Collados},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
  year={2023}
}
Downloads last month
20
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train cardiffnlp/twitter-roberta-large-emoji-latest

Collection including cardiffnlp/twitter-roberta-large-emoji-latest