Purpose of this finetuning

Finetune base model GPT2-IMDB using a using this BERT sentiment classifier as a reward function.

  • The goal is to train the GPT2 model to extrapolate on a movie review and generate negative sentiment.
  • There is a separate training done to generate positive movie reviews. The eventual goal would be to interpolate the weight spaces of the 'positively fintuned' and 'negatively finetuned' models as per the rewarded-soups paper and test if it results in (qualitatively) neutral reviews.

Model Params

Here are the traning parameters

  • base_model ='lvwerra/gpt2-imdb'
  • dataset = stanfordnlp/imdb
  • batch_size = 16
  • learning_rate = 1.41e-5
  • output_max_length = 16
  • output_min_length = 4

Not sure how long it took, but less than a couple hours on a single A6000 GPU

Results

image/png

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Samzy17/gpt2-imdb-movie-reviews-negative

Base model

lvwerra/gpt2-imdb
Finetuned
(98)
this model

Dataset used to train Samzy17/gpt2-imdb-movie-reviews-negative