GPT-2 Next Word Prediction

{% if error_message %} {% endif %}
{% if prediction %}

Prediction Results

Input text: {{ text }}

Next word: {{ prediction.token }}

Logit value: {{ "%.4f"|format(prediction.logit) }}

Probability: {{ "%.2f"|format(prediction.prob) }}%

{% if head_contributions %}

Layer Contributions to Log Probability

This chart shows how each layer in GPT-2 influences the prediction of "{{ prediction.token }}" as your next word. Each bar represents one of GPT-2's 12 layers:

  • Green bars (positive): These layers push toward predicting this word
  • Purple bars (negative): These layers push against predicting this word
  • Taller bars have stronger influence than shorter bars

Attention Head Contributions

This heatmap shows how each of GPT-2's 144 individual attention mechanisms (12 heads in each of 12 layers) influences the prediction:

  • Red squares: These heads strongly support predicting this word
  • Blue squares: These heads work against predicting this word
  • White/neutral squares: These heads have little influence on this prediction

Hover over any square to see its exact contribution. Research has shown certain heads specialize in specific tasks like tracking names or completing patterns.

Negative contribution %
Neutral (0%)
Positive contribution %
{% endif %} {% endif %}
{% if head_contributions %} {% endif %}