Fine-Tuned T5-Small for TriviaQA
This repository contains a T5-small model fine-tuned on approximately 20,000 cleaned question-answer pairs from the TriviaQA dataset.
The primary goal of this project was educational: to practice dataset preprocessing, learn the workflow for fine-tuning sequence-to-sequence models, and test the factual question-answering abilities of T5.
Note: This model is not intended for production-level accuracy.
Overview
Base Model: t5-small (60 million parameters)
Task: Abstractive Question Answering (short-form trivia)
Training Data: ~20,000 samples from TriviaQA
Expected Output: Answers are typically 1–3 words
Training Details
Epochs: 3
Batch Size: 16
Hardware: NVIDIA GTX 1050 (4GB VRAM)
Limitations and Behavior
This model was trained as an experiment and has significant limitations.
Limited Factual Memory: The t5-small model is not large enough to store a vast amount of "world knowledge."
Small Dataset: Training on only 20k examples is insufficient for the model to learn facts it hasn't seen.
No Retrieval: This is a standard (non-RAG) model. It cannot "look up" answers from an external source like Google or Wikipedia.
Potential for Hallucination: The model may guess or provide a confident-sounding but incorrect answer, especially for questions outside its training data.
This behavior is expected for a small encoder-decoder model trained on a limited dataset.
How to Improve
To build a robust and accurate trivia bot, the following steps would be necessary:
Implement RAG: Add a retrieval-augmented generation (RAG) pipeline. This would allow the model to search a knowledge base (e.g., Google, Wikipedia) for relevant context before formulating an answer.
Use a Larger Model: Start with a more capable base model, such as Flan-T5-Large, Flan-T5-XL, or a modern decoder-based model (e.g., Mistral, Llama 3).
Use the Full Dataset: Train on the complete TriviaQA dataset.
Prompt Engineering: Use stricter, more detailed prompts to force the model to generate only short, precise answers.
- Downloads last month
- 16
Model tree for prajwalmani/t5-small-trivia-qa
Base model
google-t5/t5-small