EleutherAI
/

pile-t5-base

Text Generation

text2text-generation

encoder-decoder

Model card Files Files and versions

lintang commited on Apr 15, 2024

Commit

aed689f

·

verified ·

1 Parent(s): 3d55ad5

Update README.md

Files changed (1) hide show

README.md +10 -7

README.md CHANGED Viewed

@@ -6,10 +6,11 @@ language:
 pipeline_tag: text2text-generation
 tags:
 - t5x
-- encode-decoder
 ---
-Pile-T5 Base is an Encoder-Decoder model trained on [the Pile](https://pile.eleuther.ai/) using the [T5x](https://github.com/google-research/t5x) library. The model was trained for 2 million steps or roughly 2 trillion tokens using MLM-objective similar to the original T5 model.
 ### Model Details
@@ -30,7 +31,7 @@ ai](mailto:[email protected]).
 | Hyperparameter             | Value       |
 | -------------------------- | ----------- |
-| n<sub>parameters</sub>     |             |
 | n<sub>encoder layers</sub> | 12          |
 | n<sub>decoder layers</sub> | 12          |
 | d<sub>model</sub>          | 2048        |
@@ -133,16 +134,18 @@ checkpoints that can be used for finetuning with the T5x library, refer to [here
 ### Evaluations
-TBD
 ### BibTeX
 ```
-@article{2024t5v2,
   author  = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel},
-  title   = {Pile T5, an update of T5},
   year    = {2024},
-  url     = {}
 }
 ```

 pipeline_tag: text2text-generation
 tags:
 - t5x
+- encoder-decoder
 ---
+Pile-T5 Base is an Encoder-Decoder model trained on [the Pile](https://pile.eleuther.ai/) with using the [T5x](https://github.com/google-research/t5x) library. The model was trained for 2 million steps or roughly 2 trillion tokens using MLM-objective similar to the original T5 model.
+The HF version of Pile-T5 Base borrows UMT5's model implementation as it uses scalable model implementation from T5x and uses `LlamaTokenizer`.
 ### Model Details
 | Hyperparameter             | Value       |
 | -------------------------- | ----------- |
+| n<sub>parameters</sub>     | 247586304   |
 | n<sub>encoder layers</sub> | 12          |
 | n<sub>decoder layers</sub> | 12          |
 | d<sub>model</sub>          | 2048        |
 ### Evaluations
+Pile-T5 Base was evaluated on SuperGLUE, CodeXGLUE. A Flan-finetuned version was evaluated on Flan Held In tasks.
+Results can be seen in the [blogpost](https://blog.eleuther.ai/pile-t5/)
 ### BibTeX
 ```
+@misc{2024PileT5,
   author  = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel},
+  title   = {Pile-T5},
   year    = {2024},
+  url     = {https://blog.eleuther.ai/pile-t5/},
+  note    = {Blog post},
 }
 ```