Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
barpitf
/
RAT
like
2
HuggingFaceFW/fineweb-edu
English
RAT
efficient architecture
recurrence
attention
pretraining
arxiv:
2507.04416
License:
mit
Model card
Files
Files and versions
xet
Community
d18f6f7
RAT
76.9 GB
2 contributors
History:
2 commits
wimh966
[ckpt]
d18f6f7
about 1 month ago
.gitattributes
1.52 kB
initial commit
about 1 month ago
README.md
24 Bytes
initial commit
about 1 month ago
attention.pth
15.3 GB
xet
[ckpt]
about 1 month ago
attention_localattention_l2.pth
15.3 GB
xet
[ckpt]
about 1 month ago
ratl16.pth
15.6 GB
xet
[ckpt]
about 1 month ago
ratl16_localattention_l2.pth
15.4 GB
xet
[ckpt]
about 1 month ago
rnn.pth
15.3 GB
xet
[ckpt]
about 1 month ago