view article Article Introducing AI Sheets: a tool to work with datasets using open AI models! Aug 8 • 101
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 257
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5 • 46
🧠 Traditional Chinese Reasoning Datasets Collection A curated collection of datasets designed to evaluate and train reasoning capabilities in Traditional Chinese across various domains. • 3 items • Updated 14 days ago • 8
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality Mar 4 • 77
Breeze 2 Family Collection Llama-Breeze2 is a multi-modal language model family specifically intended for Traditional Chinese use. BreezyVoice is a Taiwan Mandarin TTS • 6 items • Updated Feb 26 • 19
Cosmos-Tokenizer Collection A suite of image and video tokenizers • 13 items • Updated 6 days ago • 41
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Aug 25 • 81
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5 • 291
view article Article Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚 By Isayoften • Aug 26, 2024 • 77
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17, 2024 • 51
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 236
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 6 days ago • 162
Llama3-ChatQA-1.5 Collection Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated 6 days ago • 44
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information Paper • 2402.13616 • Published Feb 21, 2024 • 49
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 10 • 344