-
Qwen/Qwen3-4B-Thinking-2507
Text Generation • 4B • Updated • 540k • • 461 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 61 • 99 -
fdtn-ai/Foundation-Sec-8B-Instruct
Text Generation • 8B • Updated • 18.4k • • 55 -
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
Paper • 2508.05954 • Published • 6
Anthony Ledesma
arledesma
AI & ML interests
None yet
Organizations
None yet
reading
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 19 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 75 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 40
Models
-
Qwen/Qwen3-4B-Thinking-2507
Text Generation • 4B • Updated • 540k • • 461 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 61 • 99 -
fdtn-ai/Foundation-Sec-8B-Instruct
Text Generation • 8B • Updated • 18.4k • • 55 -
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
Paper • 2508.05954 • Published • 6
reading
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 19 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 75 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 40
models
0
None public yet
datasets
0
None public yet