Trangle Heshvp's picture

Trangle Heshvp

Trangle

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

deepseek-ai/DeepSeek-V3.2

liked a model 1 day ago

deepseek-ai/DeepSeek-V3.2-Speciale

liked a dataset about 1 month ago

neulab/agent-data-collection

View all activity

Organizations

upvoted 3 articles 5 months ago

Article

Introducing ColQwen-Omni: Retrieve in every modality

Jul 17

•

75

Article

Understanding Gemma 3n: How MatFormer Gives You Many Models in One

Jun 26

•

48

Article

Vision Language Models Explained

Apr 11, 2024

•

495

upvoted a paper 7 months ago

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7 • 65

upvoted a collection 9 months ago

Gemma 3 Release

28 items • Updated Aug 11 • 545

upvoted 2 papers 9 months ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 75

Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence

Paper • 2502.14905 • Published Feb 18 • 9

upvoted an article 10 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

•

1.31k

upvoted 2 collections 12 months ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 151

Granite 3.1 Language Models

A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 9 items • Updated 16 days ago • 67

upvoted a collection about 1 year ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 646

upvoted an article over 1 year ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

•

428

upvoted 4 collections over 1 year ago

Gemma Scope Release

A comprehensive, open suite of sparse autoencoders for Gemma 2 2B and 9B. • 10 items • Updated Jul 10 • 18

Llama 3.1 Evals

This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.1 models, including the configurations, • 6 items • Updated Dec 6, 2024 • 19

Minitron

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 9 days ago • 61

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 239

upvoted 2 papers over 1 year ago

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12, 2024 • 23

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 168

upvoted an article over 1 year ago

Article

BM25 for Python: Achieving high performance while simplifying dependencies with BM25S⚡

Jul 9, 2024

•

73

upvoted a paper over 1 year ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 51