cs-fxr (fxrc)

upvoted a collection 3 months ago

GLM-4.5

Collection

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated Aug 11 • 246

upvoted a paper 4 months ago

Harnessing the Universal Geometry of Embeddings

Paper • 2505.12540 • Published May 18 • 8

upvoted an article 4 months ago

Article

Efficient MultiModal Data Pipeline

Jul 8

• 58

upvoted a paper 5 months ago

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Paper • 2506.06444 • Published Jun 6 • 73

upvoted an article 5 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 949

upvoted a paper 6 months ago

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

upvoted an article 6 months ago

Article

The Transformers Library: standardizing model definitions

May 15

• 120

upvoted 2 papers 7 months ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 89

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 169

upvoted an article 8 months ago

Article

FastRTC: The Real-Time Communication Library for Python

Feb 25

• 172

upvoted a paper 8 months ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 130

upvoted 2 articles 9 months ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

May 24, 2023

• 168

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.31k

upvoted a paper 9 months ago

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Paper • 2306.13649 • Published Jun 23, 2023 • 26

fxrc

AI & ML interests

Organizations

GLM-4.5

Harnessing the Universal Geometry of Embeddings

Efficient MultiModal Data Pipeline

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Mixture of Experts Explained

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

The Transformers Library: standardizing model definitions

Training Large Language Models to Reason in a Continuous Latent Space

Transformers without Normalization

FastRTC: The Real-Time Communication Library for Python

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Open-source DeepResearch – Freeing our search agents

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

fxrc

AI & ML interests

Organizations

cs-fxr's activity

Efficient MultiModal Data Pipeline

Mixture of Experts Explained

The Transformers Library: standardizing model definitions

FastRTC: The Real-Time Communication Library for Python

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Open-source DeepResearch – Freeing our search agents