Mike Ravkine's picture

Mike Ravkine PRO

mike-ravkine

·

the-crypt-keeper

AI & ML interests

LLM Research / Development / Evaluation

Recent Activity

posted an update about 10 hours ago

Spatial reasoning is a domain where LLMs struggle surprisingly hard. A new paper, "Stuck in the Matrix: Probing Spatial Reasoning in Large Language Models" compares performance on a handful of spacial reasoning tasks and finds all SOTA LLMs breaking down and hallucinating their faces off when the grids get large. https://arxiv.org/html/2510.20198v1 The word search task is especially revealing: notice the bias towards detecting "horizontal" while struggling with "vertical" - LLMs only understand simple, linear relationships.. add a stride for 2D and it's basically over.

posted an update 4 days ago

"Do LLMs Really Need 10+ Thoughts for "Find the Time 1000 Days Later"? Towards Structural Understanding of LLM Overthinking" https://arxiv.org/abs/2510.07880 > Our analysis reveals two major patterns for open-weight thinking models -- Explorer and Late Landing. This finding provides evidence that over-verification and over-exploration are the primary drivers of overthinking in LLMs. Grounded in thought structures, we propose a utility-based definition of overthinking, which moves beyond length-based metrics. This revised definition offers a more insightful understanding of LLMs' thought progression, as well as practical guidelines for principled overthinking management. Really cool paper! Unfortunately no code, may need to attempt to recreate my own TRACE 🤔

liked a model 6 days ago

noctrex/cogito-v2-preview-llama-109B-MoE-MXFP4_MOE-GGUF

View all activity

Organizations

None yet

upvoted a collection 7 days ago

aquif-4

aquif-4-Exp is the first hybrid attention model from aquif, built on a strong architecture with 256 experts. • 2 items • Updated 4 days ago • 3

upvoted a paper 15 days ago

SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs

Paper • 2510.05069 • Published 19 days ago • 12

upvoted a paper about 2 months ago

Symbolic Graphics Programming with Large Language Models

Paper • 2509.05208 • Published Sep 5 • 45

upvoted an article 2 months ago

Article

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

By

and 9 others •

Aug 18

• 30

upvoted a collection 10 months ago

Lumimaid 0.2

4 items • Updated Jul 26, 2024 • 20

upvoted a paper 10 months ago

RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

Paper • 2412.11919 • Published Dec 16, 2024 • 36

upvoted a paper about 1 year ago

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Paper • 2410.10139 • Published Oct 14, 2024 • 52

upvoted a collection about 1 year ago

My most recent datasets

6 items • Updated Oct 8, 2024 • 6

upvoted an article about 1 year ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 270

upvoted a collection about 1 year ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Jul 21 • 345

upvoted a paper about 1 year ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83

upvoted 2 collections about 1 year ago

Multimodal RAG

10 items • Updated Sep 5, 2024 • 30

Hermes 3

The Hermes 3 Series of Models • 11 items • Updated Sep 8 • 130

upvoted a paper over 1 year ago

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19, 2024 • 46

upvoted 2 collections over 1 year ago

Personal Favorites

Recommended models I use often or like for any reason. I recommend reading their cards for more details. • 10 items • Updated Dec 24, 2024 • 91

Quyen

State-of-the-arts General LLMs - based on Qwen1.5 • 26 items • Updated Feb 13, 2024 • 12

upvoted a paper over 2 years ago

PolyLM: An Open Source Polyglot Large Language Model

Paper • 2307.06018 • Published Jul 12, 2023 • 26