AI & ML interests

None defined yet.

Recent Activity

DmitryRyuminย 
posted an update about 22 hours ago
view post
Post
1833
๐Ÿš€๐Ÿค–๐ŸŒŸ New Research Alert - ICCV 2025 (Oral)! ๐ŸŒŸ๐Ÿค–๐Ÿš€
๐Ÿ“„ Title: Variance-based Pruning for Accelerating and Compressing Trained Networks ๐Ÿ”

๐Ÿ“ Description: The one-shot pruning method efficiently compresses networks, reducing computation and memory usage while retaining almost full performance and requiring minimal fine-tuning.

๐Ÿ‘ฅ Authors: Uranik Berisha, Jens Mehnert, and Alexandru Paul Condurache

๐Ÿ“… Conference: ICCV, 19 โ€“ 23 Oct, 2025 | Honolulu, Hawai'i, USA ๐Ÿ‡บ๐Ÿ‡ธ

๐Ÿ“„ Paper: Variance-Based Pruning for Accelerating and Compressing Trained Networks (2507.12988)

๐Ÿš€ ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers

๐Ÿš€ Added to the Efficient Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/efficient-learning.md

๐Ÿ“š More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

๐Ÿ” Keywords: #VarianceBasedPruning #NetworkCompression #ModelAcceleration #EfficientDeepLearning #VisionTransformers #AI #ICCV2025 #ResearchHighlight
nroggendorffย 
posted an update 2 days ago
view post
Post
3228
Is it hot in here, or is it just me?
ยท
DmitryRyuminย 
posted an update 2 days ago
view post
Post
2608
๐Ÿš€๐Ÿ‘๏ธ๐ŸŒŸ New Research Alert - ICCV 2025 (Oral)! ๐ŸŒŸ๐Ÿ‘๏ธ๐Ÿš€
๐Ÿ“„ Title: Token Activation Map to Visually Explain Multimodal LLMs ๐Ÿ”

๐Ÿ“ Description: The Token Activation Map (TAM) is an advanced explainability method for multimodal LLMs. Using causal inference and a Rank Gaussian Filter, TAM reveals token-level interactions and eliminates redundant activations. The result is clearer, high-quality visualizations that enhance understanding of object localization, reasoning and multimodal alignment across models.

๐Ÿ‘ฅ Authors: Yi Li, Hualiang Wang, Xinpeng Ding, Haonan Wang, and Xiaomeng Li

๐Ÿ“… Conference: ICCV, 19 โ€“ 23 Oct, 2025 | Honolulu, Hawai'i, USA ๐Ÿ‡บ๐Ÿ‡ธ

๐Ÿ“„ Paper: Token Activation Map to Visually Explain Multimodal LLMs (2506.23270)

๐Ÿ“ Repository: https://github.com/xmed-lab/TAM

๐Ÿš€ ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers

๐Ÿš€ Added to the Multi-Modal Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/multi-modal-learning.md

๐Ÿ“š More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

๐Ÿ” Keywords: #TokenActivationMap #TAM #CausalInference #VisualReasoning #Multimodal #Explainability #VisionLanguage #LLM #XAI #AI #ICCV2025 #ResearchHighlight
  • 1 reply
ยท
Nymboย 
posted an update 9 days ago
view post
Post
1456
Two new tools added to the Nymbo/Tools MCP server, File_System and Shell_Exec. You can theoretically do basically anything with these two tools, and it should enable support for many Claude Skills.

GPT-5-Codex proves that for many cases, shell commands really are all you need, and Claude Skills seem to lean into this. The thing is, nothing about the design of Claude Skills actually restricts them to proprietary models!

# File_System

There's a new directory inside the repo called Filesystem, that's the agent's "root". It can perform the following actions : list, read, write, append, mkdir, move, copy, delete, info, help. It's able to keep this all within the scope of one tool call by making the Action field required and all other fields optional. Using a filesystem shouldn't require 15 different tools.

Files created in the public HF space live in the space's running container, and gets cleared when the space is restarted. When running the server locally, files are actually stored on disk.

# Shell_Exec

What good is a filesystem if you can't execute commands in that filesystem? This tool automatically detects if the server is running on Windows or Linux, and suggests using the appropriate shell (PowerShell/Bash). Both of these new tools require that the agent uses relative paths, rather than absolute paths. I could be convinced to back pedal on this.

# Closing Thoughts

The File_System and Shell_Exec tools aren't super polished yet, I'll continue to improve the agent's instructions and UX of using the new tools. Most of my testing was done with gpt-oss-20b and if it messes up, it gets the gist after one failed tool call. It should work perfectly fine for the GPU poor.
  • 1 reply
ยท
nroggendorffย 
posted an update 9 days ago
view post
Post
3311
I love getting emails telling me when there's somebody else's active access token in one of my commit SHAs. HF should really only tell you if it is your token, otherwise I could just make a dataset with a bunch of random strings and wait for a valid token.
user,permission,token
nroggendorff,write,hf_...
pepper13,finegrained,hf_...
...,...,...
...

Also, don't comment about how unlikely this is. I've gotten a warning email about a token I 'leaked' at least four times.
In all cases, it has been in the digest hash.
  • 2 replies
ยท
m-ricย 
posted an update 13 days ago
view post
Post
412
Tokenization is one of the most important processes in AI - yet many would like to kill it ๐Ÿ’€

What's tokenization? The neural networks inside LLMs actually only process numbers, not text: tokenization is the process that makes text readable for them, by converting sentences into lists of numbers.

โžก๏ธ For instance, "This is tokenization" would be split into "This | is | token | ization", then each of the parts (tokens) are converted to IDs according to a predefined mapping: for instance "ization" could map to id 2438.
Thus "This is tokenization" can become 1335 | 135 | 2980 | 2438 => now the model can process the sentence!

Most tokenizers today use pre-specified mappings called "vocabularies", generally built about the compression algorithme Byte-Pair Encoding (BPE) that learns from a big corpuses of texts an optimized split to efficiently encode any text from the same distribution into a list token IDs.

๐Ÿคจ Now, these current tokenizers have flaws.
For instance, the rigidity of their mapping creates losses ; the prime example being that a tokenizer designed for English (thus optimized for tokens like "has", "been", "clock", etc) will not have the right tokens to approach Burmese, thus being terribly inefficient at it.

Many alternative approaches have emerged as a result: for instance "tokenizer-free tokenizers". One that I really liked was "entropy-based": it monitors the stream of text, and trigger a split whenever the entropy increases too much, i.e. when something "surprising" happens.

But this great article argues that tokenizers are a lesser evil. Read and decide for yourself!
https://huggingface.co/blog/catherinearnett/in-defense-of-tokenizers
Nymboย 
posted an update 14 days ago
view post
Post
1655
I've made some improvements to my custom Deep_Research tool in the Nymbo/Tools MCP server. I've added a second LLM process and it still takes less than 1 minute to complete!

The original version of my Deep_Research tool would basically dump up to 50 fetched webpages onto the Researcher model (Qwen3-235B), with only a little bit of context shown from each page.

# New "Filterer" Process

The new process includes another LLM call before the researcher process. The Filterer (also Qwen3-235B) gets the query summary and the original 50 pages with low context, and decides which pages are most relevant to the research topic. The Filterer then outputs the URLs to the relevant pages, which are then re-fetched (with more context) and sent to the Researcher.

# Researcher Context

The Researcher now gets only the relevant webpages, then begins writing the report. When testing with 50 initial results, the researcher would often end up with 10-20 results of relevant context.

It still takes less than a minute to accomplish everything, thanks entirely to Cerebras inference. It now takes about 35-45 seconds to complete once the tool is run.

It's also worth noting that both the Filterer and Researcher now are provided the current time/date before they see the content, reducing hallucinations caused by knowledge cutoffs.
Severianย 
posted an update 16 days ago
view post
Post
290
New Technique to Deeply Poison AI on Images and Prove Creative Provenance

I've developed a new method to protect creative work from unauthorized AI training. My Poisonous Shield for Images algorithm embeds a deep, removal-resistant poison into the mathematical structure of your images. It's designed to be toxic to machine learning models, achieving up to 20-348% disruption in AI training convergence in benchmark tests.

Unlike traditional watermarks, this protection survives compression and resizing and is not removed by standard tools. The technique also embeds cryptographic proof of provenance directly into the image, verifying ownership and detecting tampering.

You can see examples and learn more about how and WHY it works better than current methods:

https://severian-poisonous-shield-for-images.static.hf.space

If you are interested in using this technology to protect your work from AI training and unauthorized use, please reach out to me. It is currently in the prototype phase but fully functioning and effective. Still working on expanding it to a production-grade usable app.

This is not intended as a pure self-promotion post. I am genuinely wanting to help creators and want to gauge interest from different communities. I've spent the past year and a half building this from scratch with new math and code to try and solve this massive problem.ย 
m-ricย 
posted an update 19 days ago
view post
Post
4766
STOP EVERYTHING NOW - we might finally have a radical architecture improvement over Transformers!!! ๐Ÿšจ

A lone scientist just proposed Tiny Recursive Model (TRM), and it is literally the most impressive model that I've seen this year.

โžก๏ธ Tiny Recursive Model is 7M parameters
โžก๏ธ On ARC-AGI, it beats flagship models like Gemini-2.5-pro

Consider how wild this is: Gemini-2.5-pro must be over 10,000x bigger
and had 1,000 as many authors ๐Ÿ˜‚ (Alexia is alone on the paper)

What's this sorcery?
In short: it's a very tiny Transformers, but it loops over itself at two different frequencies, updating two latent variables: one for the proposed answer and one for the reasoning.

@AlexiaJM started from the paper Hierarchical Reasoning Model, published a few months ago, that already showed breakthrough improvement on AGI for its small size (27M)

Hierarchical Reasoning Model had introduced one main feature:
๐Ÿ”Ž Deep supervision
In their model, one part (here one layer) would run at high frequency, and another would be lower frequency, running only every n steps.

They had used a recurrent architecture, where these layers would repeat many times ; but to make it work they had to do many approximations, including not fully backpropagating the loss through all layers.

Alexia studied what was useful and what wasn't, and cleaned the architecture as follows :
Why use a recurrent architecture, when you can just make it a loop?
โžก๏ธ She made the network recursive, looping over itself

Why use 2 latent variables ?
โžก๏ธ She provides a crystal clear explanation : the one that changes frequently is the reasoning, the one that changes at low frequency is the proposed answer.
โžก๏ธ She runs ablation studies to validate that 2 is indeed optimal.

This new setup is a much more elegant way to process reasoning than generating huge chains of tokens as all flagship models currently do.

This might be the breakthrough we've been awaiting for so long!
  • 2 replies
ยท
Severianย 
posted an update 19 days ago
view post
Post
3143
MLX port of BDH (Baby Dragon Hatchling) is up!

Iโ€™ve ported the BDH ( https://github.com/pathwaycom/bdh ) model to MLX for Apple Silicon. Itโ€™s a faithful conversion of the PyTorch version: same math, same architecture (byte-level vocab, shared weights across layers, ReLU sparsity, RoPE attention with Q=K), with MLX-friendly APIs and a detailed README explaining the few API-level differences and why results are equivalent.

Code, docs, and training script are ready to use. You may need to adjust the training script a bit to fit your own custom dataset. Only tested on M4 so far, but should work perfect for any M1/M2/M3 users out there.

Iโ€™m currently training this MLX build on my Internal Knowledge Map (IKM) dataset Severian/Internal-Knowledge-Map
Trainingโ€™s underway; expect a day or so before I publish weights. When itโ€™s done, Iโ€™ll upload the checkpoint to Hugging Face for anyone to test.

Repo: https://github.com/severian42/BDH-MLX
HF model (coming soon): Severian/BDH-MLX

If you try it on your own data, feedback and PRs are welcome.
Molbapย 
posted an update 21 days ago
view post
Post
2953
๐Ÿš€ New blog: Maintain the unmaintainable โ€“ 1M+ Python LOC, 400+ models

How do you stop a million-line library built by thousands of contributors from collapsing under its own weight?
At ๐Ÿค— Transformers, we do it with explicit software-engineering tenets, principles that make the codebase hackable at scale.

๐Ÿ” Inside the post:
โ€“ One Model, One File: readability first โ€” you can still open a modeling file and see the full logic, top to bottom.
โ€“ Modular Transformers: visible inheritance that cuts maintenance cost by ~15ร— while keeping models readable.
โ€“ Config-Driven Performance: FlashAttention, tensor parallelism, and attention scheduling are config-level features, not rewrites.

Written with @lysandre ,@pcuenq and @yonigozlan , this is a deep dive into how Transformers stays fast, open, and maintainable.

Read it here โ†’ transformers-community/Transformers-tenets
lunarfluย 
posted an update 21 days ago
view post
Post
2131
Cool stuff these past weeks on huggingface! ๐Ÿค— ๐Ÿš€ !
โ€ข ๐Ÿ“ˆTrackio, local-first W&B alternative
https://github.com/gradio-app/trackio/issues
โ€ข ๐ŸŒEmbeddingGemma, 300M-param, multilingual embeddings, on-device
https://huggingface.co/blog/embeddinggemma
โ€ข ๐Ÿ’ปOpen LLMs in VS Code (Inference Providers)
https://x.com/reach_vb/status/1966185427582497171
โ€ข ๐Ÿค–Smol2Operator GUI agents
https://huggingface.co/blog/smol2operator
โ€ข ๐Ÿ–ผ๏ธGradio visible watermarking
https://huggingface.co/blog/watermarking-with-gradio
NeoPyย 
posted an update 22 days ago
view post
Post
2242
@lunarflu can you make me out from Hugging Face Discord Community? Because my old email discord was gone yet and all of my email gone too ๐Ÿ˜”

My Dead Account:
@Blane187
@Ryouko65777


Also delete the account if can ๐Ÿ™
  • 2 replies
ยท
Nymboย 
posted an update 23 days ago
view post
Post
606
I have a few Sora-2 invites - 15509N
  • 1 reply
ยท
nroggendorffย 
posted an update 26 days ago
view post
Post
417
When're we getting H100-explorers?
  • 1 reply
ยท
nroggendorffย 
posted an update about 1 month ago
view post
Post
3350
I'm sorry, what?
ยท
Tonicย 
posted an update about 1 month ago