sometimesanotion (sometimesanotion)

New activity in jpacifico/Chocolatine-2-14B-Instruct-v2.0.3 12 days ago

A tour of 14B finetuning

4

#1 opened 9 months ago by

sometimesanotion

liked a model 17 days ago

LiquidAI/LFM2-VL-1.6B

Image-Text-to-Text • 2B • Updated 11 days ago • 5.64k • 207

liked a model 23 days ago

Sunbird/Sunflower-14B

Text Generation • 15B • Updated 24 days ago • 2.74k • 3

New activity in win10/GPT-OSS-30B-Preview 24 days ago

how you scaled up the parameters?

2

#1 opened 3 months ago by

jiyintor

liked a model 24 days ago

win10/GPT-OSS-30B-Preview

Text Generation • 31B • Updated Aug 11 • 21 • 2

liked a dataset 24 days ago

ConicCat/BasedBaseEvals

Preview • Updated 25 days ago • 55 • 2

updated a model 25 days ago

sometimesanotion/Lamarck-14B-v0.7

Text Generation • 15B • Updated 25 days ago • 783 • 44

liked a model 26 days ago

LiquidAI/LFM2-8B-A1B

Text Generation • 8B • Updated 5 days ago • 13.6k • 231

reacted to sequelbox's post with 👍 26 days ago

Post

1492

NEW RELEASE: Esper 3.1!

- Esper is our full-stack, full-cycle coding, DevOps, and architecture specialist!
- Our newest, best DeepSeek technical datasets emphasize more challenging queries and tough real-world coding tasks across a variety of programming languages and development paradigms:
- Titanium 3 for coding and reasoning in DevOps and architecture: sequelbox/Titanium3-DeepSeek-V3.1-Terminus
- Tachibana 3 for high-difficulty code production in a variety of topics and programming languages:
- sequelbox/Tachibana3-Part1-DeepSeek-V3.1-Terminus
- sequelbox/Tachibana3-Part2-DeepSeek-V3.2
- Mitakihara for MLOps, AI building, use, expertise, and research: sequelbox/Mitakihara-DeepSeek-R1-0528

Our first release in the Esper 3.1 series is built on Qwen3-4B-Thinking-2507. GET IT NOW, FOR EVERYONE: ValiantLabs/Qwen3-4B-Thinking-2507-Esper3.1

We'll be bringing Esper 3.1 to more, larger models as soon as we can; you can help this happen faster with a donation: sequelbox/SupportOpenSource

We're really happy about this one; let us know how Esper 3.1 works for you!

Support open source. It's our only hope for an AI future you'll actually want to live in.

More to come soon!

with our love and appreciation,
allegra

reacted to hba123's post with 🔥🚀 about 1 month ago

Post

2212

Hey, amazing, awesome people of the beautiful internet 😍🥰

Distillation has been (from my point of view) a main driving factor for the success of hashtag#LLMs - like distilling the knowledge of an amazing big model (say hashtag#DeepSeekv3, or hashtag#GeminiAI) into yours.

Probably, you have done it with minimising a KL divergence, and it somehow worked.

Well, not that well, right?

1️⃣ Your model tends to memorise!
2️⃣ Your model might get the right answer, but its reasoning might be flawed.

To fix those problems, we rethink distillation and process a new approach! A method that is based on constrained RL that comes with nice theoretical guarantees and excellent performance!

Check it out: Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective (2509.22921)

Let us do distillation right! Please upvote if you find it useful!

reacted to giadap's post with 👍❤️ about 1 month ago

Post

10851

One of the hardest challenges in AI safety is finding the right balance: how do we protect people from harm without undermining their agency? This tension is especially visible in conversational systems, where safeguards can sometimes feel more paternalistic than supportive.

In my latest piece for Hugging Face, I argue that open source and community-driven approaches offer a promising (though not exclusive) way forward.

✨ Transparency can make safety mechanisms into learning opportunities.
✨ Collaboration with diverse communities makes safeguards more relevant across contexts.
✨ Iteration in the open lets protections evolve rather than freeze into rigid, one-size-fits-all rules.

Of course, this isn’t a silver bullet. Top-down safety measures will still be necessary in some cases. But if we only rely on corporate control, we risk building systems that are safe at the expense of trust and autonomy.

Read the blog post here: https://huggingface.co/blog/giadap/preserving-agency

7 replies

·

New activity in LiquidAI/LFM2-2.6B about 1 month ago

Excellent work

👍 ❤️ 2

2

#4 opened about 1 month ago by

sometimesanotion

liked 2 models about 1 month ago

Jackrong/gpt-oss-120b-Distill-Phi-4-14B

Question Answering • 15B • Updated Oct 2 • 22 • 2

Pinkstack/DistilGPT-OSS-qwen3-4B

Text Generation • 4B • Updated Sep 30 • 235 • 18

reacted to Kseniase's post with ❤️ about 1 month ago

Post

6126

10 awesome advanced LoRA approaches

Low-Rank Adaptation (LoRA) is the go-to method for efficient model fine-tuning that adds small low-rank matrices instead of retraining full models. The field isn’t standing still – new LoRA variants push the limits of efficiency, generalization, and personalization. So we’re sharing 10 of the latest LoRA approaches you should know about:

1. Mixture-of-LoRA-experts → Mixture of Low-Rank Adapter Experts in Generalizable Audio Deepfake Detection (2509.13878)
Adds multiple low-rank adapters (LoRA) into a model’s layers, and a routing mechanism activates the most suitable ones for each input. This lets the model adapt better to new unseen conditions

2. Amortized Bayesian Meta-Learning for LoRA (ABMLL) → Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models (2508.14285)
Balances global and task-specific parameters within a Bayesian framework to improve uncertainty calibration and generalization to new tasks without high memory or compute costs

3. AutoLoRA → AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation (2508.02107)
Automatically retrieves and dynamically aggregates public LoRAs for stronger T2I generation

4. aLoRA (Activated LoRA) → Activated LoRA: Fine-tuned LLMs for Intrinsics (2504.12397)
Only applies LoRA after invocation, letting the model reuse the base model’s KV cache instead of recomputing the full turn’s KV cache. Efficient in multi-turn conversations

5. LiLoRA (LoRA in LoRA) → LoRA in LoRA: Towards Parameter-Efficient Architecture Expansion for Continual Visual Instruction Tuning (2508.06202)
Shares the LoRA matrix A across tasks and additionally low-rank-decomposes matrix B to cut parameters in continual vision-text MLLMs

6. Sensitivity-LoRA → Sensitivity-LoRA: Low-Load Sensitivity-Based Fine-Tuning for Large Language Models (2509.09119)
Dynamically assigns ranks to weight matrices based on their sensitivity, measured using second-order derivatives

Read further below ↓
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe