Ian J's picture

Ian J

iyanello

·

MIkeLP

AI & ML interests

None yet

Recent Activity

updated a model about 11 hours ago

iyanello/Qwen3-Embedding-0.6B-GGUF

updated a model about 11 hours ago

iyanello/Qwen3-Reranker-0.6B-GGUF

published a model about 11 hours ago

iyanello/Qwen3-Reranker-0.6B-GGUF

View all activity

Organizations

None yet

upvoted a collection 2 days ago

Cerebras REAP

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 9 days ago • 69

upvoted a collection 10 days ago

T5Gemma 2

3 items • Updated 10 days ago • 55

upvoted 2 collections 4 months ago

mmBERT: a modern multilingual encoder

mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9 • 49

Contextual AI Reranker v2

Family of instruction-following multilingual rerankers on the cost/performance Pareto frontier across public and customer benchmarks • 6 items • Updated Aug 25 • 9

upvoted a paper 6 months ago

Retrieval-Augmented Generation with Hierarchical Knowledge

Paper • 2503.10150 • Published Mar 13 • 2

upvoted a collection about 1 year ago

Pangea

A Fully Open Multilingual Multimodal LLM for 39 Languages • 26 items • Updated Feb 1 • 19

upvoted 3 collections over 1 year ago

DataGemma Release

A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated Jul 10 • 87

Knowledge graph

25 items • Updated Feb 11, 2024 • 6

VideoLLaMA2

Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability • 13 items • Updated Sep 2 • 19

upvoted an article over 1 year ago

Article

Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖

Jun 20, 2024

•

26

upvoted a collection over 1 year ago

Cohere Labs Aya 23

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated Jul 31 • 56

upvoted 2 papers over 1 year ago

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11, 2024 • 32

upvoted a collection over 1 year ago

StarChat2 15B

Model, datasets, and demo for StarChat2 15B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 10 items • Updated Apr 12, 2024 • 13

upvoted a paper almost 2 years ago

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Paper • 2403.10704 • Published Mar 15, 2024 • 59