yarn scale to 122,880 context length
#41 opened 4 months ago
by
nbroad
chat_template issue
#40 opened 4 months ago
by
Alex-Chan
Add assistant mask support to Qwen3-235B-A22B
🚀
2
#38 opened 5 months ago
by
waleko
system prompt suggestion
1
#37 opened 5 months ago
by
Shuaiqi
Publish to GitHub Marketplace
#36 opened 5 months ago
by
typed-sigterm
The Qwen3-235B-A22B model is not as effective as the Qwen3-32B model.
🔥
4
#35 opened 5 months ago
by
czqqq
Video?
2
#34 opened 5 months ago
by
jujutechnology
关于Qwen3-235B-A22B在文学创作中“思维链”模式对输出风格影响的观察与建议
🧠
👍
9
#33 opened 6 months ago
by
yxcl6874
In complex reasoning tasks Qwen3 is far behind QwQ
12
#32 opened 6 months ago
by
AdamF92
I know not related but saw This Star Wars picture on Facebook and i thought it is code so trying to say Hi to Someone named John!
#31 opened 6 months ago
by
ebearden
Not sure Clocks keep coming up and keep getting the run around might need time stamping confirmations?
#30 opened 6 months ago
by
ebearden
Not Sure Orginal to Persian to Binary?
#29 opened 6 months ago
by
ebearden
Qwen3 is simply amazing.
#28 opened 6 months ago
by
Trilogix1
Upload b891b3c3bb6a146a8e809bb72a06d101.png
#27 opened 6 months ago
by
Jalil16
Add image visual recognition output just like qwen 2.5 vl-32b instruct
6
#26 opened 6 months ago
by
devopsML
English to French - French to English based on Meta & HuggingFace Chat Bot
#25 opened 6 months ago
by
ebearden
Qwen3 幻觉太高了,比 Qwen 2.5 差太多了
➕
1
9
#24 opened 6 months ago
by
hehua2008
Upload 3 files
#23 opened 6 months ago
by
neuroQuantu
Upload 3 files
1
#22 opened 6 months ago
by
neuroQuantu
Model keeps talking about Cumhurbaşkanlığı Sarayı when speaking Turkish
#21 opened 6 months ago
by
aeminkocal
Qwen3 not Using Tools in Complex Prompts Unlike QwQ-32B
8
#20 opened 6 months ago
by
Anaudia
Thanks a lot for this release
🔥
3
#19 opened 6 months ago
by
Volko76
Does anyone feel Qwen3 often fails to follow instructions accurately?
🚀
7
7
#18 opened 6 months ago
by
DOFOFFICIAL
Two of the base models are missing
➕
1
1
#17 opened 6 months ago
by
ZhangRC
Qwen is loosing broad knowledge since Qwen2.
🔥
👍
12
16
#16 opened 6 months ago
by
phil111
GPQA perf for DSV3-Base seems wrong
➕
4
2
#15 opened 6 months ago
by
AChen-qaq
72B-MoE
👍
4
#13 opened 6 months ago
by
avalonsec
235B会放出来Base模型吗?
➕
8
#12 opened 6 months ago
by
Yantao2009
看模型介绍和模型结构里面没有关于vision encoder的部分,但是在qwen的在线模型服务界面可以用这个模型去看图片,想问下视觉部分是复用了哪个vision encoder呢?
5
#11 opened 6 months ago
by
Chloez
有用4张H20实践过的大佬吗
2
#10 opened 6 months ago
by
Edison0902
8张80G显存的8卡A100能部署不?
10
#9 opened 6 months ago
by
Yuxin362
User rating and reviews of Qwen3 App and Qwen3 Model
#8 opened 6 months ago
by
DeepNLP
是不是奖励函数没有ngram重复度惩罚
2
#7 opened 6 months ago
by
wzx111
🚀[Fine-tuning] Qwen3-MoE Megatron Training Implementation and Best Practices👋
🚀
6
1
#6 opened 6 months ago
by
study-hjt
【Evaluation】Best practice for evaluating Qwen3 !!
🚀
👍
4
#5 opened 6 months ago
by
wangxingjun778
Please upload the base model for this one
👍
5
#4 opened 6 months ago
by
mesh-ops
GPTQ/AWQ
👀
14
4
#3 opened 6 months ago
by
ndurkee
Add languages tag
#2 opened 6 months ago
by
de-francophones