magiccodingman's picture
File name changes
94a426d verified

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes

model size params backend ngl test t/s
seed_oss 36B MXFP4 MoE 17.89 GiB 36.15 B CUDA 35 pp8 20.46 ± 0.38
seed_oss 36B MXFP4 MoE 17.89 GiB 36.15 B CUDA 35 tg128 5.52 ± 0.01

build: 92bb442ad (7040)