magiccodingman's picture
File name changes
94a426d verified

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes

model size params backend ngl test t/s
seed_oss 36B Q4_K - Medium 20.26 GiB 36.15 B CUDA 35 pp8 26.65 ± 0.22
seed_oss 36B Q4_K - Medium 20.26 GiB 36.15 B CUDA 35 tg128 4.84 ± 0.03

build: 92bb442ad (7040)