Qwen3.6 27B

Q4_K_XL·27B params·GGUF
checkpoint: unsloth/Qwen3.6-27B-GGUF:Qwen3.6-27B-UD-Q4_K_XL.gguf

All runs (15)

HardwareBackendShapeConc.Gen tok/sTTFTTPOT (ms)Out tokTotalVRAM Δ
GeForce RTX 3090 · 24 GiBllama.cpp 59778f0 (cuda)codegen1
21.2
358ms47.0100047.28s0.000 GiB
GeForce RTX 3090 · 24 GiBllama.cpp 59778f0 (cuda)chat1
21.1
253ms43.61004.73s0.000 GiB
GeForce RTX 3090 · 24 GiBllama.cpp 59778f0 (cuda)agent1
20.7
607ms47.250024.14s0.000 GiB
GeForce RTX 3090 · 24 GiBllama.cpp 59778f0 (cuda)rag1
19.7
919ms46.020010.17s0.000 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b8940 (vulkan)codegen1
11.9
481ms83.6100084.14s0.000 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b8940 (vulkan)chat1
11.6
360ms83.21008.61s0.000 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b1203 (rocm)codegen1
11.5
426ms86.7100087.08s0.003 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b8940 (vulkan)agent1
11.4
1.98s84.850044.00s0.000 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b1203 (rocm)chat1
11.2
351ms86.31008.93s0.002 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b1203 (rocm)agent1
11.1
1.78s87.050045.13s0.007 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b8940 (vulkan)rag1
10.8
1.95s83.820018.60s0.000 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b1203 (rocm)rag1
10.5
1.73s87.120019.09s0.004 GiB
GeForce RTX 3090 · 24 GiBllama.cpp 59778f0 (cuda)agent4
9.7
4.02s91.734135.46s0.030 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b8940 (vulkan)agent4
7.2
3.82s128.250069.24s0.000 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)llama.cpp b1203 (rocm)agent4
5.3
3.44s178.950093.77s-0.002 GiB

Environment

GeForce RTX 3090 · 24 GiB
cpuAMD EPYC 7302P 16-Core Processor
gpuNVIDIA GeForce RTX 3090
archNVIDIA
vram24 GiB (system 64.0 GiB)
power200 W / 450 W max(44% cap)
backendllama.cpp 59778f0 (cuda)
serverlemonade unknown
osUbuntu 24.04 LTS
kernel6.17.13-7-pve
driver590.48.01
python3.12.3
containerizedtrue
runs/cell5
warmups2
endpoint/v1/chat/completions
streamingtrue
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)
cpuAMD RYZEN AI MAX+ 395 w/ Radeon 8060S
gpuAMD Radeon 8060S
archStrix Halo (gfx1151)
vram96 GiB (system 31.1 GiB, unified)
backendllama.cpp b1203 (rocm)
serverlemonade 10.4.0
osUbuntu 24.04.4 LTS
kernel7.0.2-2-pve
python3.12.3
containerizedtrue
runs/cell3
warmups1
endpoint/v1/chat/completions
streamingtrue
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)
cpuAMD RYZEN AI MAX+ 395 w/ Radeon 8060S
gpuAMD Radeon 8060S
archStrix Halo (gfx1151)
vram96 GiB (system 31.1 GiB, unified)
backendllama.cpp b8940 (vulkan)
serverlemonade 10.4.0
osUbuntu 24.04.4 LTS
kernel7.0.2-2-pve
python3.12.3
containerizedtrue
runs/cell3
warmups1
endpoint/v1/chat/completions
streamingtrue