Qwen3.6 27B

Q8_0·27B params·GGUF
reasoning
checkpoint: unsloth/Qwen3.6-27B-GGUF:Qwen3.6-27B-Q8_0.gguf

All runs (5)

HardwareBackendModeShapeConc.Gen tok/sPrefill tok/sTTFTTPOT (ms)Prompt tokOut tokTotalVRAM Δ
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)unifieddrv 7
llama.cpp rocm-4f13cb7 (rocm)baselineagent1
7.7
2129.7281ms129.959950065.26s0.011 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)unifieddrv 7
llama.cpp rocm-4f13cb7 (rocm)baselinecodegen1
7.6
141.4439ms129.7621000131.22s0.017 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)unifieddrv 7
llama.cpp rocm-4f13cb7 (rocm)baselinechat1
7.5
66.9453ms129.53010013.42s0.003 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)unifieddrv 7
llama.cpp rocm-4f13cb7 (rocm)baselinerag1
7.3
474.91.54s129.884220027.43s0.006 GiB
Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)unifieddrv 7
llama.cpp rocm-4f13cb7 (rocm)baselineagent4
3.2
7.797.84s129.8599500162.66s0.039 GiB

Environment

Strix Halo · Radeon 8060S · 128 GiB unified (96 GiB VRAM)
cpuAMD RYZEN AI MAX+ 395 w/ Radeon 8060S
gpuAMD Radeon 8060S
archStrix Halo (gfx1151)
vram96 GiB (system 31.1 GiB, unified)
pcieGen 4 x16 / Gen 4 x16 max
clocksgfx 1447 MHz · mem 1000 MHz
temp47°C idle · 71°C peak
peak draw98 W
backendllama.cpp rocm-4f13cb7 (rocm)
serverlemonade unknown
osUbuntu 24.04 LTS
kernel7.0.2-2-pve
driverROCm 7.2.3
libc2.39
python3.12.3
containerizedtrue
llama.cppversion: 1 (4f13cb7) built with Clang 22.0.0 for Linux x86_64
build flagsGGML_HIP=ON AMDGPU_TARGETS=gfx1151 CMAKE_BUILD_TYPE=Release
runs/cell5
warmups2
endpoint/v1/chat/completions
streamingtrue