LFM2 2.6B
Q4_K_M·2.6B params·GGUF
intelligence: see on Artificial Analysis →
checkpoint:
LiquidAI/LFM2-2.6B-GGUF:Q4_K_MAll runs (10)
| Hardware | Backend | Shape | Conc. | Gen tok/s ↓ | TTFT | TPOT (ms) | Out tok | Total | VRAM Δ |
|---|---|---|---|---|---|---|---|---|---|
| GeForce RTX 5070 · 11.94 GiB | llama.cpp b9174 (cuda) | codegen | 1 | 268.6 | 22ms | 3.7 | 926 | 3.46s | 0.000 GiB |
| GeForce RTX 5070 · 11.94 GiB | llama.cpp b9174 (cuda) | agent | 1 | 262.7 | 16ms | 3.7 | 370 | 1.44s | 0.000 GiB |
| GeForce RTX 5070 · 11.94 GiB | llama.cpp b9174 (cuda) | chat | 1 | 259.5 | 19ms | 3.7 | 100 | 382ms | 0.000 GiB |
| GeForce RTX 5070 · 11.94 GiB | llama.cpp b9174 (cuda) | rag | 1 | 249.9 | 50ms | 3.7 | 90 | 357ms | 0.000 GiB |
| GeForce RTX 3090 · 24 GiB | llama.cpp 59778f0 (cuda) | chat | 1 | 238.9 | 21ms | 3.9 | 100 | 408ms | 0.000 GiB |
| GeForce RTX 3090 · 24 GiB | llama.cpp 59778f0 (cuda) | codegen | 1 | 232.4 | 35ms | 4.3 | 832 | 3.60s | 0.000 GiB |
| GeForce RTX 3090 · 24 GiB | llama.cpp 59778f0 (cuda) | rag | 1 | 222.6 | 66ms | 4.0 | 121 | 508ms | 0.000 GiB |
| GeForce RTX 3090 · 24 GiB | llama.cpp 59778f0 (cuda) | agent | 1 | 221.1 | 90ms | 4.3 | 500 | 2.26s | 0.000 GiB |
| GeForce RTX 5070 · 11.94 GiB | llama.cpp b9174 (cuda) | agent | 4 | 133.9 | 369ms | 6.8 | 369 | 2.82s | 0.000 GiB |
| GeForce RTX 3090 · 24 GiB | llama.cpp 59778f0 (cuda) | agent | 4 | 111.7 | 434ms | 7.9 | 500 | 3.94s | 0.000 GiB |
Environment
GeForce RTX 3090 · 24 GiB
cpuAMD EPYC 7302P 16-Core Processor
gpuNVIDIA GeForce RTX 3090
archNVIDIA
vram24 GiB (system 64.0 GiB)
power200 W / 450 W max(44% cap)
backendllama.cpp 59778f0 (cuda)
serverlemonade unknown
osUbuntu 24.04 LTS
kernel6.17.13-7-pve
driver590.48.01
python3.12.3
containerizedtrue
runs/cell5
warmups2
endpoint/v1/chat/completions
streamingtrue
GeForce RTX 5070 · 11.94 GiB
cpuAMD Ryzen 9 7900 12-Core Processor
gpuNVIDIA GeForce RTX 5070
archNVIDIA
vram11.94 GiB (system 30.4 GiB)
power250 W / 300 W max(83% cap)
backendllama.cpp b9174 (cuda)
serverlemonade unknown
osCachyOS
kernel7.0.0-1-cachyos
driver595.58.03
python3.14.4
containerizedfalse
runs/cell5
warmups2
endpoint/v1/chat/completions
streamingtrue