LLM Inference Hardware Calculator
Estimate VRAM & System RAM for single-user inference (Batch=1).
Model quant & KV cache quant are configured separately.
Model Configuration
System Configuration
Hardware Requirements
VRAM Needed:
On-Disk Size:
GPU Config:
System RAM: