llmfit matches LLM models to your hardware capabilities. Detects RAM, CPU cores, and GPU specs, then scores hundreds of language models across quality, speed, fit, and context dimensions. Features interactive TUI with vim-like navigation, classic CLI, and REST API. Automatically selects optimal quantization levels and handles MoE architectures. Integrates with Ollama, llama.cpp, MLX, and Docker Model Runner. 200+ models from Meta, Mistral, Qwen, Google catalogued.
git clone https://github.com/AlexsJones/llmfit.git
# Install
cargo install llmfit
# Interactive TUI - auto-detects hardware
llmfit
# CLI recommendations
llmfit recommend --gpu-vram 8GB
# Score a specific model
llmfit fit meta-llama/Llama-3-70B