The Project

A terminal UI for managing llama.cpp servers. Start, stop, and monitor the server, manage versions, download GGUF models from Hugging Face, and track inference performance in real time.

Built for people who already live in the terminal. Tested in Linux.

npm install -g llama-manager click to copy

// ABOUT

A hobby project for managing local LLM inference. Created by Patryk Bajer (bayger) and contributors.

Apache License 2.0

// IN ACTION

llama-manager demo

// FEATURES

Dashboard

Real-time per-slot metrics, server controls, live log viewer.

Logs

Dedicated server log viewer with structured severity coloring.

Tasks

Parsed task history with token counts, speeds, draft acceptance, SQLite persistence.

Profiles

Named server configurations with type-aware preset editors.

Versions

Install, switch, and uninstall llama.cpp builds from GitHub releases.

Models

Search Hugging Face, download GGUF models with progress tracking.

Options

Global settings: paths, poll interval, task limits, appearance, theme, HF token.

// DISCLAIMER

This project is developed in free time. New versions will be released "when they're ready". If you have an idea or something isn't working right — drop a message on the GitHub Discussions page. External PRs are closed.