Skip to content
Documentation

Documentation

mini-claude in a nutshell

mini-claude is a fast, private terminal chat client for self-hosted large language models. It speaks the OpenAI-compatible HTTP API, streams tokens in real time, and ships as a single Go binary. Nothing leaves your machine.

Quickstart

You need three things:

  1. An inference server running locally (we recommend Ollama).
  2. At least one model pulled.
  3. The mini-claude binary or the source.
ollama serve &
ollama pull llama3.2:3b

git clone https://github.com/hugostarte/Mini-Claude.git
cd Mini-Claude
go run ./cmd/tui

That’s it. A welcome screen appears, you type a message, the response streams in.

Next steps