Local-first · No cloud · No API keys
A thirteen-agent coding swarm that lives entirely on your own machine.
Dispatch a single prompt to specialized agents architect, programmer, security, reviewer and more orchestrated across four modes against local inference servers (llama.cpp · MLX · vLLM). Privacy-first, air-gappable, and instant.
Everything runs on Apple Silicon (MLX) or NVIDIA (vLLM) — nothing leaves the box.
Architect, foreman, programmer, security, database, frontend, reviewer, tester, debugger, optimizer, scout, mlx-scout and synthesis — each with a tuned system prompt.
Point any agent at any model. Blend MLX, llama.cpp and vLLM in a single swarm, with per-agent model overrides from the configure panel.
Sessions auto-continue after the first broadcast. The conversation thread panel keeps full turn history, persisted to disk.
Per-agent codebase retrieval injects relevant context so every agent reasons over your repository, not a stale snapshot.
Monitor popout shows KV-cache pressure, unified-memory gauges and per-port MLX telemetry in real time.
No cloud dependency, no API keys, air-gappable. A React UI talking to a C++ coordinator over local ports.
Selectable from the UI mode menu — broadcast, chain, reduce or route.