Cofiswarm — The Local-First Multi-Agent Coding Swarm

Your machine, your model, your code

Everything runs on Apple Silicon (MLX) or NVIDIA (vLLM) — nothing leaves the box.

13 Specialized Roles

Architect, foreman, programmer, security, database, frontend, reviewer, tester, debugger, optimizer, scout, mlx-scout and synthesis — each with a tuned system prompt.

Mix Backends Per Agent

Point any agent at any model. Blend MLX, llama.cpp and vLLM in a single swarm, with per-agent model overrides from the configure panel.

Multi-Turn Threads

Sessions auto-continue after the first broadcast. The conversation thread panel keeps full turn history, persisted to disk.

pgvector RAG Context

Per-agent codebase retrieval injects relevant context so every agent reasons over your repository, not a stale snapshot.

Live Metrics Dashboard

Monitor popout shows KV-cache pressure, unified-memory gauges and per-port MLX telemetry in real time.

Built for Privacy

No cloud dependency, no API keys, air-gappable. A React UI talking to a C++ coordinator over local ports.

Four ways to orchestrate

Selectable from the UI mode menu — broadcast, chain, reduce or route.

Flat

Broadcast the prompt to every agent in parallel.

Pipeline

Sequential chain; each agent builds on the last.

Cascade

Mixture-of-agents reduced by a synthesizer.

Router

A classifier picks the right agents by live load.