Multi-agent tiles· MCP server for every CLI· Swarmroom inter-agent chat· Capability routing· Arena head-to-head· Forge marketplace· Local AI · Ollama· Usage forecast· Open source · Apache 2.0 + FSL-1.1· v0.2.1 · 19.04.26· Multi-agent tiles· MCP server for every CLI· Swarmroom inter-agent chat· Capability routing· Arena head-to-head· Forge marketplace· Local AI · Ollama· Usage forecast· Open source · Apache 2.0 + FSL-1.1· v0.2.1 · 19.04.26·
v0.2.1 · Multi-agent · Online

A terminal for you and your AI agents

Your shortcut to every AI CLI. Spawn Claude, Codex, Gemini, OpenCode side‑by‑side. Benchmark them. Ship the winner. Real CLIs. Real tools. Your judgment.

Download for Mac
v0.2.1|macOS 13+|npx anvilterm
New Windows build incoming  —  Join waitlist
~/anvil/arena — claude-opus-4-7 codex gemini opencode ● LIVE · 4 AGENTS
Claude Code+ Codex+ Gemini CLI+ OpenCode+ Copilot CLI+ Ollama+ Shell· Claude Code+ Codex+ Gemini CLI+ OpenCode+ Copilot CLI+ Ollama+ Shell· Claude Code+ Codex+ Gemini CLI+ OpenCode+ Copilot CLI+ Ollama+ Shell·
— Runs every CLI worth running —
Claude
Claude
Codex
Codex
Gemini
Gemini
OpenCode
OpenCode
00/
The Pitch
ABSTRACT · 120 words

Every other terminal ships with one AI. AnvilTerm ships with a team.

Warp bets on one cloud agent. iTerm2 adds a chat sidebar. Ghostty stays pure. None of them let you spawn Claude, Codex, Gemini and OpenCode side-by-side, watch them coordinate in a shared chat, score them against each other, and ship the winner. AnvilTerm does. It treats multi-agent as a first-class primitive — live tiles, MCP control plane, SwarmRoom, Arena, Forge marketplace — not a feature on the roadmap.

Watching one AI work is productivity.
Watching four of them compete for your prompt is the future.

01/
Swarm · Multi-Agent
6 vendors · auto-grid · live

Spawn a team.
Watch it work.

One prompt, N agents. Each is its own live PTY tile running a real CLI — not a wrapper, not an API reskin. Auto-grid re-flows as agents finish. They post to a shared SwarmRoom where the lead delegates and specialists report back.

★ Lead ↔ specialists

SwarmRoom — a channel where your AIs debate.

Lead agent delegates subtasks. Specialists pick them up, report back in the same room. You watch the transcript live in a chat tile. Works with any agent that speaks MCP — which is all of them.

"It's Discord for your agents, except they actually work."
— ak · build log · 2026-04-11
▸ Real CLIs

Not wrappers.

Your agents are literal claude, codex, gemini binaries. You see exactly what the TUI sees.

▸ Capability routing

Right model per task.

swarm_route(["multimodal"]) → Gemini. ["reasoning"] → Claude. ["refactor"] → Codex.

▸ Auto-grid

Tiles that re-flow.

One agent = one tab. Four = 2×2. Eight = 3×3. Each tile stops, closes, exports on its own.

⌖ Lead-led composition

Tell the lead what to ship. It picks the team.

Spawn a single Claude lead. It reads the prompt, calls swarm_vendors, picks specialists, delegates, merges the result. You just watch and approve. No team-building UI required.

⏱ Live token ledger

Per-agent + aggregate.

Every tile streams its token counter scraped from the TUI. The toolbar aggregates daily + weekly spend across Claude + Codex + OpenCode. Forecast tells you when you'll hit the cap.

02/
Arena · Head-to-head
Contestants · Judge · Export

Two agents enter.
One ships.

Pick contestants. Paste a prompt. Hit start. Each tile streams live output, renders deliverables as iframes when the agent calls arena_push_artifact. Judge on quality, speed, correctness. Export the fight as markdown + JSONL for reproducibility — or as a 9:16 battle video for your timeline.

Claude Opus 4.7
94
Shipped in 38s · 12.1k tokens · ArtifactV1 rendered · 0 revisions
vs
GPT-5.4 High
91
Shipped in 42s · 14.8k tokens · ArtifactV1 rendered · 1 revision

Every battle is stored at ~/.anvil/arena/<id>.jsonl. Bring it back six months later, re-run on new model versions, watch the winner flip. Reproducible benchmarking — for the laptop era.

03/
Screens · The product
macOS · dark · v0.2.1

This is what four agents
working at once looks like.

Arena tiles stream live output side-by-side. The full app view shows Forge marketplace, SwarmRoom chat, and the menu-bar usage tray all alive at once.

04/
MCP · Control Plane
20+ tools · stdio

Every AI you install
gets superpowers.

AnvilTerm ships with a Model Context Protocol server. One command registers it across Claude Code, Codex, Gemini, OpenCode. Every agent gains — for free — a browser, a PTY, inter-agent chat, artifact rendering, usage tracking, screenshot.

# wire AnvilTerm into every installed agent at once
npx anvilterm-doctor --install

# now from any agent:
terminal_create() · terminal_write() · terminal_screen()
tui_type() · tui_interrupt() · tui_choose()
swarm_spawn() · swarm_route() · swarm_vendors()
swarm_room_post() · swarm_room_listen() · swarm_room_thread()
arena_push_artifact() · arena_current()
◉ Terminal-in-terminal

Agents spawn their own PTYs.

A tab created via MCP appears as a live tile in the AnvilTerm window. You watch — and intervene if needed. Standalone fallback via node-pty if the UI isn't running.

◉ Works everywhere

Claude · Cursor · Windsurf · Codex · Gemini · OpenCode.

Every MCP-speaking client. Stdio transport. Register once, the toolset travels with you.

05/
Forge · Marketplace
MCP + Skills · one-tap install

A marketplace for MCP servers and Skills.

Curated catalog, star-ranked, kind-filtered. Pick a server, hit install, choose Claude / Codex / Gemini / OpenCode — or all of them. Forge writes directly to each agent's config with a _forge:true tag so it can manage updates and removal cleanly.

⚒ One-tap install

Chrome · Slack · Playwright · Linear · Stripe — all at once.

Skills get git-cloned to ~/.anvil/skills/ and symlinked into each agent's skills dir. MCP servers get registered in each vendor's config. Installed view shows per-agent status chips so you always know what's actually wired.

"Homebrew, if brew also wired the package into every shell you've got open."
— forge. ship. repeat.
06/
Local AI · Offline
Ollama · Metal-accelerated

Offline assistant.
Your data stays local.

Cmd+K opens an Ollama-backed assistant with native function calling. Gemma 4, Qwen 2.5/3, Llama 3.1, Mistral Nemo. Prompts never leave the machine. Ideal for regulated work, sensitive repos, a plane, a café with no wifi.

▸ Zero telemetry

On-device.

Code, prompts, context — none of it leaves your laptop.

▸ Tool calls

Run buttons, not screenshots.

Models that speak tools return structured calls that render as one-click run buttons in the chat.

▸ Terminal context

Sees what you see.

The assistant reads the visible terminal screen so suggestions are grounded in what's actually running.

07/
Benchmark · Swarm vs Subagents
10 runs · n=40 tasks · 2026-04

Multi-agent swarm vs Claude subagents.
We ran the numbers.

Claude's built-in subagents are great when one model is enough. A real swarm — Claude + Codex + Gemini + OpenCode running in parallel as live CLIs — wins on parallelism, model diversity and visibility. Here's the honest breakdown on 40 mixed refactor / research / UI tasks.

A  ·  AnvilTerm Swarm (4 agents)

4 CLIs. Parallel. Visible.

Wall-clock time
1.0×
Model diversity
4 vendors
Live visibility
Per-tile
Context isolation
Full · per agent
Mid-run intervention
Any tile
Failure blast radius
1 of N
B  ·  Claude built-in subagents

Sequential. Opaque. One family.

Wall-clock time
3.1×
Model diversity
1 family
Live visibility
Spinner
Context isolation
Compacted
Mid-run intervention
None
Failure blast radius
Blocks parent
Dimension AnvilTerm Swarm Claude Subagents
ParallelismN real PTYs in parallelSequential within parent turn
Context window per workerFull · 1M tokens each · no compactionShared parent, compacted on dispatch
Model diversityClaude · Codex · Gemini · OpenCode · Copilot · OllamaClaude family only
Live outputDedicated live tile per agentOpaque spinner until result
Human-in-the-loopType into any tile, paste refs, interruptNone once dispatched
Artifact renderingIframes · SVG · markdown · liveText summary returned to parent
Failure isolationOne agent fails, N-1 keep workingSubagent failure blocks parent turn
Token cost routingPer-vendor metered, route cheap tasks to OllamaAll charged against parent's Claude quota
DeterminismFresh PTY state per agentInherits parent compaction
Tool accessEach agent carries its own MCP toolkitParent's MCP toolkit only
ReproducibilitySession JSONL + Arena replayNo persistent transcript
Best forCross-vendor compare, parallel research, long refactorsTightly-coupled chains in one model family

Note: wall-clock × is normalized to swarm=1.0 across 40 mixed tasks. Subagents win when the task is inherently sequential (each step depends on the prior result) — swarm wins when the work fan-outs. Use both.

08/
The Ledger · vs Everyone
23 capabilities · 8 terminals

The honest spec sheet.
We checked.

Every other terminal is excellent at its thing. The matrix below is the proof — not marketing. Hover any row to see only AnvilTerm light up.

Capability AnvilTerm Warp iTerm2 Ghostty WezTerm Kitty Alacritty Tabby
Multi-agent swarm · live tiles
MCP server · agents drive the terminalpartial
SwarmRoom · inter-agent chat
Arena · head-to-head benchmark
Forge · MCP + Skills marketplacecatalog
Capability routing (task → best model)
Cross-vendor usage tracking
Usage forecast · threshold alerts
Inline images · SVG renderpartialpartial
Inline video · PDF · audio waveform
YouTube embed · hover previews
Markdown table → spreadsheet
Local AI · offline assistantplugin
Voice input · push-to-talk
Session recording · ANSI + plainpartial
Interactive screenshot → MCP
Automation API (DevTools · Playwright)AppleScriptLuaRCplugins
Works offline · no cloud lock-in
Open sourceApache + FSLclosedGPLMITMITGPLApacheMIT
macOS · Linux · Windows● ● ●✓ ✓ ✓mac✓ ✓ β✓ ✓ ✓✓ ✓ —✓ ✓ ✓✓ ✓ ✓
Native GPU rendererxterm.jsMetalOpenGL
shipped · AnvilTerm shipped partial via plugin or limited not available
09/
Made for three sides
dev · agents · labs

Built for the AI era.
All three sides of it.

A / The developer

Stop flipping tabs.

Every AI CLI in one workspace. Benchmark them on real work. Pick the best tool for each job. Never break your flow.

B / The agents

Real tools.

A browser, a PTY, an artifact renderer, image / PDF / YouTube understanding, a room to talk to their siblings. MCP, done for you.

C / The labs

Reproducible benchmarks.

Arena runs head-to-head on real CLIs, real tasks. Your model's wins become shareable 9:16 battle videos. Free distribution.

Forge the future.

Download · Spawn · Ship
↓ AnvilTerm 0.2.1 · macOS
Read the source ↗
↓   Download