Try it
Once aivo is installed, you can use it straight away.
aivo "tell me a short story"
aivo codex Works with Claude Code Codex Gemini CLI OpenCode Amp Pi
$ Install aivo with script or package managers.
curl -fsSL https://getaivo.dev/install.sh | bash irm https://getaivo.dev/install.ps1 | iex brew install yuanchuan/tap/aivo npm install -g @yuanchuan/aivo Start without an API key, add your own when you're ready.
# or codex, gemini, opencode, pi, amp
aivo claude aivo-starter is a built-in provider with a few models for students and learners who want to try a coding agent without setting up a provider account — fine for demos and daily coding sessions.
aivo/starter DeepSeek V4 Flash · official API A general-purpose default for everyday tasks. minimax/minimax-m2.7 Expires Jun 1, 2026 MiniMax M2.7 · official API Larger model with more headroom for complex tasks. poolside/laguna-m.1 Laguna M.1 · Poolside Tuned for code generation and editing. Once aivo is installed, you can use it straight away.
aivo "tell me a short story"
aivo codex The starter pool has a finite daily budget — intended for interactive coding-agent sessions. For regular or automated use, please add your own API key.
Add API keys from any provider you trust. Keys are encrypted and stored locally, so you can manage them and switch anytime.
# OpenRouter, Vercel AI Gateway,
# OpenCode Go, DeepSeek, CloudFlare,
# Google AI Studio, Kilo Gateway,
# Amazon Bedrock, Cursor,
# more...
aivo keys add The picker covers most providers. A few work differently:
If you have a GitHub Copilot subscription, you can add it as a provider in aivo. (Be aware of the GitHub Copilot charges by request count, not by token usage)
# choose GitHub Copilot
aivo keys add
Normally you don't need aivo for this. But if you want to manage multiple subscriptions in one place, aivo can help you switch between them without logging in and out repeatedly.
Note that you can't use a Codex, Gemini, or Claude subscription as an API service because each subscription's terms restrict use to its own agent, and aivo respects that.
# choose ChatGPT, Claude, or Gemini
aivo keys add
Cursor is the exception. Its subscription includes the Cursor SDK, so it works across agents like a BYOK key.
# choose Cursor from the picker
aivo keys add
# you can use composer model in codex or any other coding agents.
aivo pi -m composer-2.5
Connect to Ollama for models on your machine or its cloud service. If a model isn't present, aivo asks to pull it before use.
# choose Ollama
aivo keys add
# chat with an ollama model
aivo chat -m ministral-3:3b
Other local servers — LM Studio, llama.cpp, etc. — work the same way: add their OpenAI-compatible endpoint as a provider.
Move keys between machines through a single password-encrypted file. OAuth logins and GitHub Copilot are skipped by default — they're machine-bound and won't transfer cleanly.
# prompts for a password
aivo keys export ~/keys.aivo
# same password on the other machine, or pass a URL
aivo keys import ~/keys.aivo
aivo keys import https://example.com/keys.aivo
# non-interactive (CI / scripts)
aivo keys export ~/keys.aivo --password-stdin <<< "my secret"
# list all saved keys
aivo keys
# quickly add a key with options
aivo keys add --base-url https://openrouter.ai/api --key sk-xxx
# activate a key to use
aivo keys use
aivo keys use mykey
# print saved data of a key
aivo keys cat
# edit a saved key
aivo keys edit
# health-check
aivo keys ping
# for more options
aivo keys --help
Launch a coding agent with any provider or model you want. All extra arguments are passed through to the underlying tool.
aivo claude
Use a saved API key with -k, or omit the value to open the key picker.
Pass -k alone with -m to pick both at once.
aivo claude -k openrouter
aivo claude -k copilot
aivo claude -k
Pin a model with -m, or omit the value to open the model picker.
Once a model is applied, aivo claude remembers it for the next run.
aivo claude -m moonshotai/kimi-k2.5
aivo claude -m
Per-slot models Claude Code runs multiple models in a single session —
haiku/sonnet/opus slots plus reasoning and subagent
overrides. Pin each slot to a different model, or pass a slot flag without a value to open
the picker for that slot.
aivo claude --sonnet-model deepseek-v4-pro --haiku-model deepseek-v4-flash
aivo claude --reasoning-model gpt-5.4 --subagent-model claude-haiku-4-5
# open the picker for one slot
aivo claude --opus-model
1M / 2M context window Append --1m or --2m so
Claude Code uses the long-context window for the resolved model.
aivo claude -m deepseek-v4-pro --1m
aivo claude -m grok-4.20-reasoning --2m
Debug HTTP traffic --debug writes every upstream request
and response to a JSONL file. Pass a path to override the default location.
aivo claude --debug
aivo claude --debug=/tmp/aivo-http.jsonl
aivo codex Use a saved API key with -k, or omit the value to open the picker.
aivo codex -k vercel
aivo codex -k
Pin a model with -m, or omit the value to open the model picker.
aivo codex -m xiaomi/mimo-v2-pro
aivo codex -m
Debug HTTP traffic Write every upstream request and response to a JSONL file.
aivo codex --debug
aivo codex --debug=/tmp/aivo-http.jsonl
Codex desktop app (macOS only) Launch the native Codex with the same
backend. The app's in-app model picker only shows names starting with gpt- or claude-;
other models still work, but won't appear there.
aivo codex-app
aivo codex-app -k aivo gemini Use a saved API key with -k, or omit the value to open the picker.
aivo gemini -k google
aivo gemini -k
Pin a model with -m, or omit the value to open the model picker.
aivo gemini -m gemini-2.5-pro
aivo gemini -m
OAuth login A Gemini subscription works inside
aivo gemini only — it can't act as a generic API for other agents.
See OAuth login for details.
Debug HTTP traffic Write every upstream request and response to a JSONL file.
aivo gemini --debug
aivo gemini --debug=/tmp/aivo-http.jsonl
aivo opencode Use a saved API key with -k, or omit the value to open the picker.
aivo opencode -k openrouter
aivo opencode -k
Pin a model with -m, or omit the value to open the model picker.
aivo opencode -m z-ai/glm-4.7
aivo opencode -m
Debug HTTP traffic Write every upstream request and response to a JSONL file.
aivo opencode --debug
aivo opencode --debug=/tmp/aivo-http.jsonl
aivo amp Use a saved API key with -k, or omit the value to open the picker.
aivo amp -k openrouter
aivo amp -k
Pin a model with -m, or omit the value to open the model picker.
aivo amp -m gpt-5.4
aivo amp -m
Per-mode models Amp routes between rush,
smart, deep, and large modes. Pin each mode to a
different model, or pass a mode flag without a value to open the picker for that mode.
aivo amp --smart-model claude-sonnet-4.6 --rush-model claude-haiku-4-5
aivo amp --deep-model gpt-5.4-thinking --large-model deepseek-v4-pro
# pin the starting mode, or omit the value to open the mode picker
aivo amp --mode deep
aivo amp --mode
Debug HTTP traffic Write every upstream request and response to a JSONL file.
aivo amp --debug
aivo amp --debug=/tmp/aivo-http.jsonl
aivo pi Use a saved API key with -k, or omit the value to open the picker.
aivo pi -k cursor
aivo pi -k
Pin a model with -m, or omit the value to open the model picker.
aivo pi -m composer-2.5
aivo pi -m
Debug HTTP traffic Write every upstream request and response to a JSONL file.
aivo pi --debug
aivo pi --debug=/tmp/aivo-http.jsonl
Transform mode Because Pi supports multiple upstream protocols, by
default aivo just hands it the URL and the right API type so it talks to the provider
directly. Use --transform to put aivo's local router in the middle and
normalize the stream.
aivo pi --transform -k openrouter
Without a tool name, aivo run remembers your last key and tool selection,
so next time it skips the prompts and goes straight to launching.
aivo run
Pin a coding agent and its usual flags under one name, then launch the whole thing with
aivo <name> (or the long form, aivo run <name>).
Inline flags at run time override the saved ones — handy for
locking in the per-slot model setups you'd otherwise retype every session.
# Claude Code with separate models per slot
aivo alias work claude -k openrouter \
--opus-model deepseek/deepseek-v4-pro \
--sonnet-model deepseek/deepseek-v4-flash \
--haiku-model deepseek/deepseek-v4-flash \
--1m
# Amp with separate models per mode, pinned to deep mode
aivo alias task amp -k openrouter \
--smart-model openai/gpt-5.5 \
--deep-model anthropic/claude-opus-4.7 \
--rush-model openai/gpt-5.4 \
--large-model deepseek/deepseek-v4-pro
# run them
aivo work
aivo task
Once you add a provider and set up the key,
you can use the models command to see the available models the provider offers.
Most providers provide the model list through API, so aivo fetches it on demand and caches it for later use.
# active key's models, or pick one
aivo models
aivo models -k openrouter
# filter by name
aivo models -s free
aivo models -k openrouter -s claude
# force refresh the cached list
aivo models --refresh
--json prints the raw upstream model list for scripts. The shape
depends on the provider — pipe to jq to explore and filter.
aivo models --json
aivo models -k openrouter --json | jq
If a model name is too long, you can give it a short alias.
Model aliases are accepted anywhere -m/--model works.
aivo alias fast=claude-haiku-4-5
aivo alias mimo xiaomi/mimo-v2-pro
# use it
aivo claude -m fast
aivo chat -k vercel -m mimo
# list and remove
aivo alias
aivo alias rm fast
Run any open-weight GGUF directly from a Hugging Face repo. The first time you
reference a model, aivo downloads the file to
~/.config/aivo/cache/huggingface and serves it through a bundled
llama-server.
Subsequent runs reuse the cached file.
The hf: prefix and full https://huggingface.co/…
URLs are accepted anywhere a model name works — in aivo chat,
a coding agent's -m, the bare aivo shortcut, etc.
aivo hf:Qwen/Qwen2.5-0.5B-Instruct-GGUF
aivo https://huggingface.co/allenai/Olmo-3-1025-7B
GGUF repos usually publish several quantizations. Append :<quant>
(e.g. Q5_K_M, Q4_K_M) to pick a specific file; if you
don't, aivo prompts you to choose the first time it downloads.
aivo chat hf:bartowski/Llama-3.2-3B-Instruct-GGUF:Q5_K_M
aivo claude -m hf:bartowski/Llama-3.2-3B-Instruct-GGUF:Q4_K_M
Pass hf: wherever -m/--model is accepted.
Combine with the per-slot flags for Claude Code and Amp to mix local and remote
models in the same session.
aivo claude -m hf:Qwen/Qwen2.5-0.5B-Instruct-GGUF
aivo pi -m hf:Qwen/Qwen2.5-0.5B-Instruct-GGUF
# mix: remote opus, local sonnet
aivo claude \
--opus-model claude-opus-4.7 \
--sonnet-model hf:Qwen/Qwen2.5-0.5B-Instruct-GGUF
A bare hf: opens a picker over the local cache — useful once
you've pulled a few models and don't want to retype the repo path.
aivo chat hf:
aivo claude hf:
aivo hf lists the cached repos; --verbose expands each
repo to show every downloaded quant. The rest are housekeeping commands.
# list cached repos
aivo hf
aivo hf --verbose
# pre-pull a model (handy before going offline)
aivo hf pull hf:Qwen/Qwen2.5-0.5B-Instruct-GGUF
# delete one quant or a whole repo
aivo hf rm <repo> --quant Q5_K_M
aivo hf rm <repo> --all -y
# wipe the whole cache
aivo hf clean -y
Talk to any model directly from your terminal. Full-screen TUI for conversations, or one-shot mode for quick answers and shell pipelines.
Interactive chat in your terminal with streaming and markdown rendering. The selected model is remembered per saved key.
aivo chat
aivo chat -m gpt-5.4
# open model picker
aivo chat -m
# open key and model pickers if you forget the names
aivo chat -k
Send a single prompt and exit. A bare quoted argument is the shortcut —
aivo "..." rewrites to aivo chat -p "...".
Use -p/--prompt when you also want flags.
When -p has a message, piped stdin is appended as context.
When -p has no message, the entire stdin becomes the prompt.
aivo "pro tips for git"
aivo -p "pro tips for git" -m gpt-5.4
# type interactively, Ctrl-D to send
aivo -p
Pipe anything into aivo. It reads stdin, adds your prompt, and sends it to the model.
git diff | aivo -p "Write a commit message"
cat error.log | aivo -p "Find the root cause"
cat error.log | aivo -p
Expose your active provider as a local OpenAI-compatible endpoint. Any tool that speaks the OpenAI API can use it — VS Code extensions, Python scripts, anything.
Model aliases are resolved at the server — clients can post
{"model": "fast", ...} and aivo rewrites it to the real upstream model
before forwarding. The alias list also shows up in /v1/models.
aivo serve
aivo serve --port 8080
aivo serve --host 0.0.0.0
If a request hits a rate limit (429) or server error (5xx), aivo retries with the next saved key automatically.
aivo serve --failover
Log every request and response. Pipe to jq for readable output, or write to a file.
aivo serve --log | jq .
aivo serve --log /tmp/requests.jsonl
aivo serve --auth-token
aivo serve --auth-token my-secret
aivo serve --cors
aivo serve --timeout 60
curl http://localhost:24860/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "hello"}]}' One feed across aivo's own events (chat, run, serve), native CLI sessions (claude, codex, gemini, pi, opencode), and amp threads — read from each tool's on-disk session files and aivo's local SQLite. Scoped to the current project by default.
# recent activity in the current project
aivo logs
# every project on this machine
aivo logs --all
Narrow by source, model, key, time, or text. --errors shows only failures.
aivo logs --by chat -n 5
aivo logs --by claude --errors
aivo logs --by native
aivo logs --model glm-4.7
aivo logs -s "rate limit"
aivo logs --since 24h
aivo logs --json
Publish a session via a tunneled viewer URL — useful for showing a teammate a bug repro, an agent transcript, or an interesting model run.
The server is just a bridge — it forwards viewer requests to your machine over the tunnel and keeps no copy of the session.
# pick a session in the current project
aivo logs share
# share a specific row by id prefix
aivo logs share 1335c631
# pick from every project on this machine
aivo logs share --all
# follow updates and open in the browser
aivo logs share --live --open
# skip redaction (be sure)
aivo logs share --no-redact
Aggregates token counts from aivo chat, Claude Code, Codex, Gemini, OpenCode, Amp, and Pi by reading each tool's native data files.
aivo stats
aivo stats claude
aivo stats chat
# raw numbers for scripts
aivo stats -n
# filter by provider
aivo stats -s openrouter
# last N units (m, h, d, w)
aivo stats --since 7d
aivo stats claude --since 24h
# show all models
aivo stats -a
# bypass cache
aivo stats -r
$ aivo stats
────────────────────────────────────────────────────
408M tokens · 14B cached · 5.0K sessions · 77 models
By tool sessions tokens
claude 4.2K 295M ████████████████████
codex 256 87M █████▉
opencode 166 10M ▊
chat 91 8.0M ▌
gemini 204 4.2M ▎
pi 85 3.8M ▎
By model tokens
gpt-5.4 75M ████████████████████
minimax-m2.5 63M ████████████████▊
claude-sonnet-4.6 53M ██████████████▏
claude-opus-4.6 40M ██████████▋
minimax-m2.7 38M ██████████▏
claude-opus-4.7 31M ████████▏
laguna-m.1 22M █████▉
claude-haiku-4-5-20251001 14M ███▉
kimi-k2.5:cloud 11M ██▉
claude-sonnet-4.5 6.6M █▊
gpt-5.3-codex 6.1M █▋
glm-4.7-free 6.0M █▋
glm-4.7 5.0M █▍
mercury-2 4.3M █▏
deepseek-v4-flash 3.7M █
gpt-5.5 3.6M █
kimi-k2.5 2.0M ▌
starter 1.8M ▌
gpt-5.1-codex 1.5M ▍
kat-coder-pro-v1 1.4M ▍
others (57 models) 9.8M ██▋ aivo update detects whether you installed via Homebrew, npm, or the install script and updates accordingly.
Failed updates are rolled back automatically; a manual rollback is available if a new version misbehaves.
aivo update
# force update even if installed via a package manager
aivo update --force
# restore the previous version from the last backup
aivo update --rollback
AES-256-GCM encrypted at rest in ~/.config/aivo/, with the key derived from your machine. Requests go straight from your machine to the provider.
No. With your own provider keys, requests go directly from your machine to the provider. The starter pool routes through aivo's starter provider, but nothing is logged or stored. Aivo collects no telemetry.
Coding agents speak Anthropic, OpenAI, or Gemini protocols. Providers also speak one of the three. Aivo detects both ends and converts request and response shape on the fly — tools, streaming, and reasoning blocks included. There is no per-pair shim, so any new agent that speaks one of the three protocols works with every provider that speaks any of them.