A multi-AI terminal agent that writes code, runs commands, searches the web, talks to you, and manages your projects — from your terminal, your phone, or running 24/7 in the background.
Quick Start | Development Tool | Personal Assistant | Channels | Autonomous Agent | Security | CLI Reference | API
Code Buddy is an open-source multi-provider AI coding agent that runs in your terminal. It supports Grok, Claude, ChatGPT, Gemini, LM Studio, and Ollama via OpenAI-compatible APIs and provider-specific SDKs.
It works as two things at once:
- A development tool — reads files, writes code, runs commands, creates PRs, plans complex tasks, and fixes its own mistakes across 5-50 tool calls per task.
- A personal assistant — talks to you by voice, remembers your preferences, monitors your screen, sends notifications to your phone via Telegram/Discord/Slack, and runs scheduled tasks 24/7 in the background.
Key highlights:
- 6 AI providers with automatic failover
- 40 bundled skills (PR workflow, DevOps, creative tools, smart home, media)
- 11 messaging channels (Terminal, Telegram, Discord, Slack, WhatsApp, Signal, Teams, Matrix, Google Chat, WebChat, HTTP API)
- Daemon mode for 24/7 background operation
- Multi-agent orchestration with self-healing
- Voice conversation with wake word detection
- OS sandbox with workspace-write mode (read-only / workspace-write / danger-full-access tiers)
- Docker sandbox for untrusted code execution
- Knowledge base injection (Knowledge.md files loaded into agent system prompt)
- Wide Research mode (parallel sub-agents decompose and research topics concurrently)
- Todo.md attention bias (task list appended to end of every LLM context turn — Manus AI pattern)
- Lessons.md self-improvement loop (PATTERN/RULE/CONTEXT/INSIGHT lessons injected before every turn — persists corrections across sessions)
- Workflow orchestration rules in system prompt (concrete plan triggers, auto-correction protocol, verification contract)
- Restorable context compression (identifiers preserved, full content recoverable on demand)
- Pre-compaction memory flush (facts saved to MEMORY.md before context is compacted — OpenClaw pattern)
- Decision memory extraction (architectural choices + rationale persisted during context flush)
- Coding style learning (auto-analyzes naming, imports, indentation, error handling patterns)
- Importance-weighted context compression (error messages, decisions, code scored higher than conversation)
- Auto-repair middleware (detects tool failures, localizes faults, suggests fixes automatically)
- Quality gate middleware (auto-delegates to CodeGuardian + Security agents after implementation)
- Issue-to-PR pipeline (
buddy dev issue #42→ branch → implement → test → PR) - Anthropic prompt cache breakpoints (stable/dynamic split → 10× token cost savings)
- Per-channel streaming policies (Telegram, Discord, Slack, WhatsApp each get their own chunking/format rules)
- SSRF guard on all outbound fetches (IPv4 + IPv6 bypass vector blocking)
- Tool prefix naming convention (
shell_exec,file_read,browser_search, … — Codex-style canonical aliases)
- Node.js 18.0.0 or higher
- ripgrep (recommended for faster search)
- Docker (required for CodeAct / Open Manus mode)
# macOS
brew install ripgrep
# Ubuntu/Debian
sudo apt-get install ripgrep
# Windows
choco install ripgrep# npm (recommended)
npm install -g @phuetz/code-buddy
# Or try without installing
npx @phuetz/code-buddy@latest# Configure API key (Grok/xAI)
export GROK_API_KEY=your_api_key
# Start interactive mode
buddy
# Or with a specific task
buddy --prompt "analyze the codebase structure"
# Use with local LLM (LM Studio)
buddy --base-url http://localhost:1234/v1 --api-key lm-studio
# Full autonomy mode
YOLO_MODE=true buddy# Single prompt, JSON output to stdout (logs go to stderr)
buddy -p "create a hello world Express app" --output-format json > result.json
# Pipe into other tools
buddy -p "explain this code" --output-format json 2>/dev/null | jq '.content'
# Use in CI with full autonomy
buddy -p "run tests and fix failures" \
--dangerously-skip-permissions \
--output-format json \
--max-tool-rounds 30
# Auto-approve all tool executions (no confirmation prompts)
buddy -p "fix lint errors" --auto-approve --output-format textHeadless mode exits cleanly after completion — safe for timeout, shell scripts, and CI pipelines.
# Continue the most recent session
buddy --continue
# Resume a specific session by ID (supports partial matching)
buddy --resume abc123
# Set a cost limit for the session
buddy --max-price 5.00# 1. First-time setup
buddy --setup # Quick API key setup wizard
buddy onboard # Full interactive config wizard
buddy doctor # Verify environment & dependencies
# 2. Start coding
buddy # Launch interactive chat
buddy --vim # Launch with Vim keybindings
# 3. Describe what you want in natural language
> "Create a Node.js project with Express and Prisma"
> "Add Google OAuth authentication"
> "Write tests for the auth module"
> "Fix the typecheck errors"
> "Commit everything"
# 4. Advanced modes
buddy --model gemini-2.5-flash # Switch AI model
buddy --system-prompt architect # Use architect system prompt
buddy --agent my-custom-agent # Use custom agent from ~/.codebuddy/agents/
buddy speak # Voice conversation mode
buddy daemon start # Run 24/7 in background
buddy server --port 3000 # Expose REST/WebSocket APICode Buddy autonomously reads files, writes code, runs commands, and fixes errors — typically 5-15 tool calls per task (up to 50, or 400 in YOLO mode).
Code Buddy operates as an autonomous coding agent. It reads your codebase, makes changes, runs commands, and iterates until the task is done.
Built-in tools:
| Category | Tools |
|---|---|
| File Operations | view_file, create_file, str_replace_editor, edit_file, multi_edit |
| Search | search, codebase_map |
| System | bash, docker, kubernetes |
| CodeAct | run_script (Python/JS/TS in Docker), plan (Persistent Planner) |
| Web | web_search, web_fetch, browser |
| Patching | apply_patch (unified diff with fuzz factor, Codex-inspired) |
| Planning | create_todo_list, get_todo_list, update_todo_list |
| Media | screenshot, audio, video, ocr, clipboard |
| Documents | pdf, document, archive |
| Knowledge | knowledge_search, knowledge_add — search/add knowledge base entries |
| Human Input | ask_human — pause execution for mid-task user clarification (120s timeout) |
| Self-Extension | create_skill — write new SKILL.md files at runtime (self-authoring) |
| Self-Improvement | lessons_add, lessons_search, lessons_list — persist and recall learned patterns across sessions |
| Verification | task_verify — run tsc/tests/lint before marking tasks complete (Verification Contract) |
RAG-based tool selection filters tools per query to reduce prompt tokens — only relevant tools are included in each API call.
Web Search (5-Provider Fallback Chain):
| Priority | Provider | API Key Required | Features |
|---|---|---|---|
| 1 | Brave MCP | BRAVE_API_KEY + MCP enabled |
Full MCP integration, richest results |
| 2 | Brave API | BRAVE_API_KEY |
Country, language, freshness filters |
| 3 | Perplexity | PERPLEXITY_API_KEY or OPENROUTER_API_KEY |
AI-synthesized answers with citations |
| 4 | Serper | SERPER_API_KEY |
Google Search results |
| 5 | DuckDuckGo | None | Free fallback (no API key needed) |
Search parameters: country (ISO 3166), search_lang, ui_lang, freshness (pd/pw/pm/py or date range), provider (force specific).
Context management uses smart multi-stage compaction (remove stale tool results, summarize older messages, aggressive truncation) to keep conversations within token limits across long sessions.
Hybrid search combines keyword + semantic search with configurable weights for memory retrieval.
Code Buddy implements the Open Manus / CodeAct architecture in a structured, phased approach, allowing it to write and execute code (Python, TypeScript, Node.js) in a secure Docker sandbox instead of relying solely on pre-defined tools.
Phase 1: Sandboxed Execution (Hybrid Agent)
- RunScriptTool: Writes and runs scripts in ephemeral Docker containers (
ubuntu:latest,node:22-slim,python:3.11-slim). - Browser Automation: Uses
Playwrightin Docker to scrape websites, interact with SPAs, and take screenshots programmatically. - Safety First: Timeout (120s), Memory Limit (1GB), and ephemeral containers prevent runaway processes.
Phase 2: Persistent State & Planning
- Persistent Workspace: Files created in
.codebuddy/workspacepersist between script executions, allowing multi-step workflows (e.g., scrape → save CSV → analyze CSV → plot chart). - PlanTool: The agent maintains a
PLAN.mdfile in your project root to track complex, multi-step objectives statefully. - Structured Loop: The system prompt enforces a strict PLAN → THINK → CODE → OBSERVE → UPDATE cognitive cycle to prevent chaotic behavior.
Phase 3: Wide Research (Parallel Agents)
- WideResearchOrchestrator: Decomposes a topic into N independent subtopics via LLM, spawns N parallel CodeBuddyAgent workers (default: 5, max: 20), then aggregates results into a single comprehensive report.
- Progress streaming: Emits real-time events as each worker completes.
- CLI:
buddy research "quantum computing breakthroughs" --workers 8 --output report.md
Phase 4: Context Engineering (Manus AI + OpenClaw patterns)
- Todo.md Attention Bias — The agent maintains a
todo.mdtask list that is automatically appended at the end of the LLM context on every turn. Because transformers attend more strongly to recent tokens, this keeps objectives in focus across long sessions without modifying the system prompt. Usebuddy todo add/done/listor thetodo_updatetool. - Restorable Compression — When the context window is compressed, file paths and URLs are extracted as identifiers and the original content is stored. The agent can call
restore_context("src/agent/types.ts")to retrieve the full content on demand, making compression lossless for structured identifiers. - Pre-compaction Memory Flush (NO_REPLY) — Before compaction triggers, a silent background LLM turn extracts durable facts and saves them to
MEMORY.md. If the model returns theNO_REPLYsentinel with no meaningful content, the output is suppressed entirely (no notification spam). - Inline Citations — Web search results now include
[1][2]citation markers inline and a Sources block listing all referenced URLs. - Lessons.md Self-Improvement Loop — After any user correction, the agent calls
lessons_addto persist the lesson (category: PATTERN, RULE, CONTEXT, or INSIGHT) to.codebuddy/lessons.md. On every turn, active lessons are injected as a<lessons_context>block BEFORE the todo suffix so learned patterns are always visible. Usebuddy lessons add/search/listor thelessons_add/lessons_searchtools. Thetask_verifytool runs the Verification Contract (tsc + tests + lint) before any task completion.
Phase 5: Persistent & Contextual Memory (Agent Zero + OpenClaw patterns)
- Decision Memory — During pre-compaction flush, the LLM extracts architectural decisions (choice, alternatives, rationale, confidence) into structured
<decision>blocks. These are persisted via EnhancedMemory and injected as<decisions_context>in future turns, so the agent remembers why past choices were made — even months later. - Coding Style Learning —
CodingStyleAnalyzerdetects project conventions (quote style, semicolons, indentation, naming, imports, error handling, testing patterns) via regex heuristics and majority voting across files. The resulting profile is injected as a<coding_style>block in the system prompt. - Importance-Weighted Compression —
ImportanceScorerassigns scores (0–1) to each message based on content type (errors=0.95, decisions=0.90, code=0.70, conversation=0.25), recency, and role. The sliding window inContextManagerV2preserves high-importance messages (>0.8) even outside the recent window. - Auto-Repair Middleware (priority 150) — Scans recent tool results for error patterns (SyntaxError, FAIL, exit code). When detected, invokes
FaultLocalizerto pinpoint the file/line, then injects a repair suggestion into the conversation. Max 3 attempts per failure, resets on success. - Quality Gate Middleware (priority 200) — After implementation completes, auto-delegates to CodeGuardian (architecture review) and SecurityReview (vulnerability scan) agents. Reports findings as warnings; required gates can block the loop.
- Issue-to-PR Pipeline —
buddy dev issue <url-or-number>automates: fetch GitHub issue → create branch → map labels to workflow type → run plan+implement+test → create PR withCloses #N. Labels likebug→fix-tests,security→security-audit,feature→add-feature.
Example Prompts:
> "Go to Google News, scrape the top headlines about AI, save them to a CSV, and then use Python to analyze the sentiment."
> "Write a script to check broken links on my documentation site."
> "Calculate the Fibonacci sequence up to 1000 and plot the growth rate."Code Buddy is an evolution of the OpenClaw architecture, modernized for the TypeScript ecosystem and enhanced with Open Manus (CodeAct) autonomy.
| Feature | OpenClaw | Code Buddy | Open Manus |
|---|---|---|---|
| Language | Python | TypeScript / Node.js | Python |
| Philosophy | Tool-Based | Hybrid (Tool + CodeAct) | Pure CodeAct |
| Messaging | Multi-channel | 11+ Channels (Telegram focus) | Web Interface |
| Task State | Heartbeat | Persistent PLAN.md + Workspace | Transient Session |
| Concurrency | Lane Queue | Advanced Lane Queue + DAG | Sequential |
| Extensibility | SKILL.md | Skills Hub + Plugins + MCP | Custom Scripts |
Why Code Buddy? It combines the industrial-grade reliability of OpenClaw (concurrency control, security policies, multi-channel messaging) with the infinite flexibility of Open Manus (dynamic script generation and execution).
Manus AI influence: Wide Research (parallel sub-agent research workers), Knowledge Base injection, todo.md attention bias (task list at end of context each turn), and restorable context compression (identifier-based content recovery) are all inspired by Manus AI's context engineering research. The pre-compaction NO_REPLY flush pattern is from OpenClaw's compaction documentation.
Code Buddy validates everything before it touches your files:
| Feature | Description |
|---|---|
| Generated Code Validator | Pre-write scan for eval, XSS, SQL injection, hardcoded secrets, prototype pollution |
| Pre-Write Syntax Validator | Balanced delimiters, template literals, indentation (JS/TS/Python/YAML/HTML/CSS/JSON) |
| Atomic Rollback (apply-patch) | All-or-nothing patch application with full file state backup |
| Atomic Transactions (multi-edit) | Multi-file edits rolled back on first failure |
| AST Bash Validation | tree-sitter-based command parsing with centralized dangerous pattern checks |
| Bash Checkpoints | Pre-snapshot of files targeted by destructive commands (rm, mv, truncate) |
| Diff Preview | Shows actual diffs before approval, magnitude-based re-confirmation for large changes |
| Semantic Truncation | Error-preserving output truncation (keeps error lines and stack traces) |
| Security Audit Logging | JSONL audit trail for all code generation security decisions |
For complex multi-step requests, Code Buddy decomposes work into a DAG (directed acyclic graph) and executes steps in parallel where possible.
- TaskPlanner —
needsPlanning()heuristic detects complex requests,createPlan()produces a TaskGraph - Topological sort — determines execution order with dependency tracking
- Parallel execution — independent steps run concurrently via dependency waves
- Architect mode —
--system-prompt architectenables plan-first coding with per-step checkpoints
| Feature | Description |
|---|---|
| CI Watcher | GitHub Actions / GitLab CI / Jenkins alerts with "Fix it" auto-agent |
| Webhook Triggers | HMAC-SHA256 verified HTTP triggers — connect CI, monitoring, or any service |
| Headless Mode | buddy -p "run tests and fix failures" --dangerously-skip-permissions for CI pipelines |
Code Buddy handles the full Git lifecycle through natural language:
> "Create a PR for the auth changes"
> "Review the open PRs"
> "Fix the merge conflicts on feature-branch"
> "Commit everything with a good message"
Telegram enhanced commands for remote Git operations:
| Command | Description |
|---|---|
/repo |
Repository info, recent commits, open PRs |
/branch [name] |
Branch diff stats vs main |
/pr [number] |
List or view PRs with merge/review buttons |
Full hands-free voice interaction with wake word detection:
buddy speak "Hello, I am Code Buddy" # Synthesize and play speech
buddy speak --voice af_bella "Hello world" # Use a specific voice
buddy speak --list-voices # List available voices
buddy speak --speed 1.5 "Fast speech" # Adjust speed (0.25-4.0)
buddy speak --format mp3 "Hello" # Output format (wav, mp3)
buddy speak --url http://host:8000 "Hello" # Custom AudioReader URL7 TTS providers: Edge TTS, espeak, macOS say, Piper, OpenAI, ElevenLabs, AudioReader (Kokoro-82M local)
In-chat voice commands:
| Command | Description |
|---|---|
/speak <text> |
Speak text with current TTS provider |
/tts on|off |
Enable/disable TTS |
/tts auto |
Auto-speak all agent responses |
/tts provider audioreader |
Switch to AudioReader (Kokoro-82M, local, free) |
/tts voice ff_siwis |
Set voice (e.g., ff_siwis FR, af_bella EN) |
Wake word detection via Porcupine (Picovoice) with text-match fallback. Set PICOVOICE_ACCESS_KEY for hardware-accelerated detection, or use the built-in text matcher for free.
Infinite voice conversation: Enable continuousListening + autoSpeak with AudioReader for a hands-free loop: listen → STT → agent → TTS → listen.
| Subsystem | Storage | Purpose |
|---|---|---|
| Persistent Memory | Markdown files | Project/user notes |
| Enhanced Memory | SQLite + embeddings | Semantic search |
| Prospective Memory | SQLite | Tasks, goals, reminders |
| Decision Memory | EnhancedMemory | Architectural choices with rationale, auto-extracted during context flush |
| Coding Style Memory | EnhancedMemory | Project conventions (naming, imports, indentation) auto-analyzed from source |
| ICM (optional) | ICM MCP server | Persistent cross-session memory via episodic + semantic dual architecture |
Auto-capture detects and stores important information from conversations:
"Remember that..." → Stored as instruction
"I prefer..." → Stored as preference
"This project uses..." → Stored as project fact
"My email is..." → Stored as contact info
"We decided to..." → Stored as decision
Memory lifecycle hooks inject relevant memories before execution, capture important info after responses, and summarize conversations at session end. Deduplication via Jaccard similarity (0.95 threshold) prevents duplicates.
Domain knowledge injected into the agent system prompt at startup (src/knowledge/knowledge-manager.ts):
- Sources:
Knowledge.md(project root),.codebuddy/knowledge/*.md(project-level),~/.codebuddy/knowledge/*.md(global) - YAML frontmatter:
title,tags,scope(restrict to specific agent modes),priority(injection order) - Agent tools:
knowledge_search(keyword search across all entries),knowledge_add(persist new knowledge to disk) - Injection: Loaded entries are wrapped in a
<knowledge>block and included in the system prompt automatically.
buddy knowledge list # List all loaded knowledge entries
buddy knowledge show <title> # Show a specific entry
buddy knowledge search "TypeScript conventions"
buddy knowledge add # Interactive: add a new knowledge entry
buddy knowledge remove <title> # Remove an entry
buddy knowledge context # Show the full <knowledge> block the agent seesCode Buddy includes 40 built-in SKILL.md files that provide domain-specific knowledge, best practices, and MCP server integration. Skills are loaded contextually when relevant to your project.
| Category | Skill | Description |
|---|---|---|
| PR Workflow | review-pr |
Code review checklist, inline comments, approval criteria |
prepare-pr |
Branch naming, commit cleanup, PR description template | |
merge-pr |
Merge strategies, conflict resolution, post-merge cleanup | |
| Dev Tools | github |
Issues, releases, Actions workflows, gh CLI |
gitlab |
GitLab API, glab CLI, CI/CD pipelines, merge requests | |
session-logs |
Export/search conversation history and session metadata | |
model-usage |
Token tracking, cost analysis, provider comparison | |
tmux-sessions |
Terminal multiplexing, pane layouts, session management | |
healthcheck |
Service monitoring, endpoint checks, alerting | |
| Project | project-best-practices |
Project scaffolding, structure, linting, testing conventions |
csharp-avalonia |
Cross-platform desktop/mobile with C# and Avalonia UI | |
coding-agent |
Autonomous multi-step coding with planning and validation | |
skill-creator |
Author new SKILL.md files with YAML frontmatter | |
| Creative & 3D | blender |
Python bpy scripting, CLI rendering, Geometry Nodes |
unreal-engine |
Remote Control API, Python editor scripting, Movie Render Queue | |
davinci-resolve |
DaVinciResolveScript Python API, color grading, render queue | |
ableton-live |
OSC protocol, MIDI Remote Scripts, Max for Live | |
| Design | figma |
REST API, Plugin API, design tokens extraction |
gimp |
Python-Fu / Script-Fu scripting, batch image processing | |
inkscape |
Extensions API, CLI export, SVG manipulation | |
| DevOps & Infra | kubernetes |
kubectl, Helm, ArgoCD GitOps |
terraform-ansible |
Terraform IaC + Ansible configuration management | |
grafana-prometheus |
Grafana HTTP API, PromQL, alerting pipelines | |
jenkins-ci |
Jenkins API, Groovy pipelines, shared libraries | |
| Workflow & Data | n8n |
REST API, webhook triggers, workflow automation |
databases |
PostgreSQL, MongoDB, Redis CLI and automation | |
game-engines |
Unity C# + Godot GDScript, builds, scene management | |
| Utilities | summarize |
Text/file/URL summarization with configurable length |
weather |
Weather lookups via wttr.in and OpenWeatherMap | |
| Media | image-gen |
Image generation via DALL-E, Stable Diffusion, Midjourney |
whisper-transcribe |
Audio/video transcription with OpenAI Whisper | |
pdf-tools |
PDF creation, merging, text extraction, conversion | |
screenshot |
Screen capture, annotation, OCR text extraction | |
video-tools |
FFmpeg video editing, conversion, thumbnails, GIFs | |
gif-search |
GIF search via Giphy and Tenor APIs | |
| Communication | email-tools |
Email send/read via himalaya CLI and SMTP |
notion |
Notion API for pages, databases, search, content blocks | |
blog-watcher |
RSS/Atom feed monitoring, web page change detection | |
| Smart Home | spotify |
Spotify playback control via spotify_player and Web API |
smart-home |
Philips Hue and Home Assistant control |
Each skill includes Direct Control (CLI/API/scripting commands), MCP Server Integration (config for .codebuddy/mcp.json), and Common Workflows (step-by-step recipes). Skills are stored in .codebuddy/skills/bundled/ and can be extended with managed or workspace skills via the Skills Registry and Hub.
Self-authoring skills: The agent can extend its own skill set at runtime using the create_skill tool, writing new SKILL.md files to .codebuddy/skills/workspace/. The SkillRegistry hot-reloads them within ~250ms, so newly created skills are immediately available without restarting.
The agent can reach out to you — not just respond:
- Push notifications with priority levels (info, warning, critical)
- Rate limiting prevents notification spam
- Quiet hours — suppress non-critical notifications during configured periods
- Multi-channel delivery — notifications route to Telegram, Discord, Slack, or any connected channel
Monitor your screen and environment for events:
- Periodic screenshots with perceptual diff detection
- Event triggers —
file_change,screen_change,time,webhook - Trigger registry — add/remove triggers dynamically
buddy trigger list # List all event triggers
buddy trigger add <spec> # Add a trigger (format: type:condition action:target)
buddy trigger remove <id> # Remove a triggerCode Buddy supports 11 messaging channels:
| Channel | Features |
|---|---|
| Terminal | Native CLI interface (Ink/React) |
| HTTP API | REST + WebSocket |
| WebChat | Built-in HTTP + WebSocket with browser UI |
| Discord | Bot integration, slash commands |
| Telegram | Bot API, pro features, scoped auth, CI watcher |
| Slack | Bolt framework, events |
| Baileys (QR pairing, media, reconnect) | |
| Signal | signal-cli REST API (polling, groups) |
| Google Chat | Workspace API (JWT auth, webhook events) |
| Microsoft Teams | Bot Framework (OAuth2, adaptive cards) |
| Matrix | matrix-js-sdk (E2EE, threads, media) |
Telegram is the most feature-rich channel, giving you full agent capabilities from your phone.
Setup:
- Create a bot with @BotFather on Telegram (
/newbot) - Configure the token:
export TELEGRAM_BOT_TOKEN=123456:ABC-DEF...Or in .codebuddy/settings.json:
{
"channels": {
"telegram": {
"type": "telegram",
"token": "123456:ABC-DEF...",
"adminUsers": ["your_telegram_user_id"],
"defaultParseMode": "Markdown"
}
}
}- Start Code Buddy with Telegram:
buddy --channel telegram # Interactive with Telegram
buddy daemon start # 24/7 background modeDeployment modes:
| Mode | Config | Best for |
|---|---|---|
| Polling (default) | No extra config | Development, behind NAT |
| Webhook | "webhookUrl": "https://your-domain.com/telegram" |
Production, lower latency |
Supported message types: text, images, audio, video, documents, stickers, locations, contacts, inline buttons, reply threads, typing indicators.
What you can do via Telegram:
| Category | Capabilities |
|---|---|
| Remote Coding | Code modifications, bug fixes, refactoring, file analysis, create commits & PRs |
| Bash Execution | Run build, test, deploy commands — with confirmation for destructive ops |
| Rich Media | Send images → Gemini Vision analysis, send files (code, logs) → processed by agent |
| Voice Messages | Send voice notes → STT transcription → agent response |
| Daemon Mode | 24/7 background operation (buddy daemon start), cron jobs, proactive alerts |
| Notifications | Build failures, test results, heartbeat alerts pushed to your Telegram |
| Interactive | Inline buttons for confirmations, Markdown-formatted responses |
Pro features:
| Feature | Description |
|---|---|
| Scoped Authorization | Tiered permissions: read-only → write-patch → run-tests → deploy |
| Diff-First Mode | Preview all code changes before applying — Apply / Full Diff / Cancel buttons |
| Run Tracker | Step-by-step timeline of agent runs with cost, duration, artifacts |
| CI Watcher | GitHub Actions / GitLab CI / Jenkins alerts with "Fix it" auto-agent |
| Secret Handles | Map friendly names to env vars — secrets never enter LLM context |
| Context Pins | Pin important decisions or facts for the agent to remember |
Enhanced commands:
| Command | Description |
|---|---|
/repo |
Repository info, recent commits, open PRs |
/branch [name] |
Branch diff stats vs main |
/pr [number] |
List or view PRs with merge/review buttons |
/task <desc> |
Create an agent task with objective |
/runs |
List recent agent runs with timeline |
/run <id> |
View run details with Re-run/Tests/Rollback buttons |
/yolo [minutes] |
Timed full access (1-60 min, auto-revokes) |
/pins |
View pinned context |
Example workflows:
Fix CI failure:
CI alert arrives → cause analysis → "Fix it" button
→ agent creates fix → diff preview → Apply/Cancel
→ changes applied → tests re-run
Add feature + tests + PR:
/task "add user search with tests"
→ plan-first preview → approve plan
→ diff-first preview → apply changes
→ agent creates PR → link in chat
Prevents unauthorized users from consuming API credits:
- Unknown user messages the bot → receives a 6-character pairing code (expires in 15 min)
- Bot owner approves via CLI:
buddy pairing approve --channel telegram ABC123 - User is added to the persistent allowlist (
~/.codebuddy/credentials/telegram-allowFrom.json)
Security features: rate limiting (5 failed attempts → 1h block), per-channel allowlists, admin bypass.
Pairing CLI commands:
buddy pairing status # Show pairing system status
buddy pairing list # List all approved users
buddy pairing pending # List pending pairing requests
buddy pairing approve <code> # Approve a pairing request by code
buddy pairing add <id> # Manually add a user to the allowlist
buddy pairing revoke <id> # Revoke access for a user// Discord
const discord = new DiscordChannel({
token: process.env.DISCORD_TOKEN,
allowedGuilds: ['guild-id'],
});
await discord.connect();
// WhatsApp (Baileys, QR pairing)
const whatsapp = new WhatsAppChannel({ dataPath: '~/.codebuddy/whatsapp' });
await whatsapp.connect(); // Scan QR code
// Signal (signal-cli REST API)
const signal = new SignalChannel({ apiUrl: 'http://localhost:8080', phoneNumber: '+1234567890' });
await signal.connect();
// Matrix (E2EE, threads)
const matrix = new MatrixChannel({ homeserverUrl: 'https://matrix.org', accessToken: '...' });
await matrix.connect();Run Code Buddy 24/7 in the background:
buddy daemon start [--detach] # Start background daemon
buddy daemon stop # Stop daemon
buddy daemon restart # Restart daemon
buddy daemon status # Show daemon status and services
buddy daemon logs [--lines N] # View daemon logsFeatures:
- PID file management with stale detection
- Auto-restart on crash (max 3 retries)
- Service registry and health monitoring (CPU, memory)
- Heartbeat engine — periodic agent wake with HEARTBEAT.md checklist, smart suppression, active hours
buddy heartbeat start # Start the heartbeat engine
buddy heartbeat stop # Stop the heartbeat engine
buddy heartbeat status # Show heartbeat status
buddy heartbeat tick # Manually trigger a single tickThe SupervisorAgent coordinates multiple agent instances:
- Strategies — sequential, parallel, race, all
- Shared context — thread-safe key-value store with optimistic locking
- Self-healing — error pattern recognition (6 built-in patterns), auto-recovery with exponential backoff
- Checkpoint rollback — auto-checkpoint before risky ops, rollback to last good state
Full autonomy with built-in guardrails for safe unattended operation:
# Enable via CLI
/yolo on # Enable (50 auto-edits, 100 auto-commands)
/yolo safe # Restricted mode (20 edits, 30 commands, limited paths)
/yolo off # Disable
/yolo status # Show limits, counters, allow/deny lists
# Or via environment
YOLO_MODE=true buddy # Still requires /yolo on confirmation in chatWhat changes in YOLO mode:
| Setting | Normal | YOLO |
|---|---|---|
| Tool rounds | 50 | 400 |
| Cost limit | $10 | $100 (cap $1,000) |
| File edits | Confirm each | Auto-approve (up to limit) |
| Bash commands | Confirm each | Auto-execute safe commands |
Autonomy levels (fine-grained control):
/autonomy suggest # Confirm everything
/autonomy confirm # Confirm important ops (default)
/autonomy auto # Auto-approve safe ops, confirm dangerous
/autonomy full # Auto-approve all except critical
/autonomy yolo # Full auto with guardrailsCustomize allow/deny lists:
/yolo allow "npm run dev" # Add to auto-execute list
/yolo deny "docker rm -f" # Block a command patternBuilt-in guardrails (always active, even in YOLO):
- Blocked paths:
.env,.git,node_modules,*.pem,*.key,credentials - Blocked commands:
rm -rf /,sudo,git push --force origin main,DROP DATABASE - Per-session limits on edits and commands
- Hard cost cap ($1,000 max even with
MAX_COSToverride)
The Cron-Agent Bridge connects the scheduler to CodeBuddyAgent instances for recurring tasks:
buddy trigger add time:*/30 action:run-tests # Run tests every 30 min
buddy trigger add webhook:deploy action:notify # Notify on deploy webhookWebhook triggers use HMAC-SHA256 verification with template placeholders for flexible integration.
Code Buddy supports multiple AI providers with automatic failover:
| Provider | Models | Context | Configuration |
|---|---|---|---|
| Grok (xAI) | grok-4, grok-code-fast-1 | 128K | GROK_API_KEY |
| Claude (Anthropic) | claude-sonnet-4, opus | 200K | ANTHROPIC_API_KEY |
| ChatGPT (OpenAI) | gpt-4o, gpt-4-turbo | 128K | OPENAI_API_KEY |
| Gemini (Google) | gemini-2.0-flash (+ vision) | 2M | GOOGLE_API_KEY |
| LM Studio | Any local model | Varies | --base-url http://localhost:1234/v1 |
| Ollama | llama3, codellama, etc. | Varies | --base-url http://localhost:11434/v1 |
Model failover chain — cascading provider fallback with health tracking and cooldown periods.
# Use LM Studio (local)
buddy --base-url http://localhost:1234/v1 --api-key lm-studio
# Use Ollama (local)
buddy --base-url http://localhost:11434/v1 --model llama3
# Use a specific model
buddy --model grok-code-fast-1Profile configuration in ~/.codebuddy/user-settings.json:
{
"connection": {
"activeProfileId": "grok",
"profiles": [
{
"id": "grok",
"name": "Grok API (xAI)",
"provider": "grok",
"baseURL": "https://api.x.ai/v1",
"model": "grok-4-latest"
},
{
"id": "lmstudio",
"name": "LM Studio Local",
"provider": "lmstudio",
"baseURL": "http://localhost:1234/v1",
"apiKey": "lm-studio"
}
]
}
}Auth profile manager — API key rotation (round-robin/priority/random strategies), session stickiness, exponential backoff on failures.
buddy auth-profile list # List authentication profiles
buddy auth-profile add <id> <provider> # Add a profile
buddy auth-profile remove <id> # Remove a profile
buddy auth-profile reset # Reset all cooldownsFine-grained control over what tools the agent can use:
// Tool-level allow/deny
const policy = new ToolPolicy({
allowlist: ['read_file', 'search', 'web_fetch'],
denylist: ['bash', 'write_file'],
requireConfirmation: ['delete_file'],
});
// Bash command patterns
const bashPolicy = new BashAllowlist({
patterns: [/^npm (install|test|run)/, /^git (status|diff|log)/],
blocked: [/rm -rf/, /sudo/, /curl.*\|.*sh/],
});| Mode | Description |
|---|---|
suggest |
Confirm all operations |
auto-edit |
Auto-approve safe ops |
full-auto |
Full autonomy (YOLO) |
/mode suggest # Maximum safety
/mode full-auto # Full autonomy- Trust folders — directory-level tool permissions via
.codebuddy-trust.json - Agent profiles — predefined configs:
secure(read-only),minimal,power-user - Per-model tool config — capabilities, context window, and patch format per model family
Three sandbox tiers for native OS-level isolation (Codex-inspired):
| Mode | Write Access | Use Case |
|---|---|---|
read-only |
None | Untrusted analysis tasks |
workspace-write |
Git workspace root only | Normal development (default) |
danger-full-access |
Unrestricted | Deployment/release scripts |
.git, .codebuddy, .ssh, .gnupg, .aws are always read-only regardless of mode.
const sandbox = await createSandboxForMode('workspace-write', '/my/project');
await sandbox.exec('npm', ['test']);Codex-inspired command authorization with token-array prefix matching (safer than regex — bypasses quoting/encoding tricks):
buddy execpolicy check "git push --force" # evaluate a shell string
buddy execpolicy check-argv git push --force # token-array (prefix rules first)
buddy execpolicy add-prefix git push --action deny # block git push with longest-match
buddy execpolicy dashboard # full policy overviewComprehensive Server-Side Request Forgery protection on all outbound HTTP calls:
- Blocks RFC-1918 private ranges + loopback + link-local
- Blocks IPv4 bypass vectors: octal (
0177.0.0.1), hex (0x7f000001), short form (127.1) - Blocks IPv6 transition addresses: NAT64 (
64:ff9b::/96), 6to4, Teredo, IPv4-mapped (::ffff:127.0.0.1) - Async DNS resolution check before every fetch
Containerized command execution for untrusted operations:
const sandbox = new DockerSandbox({
image: 'codebuddy/sandbox:latest',
memoryLimit: '512m',
networkMode: 'none',
timeout: 30000,
});Auto-sandbox router automatically routes dangerous commands (npm, pip, cargo, make) to Docker when available.
| Rail | Description |
|---|---|
| Diff-First Mode | All code changes are previewed before applying. Users see file summaries, line counts, and can view the full unified diff. |
| Plan-First Mode | Multi-step tasks show the execution plan for approval before any changes are made. |
| Scoped Permissions | Users get only the access they need: read-only → write-patch → run-tests → deploy. |
| Audit Trail | Every tool execution, confirmation, and security decision is logged. |
| Secret Handles | API tokens and credentials are referenced by handle name only — actual values are resolved from env vars at runtime, never exposed to the LLM context. |
| 2-Step Confirmation | Risky operations (rollback, deploy) require double confirmation with a 2-minute timeout window. |
| Timed YOLO | /yolo grants temporary full access that auto-revokes after the specified duration. |
| DM Pairing | Unknown users must be approved before they can interact with the bot. |
CodeBuddyAgent
│
├── AgentContextFacade # Context window and memory management
│ - Token counting, compression, memory retrieval
│
├── SessionFacade # Session persistence and checkpoints
│ - Save/load, checkpoint creation, rewind
│
├── ModelRoutingFacade # Model routing and cost tracking
│ - Provider selection, cost calculation
│
├── InfrastructureFacade # MCP, sandbox, hooks, plugins
│ - Hook execution, plugin loading
│
└── MessageHistoryManager # Chat and LLM message history
CodeBuddyAgent
│
├── TaskPlanner # DAG decomposition of complex requests
│ - needsPlanning() heuristic
│ - createPlan() → TaskGraph → parallel execution
│
├── SupervisorAgent # Multi-agent orchestration
│ - Sequential, parallel, race, all strategies
│ - SharedContext with optimistic locking
│
├── MiddlewarePipeline # Composable before/after turn hooks
│ - ReasoningMiddleware (priority 42)
│ - WorkflowGuardMiddleware (priority 45)
│ - AutoRepairMiddleware (priority 150)
│ - QualityGateMiddleware (priority 200)
│
├── SelfHealing # Automatic error recovery
│ - Pattern recognition (6 built-in patterns)
│ - Retry with exponential backoff
│
├── ScreenObserver # Environment monitoring
│ - Periodic screenshots with perceptual diff
│ - Event triggers (file_change, screen_change, time, webhook)
│
├── ProactiveAgent # Agent-initiated communication
│ - Push notifications with priority levels
│ - Rate limiting and quiet hours
│
└── DaemonManager # Background process lifecycle
- PID file management, auto-restart
- Service registry, health monitoring
User Input → ChatInterface (Ink/React) → CodeBuddyAgent → AI Provider
│
Tool Calls (max 50/400 rounds)
│
Tool Execution + Confirmation
│
Results back to API (loop)
buddy server --port 3000| Endpoint | Method | Description |
|---|---|---|
/api/health |
GET | Health check |
/api/metrics |
GET | Prometheus metrics |
/api/chat |
POST | Chat completion |
/api/chat/completions |
POST | OpenAI-compatible |
/api/tools |
GET | List tools |
/api/tools/{name}/execute |
POST | Execute tool |
/api/sessions |
GET/POST | Session management |
/api/memory |
GET/POST | Memory entries |
/api/daemon/status |
GET | Daemon status |
/api/daemon/health |
GET | Health metrics (CPU, memory) |
/api/cron/jobs |
GET | List cron jobs |
/api/cron/jobs/{id}/trigger |
POST | Trigger a cron job |
/api/notifications/preferences |
GET/POST | Notification settings |
/api/heartbeat/status |
GET | Heartbeat engine status |
/api/heartbeat/start|stop|tick |
POST | Heartbeat control |
/api/hub/search?q=... |
GET | Search skills marketplace |
/api/hub/installed |
GET | List installed hub skills |
/api/hub/install |
POST | Install a skill |
/api/hub/{name} |
DELETE | Uninstall a skill |
/api/identity |
GET | List loaded identity files |
/api/identity/prompt |
GET | Combined identity prompt |
/api/identity/{name} |
PUT | Update an identity file |
/api/groups/status|list |
GET | Group security status/config |
/api/groups/block |
POST | Block a user globally |
/api/groups/block/{userId} |
DELETE | Unblock a user |
/api/auth-profiles |
GET/POST/DELETE | Auth profile CRUD |
/api/auth-profiles/reset |
POST | Reset all cooldowns |
const ws = new WebSocket('ws://localhost:3000/ws');
// Authenticate
ws.send(JSON.stringify({
type: 'authenticate',
payload: { token: 'jwt-token' }
}));
// Stream chat
ws.send(JSON.stringify({
type: 'chat_stream',
payload: { messages: [{ role: 'user', content: 'Hello' }] }
}));Four MCP servers are pre-configured (disabled by default):
buddy mcp add brave-search # Brave Web Search (needs BRAVE_API_KEY)
buddy mcp add playwright # Browser automation (no key needed)
buddy mcp add exa-search # Exa neural search (needs EXA_API_KEY)
buddy mcp add icm # Infinite Context Memory (needs `cargo install icm`)
buddy mcp list # Show all configured serversPlugins extend Code Buddy with custom tools, commands, and providers:
~/.codebuddy/plugins/
my-plugin/
manifest.json
index.js
Plugin types: Tool, Provider (LLM/embedding/search), Command, Hook
const plugin: Plugin = {
async activate(context: PluginContext) {
context.registerTool({
name: 'my_tool',
description: 'Custom tool',
execute: async (args) => {
return { success: true, output: 'Done!' };
}
});
context.registerProvider({
id: 'my-llm',
type: 'llm',
async chat(messages) { return 'response'; }
});
}
};Manifest-based extension system with lifecycle hooks and config schema. Extensions live in .codebuddy/extensions/.
IDE-compatible completions backend — serves /v1/completions with bearer auth, per-IP rate limiting, and token clamping.
| Tool | Install | Purpose |
|---|---|---|
| RTK | cargo install --git https://github.com/rtk-ai/rtk |
CLI proxy that wraps commands to reduce LLM token usage 60-90% |
| ICM | cargo install --git https://github.com/rtk-ai/icm |
MCP server for persistent cross-session memory |
RTK is automatically integrated via a before-hook — supported bash commands are prefixed with rtk transparently. Configure in .codebuddy/config.toml under [integrations].
| Flag | Short | Description | Default |
|---|---|---|---|
--version |
-V |
Show version number | - |
--directory <dir> |
-d |
Set working directory | . |
--api-key <key> |
-k |
API key (or GROK_API_KEY env) |
- |
--base-url <url> |
-u |
API base URL (or GROK_BASE_URL env) |
- |
--model <model> |
-m |
AI model to use (or GROK_MODEL env) |
auto-detect |
--prompt <prompt> |
-p |
Single prompt, headless mode | - |
--browser |
-b |
Launch browser UI instead of terminal | false |
--max-tool-rounds <n> |
Max tool execution rounds | 400 |
|
--security-mode <mode> |
-s |
suggest, auto-edit, or full-auto |
suggest |
--output-format <fmt> |
-o |
Headless output: json, stream-json, text, markdown |
json |
--context <patterns> |
-c |
Glob patterns to load into context | - |
| Flag | Description | Default |
|---|---|---|
--continue |
Resume the most recent saved session | - |
--resume <id> |
Resume a specific session (supports partial ID matching) | - |
--max-price <dollars> |
Maximum cost in dollars before stopping | 10.0 |
--no-cache |
Disable response caching | - |
| Flag | Description | Default |
|---|---|---|
--auto-approve |
Automatically approve all tool executions | false |
--dangerously-skip-permissions |
Bypass all permission checks (trusted containers only) | false |
--no-self-heal |
Disable self-healing auto-correction | - |
--allow-outside |
Allow file operations outside workspace directory | false |
| Flag | Description | Example |
|---|---|---|
--force-tools |
Force-enable function calling for local models | - |
--probe-tools |
Auto-detect tool support at startup | - |
--enabled-tools <patterns> |
Only enable matching tools (glob, comma-separated) | bash,*file*,search |
--disabled-tools <patterns> |
Disable matching tools (glob, comma-separated) | bash,web_* |
--allowed-tools <patterns> |
Alias for --enabled-tools (Claude Code compat) |
- |
| Flag | Description | Default |
|---|---|---|
--system-prompt <id> |
System prompt: default, minimal, secure, code-reviewer, architect (or custom from ~/.codebuddy/prompts/) |
default |
--list-prompts |
List available system prompts and exit | - |
--agent <name> |
Use a custom agent from ~/.codebuddy/agents/ |
- |
--list-agents |
List available custom agents and exit | - |
| Flag | Description |
|---|---|
--plain |
Minimal formatting (plain text output) |
--no-color |
Disable colored output |
--no-emoji |
Disable emoji in output |
--vim |
Enable Vim keybindings for input |
--mcp-debug |
Enable MCP protocol debugging output |
| Flag | Description |
|---|---|
--init |
Initialize .codebuddy/ directory with templates |
--dry-run |
Preview changes without applying (simulation mode) |
--setup |
Run interactive API key setup wizard |
--list-models |
List available models from the API and exit |
| Command | Description |
|---|---|
/help |
Show help |
/model [name] |
Change model |
/mode [mode] |
Change security mode |
/profile [id] |
Switch connection profile |
/think |
Enable reasoning (4K tokens) |
/megathink |
Deep reasoning (10K tokens) |
/ultrathink |
Exhaustive reasoning (32K tokens) |
/cost |
Show cost dashboard |
/memory |
Memory management |
/hooks list |
List lifecycle hooks |
/plugin list |
List plugins |
/speak <text> |
Speak text with current TTS provider |
/tts on|off|auto |
TTS control |
/yolo on|off|safe|status |
YOLO mode control |
/autonomy suggest|confirm|auto|full|yolo |
Autonomy level |
# Dev Workflows
buddy dev plan "<objective>" # Profile repo + produce task plan
buddy dev run "<objective>" # Plan + implement + test + artifacts
buddy dev pr "<objective>" # Dev run + generate PR summary
buddy dev fix-ci [--log <file>] # Read CI logs + propose patch
buddy dev issue <url-or-number> # GitHub issue → branch → code → tests → PR
buddy dev explain # Summarise repo conventions
# Daemon
buddy daemon start|stop|restart|status|logs
# Triggers
buddy trigger list|add|remove
# Webhooks
buddy webhook list|add|remove
# Skills Hub
buddy hub search|install|uninstall|update|list|info|publish|sync
# Heartbeat
buddy heartbeat start|stop|status|tick
# Identity
buddy identity show|get|set|prompt
# Groups
buddy groups status|list|block|unblock
# Auth Profiles
buddy auth-profile list|add|remove|reset
# Devices
buddy device list|pair|remove|snap|screenshot|record|run
# Config
buddy config show|validate|get
# Security
buddy security-audit [--deep] [--fix] [--json]
# Voice
buddy speak [text] [--voice <name>] [--list-voices] [--speed <n>] [--format <fmt>]
# Knowledge Base
buddy knowledge list|show|search|add|remove|context
# DM Pairing
buddy pairing status|list|pending|approve <code>|add <id>|revoke <id>
# Wide Research
buddy research "<topic>" [--workers N] [--rounds N] [--output file.md]
# Task List (todo.md attention bias — injected at end of every agent turn)
buddy todo list # Show all items
buddy todo add "task description" [-p high|medium|low]
buddy todo done <id> # Mark completed
buddy todo update <id> [-s in_progress] [-t "new text"]
buddy todo remove <id> # Delete item
buddy todo clear-done # Remove all completed
buddy todo context # Preview the block injected into the agent
# Lessons (self-improvement loop — injected before every agent turn)
buddy lessons list [--category PATTERN|RULE|CONTEXT|INSIGHT]
buddy lessons add "what went wrong → correct approach" --category PATTERN
buddy lessons search "tsc" # Find relevant lessons before a task
buddy lessons clear [--category RULE] --yes
buddy lessons context # Preview the <lessons_context> block
# Setup
buddy onboard # Interactive setup wizard
buddy doctor # Environment diagnostics| Variable | Description | Default |
|---|---|---|
GROK_API_KEY |
xAI API key | Required |
ANTHROPIC_API_KEY |
Anthropic API key | - |
OPENAI_API_KEY |
OpenAI API key | - |
GOOGLE_API_KEY |
Google AI API key | - |
SERPER_API_KEY |
Web search API key | - |
GROK_BASE_URL |
Custom API endpoint | - |
GROK_MODEL |
Default model | - |
BRAVE_API_KEY |
Brave Search API key | - |
EXA_API_KEY |
Exa neural search API key | - |
PERPLEXITY_API_KEY |
Perplexity AI search key (pplx-...) |
- |
OPENROUTER_API_KEY |
OpenRouter key for Perplexity (sk-or-...) |
- |
PERPLEXITY_MODEL |
Perplexity model | perplexity/sonar-pro |
PICOVOICE_ACCESS_KEY |
Porcupine wake word detection | - |
CACHE_TRACE |
Debug prompt construction stages | false |
YOLO_MODE |
Full autonomy | false |
MAX_COST |
Cost limit ($) | 10 |
JWT_SECRET |
API server auth | Required in prod |
TELEGRAM_BOT_TOKEN |
Telegram bot token (from @BotFather) | - |
DISCORD_TOKEN |
Discord bot token | - |
SLACK_BOT_TOKEN |
Slack bot token | - |
Optional Rust tools:
| Tool | Install | Purpose |
|---|---|---|
| RTK | cargo install --git https://github.com/rtk-ai/rtk |
CLI proxy that wraps commands to reduce LLM token usage 60-90% |
| ICM | cargo install --git https://github.com/rtk-ai/icm |
MCP server for persistent cross-session memory |
Create .codebuddy/settings.json:
{
"systemPrompt": "You are working on a TypeScript project.",
"tools": {
"enabled": ["read_file", "search", "bash"],
"disabled": ["web_search"]
},
"security": {
"mode": "auto-edit",
"bashAllowlist": ["npm *", "git *"]
}
}# Clone and install
git clone https://github.com/phuetz/code-buddy.git
cd code-buddy
npm install
# Development mode
npm run dev
# Run tests
npm test
# Validate before commit
npm run validate
# Build
npm run build24,000+ tests across 590+ suites covering:
- Core: Tool Policy, Bash Allowlist, Context Window Guard, Compaction
- Agent: Middleware Pipeline, Profiles, Reasoning, Streaming
- Autonomy: Daemon, Cron Bridge, Task Planner, Delegation Engine
- Observation: Screen Observer, Triggers, Proactive Notifications
- Orchestration: Supervisor, Shared Context, Self-Healing, Rollback
- Providers: Gemini (vision + conversation), OpenAI-compat, Failover
- Security: Trust Folders, Skill Scanner, Bash Parser, Session Locks
- Infrastructure: MCP Client, Webhooks, Extensions, ACP Protocol, RTK Compressor, ICM Bridge
- Reasoning: Tree-of-Thought, MCTS, Reasoning Facade, /think command
- Memory: Decision Memory, Coding Style Analyzer, Importance Scorer
- Middleware: Auto-Repair, Quality Gates, Reasoning, Workflow Guard
- Voice: Wake Word, TTS Providers, Voice Control Loop
- UI: ChatHistory, ChatInterface, TabbedQuestion
Code Buddy implements techniques from academic research and draws architectural inspiration from leading open-source projects.
Reasoning & Planning:
| Paper | Reference | Implementation |
|---|---|---|
| Tree of Thoughts | Yao et al., 2023 — arXiv:2305.10601 | src/agent/reasoning/tree-of-thought.ts |
| RethinkMCTS | Zhang et al., 2024 — arXiv:2409.09584 | src/agent/reasoning/mcts.ts |
| TALE (Token-Budget-Aware Reasoning) | arXiv:2412.18547 | src/agent/token-budget-reasoning.ts — 68.9% token reduction |
| FrugalGPT | Stanford, 2023 — arXiv:2305.05176 | src/optimization/model-routing.ts — 30-70% cost reduction |
| LLMCompiler | arXiv:2312.04511 | src/optimization/parallel-executor.ts — 2.5-4.6x speedup |
Program Repair:
| Paper | Reference | Implementation |
|---|---|---|
| ChatRepair | Xia et al., ISSTA 2024 — arXiv:2403.12538 | src/agent/repair/iterative-repair.ts |
| ITER | arXiv:2403.00418 | src/agent/repair/repair-templates.ts — iterative template repair |
| RepairAgent | ICSE 2024 | src/agent/repair/repair-engine.ts — autonomous LLM-based repair |
| AgentCoder | Huang et al., 2023 | src/agent/multi-agent/multi-agent-system.ts — hierarchical multi-agent code generation |
RAG & Context Management:
| Paper | Reference | Implementation |
|---|---|---|
| CodeRAG | arXiv:2509.16112 | src/context/multi-path-retrieval.ts, src/context/dependency-aware-rag.ts |
| RAG-MCP | arXiv:2505.03275 | src/tools/tool-selector.ts |
| ToolLLM | ICLR'24 — arXiv:2307.16789 | src/agent/execution/tool-selection-strategy.ts |
| Comprehensive RAG Survey | arXiv:2506.00054 | src/context/codebase-rag/codebase-rag.ts |
| Recurrent Context Compression | arXiv:2406.06110 | src/context/context-manager-v2.ts |
Observation & Optimization:
| Paper | Reference | Implementation |
|---|---|---|
| JetBrains Context Management | JetBrains Research, 2024 | src/context/observation-masking.ts — -7% cost, +2.6% success |
| Complexity Trap | arXiv:2508.21433 | src/context/observation-masking.ts |
| Less-is-More (Tool Filtering) | arXiv, 2024 | src/optimization/tool-filtering.ts — 70% execution time reduction |
| The Prompt Report | arXiv:2406.06608 | src/prompts/system-base.ts |
Testing & Memory:
| Paper | Reference | Implementation |
|---|---|---|
| TDD + LLM | ICSE 2024 | src/testing/tdd-mode.ts — TDD improves Pass@1 by 45.97% |
| MemGPT | UC Berkeley, 2023 | src/memory/prospective-memory.ts — stateful AI agents |
Fault Localization: Ochiai, DStar, and Tarantula (Jones et al., 2002) spectrum-based techniques in src/agent/repair/fault-localization.ts.
Code Buddy's architecture draws from these open-source projects:
| Project | Inspiration | Key Files |
|---|---|---|
| OpenClaw | Multi-channel messaging, DM pairing, lane queue concurrency, memory lifecycle, tool policy, skills system, heartbeat, identity system, group security, hub marketplace | 40+ files across src/channels/, src/concurrency/, src/memory/, src/security/, src/skills/ |
| OpenAI Codex CLI | Apply-patch unified diff, head/tail truncation, per-model tool config, turn diff tracker, security modes, OS sandbox workspace-write tiers, shell-free exec, SSRF guard, exec policy prefix rules, shell env policy, named config profiles, tool prefix naming convention, stable JSON serialization, session fork/rollout unification | src/tools/apply-patch.ts, src/sandbox/os-sandbox.ts, src/security/ssrf-guard.ts, src/sandbox/execpolicy.ts, src/tools/registry/tool-aliases.ts, src/utils/stable-json.ts, src/observability/run-store.ts |
| Claude Code | Hook system, slash commands, MCP config, extended thinking, parallel subagents, headless output, Anthropic prompt cache breakpoints | src/hooks/, src/commands/slash-commands.ts, src/mcp/config.ts, src/optimization/cache-breakpoints.ts |
| Gemini CLI | Persistent checkpoints, context files, compress command, shell prefix, multimodal input | src/checkpoints/, src/context/context-files.ts, src/input/multimodal-input.ts |
| Aider | Repository map, voice input, unified diff editor, watch mode (IDE comments) | src/context/repository-map.ts, src/tools/voice-input.ts, src/commands/watch-mode.ts |
| Cursor | .cursorrules config, parallel agent system, sandboxed terminals, embedded browser |
src/config/codebuddyrules.ts, src/agent/parallel/, src/browser/embedded-browser.ts |
| Mistral Vibe | External markdown prompts, TOML config, tool permission system, fuzzy match, update notifier | src/prompts/, src/config/toml-config.ts, src/utils/fuzzy-match.ts |
| Conductor | Spec-driven development, track system | src/tracks/ |
| RTK | Command proxy for 60-90% token reduction | src/utils/rtk-compressor.ts |
| ICM | Persistent cross-session memory via MCP | src/memory/icm-bridge.ts |
| Manus AI | Wide Research (parallel sub-agent research workers), Knowledge Base injection, todo.md attention bias, restorable context compression, pre-compaction NO_REPLY flush, inline web-search citations, observation variator (anti-repetition), structured prompt variation, tool result compaction guard, disk-backed tool results, response prefill modes (tool_choice control), WebSearchMode + domain policy, message queue debounce/cap/overflow | src/agent/wide-research.ts, src/context/observation-variator.ts, src/agent/response-constraint.ts, src/tools/web-search.ts, src/agent/message-queue.ts |
| OpenClaw | Multi-channel messaging, DM pairing, lane queue concurrency, memory lifecycle, tool policy, skills system, heartbeat, identity system, group security, hub marketplace, daily session reset, per-channel streaming policies | src/channels/streaming-policy.ts, src/channels/, src/skills/, src/daemon/daily-reset.ts |
Other influences: Rust (Result<T, E> pattern), AutoGPT, MetaGPT, CrewAI, ChatDev (role-based multi-agent), ReAct (reasoning + acting paradigm), Qodo/PR-Agent (RAG for code repos).
Benchmarks referenced: SWE-bench, HumanEval, MBPP, BigCodeBench, WebArena, Berkeley Function Calling Leaderboard.
For detailed research notes, see docs/RESEARCH_IMPROVEMENTS.md, docs/RAG_TOOL_SELECTION.md, and deep_research/ai-coding-assistant-improvements/.
API key not working
echo $GROK_API_KEY # Verify key is set
buddy --prompt "test"Switching providers doesn't work
# Verify connection to local model
buddy --base-url http://localhost:1234/v1 --api-key lm-studio --prompt "test"
# List available models
buddy --list-modelsMemory not persisting
# Check memory directory
ls ~/.codebuddy/memory/
# Clear and reinitialize
rm -rf ~/.codebuddy/memory/
buddyHigh latency
- Use a faster model:
buddy --model grok-code-fast-1 - Use local LLM:
buddy --base-url http://localhost:11434/v1 --model llama3
Debug mode
DEBUG=codebuddy:* buddyMIT License - see LICENSE for details.
Report Bug | Request Feature | Star on GitHub
Multi-AI: Grok | Claude | ChatGPT | Gemini | LM Studio | Ollama