Code Buddy

Your AI-Powered Development Tool & Personal Assistant

A multi-AI terminal agent that writes code, runs commands, searches the web, talks to you, and manages your projects — from your terminal, your phone, or running 24/7 in the background.

What is Code Buddy?

Code Buddy is an open-source multi-provider AI coding agent that runs in your terminal. It supports Grok, Claude, ChatGPT, Gemini, LM Studio, and Ollama via OpenAI-compatible APIs and provider-specific SDKs.

It works as two things at once:

A development tool — reads files, writes code, runs commands, creates PRs, plans complex tasks, and fixes its own mistakes across 5-50 tool calls per task.
A personal assistant — talks to you by voice, remembers your preferences, monitors your screen, sends notifications to your phone via Telegram/Discord/Slack, and runs scheduled tasks 24/7 in the background.

Key highlights:

6 AI providers with automatic failover
40 bundled skills (PR workflow, DevOps, creative tools, smart home, media)
11 messaging channels (Terminal, Telegram, Discord, Slack, WhatsApp, Signal, Teams, Matrix, Google Chat, WebChat, HTTP API)
Daemon mode for 24/7 background operation
Multi-agent orchestration with self-healing
Voice conversation with wake word detection
OS sandbox with workspace-write mode (read-only / workspace-write / danger-full-access tiers)
Docker sandbox for untrusted code execution
Knowledge base injection (Knowledge.md files loaded into agent system prompt)
Wide Research mode (parallel sub-agents decompose and research topics concurrently)
Todo.md attention bias (task list appended to end of every LLM context turn — Manus AI pattern)
Lessons.md self-improvement loop (PATTERN/RULE/CONTEXT/INSIGHT lessons injected before every turn — persists corrections across sessions)
Workflow orchestration rules in system prompt (concrete plan triggers, auto-correction protocol, verification contract)
Restorable context compression (identifiers preserved, full content recoverable on demand)
Pre-compaction memory flush (facts saved to MEMORY.md before context is compacted — OpenClaw pattern)
Decision memory extraction (architectural choices + rationale persisted during context flush)
Coding style learning (auto-analyzes naming, imports, indentation, error handling patterns)
Importance-weighted context compression (error messages, decisions, code scored higher than conversation)
Auto-repair middleware (detects tool failures, localizes faults, suggests fixes automatically)
Quality gate middleware (auto-delegates to CodeGuardian + Security agents after implementation)
Issue-to-PR pipeline (buddy dev issue #42 → branch → implement → test → PR)
Anthropic prompt cache breakpoints (stable/dynamic split → 10× token cost savings)
Per-channel streaming policies (Telegram, Discord, Slack, WhatsApp each get their own chunking/format rules)
SSRF guard on all outbound fetches (IPv4 + IPv6 bypass vector blocking)
Tool prefix naming convention (shell_exec, file_read, browser_search, … — Codex-style canonical aliases)

Quick Start

Prerequisites

Node.js 18.0.0 or higher
ripgrep (recommended for faster search)
Docker (required for CodeAct / Open Manus mode)

# macOS
brew install ripgrep

# Ubuntu/Debian
sudo apt-get install ripgrep

# Windows
choco install ripgrep

Install

# npm (recommended)
npm install -g @phuetz/code-buddy

# Or try without installing
npx @phuetz/code-buddy@latest

First Run

# Configure API key (Grok/xAI)
export GROK_API_KEY=your_api_key

# Start interactive mode
buddy

# Or with a specific task
buddy --prompt "analyze the codebase structure"

# Use with local LLM (LM Studio)
buddy --base-url http://localhost:1234/v1 --api-key lm-studio

# Full autonomy mode
YOLO_MODE=true buddy

Headless Mode (CI / Scripting)

# Single prompt, JSON output to stdout (logs go to stderr)
buddy -p "create a hello world Express app" --output-format json > result.json

# Pipe into other tools
buddy -p "explain this code" --output-format json 2>/dev/null | jq '.content'

# Use in CI with full autonomy
buddy -p "run tests and fix failures" \
  --dangerously-skip-permissions \
  --output-format json \
  --max-tool-rounds 30

# Auto-approve all tool executions (no confirmation prompts)
buddy -p "fix lint errors" --auto-approve --output-format text

Headless mode exits cleanly after completion — safe for timeout, shell scripts, and CI pipelines.

Session Management

# Continue the most recent session
buddy --continue

# Resume a specific session by ID (supports partial matching)
buddy --resume abc123

# Set a cost limit for the session
buddy --max-price 5.00

Typical Project Workflow

# 1. First-time setup
buddy --setup                # Quick API key setup wizard
buddy onboard                # Full interactive config wizard
buddy doctor                 # Verify environment & dependencies

# 2. Start coding
buddy                        # Launch interactive chat
buddy --vim                  # Launch with Vim keybindings

# 3. Describe what you want in natural language
> "Create a Node.js project with Express and Prisma"
> "Add Google OAuth authentication"
> "Write tests for the auth module"
> "Fix the typecheck errors"
> "Commit everything"

# 4. Advanced modes
buddy --model gemini-2.5-flash  # Switch AI model
buddy --system-prompt architect # Use architect system prompt
buddy --agent my-custom-agent   # Use custom agent from ~/.codebuddy/agents/
buddy speak                     # Voice conversation mode
buddy daemon start              # Run 24/7 in background
buddy server --port 3000        # Expose REST/WebSocket API

Code Buddy autonomously reads files, writes code, runs commands, and fixes errors — typically 5-15 tool calls per task (up to 50, or 400 in YOLO mode).

Development Tool

Agentic Coding

Code Buddy operates as an autonomous coding agent. It reads your codebase, makes changes, runs commands, and iterates until the task is done.

Built-in tools:

Category	Tools
File Operations	`view_file`, `create_file`, `str_replace_editor`, `edit_file`, `multi_edit`
Search	`search`, `codebase_map`
System	`bash`, `docker`, `kubernetes`
CodeAct	`run_script` (Python/JS/TS in Docker), `plan` (Persistent Planner)
Web	`web_search`, `web_fetch`, `browser`
Patching	`apply_patch` (unified diff with fuzz factor, Codex-inspired)
Planning	`create_todo_list`, `get_todo_list`, `update_todo_list`
Media	`screenshot`, `audio`, `video`, `ocr`, `clipboard`
Documents	`pdf`, `document`, `archive`
Knowledge	`knowledge_search`, `knowledge_add` — search/add knowledge base entries
Human Input	`ask_human` — pause execution for mid-task user clarification (120s timeout)
Self-Extension	`create_skill` — write new SKILL.md files at runtime (self-authoring)
Self-Improvement	`lessons_add`, `lessons_search`, `lessons_list` — persist and recall learned patterns across sessions
Verification	`task_verify` — run tsc/tests/lint before marking tasks complete (Verification Contract)

RAG-based tool selection filters tools per query to reduce prompt tokens — only relevant tools are included in each API call.

Code Intelligence

Web Search (5-Provider Fallback Chain):

Priority	Provider	API Key Required	Features
1	Brave MCP	`BRAVE_API_KEY` + MCP enabled	Full MCP integration, richest results
2	Brave API	`BRAVE_API_KEY`	Country, language, freshness filters
3	Perplexity	`PERPLEXITY_API_KEY` or `OPENROUTER_API_KEY`	AI-synthesized answers with citations
4	Serper	`SERPER_API_KEY`	Google Search results
5	DuckDuckGo	None	Free fallback (no API key needed)

Search parameters: country (ISO 3166), search_lang, ui_lang, freshness (pd/pw/pm/py or date range), provider (force specific).

Context management uses smart multi-stage compaction (remove stale tool results, summarize older messages, aggressive truncation) to keep conversations within token limits across long sessions.

Hybrid search combines keyword + semantic search with configurable weights for memory retrieval.

🚀 Open Manus Features (CodeAct)

Code Buddy implements the Open Manus / CodeAct architecture in a structured, phased approach, allowing it to write and execute code (Python, TypeScript, Node.js) in a secure Docker sandbox instead of relying solely on pre-defined tools.

Phase 1: Sandboxed Execution (Hybrid Agent)

RunScriptTool: Writes and runs scripts in ephemeral Docker containers (ubuntu:latest, node:22-slim, python:3.11-slim).
Browser Automation: Uses Playwright in Docker to scrape websites, interact with SPAs, and take screenshots programmatically.
Safety First: Timeout (120s), Memory Limit (1GB), and ephemeral containers prevent runaway processes.

Phase 2: Persistent State & Planning

Persistent Workspace: Files created in .codebuddy/workspace persist between script executions, allowing multi-step workflows (e.g., scrape → save CSV → analyze CSV → plot chart).
PlanTool: The agent maintains a PLAN.md file in your project root to track complex, multi-step objectives statefully.
Structured Loop: The system prompt enforces a strict PLAN → THINK → CODE → OBSERVE → UPDATE cognitive cycle to prevent chaotic behavior.

Phase 3: Wide Research (Parallel Agents)

WideResearchOrchestrator: Decomposes a topic into N independent subtopics via LLM, spawns N parallel CodeBuddyAgent workers (default: 5, max: 20), then aggregates results into a single comprehensive report.
Progress streaming: Emits real-time events as each worker completes.
CLI: buddy research "quantum computing breakthroughs" --workers 8 --output report.md

Phase 4: Context Engineering (Manus AI + OpenClaw patterns)

Todo.md Attention Bias — The agent maintains a todo.md task list that is automatically appended at the end of the LLM context on every turn. Because transformers attend more strongly to recent tokens, this keeps objectives in focus across long sessions without modifying the system prompt. Use buddy todo add/done/list or the todo_update tool.
Restorable Compression — When the context window is compressed, file paths and URLs are extracted as identifiers and the original content is stored. The agent can call restore_context("src/agent/types.ts") to retrieve the full content on demand, making compression lossless for structured identifiers.
Pre-compaction Memory Flush (NO_REPLY) — Before compaction triggers, a silent background LLM turn extracts durable facts and saves them to MEMORY.md. If the model returns the NO_REPLY sentinel with no meaningful content, the output is suppressed entirely (no notification spam).
Inline Citations — Web search results now include [1] [2] citation markers inline and a Sources block listing all referenced URLs.
Lessons.md Self-Improvement Loop — After any user correction, the agent calls lessons_add to persist the lesson (category: PATTERN, RULE, CONTEXT, or INSIGHT) to .codebuddy/lessons.md. On every turn, active lessons are injected as a <lessons_context> block BEFORE the todo suffix so learned patterns are always visible. Use buddy lessons add/search/list or the lessons_add/lessons_search tools. The task_verify tool runs the Verification Contract (tsc + tests + lint) before any task completion.

Phase 5: Persistent & Contextual Memory (Agent Zero + OpenClaw patterns)

Decision Memory — During pre-compaction flush, the LLM extracts architectural decisions (choice, alternatives, rationale, confidence) into structured <decision> blocks. These are persisted via EnhancedMemory and injected as <decisions_context> in future turns, so the agent remembers why past choices were made — even months later.
Coding Style Learning — CodingStyleAnalyzer detects project conventions (quote style, semicolons, indentation, naming, imports, error handling, testing patterns) via regex heuristics and majority voting across files. The resulting profile is injected as a <coding_style> block in the system prompt.
Importance-Weighted Compression — ImportanceScorer assigns scores (0–1) to each message based on content type (errors=0.95, decisions=0.90, code=0.70, conversation=0.25), recency, and role. The sliding window in ContextManagerV2 preserves high-importance messages (>0.8) even outside the recent window.
Auto-Repair Middleware (priority 150) — Scans recent tool results for error patterns (SyntaxError, FAIL, exit code). When detected, invokes FaultLocalizer to pinpoint the file/line, then injects a repair suggestion into the conversation. Max 3 attempts per failure, resets on success.
Quality Gate Middleware (priority 200) — After implementation completes, auto-delegates to CodeGuardian (architecture review) and SecurityReview (vulnerability scan) agents. Reports findings as warnings; required gates can block the loop.
Issue-to-PR Pipeline — buddy dev issue <url-or-number> automates: fetch GitHub issue → create branch → map labels to workflow type → run plan+implement+test → create PR with Closes #N. Labels like bug→fix-tests, security→security-audit, feature→add-feature.

Example Prompts:

> "Go to Google News, scrape the top headlines about AI, save them to a CSV, and then use Python to analyze the sentiment."
> "Write a script to check broken links on my documentation site."
> "Calculate the Fibonacci sequence up to 1000 and plot the growth rate."

🧬 Roots & Comparison

Code Buddy is an evolution of the OpenClaw architecture, modernized for the TypeScript ecosystem and enhanced with Open Manus (CodeAct) autonomy.

Feature	OpenClaw	Code Buddy	Open Manus
Language	Python	TypeScript / Node.js	Python
Philosophy	Tool-Based	Hybrid (Tool + CodeAct)	Pure CodeAct
Messaging	Multi-channel	11+ Channels (Telegram focus)	Web Interface
Task State	Heartbeat	Persistent PLAN.md + Workspace	Transient Session
Concurrency	Lane Queue	Advanced Lane Queue + DAG	Sequential
Extensibility	SKILL.md	Skills Hub + Plugins + MCP	Custom Scripts

Why Code Buddy? It combines the industrial-grade reliability of OpenClaw (concurrency control, security policies, multi-channel messaging) with the infinite flexibility of Open Manus (dynamic script generation and execution).

Manus AI influence: Wide Research (parallel sub-agent research workers), Knowledge Base injection, todo.md attention bias (task list at end of context each turn), and restorable context compression (identifier-based content recovery) are all inspired by Manus AI's context engineering research. The pre-compaction NO_REPLY flush pattern is from OpenClaw's compaction documentation.

Code Safety

Code Buddy validates everything before it touches your files:

Feature	Description
Generated Code Validator	Pre-write scan for eval, XSS, SQL injection, hardcoded secrets, prototype pollution
Pre-Write Syntax Validator	Balanced delimiters, template literals, indentation (JS/TS/Python/YAML/HTML/CSS/JSON)
Atomic Rollback (apply-patch)	All-or-nothing patch application with full file state backup
Atomic Transactions (multi-edit)	Multi-file edits rolled back on first failure
AST Bash Validation	tree-sitter-based command parsing with centralized dangerous pattern checks
Bash Checkpoints	Pre-snapshot of files targeted by destructive commands (rm, mv, truncate)
Diff Preview	Shows actual diffs before approval, magnitude-based re-confirmation for large changes
Semantic Truncation	Error-preserving output truncation (keeps error lines and stack traces)
Security Audit Logging	JSONL audit trail for all code generation security decisions

Task Planning

For complex multi-step requests, Code Buddy decomposes work into a DAG (directed acyclic graph) and executes steps in parallel where possible.

TaskPlanner — needsPlanning() heuristic detects complex requests, createPlan() produces a TaskGraph
Topological sort — determines execution order with dependency tracking
Parallel execution — independent steps run concurrently via dependency waves
Architect mode — --system-prompt architect enables plan-first coding with per-step checkpoints

CI/CD Integration

Feature	Description
CI Watcher	GitHub Actions / GitLab CI / Jenkins alerts with "Fix it" auto-agent
Webhook Triggers	HMAC-SHA256 verified HTTP triggers — connect CI, monitoring, or any service
Headless Mode	`buddy -p "run tests and fix failures" --dangerously-skip-permissions` for CI pipelines

Git Workflow

Code Buddy handles the full Git lifecycle through natural language:

> "Create a PR for the auth changes"
> "Review the open PRs"
> "Fix the merge conflicts on feature-branch"
> "Commit everything with a good message"

Telegram enhanced commands for remote Git operations:

Command	Description
`/repo`	Repository info, recent commits, open PRs
`/branch [name]`	Branch diff stats vs main
`/pr [number]`	List or view PRs with merge/review buttons

Personal Assistant

Voice Conversation

Full hands-free voice interaction with wake word detection:

buddy speak "Hello, I am Code Buddy"         # Synthesize and play speech
buddy speak --voice af_bella "Hello world"    # Use a specific voice
buddy speak --list-voices                     # List available voices
buddy speak --speed 1.5 "Fast speech"         # Adjust speed (0.25-4.0)
buddy speak --format mp3 "Hello"              # Output format (wav, mp3)
buddy speak --url http://host:8000 "Hello"    # Custom AudioReader URL

7 TTS providers: Edge TTS, espeak, macOS say, Piper, OpenAI, ElevenLabs, AudioReader (Kokoro-82M local)

In-chat voice commands:

Command	Description
`/speak <text>`	Speak text with current TTS provider
`/tts on\|off`	Enable/disable TTS
`/tts auto`	Auto-speak all agent responses
`/tts provider audioreader`	Switch to AudioReader (Kokoro-82M, local, free)
`/tts voice ff_siwis`	Set voice (e.g., `ff_siwis` FR, `af_bella` EN)

Wake word detection via Porcupine (Picovoice) with text-match fallback. Set PICOVOICE_ACCESS_KEY for hardware-accelerated detection, or use the built-in text matcher for free.

Infinite voice conversation: Enable continuousListening + autoSpeak with AudioReader for a hands-free loop: listen → STT → agent → TTS → listen.

Memory System

Subsystem	Storage	Purpose
Persistent Memory	Markdown files	Project/user notes
Enhanced Memory	SQLite + embeddings	Semantic search
Prospective Memory	SQLite	Tasks, goals, reminders
Decision Memory	EnhancedMemory	Architectural choices with rationale, auto-extracted during context flush
Coding Style Memory	EnhancedMemory	Project conventions (naming, imports, indentation) auto-analyzed from source
ICM (optional)	ICM MCP server	Persistent cross-session memory via episodic + semantic dual architecture

Auto-capture detects and stores important information from conversations:

"Remember that..."        → Stored as instruction
"I prefer..."             → Stored as preference
"This project uses..."    → Stored as project fact
"My email is..."          → Stored as contact info
"We decided to..."        → Stored as decision

Memory lifecycle hooks inject relevant memories before execution, capture important info after responses, and summarize conversations at session end. Deduplication via Jaccard similarity (0.95 threshold) prevents duplicates.

Knowledge Base

Domain knowledge injected into the agent system prompt at startup (src/knowledge/knowledge-manager.ts):

Sources: Knowledge.md (project root), .codebuddy/knowledge/*.md (project-level), ~/.codebuddy/knowledge/*.md (global)
YAML frontmatter: title, tags, scope (restrict to specific agent modes), priority (injection order)
Agent tools: knowledge_search (keyword search across all entries), knowledge_add (persist new knowledge to disk)
Injection: Loaded entries are wrapped in a <knowledge> block and included in the system prompt automatically.

buddy knowledge list             # List all loaded knowledge entries
buddy knowledge show <title>     # Show a specific entry
buddy knowledge search "TypeScript conventions"
buddy knowledge add              # Interactive: add a new knowledge entry
buddy knowledge remove <title>   # Remove an entry
buddy knowledge context          # Show the full <knowledge> block the agent sees

Skills Library (40 Bundled Skills)

Code Buddy includes 40 built-in SKILL.md files that provide domain-specific knowledge, best practices, and MCP server integration. Skills are loaded contextually when relevant to your project.

Category	Skill	Description
PR Workflow	`review-pr`	Code review checklist, inline comments, approval criteria
	`prepare-pr`	Branch naming, commit cleanup, PR description template
	`merge-pr`	Merge strategies, conflict resolution, post-merge cleanup
Dev Tools	`github`	Issues, releases, Actions workflows, gh CLI
	`gitlab`	GitLab API, glab CLI, CI/CD pipelines, merge requests
	`session-logs`	Export/search conversation history and session metadata
	`model-usage`	Token tracking, cost analysis, provider comparison
	`tmux-sessions`	Terminal multiplexing, pane layouts, session management
	`healthcheck`	Service monitoring, endpoint checks, alerting
Project	`project-best-practices`	Project scaffolding, structure, linting, testing conventions
	`csharp-avalonia`	Cross-platform desktop/mobile with C# and Avalonia UI
	`coding-agent`	Autonomous multi-step coding with planning and validation
	`skill-creator`	Author new SKILL.md files with YAML frontmatter
Creative & 3D	`blender`	Python bpy scripting, CLI rendering, Geometry Nodes
	`unreal-engine`	Remote Control API, Python editor scripting, Movie Render Queue
	`davinci-resolve`	DaVinciResolveScript Python API, color grading, render queue
	`ableton-live`	OSC protocol, MIDI Remote Scripts, Max for Live
Design	`figma`	REST API, Plugin API, design tokens extraction
	`gimp`	Python-Fu / Script-Fu scripting, batch image processing
	`inkscape`	Extensions API, CLI export, SVG manipulation
DevOps & Infra	`kubernetes`	kubectl, Helm, ArgoCD GitOps
	`terraform-ansible`	Terraform IaC + Ansible configuration management
	`grafana-prometheus`	Grafana HTTP API, PromQL, alerting pipelines
	`jenkins-ci`	Jenkins API, Groovy pipelines, shared libraries
Workflow & Data	`n8n`	REST API, webhook triggers, workflow automation
	`databases`	PostgreSQL, MongoDB, Redis CLI and automation
	`game-engines`	Unity C# + Godot GDScript, builds, scene management
Utilities	`summarize`	Text/file/URL summarization with configurable length
	`weather`	Weather lookups via wttr.in and OpenWeatherMap
Media	`image-gen`	Image generation via DALL-E, Stable Diffusion, Midjourney
	`whisper-transcribe`	Audio/video transcription with OpenAI Whisper
	`pdf-tools`	PDF creation, merging, text extraction, conversion
	`screenshot`	Screen capture, annotation, OCR text extraction
	`video-tools`	FFmpeg video editing, conversion, thumbnails, GIFs
	`gif-search`	GIF search via Giphy and Tenor APIs
Communication	`email-tools`	Email send/read via himalaya CLI and SMTP
	`notion`	Notion API for pages, databases, search, content blocks
	`blog-watcher`	RSS/Atom feed monitoring, web page change detection
Smart Home	`spotify`	Spotify playback control via spotify_player and Web API
	`smart-home`	Philips Hue and Home Assistant control

Each skill includes Direct Control (CLI/API/scripting commands), MCP Server Integration (config for .codebuddy/mcp.json), and Common Workflows (step-by-step recipes). Skills are stored in .codebuddy/skills/bundled/ and can be extended with managed or workspace skills via the Skills Registry and Hub.

Self-authoring skills: The agent can extend its own skill set at runtime using the create_skill tool, writing new SKILL.md files to .codebuddy/skills/workspace/. The SkillRegistry hot-reloads them within ~250ms, so newly created skills are immediately available without restarting.

Proactive Notifications

The agent can reach out to you — not just respond:

Push notifications with priority levels (info, warning, critical)
Rate limiting prevents notification spam
Quiet hours — suppress non-critical notifications during configured periods
Multi-channel delivery — notifications route to Telegram, Discord, Slack, or any connected channel

Screen Observer

Monitor your screen and environment for events:

Periodic screenshots with perceptual diff detection
Event triggers — file_change, screen_change, time, webhook
Trigger registry — add/remove triggers dynamically

buddy trigger list             # List all event triggers
buddy trigger add <spec>       # Add a trigger (format: type:condition action:target)
buddy trigger remove <id>      # Remove a trigger

Multi-Channel Messaging

Code Buddy supports 11 messaging channels:

Channel	Features
Terminal	Native CLI interface (Ink/React)
HTTP API	REST + WebSocket
WebChat	Built-in HTTP + WebSocket with browser UI
Discord	Bot integration, slash commands
Telegram	Bot API, pro features, scoped auth, CI watcher
Slack	Bolt framework, events
WhatsApp	Baileys (QR pairing, media, reconnect)
Signal	signal-cli REST API (polling, groups)
Google Chat	Workspace API (JWT auth, webhook events)
Microsoft Teams	Bot Framework (OAuth2, adaptive cards)
Matrix	matrix-js-sdk (E2EE, threads, media)

Telegram (Deep Dive)

Telegram is the most feature-rich channel, giving you full agent capabilities from your phone.

Setup:

Create a bot with @BotFather on Telegram (/newbot)
Configure the token:

export TELEGRAM_BOT_TOKEN=123456:ABC-DEF...

Or in .codebuddy/settings.json:

{
  "channels": {
    "telegram": {
      "type": "telegram",
      "token": "123456:ABC-DEF...",
      "adminUsers": ["your_telegram_user_id"],
      "defaultParseMode": "Markdown"
    }
  }
}

Start Code Buddy with Telegram:

buddy --channel telegram        # Interactive with Telegram
buddy daemon start              # 24/7 background mode

Deployment modes:

Mode	Config	Best for
Polling (default)	No extra config	Development, behind NAT
Webhook	`"webhookUrl": "https://your-domain.com/telegram"`	Production, lower latency

Supported message types: text, images, audio, video, documents, stickers, locations, contacts, inline buttons, reply threads, typing indicators.

What you can do via Telegram:

Category	Capabilities
Remote Coding	Code modifications, bug fixes, refactoring, file analysis, create commits & PRs
Bash Execution	Run build, test, deploy commands — with confirmation for destructive ops
Rich Media	Send images → Gemini Vision analysis, send files (code, logs) → processed by agent
Voice Messages	Send voice notes → STT transcription → agent response
Daemon Mode	24/7 background operation (`buddy daemon start`), cron jobs, proactive alerts
Notifications	Build failures, test results, heartbeat alerts pushed to your Telegram
Interactive	Inline buttons for confirmations, Markdown-formatted responses

Pro features:

Feature	Description
Scoped Authorization	Tiered permissions: `read-only` → `write-patch` → `run-tests` → `deploy`
Diff-First Mode	Preview all code changes before applying — Apply / Full Diff / Cancel buttons
Run Tracker	Step-by-step timeline of agent runs with cost, duration, artifacts
CI Watcher	GitHub Actions / GitLab CI / Jenkins alerts with "Fix it" auto-agent
Secret Handles	Map friendly names to env vars — secrets never enter LLM context
Context Pins	Pin important decisions or facts for the agent to remember

Enhanced commands:

Command	Description
`/repo`	Repository info, recent commits, open PRs
`/branch [name]`	Branch diff stats vs main
`/pr [number]`	List or view PRs with merge/review buttons
`/task <desc>`	Create an agent task with objective
`/runs`	List recent agent runs with timeline
`/run <id>`	View run details with Re-run/Tests/Rollback buttons
`/yolo [minutes]`	Timed full access (1-60 min, auto-revokes)
`/pins`	View pinned context

Example workflows:

Fix CI failure:

CI alert arrives → cause analysis → "Fix it" button
→ agent creates fix → diff preview → Apply/Cancel
→ changes applied → tests re-run

Add feature + tests + PR:

/task "add user search with tests"
→ plan-first preview → approve plan
→ diff-first preview → apply changes
→ agent creates PR → link in chat

DM Pairing (Access Control)

Prevents unauthorized users from consuming API credits:

Unknown user messages the bot → receives a 6-character pairing code (expires in 15 min)
Bot owner approves via CLI: buddy pairing approve --channel telegram ABC123
User is added to the persistent allowlist (~/.codebuddy/credentials/telegram-allowFrom.json)

Security features: rate limiting (5 failed attempts → 1h block), per-channel allowlists, admin bypass.

Pairing CLI commands:

buddy pairing status             # Show pairing system status
buddy pairing list               # List all approved users
buddy pairing pending            # List pending pairing requests
buddy pairing approve <code>     # Approve a pairing request by code
buddy pairing add <id>           # Manually add a user to the allowlist
buddy pairing revoke <id>        # Revoke access for a user

Other Channels

// Discord
const discord = new DiscordChannel({
  token: process.env.DISCORD_TOKEN,
  allowedGuilds: ['guild-id'],
});
await discord.connect();

// WhatsApp (Baileys, QR pairing)
const whatsapp = new WhatsAppChannel({ dataPath: '~/.codebuddy/whatsapp' });
await whatsapp.connect(); // Scan QR code

// Signal (signal-cli REST API)
const signal = new SignalChannel({ apiUrl: 'http://localhost:8080', phoneNumber: '+1234567890' });
await signal.connect();

// Matrix (E2EE, threads)
const matrix = new MatrixChannel({ homeserverUrl: 'https://matrix.org', accessToken: '...' });
await matrix.connect();

Autonomous Agent

Daemon Mode

Run Code Buddy 24/7 in the background:

buddy daemon start [--detach]  # Start background daemon
buddy daemon stop              # Stop daemon
buddy daemon restart           # Restart daemon
buddy daemon status            # Show daemon status and services
buddy daemon logs [--lines N]  # View daemon logs

Features:

PID file management with stale detection
Auto-restart on crash (max 3 retries)
Service registry and health monitoring (CPU, memory)
Heartbeat engine — periodic agent wake with HEARTBEAT.md checklist, smart suppression, active hours

buddy heartbeat start          # Start the heartbeat engine
buddy heartbeat stop           # Stop the heartbeat engine
buddy heartbeat status         # Show heartbeat status
buddy heartbeat tick           # Manually trigger a single tick

Multi-Agent Orchestration

The SupervisorAgent coordinates multiple agent instances:

Strategies — sequential, parallel, race, all
Shared context — thread-safe key-value store with optimistic locking
Self-healing — error pattern recognition (6 built-in patterns), auto-recovery with exponential backoff
Checkpoint rollback — auto-checkpoint before risky ops, rollback to last good state

YOLO Mode (Autonomous Execution)

Full autonomy with built-in guardrails for safe unattended operation:

# Enable via CLI
/yolo on           # Enable (50 auto-edits, 100 auto-commands)
/yolo safe         # Restricted mode (20 edits, 30 commands, limited paths)
/yolo off          # Disable
/yolo status       # Show limits, counters, allow/deny lists

# Or via environment
YOLO_MODE=true buddy   # Still requires /yolo on confirmation in chat

What changes in YOLO mode:

Setting	Normal	YOLO
Tool rounds	50	400
Cost limit	$10	$100 (cap $1,000)
File edits	Confirm each	Auto-approve (up to limit)
Bash commands	Confirm each	Auto-execute safe commands

Autonomy levels (fine-grained control):

/autonomy suggest   # Confirm everything
/autonomy confirm   # Confirm important ops (default)
/autonomy auto      # Auto-approve safe ops, confirm dangerous
/autonomy full      # Auto-approve all except critical
/autonomy yolo      # Full auto with guardrails

Customize allow/deny lists:

/yolo allow "npm run dev"      # Add to auto-execute list
/yolo deny "docker rm -f"      # Block a command pattern

Built-in guardrails (always active, even in YOLO):

Blocked paths: .env, .git, node_modules, *.pem, *.key, credentials
Blocked commands: rm -rf /, sudo, git push --force origin main, DROP DATABASE
Per-session limits on edits and commands
Hard cost cap ($1,000 max even with MAX_COST override)

Cron & Scheduling

The Cron-Agent Bridge connects the scheduler to CodeBuddyAgent instances for recurring tasks:

buddy trigger add time:*/30 action:run-tests    # Run tests every 30 min
buddy trigger add webhook:deploy action:notify   # Notify on deploy webhook

Webhook triggers use HMAC-SHA256 verification with template placeholders for flexible integration.

AI Providers

Code Buddy supports multiple AI providers with automatic failover:

Provider	Models	Context	Configuration
Grok (xAI)	grok-4, grok-code-fast-1	128K	`GROK_API_KEY`
Claude (Anthropic)	claude-sonnet-4, opus	200K	`ANTHROPIC_API_KEY`
ChatGPT (OpenAI)	gpt-4o, gpt-4-turbo	128K	`OPENAI_API_KEY`
Gemini (Google)	gemini-2.0-flash (+ vision)	2M	`GOOGLE_API_KEY`
LM Studio	Any local model	Varies	`--base-url http://localhost:1234/v1`
Ollama	llama3, codellama, etc.	Varies	`--base-url http://localhost:11434/v1`

Model failover chain — cascading provider fallback with health tracking and cooldown periods.

Connection Profiles

# Use LM Studio (local)
buddy --base-url http://localhost:1234/v1 --api-key lm-studio

# Use Ollama (local)
buddy --base-url http://localhost:11434/v1 --model llama3

# Use a specific model
buddy --model grok-code-fast-1

Profile configuration in ~/.codebuddy/user-settings.json:

{
  "connection": {
    "activeProfileId": "grok",
    "profiles": [
      {
        "id": "grok",
        "name": "Grok API (xAI)",
        "provider": "grok",
        "baseURL": "https://api.x.ai/v1",
        "model": "grok-4-latest"
      },
      {
        "id": "lmstudio",
        "name": "LM Studio Local",
        "provider": "lmstudio",
        "baseURL": "http://localhost:1234/v1",
        "apiKey": "lm-studio"
      }
    ]
  }
}

Auth profile manager — API key rotation (round-robin/priority/random strategies), session stickiness, exponential backoff on failures.

buddy auth-profile list                   # List authentication profiles
buddy auth-profile add <id> <provider>    # Add a profile
buddy auth-profile remove <id>            # Remove a profile
buddy auth-profile reset                  # Reset all cooldowns

Security & Trust

Tool Policy & Bash Allowlist

Fine-grained control over what tools the agent can use:

// Tool-level allow/deny
const policy = new ToolPolicy({
  allowlist: ['read_file', 'search', 'web_fetch'],
  denylist: ['bash', 'write_file'],
  requireConfirmation: ['delete_file'],
});

// Bash command patterns
const bashPolicy = new BashAllowlist({
  patterns: [/^npm (install|test|run)/, /^git (status|diff|log)/],
  blocked: [/rm -rf/, /sudo/, /curl.*\|.*sh/],
});

Security Modes

Mode	Description
`suggest`	Confirm all operations
`auto-edit`	Auto-approve safe ops
`full-auto`	Full autonomy (YOLO)

/mode suggest    # Maximum safety
/mode full-auto  # Full autonomy

Trust Folders & Agent Profiles

Trust folders — directory-level tool permissions via .codebuddy-trust.json
Agent profiles — predefined configs: secure (read-only), minimal, power-user
Per-model tool config — capabilities, context window, and patch format per model family

OS Sandbox — Workspace-Write Mode

Three sandbox tiers for native OS-level isolation (Codex-inspired):

Mode	Write Access	Use Case
`read-only`	None	Untrusted analysis tasks
`workspace-write`	Git workspace root only	Normal development (default)
`danger-full-access`	Unrestricted	Deployment/release scripts

.git, .codebuddy, .ssh, .gnupg, .aws are always read-only regardless of mode.

const sandbox = await createSandboxForMode('workspace-write', '/my/project');
await sandbox.exec('npm', ['test']);

Exec Policy — Prefix Rules

Codex-inspired command authorization with token-array prefix matching (safer than regex — bypasses quoting/encoding tricks):

buddy execpolicy check "git push --force"          # evaluate a shell string
buddy execpolicy check-argv git push --force       # token-array (prefix rules first)
buddy execpolicy add-prefix git push --action deny # block git push with longest-match
buddy execpolicy dashboard                         # full policy overview

SSRF Guard

Comprehensive Server-Side Request Forgery protection on all outbound HTTP calls:

Blocks RFC-1918 private ranges + loopback + link-local
Blocks IPv4 bypass vectors: octal (0177.0.0.1), hex (0x7f000001), short form (127.1)
Blocks IPv6 transition addresses: NAT64 (64:ff9b::/96), 6to4, Teredo, IPv4-mapped (::ffff:127.0.0.1)
Async DNS resolution check before every fetch

Docker Sandbox

Containerized command execution for untrusted operations:

const sandbox = new DockerSandbox({
  image: 'codebuddy/sandbox:latest',
  memoryLimit: '512m',
  networkMode: 'none',
  timeout: 30000,
});

Auto-sandbox router automatically routes dangerous commands (npm, pip, cargo, make) to Docker when available.

Safety Rails

Rail	Description
Diff-First Mode	All code changes are previewed before applying. Users see file summaries, line counts, and can view the full unified diff.
Plan-First Mode	Multi-step tasks show the execution plan for approval before any changes are made.
Scoped Permissions	Users get only the access they need: `read-only` → `write-patch` → `run-tests` → `deploy`.
Audit Trail	Every tool execution, confirmation, and security decision is logged.
Secret Handles	API tokens and credentials are referenced by handle name only — actual values are resolved from env vars at runtime, never exposed to the LLM context.
2-Step Confirmation	Risky operations (rollback, deploy) require double confirmation with a 2-minute timeout window.
Timed YOLO	`/yolo` grants temporary full access that auto-revokes after the specified duration.
DM Pairing	Unknown users must be approved before they can interact with the bot.

Architecture

Facade Architecture

CodeBuddyAgent
    │
    ├── AgentContextFacade      # Context window and memory management
    │       - Token counting, compression, memory retrieval
    │
    ├── SessionFacade           # Session persistence and checkpoints
    │       - Save/load, checkpoint creation, rewind
    │
    ├── ModelRoutingFacade      # Model routing and cost tracking
    │       - Provider selection, cost calculation
    │
    ├── InfrastructureFacade    # MCP, sandbox, hooks, plugins
    │       - Hook execution, plugin loading
    │
    └── MessageHistoryManager   # Chat and LLM message history

Autonomy Layer

CodeBuddyAgent
    │
    ├── TaskPlanner             # DAG decomposition of complex requests
    │       - needsPlanning() heuristic
    │       - createPlan() → TaskGraph → parallel execution
    │
    ├── SupervisorAgent         # Multi-agent orchestration
    │       - Sequential, parallel, race, all strategies
    │       - SharedContext with optimistic locking
    │
    ├── MiddlewarePipeline      # Composable before/after turn hooks
    │       - ReasoningMiddleware (priority 42)
    │       - WorkflowGuardMiddleware (priority 45)
    │       - AutoRepairMiddleware (priority 150)
    │       - QualityGateMiddleware (priority 200)
    │
    ├── SelfHealing             # Automatic error recovery
    │       - Pattern recognition (6 built-in patterns)
    │       - Retry with exponential backoff
    │
    ├── ScreenObserver          # Environment monitoring
    │       - Periodic screenshots with perceptual diff
    │       - Event triggers (file_change, screen_change, time, webhook)
    │
    ├── ProactiveAgent          # Agent-initiated communication
    │       - Push notifications with priority levels
    │       - Rate limiting and quiet hours
    │
    └── DaemonManager           # Background process lifecycle
            - PID file management, auto-restart
            - Service registry, health monitoring

Core Flow

User Input → ChatInterface (Ink/React) → CodeBuddyAgent → AI Provider
                                              │
                                         Tool Calls (max 50/400 rounds)
                                              │
                                      Tool Execution + Confirmation
                                              │
                                        Results back to API (loop)

API Server & Integrations

REST API

buddy server --port 3000

Endpoint	Method	Description
`/api/health`	GET	Health check
`/api/metrics`	GET	Prometheus metrics
`/api/chat`	POST	Chat completion
`/api/chat/completions`	POST	OpenAI-compatible
`/api/tools`	GET	List tools
`/api/tools/{name}/execute`	POST	Execute tool
`/api/sessions`	GET/POST	Session management
`/api/memory`	GET/POST	Memory entries
`/api/daemon/status`	GET	Daemon status
`/api/daemon/health`	GET	Health metrics (CPU, memory)
`/api/cron/jobs`	GET	List cron jobs
`/api/cron/jobs/{id}/trigger`	POST	Trigger a cron job
`/api/notifications/preferences`	GET/POST	Notification settings
`/api/heartbeat/status`	GET	Heartbeat engine status
`/api/heartbeat/start\|stop\|tick`	POST	Heartbeat control
`/api/hub/search?q=...`	GET	Search skills marketplace
`/api/hub/installed`	GET	List installed hub skills
`/api/hub/install`	POST	Install a skill
`/api/hub/{name}`	DELETE	Uninstall a skill
`/api/identity`	GET	List loaded identity files
`/api/identity/prompt`	GET	Combined identity prompt
`/api/identity/{name}`	PUT	Update an identity file
`/api/groups/status\|list`	GET	Group security status/config
`/api/groups/block`	POST	Block a user globally
`/api/groups/block/{userId}`	DELETE	Unblock a user
`/api/auth-profiles`	GET/POST/DELETE	Auth profile CRUD
`/api/auth-profiles/reset`	POST	Reset all cooldowns

WebSocket Events

const ws = new WebSocket('ws://localhost:3000/ws');

// Authenticate
ws.send(JSON.stringify({
  type: 'authenticate',
  payload: { token: 'jwt-token' }
}));

// Stream chat
ws.send(JSON.stringify({
  type: 'chat_stream',
  payload: { messages: [{ role: 'user', content: 'Hello' }] }
}));

MCP Servers

Four MCP servers are pre-configured (disabled by default):

buddy mcp add brave-search    # Brave Web Search (needs BRAVE_API_KEY)
buddy mcp add playwright      # Browser automation (no key needed)
buddy mcp add exa-search      # Exa neural search (needs EXA_API_KEY)
buddy mcp add icm             # Infinite Context Memory (needs `cargo install icm`)
buddy mcp list                # Show all configured servers

Plugin System

Plugins extend Code Buddy with custom tools, commands, and providers:

~/.codebuddy/plugins/
  my-plugin/
    manifest.json
    index.js

Plugin types: Tool, Provider (LLM/embedding/search), Command, Hook

const plugin: Plugin = {
  async activate(context: PluginContext) {
    context.registerTool({
      name: 'my_tool',
      description: 'Custom tool',
      execute: async (args) => {
        return { success: true, output: 'Done!' };
      }
    });

    context.registerProvider({
      id: 'my-llm',
      type: 'llm',
      async chat(messages) { return 'response'; }
    });
  }
};

Extensions

Manifest-based extension system with lifecycle hooks and config schema. Extensions live in .codebuddy/extensions/.

Copilot Proxy

IDE-compatible completions backend — serves /v1/completions with bearer auth, per-IP rate limiting, and token clamping.

External Tools (RTK & ICM)

Tool	Install	Purpose
RTK	`cargo install --git https://github.com/rtk-ai/rtk`	CLI proxy that wraps commands to reduce LLM token usage 60-90%
ICM	`cargo install --git https://github.com/rtk-ai/icm`	MCP server for persistent cross-session memory

RTK is automatically integrated via a before-hook — supported bash commands are prefixed with rtk transparently. Configure in .codebuddy/config.toml under [integrations].

CLI Reference

Global Options

Flag	Short	Description	Default
`--version`	`-V`	Show version number	-
`--directory <dir>`	`-d`	Set working directory	`.`
`--api-key <key>`	`-k`	API key (or `GROK_API_KEY` env)	-
`--base-url <url>`	`-u`	API base URL (or `GROK_BASE_URL` env)	-
`--model <model>`	`-m`	AI model to use (or `GROK_MODEL` env)	auto-detect
`--prompt <prompt>`	`-p`	Single prompt, headless mode	-
`--browser`	`-b`	Launch browser UI instead of terminal	`false`
`--max-tool-rounds <n>`		Max tool execution rounds	`400`
`--security-mode <mode>`	`-s`	`suggest`, `auto-edit`, or `full-auto`	`suggest`
`--output-format <fmt>`	`-o`	Headless output: `json`, `stream-json`, `text`, `markdown`	`json`
`--context <patterns>`	`-c`	Glob patterns to load into context	-

Session & Cost

Flag	Description	Default
`--continue`	Resume the most recent saved session	-
`--resume <id>`	Resume a specific session (supports partial ID matching)	-
`--max-price <dollars>`	Maximum cost in dollars before stopping	`10.0`
`--no-cache`	Disable response caching	-

Autonomy & Permissions

Flag	Description	Default
`--auto-approve`	Automatically approve all tool executions	`false`
`--dangerously-skip-permissions`	Bypass all permission checks (trusted containers only)	`false`
`--no-self-heal`	Disable self-healing auto-correction	-
`--allow-outside`	Allow file operations outside workspace directory	`false`

Tool Control

Flag	Description	Example
`--force-tools`	Force-enable function calling for local models	-
`--probe-tools`	Auto-detect tool support at startup	-
`--enabled-tools <patterns>`	Only enable matching tools (glob, comma-separated)	`bash,file,search`
`--disabled-tools <patterns>`	Disable matching tools (glob, comma-separated)	`bash,web_*`
`--allowed-tools <patterns>`	Alias for `--enabled-tools` (Claude Code compat)	-

Agent & Prompt Configuration

Flag	Description	Default
`--system-prompt <id>`	System prompt: `default`, `minimal`, `secure`, `code-reviewer`, `architect` (or custom from `~/.codebuddy/prompts/`)	`default`
`--list-prompts`	List available system prompts and exit	-
`--agent <name>`	Use a custom agent from `~/.codebuddy/agents/`	-
`--list-agents`	List available custom agents and exit	-

Display & Debugging

Flag	Description
`--plain`	Minimal formatting (plain text output)
`--no-color`	Disable colored output
`--no-emoji`	Disable emoji in output
`--vim`	Enable Vim keybindings for input
`--mcp-debug`	Enable MCP protocol debugging output

Setup & Init

Flag	Description
`--init`	Initialize `.codebuddy/` directory with templates
`--dry-run`	Preview changes without applying (simulation mode)
`--setup`	Run interactive API key setup wizard
`--list-models`	List available models from the API and exit

Commands

Slash Commands (In-Chat)

Command	Description
`/help`	Show help
`/model [name]`	Change model
`/mode [mode]`	Change security mode
`/profile [id]`	Switch connection profile
`/think`	Enable reasoning (4K tokens)
`/megathink`	Deep reasoning (10K tokens)
`/ultrathink`	Exhaustive reasoning (32K tokens)
`/cost`	Show cost dashboard
`/memory`	Memory management
`/hooks list`	List lifecycle hooks
`/plugin list`	List plugins
`/speak <text>`	Speak text with current TTS provider
`/tts on\|off\|auto`	TTS control
`/yolo on\|off\|safe\|status`	YOLO mode control
`/autonomy suggest\|confirm\|auto\|full\|yolo`	Autonomy level

CLI Subcommands

# Dev Workflows
buddy dev plan "<objective>"       # Profile repo + produce task plan
buddy dev run "<objective>"        # Plan + implement + test + artifacts
buddy dev pr "<objective>"         # Dev run + generate PR summary
buddy dev fix-ci [--log <file>]    # Read CI logs + propose patch
buddy dev issue <url-or-number>    # GitHub issue → branch → code → tests → PR
buddy dev explain                  # Summarise repo conventions

# Daemon
buddy daemon start|stop|restart|status|logs

# Triggers
buddy trigger list|add|remove

# Webhooks
buddy webhook list|add|remove

# Skills Hub
buddy hub search|install|uninstall|update|list|info|publish|sync

# Heartbeat
buddy heartbeat start|stop|status|tick

# Identity
buddy identity show|get|set|prompt

# Groups
buddy groups status|list|block|unblock

# Auth Profiles
buddy auth-profile list|add|remove|reset

# Devices
buddy device list|pair|remove|snap|screenshot|record|run

# Config
buddy config show|validate|get

# Security
buddy security-audit [--deep] [--fix] [--json]

# Voice
buddy speak [text] [--voice <name>] [--list-voices] [--speed <n>] [--format <fmt>]

# Knowledge Base
buddy knowledge list|show|search|add|remove|context

# DM Pairing
buddy pairing status|list|pending|approve <code>|add <id>|revoke <id>

# Wide Research
buddy research "<topic>" [--workers N] [--rounds N] [--output file.md]

# Task List (todo.md attention bias — injected at end of every agent turn)
buddy todo list                     # Show all items
buddy todo add "task description" [-p high|medium|low]
buddy todo done <id>                # Mark completed
buddy todo update <id> [-s in_progress] [-t "new text"]
buddy todo remove <id>              # Delete item
buddy todo clear-done               # Remove all completed
buddy todo context                  # Preview the block injected into the agent

# Lessons (self-improvement loop — injected before every agent turn)
buddy lessons list [--category PATTERN|RULE|CONTEXT|INSIGHT]
buddy lessons add "what went wrong → correct approach" --category PATTERN
buddy lessons search "tsc"                 # Find relevant lessons before a task
buddy lessons clear [--category RULE] --yes
buddy lessons context                      # Preview the <lessons_context> block

# Setup
buddy onboard          # Interactive setup wizard
buddy doctor           # Environment diagnostics

Configuration

Environment Variables

Variable	Description	Default
`GROK_API_KEY`	xAI API key	Required
`ANTHROPIC_API_KEY`	Anthropic API key	-
`OPENAI_API_KEY`	OpenAI API key	-
`GOOGLE_API_KEY`	Google AI API key	-
`SERPER_API_KEY`	Web search API key	-
`GROK_BASE_URL`	Custom API endpoint	-
`GROK_MODEL`	Default model	-
`BRAVE_API_KEY`	Brave Search API key	-
`EXA_API_KEY`	Exa neural search API key	-
`PERPLEXITY_API_KEY`	Perplexity AI search key (`pplx-...`)	-
`OPENROUTER_API_KEY`	OpenRouter key for Perplexity (`sk-or-...`)	-
`PERPLEXITY_MODEL`	Perplexity model	`perplexity/sonar-pro`
`PICOVOICE_ACCESS_KEY`	Porcupine wake word detection	-
`CACHE_TRACE`	Debug prompt construction stages	`false`
`YOLO_MODE`	Full autonomy	`false`
`MAX_COST`	Cost limit ($)	`10`
`JWT_SECRET`	API server auth	Required in prod
`TELEGRAM_BOT_TOKEN`	Telegram bot token (from @BotFather)	-
`DISCORD_TOKEN`	Discord bot token	-
`SLACK_BOT_TOKEN`	Slack bot token	-

Optional Rust tools:

Tool	Install	Purpose
RTK	`cargo install --git https://github.com/rtk-ai/rtk`	CLI proxy that wraps commands to reduce LLM token usage 60-90%
ICM	`cargo install --git https://github.com/rtk-ai/icm`	MCP server for persistent cross-session memory

Project Settings

Create .codebuddy/settings.json:

{
  "systemPrompt": "You are working on a TypeScript project.",
  "tools": {
    "enabled": ["read_file", "search", "bash"],
    "disabled": ["web_search"]
  },
  "security": {
    "mode": "auto-edit",
    "bashAllowlist": ["npm *", "git *"]
  }
}

Development

# Clone and install
git clone https://github.com/phuetz/code-buddy.git
cd code-buddy
npm install

# Development mode
npm run dev

# Run tests
npm test

# Validate before commit
npm run validate

# Build
npm run build

Test Coverage

24,000+ tests across 590+ suites covering:
- Core: Tool Policy, Bash Allowlist, Context Window Guard, Compaction
- Agent: Middleware Pipeline, Profiles, Reasoning, Streaming
- Autonomy: Daemon, Cron Bridge, Task Planner, Delegation Engine
- Observation: Screen Observer, Triggers, Proactive Notifications
- Orchestration: Supervisor, Shared Context, Self-Healing, Rollback
- Providers: Gemini (vision + conversation), OpenAI-compat, Failover
- Security: Trust Folders, Skill Scanner, Bash Parser, Session Locks
- Infrastructure: MCP Client, Webhooks, Extensions, ACP Protocol, RTK Compressor, ICM Bridge
- Reasoning: Tree-of-Thought, MCTS, Reasoning Facade, /think command
- Memory: Decision Memory, Coding Style Analyzer, Importance Scorer
- Middleware: Auto-Repair, Quality Gates, Reasoning, Workflow Guard
- Voice: Wake Word, TTS Providers, Voice Control Loop
- UI: ChatHistory, ChatInterface, TabbedQuestion

Research & Inspiration

Code Buddy implements techniques from academic research and draws architectural inspiration from leading open-source projects.

Scientific Papers

Reasoning & Planning:

Paper	Reference	Implementation
Tree of Thoughts	Yao et al., 2023 — arXiv:2305.10601	`src/agent/reasoning/tree-of-thought.ts`
RethinkMCTS	Zhang et al., 2024 — arXiv:2409.09584	`src/agent/reasoning/mcts.ts`
TALE (Token-Budget-Aware Reasoning)	arXiv:2412.18547	`src/agent/token-budget-reasoning.ts` — 68.9% token reduction
FrugalGPT	Stanford, 2023 — arXiv:2305.05176	`src/optimization/model-routing.ts` — 30-70% cost reduction
LLMCompiler	arXiv:2312.04511	`src/optimization/parallel-executor.ts` — 2.5-4.6x speedup

Program Repair:

Paper	Reference	Implementation
ChatRepair	Xia et al., ISSTA 2024 — arXiv:2403.12538	`src/agent/repair/iterative-repair.ts`
ITER	arXiv:2403.00418	`src/agent/repair/repair-templates.ts` — iterative template repair
RepairAgent	ICSE 2024	`src/agent/repair/repair-engine.ts` — autonomous LLM-based repair
AgentCoder	Huang et al., 2023	`src/agent/multi-agent/multi-agent-system.ts` — hierarchical multi-agent code generation

RAG & Context Management:

Paper	Reference	Implementation
CodeRAG	arXiv:2509.16112	`src/context/multi-path-retrieval.ts`, `src/context/dependency-aware-rag.ts`
RAG-MCP	arXiv:2505.03275	`src/tools/tool-selector.ts`
ToolLLM	ICLR'24 — arXiv:2307.16789	`src/agent/execution/tool-selection-strategy.ts`
Comprehensive RAG Survey	arXiv:2506.00054	`src/context/codebase-rag/codebase-rag.ts`
Recurrent Context Compression	arXiv:2406.06110	`src/context/context-manager-v2.ts`

Observation & Optimization:

Paper	Reference	Implementation
JetBrains Context Management	JetBrains Research, 2024	`src/context/observation-masking.ts` — -7% cost, +2.6% success
Complexity Trap	arXiv:2508.21433	`src/context/observation-masking.ts`
Less-is-More (Tool Filtering)	arXiv, 2024	`src/optimization/tool-filtering.ts` — 70% execution time reduction
The Prompt Report	arXiv:2406.06608	`src/prompts/system-base.ts`

Testing & Memory:

Paper	Reference	Implementation
TDD + LLM	ICSE 2024	`src/testing/tdd-mode.ts` — TDD improves Pass@1 by 45.97%
MemGPT	UC Berkeley, 2023	`src/memory/prospective-memory.ts` — stateful AI agents

Fault Localization: Ochiai, DStar, and Tarantula (Jones et al., 2002) spectrum-based techniques in src/agent/repair/fault-localization.ts.

Inspiration Projects

Code Buddy's architecture draws from these open-source projects:

Project	Inspiration	Key Files
OpenClaw	Multi-channel messaging, DM pairing, lane queue concurrency, memory lifecycle, tool policy, skills system, heartbeat, identity system, group security, hub marketplace	40+ files across `src/channels/`, `src/concurrency/`, `src/memory/`, `src/security/`, `src/skills/`
OpenAI Codex CLI	Apply-patch unified diff, head/tail truncation, per-model tool config, turn diff tracker, security modes, OS sandbox workspace-write tiers, shell-free exec, SSRF guard, exec policy prefix rules, shell env policy, named config profiles, tool prefix naming convention, stable JSON serialization, session fork/rollout unification	`src/tools/apply-patch.ts`, `src/sandbox/os-sandbox.ts`, `src/security/ssrf-guard.ts`, `src/sandbox/execpolicy.ts`, `src/tools/registry/tool-aliases.ts`, `src/utils/stable-json.ts`, `src/observability/run-store.ts`
Claude Code	Hook system, slash commands, MCP config, extended thinking, parallel subagents, headless output, Anthropic prompt cache breakpoints	`src/hooks/`, `src/commands/slash-commands.ts`, `src/mcp/config.ts`, `src/optimization/cache-breakpoints.ts`
Gemini CLI	Persistent checkpoints, context files, compress command, shell prefix, multimodal input	`src/checkpoints/`, `src/context/context-files.ts`, `src/input/multimodal-input.ts`
Aider	Repository map, voice input, unified diff editor, watch mode (IDE comments)	`src/context/repository-map.ts`, `src/tools/voice-input.ts`, `src/commands/watch-mode.ts`
Cursor	`.cursorrules` config, parallel agent system, sandboxed terminals, embedded browser	`src/config/codebuddyrules.ts`, `src/agent/parallel/`, `src/browser/embedded-browser.ts`
Mistral Vibe	External markdown prompts, TOML config, tool permission system, fuzzy match, update notifier	`src/prompts/`, `src/config/toml-config.ts`, `src/utils/fuzzy-match.ts`
Conductor	Spec-driven development, track system	`src/tracks/`
RTK	Command proxy for 60-90% token reduction	`src/utils/rtk-compressor.ts`
ICM	Persistent cross-session memory via MCP	`src/memory/icm-bridge.ts`
Manus AI	Wide Research (parallel sub-agent research workers), Knowledge Base injection, todo.md attention bias, restorable context compression, pre-compaction NO_REPLY flush, inline web-search citations, observation variator (anti-repetition), structured prompt variation, tool result compaction guard, disk-backed tool results, response prefill modes (tool_choice control), WebSearchMode + domain policy, message queue debounce/cap/overflow	`src/agent/wide-research.ts`, `src/context/observation-variator.ts`, `src/agent/response-constraint.ts`, `src/tools/web-search.ts`, `src/agent/message-queue.ts`
OpenClaw	Multi-channel messaging, DM pairing, lane queue concurrency, memory lifecycle, tool policy, skills system, heartbeat, identity system, group security, hub marketplace, daily session reset, per-channel streaming policies	`src/channels/streaming-policy.ts`, `src/channels/`, `src/skills/`, `src/daemon/daily-reset.ts`

Other influences: Rust (Result<T, E> pattern), AutoGPT, MetaGPT, CrewAI, ChatDev (role-based multi-agent), ReAct (reasoning + acting paradigm), Qodo/PR-Agent (RAG for code repos).

Benchmarks referenced: SWE-bench, HumanEval, MBPP, BigCodeBench, WebArena, Berkeley Function Calling Leaderboard.

For detailed research notes, see docs/RESEARCH_IMPROVEMENTS.md, docs/RAG_TOOL_SELECTION.md, and deep_research/ai-coding-assistant-improvements/.

Troubleshooting

API key not working

echo $GROK_API_KEY  # Verify key is set
buddy --prompt "test"

Switching providers doesn't work

# Verify connection to local model
buddy --base-url http://localhost:1234/v1 --api-key lm-studio --prompt "test"

# List available models
buddy --list-models

Memory not persisting

# Check memory directory
ls ~/.codebuddy/memory/

# Clear and reinitialize
rm -rf ~/.codebuddy/memory/
buddy

High latency

Use a faster model: buddy --model grok-code-fast-1
Use local LLM: buddy --base-url http://localhost:11434/v1 --model llama3

Debug mode

DEBUG=codebuddy:* buddy

License

MIT License - see LICENSE for details.

Report Bug | Request Feature | Star on GitHub

_{Multi-AI: Grok | Claude | ChatGPT | Gemini | LM Studio | Ollama}

Name		Name	Last commit message	Last commit date
Latest commit History 565 Commits
.codebuddy		.codebuddy
.github		.github
.husky		.husky
.vscode		.vscode
assets/diagrams		assets/diagrams
benchmarks		benchmarks
deep_research		deep_research
docs		docs
examples		examples
homebrew		homebrew
packaging		packaging
prompts		prompts
scripts		scripts
src		src
test-scripts		test-scripts
tests		tests
vscode-extension		vscode-extension
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.lintstagedrc.js		.lintstagedrc.js
.npmignore		.npmignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
.releaserc.json		.releaserc.json
ACTION-PLAN-2025-12-09.md		ACTION-PLAN-2025-12-09.md
ARCHITECTURE.md		ARCHITECTURE.md
AUDIT-2026-01-11.md		AUDIT-2026-01-11.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COLAB.md		COLAB.md
COMPETITOR_AUDIT.md		COMPETITOR_AUDIT.md
COMPETITOR_AUDIT_2025_UPDATE.md		COMPETITOR_AUDIT_2025_UPDATE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
LICENSE		LICENSE
MIGRATION.md		MIGRATION.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
SECURITY.md		SECURITY.md
TODO.md		TODO.md
action.yml		action.yml
bun.lock		bun.lock
commitlint.config.js		commitlint.config.js
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
jest.config.cjs		jest.config.cjs
package-lock.json		package-lock.json
package.json		package.json
package.json.backup		package.json.backup
sonar-project.properties		sonar-project.properties
stryker.conf.json		stryker.conf.json
tsconfig.json		tsconfig.json
tsconfig.test.json		tsconfig.test.json
typedoc.json		typedoc.json
vitest.config.ts		vitest.config.ts

Uh oh!