- The Story
- Quick Start
- Architecture
- Features
- MCP Server
- PostgreSQL Integration
- Who is this for?
- How it works
- Requirements
- Enable Auto-Retrieval
- After Installation
- For Existing Claude Users
- Token Savings
- Directory Structure
- Commands Reference
- Configuration
- Best Practices
- FAQ
- Troubleshooting
- Security
- Roadmap
- Contributing
- Credits
After using Continuous-Claude (created by parcadei), we noticed something: our CLAUDE.md files kept growing. Every time we documented something new, added a guide, or saved a configuration, the file got bigger.
The problem? Claude loads your entire CLAUDE.md on every single prompt. That 30KB file? Loaded 20+ times per session. Hundreds of thousands of tokens wasted on content Claude didn't need.
Why does this matter? Whether you're on Claude Pro ($20/month) or Pro Max ($200/month), you have a monthly token budget. Wasting thousands of tokens per prompt on irrelevant context means fewer tokens for actual thinking, coding, and building.
The solution: What if Claude could pull in just the context it needs? You ask about your database, Claude grabs the database block. You ask about deployment, Claude grabs the deployment block. Everything else stays on the shelf.
That's BloxCue - intelligent context blocks that get loaded when you need them.
v2 takes it further: an MCP server with 6 tools, token-budgeted context injection, and optional PostgreSQL integration that unifies your curated markdown blocks with learned memory from Continuous-Claude.
The easiest way to install or update BloxCue is to ask your AI to do it. Paste one of the prompts below into Claude Code, Cursor, Windsurf, or any MCP-compatible assistant.
No database? No problem. BloxCue works 100% with markdown files only. PostgreSQL is an optional add-on for Continuous-Claude users.
| Mode | Best For | Setup |
|---|---|---|
| MCP Server (v2.0) | Any MCP client (Claude, Cursor, Windsurf) | Add to mcp_config.json |
| Auto-Injection Hooks | Claude Code with automatic context | Run install.sh |
Set up BloxCue from https://github.com/bokiko/bloxcue
1. Clone to ~/bloxcue (or wherever I prefer)
2. Read AI_SETUP.md for setup instructions
3. Run ./install.sh --auto (or ask me about scope/structure preferences)
4. Configure MCP server per AI_SETUP.md Step 5
5. Verify: python3 ~/.claude-memory/scripts/indexer.py --search "test"
6. Help me create my first block from something in my CLAUDE.md
Update BloxCue to the latest version:
1. cd ~/bloxcue && git pull
2. Show me what's new and help me enable any new features
Or just run cd ~/bloxcue && git pull yourself — that's it.
git clone https://github.com/bokiko/bloxcue.git
cd bloxcueOptional: Install Continuous-Claude v3 first if you want session memory + knowledge retrieval together.
./install.shThe installer will ask you:
Where to install?
- Global (
~/.claude-memory) - knowledge used across all projects - Project (
./claude-memory) - project-specific docs only - Both - recommended for most users
How to organize?
- By subject - guides, references, projects (general use)
- By project - project-a, project-b (freelancers/agencies)
- Developer - apis, databases, deployment, frontend, backend
- DevOps - servers, networking, monitoring, security
- Minimal - just docs and notes
- Custom - you specify
nano ~/.claude-memory/guides/deployment.md---
title: Production Deployment
category: guides
tags: [deployment, production, devops]
---
# Production Deployment
## Prerequisites
- SSH access to production server
- Environment variables configured
## Deploy Steps
1. Run tests locally
2. Push to main branch
3. SSH into server
4. Pull latest changes
5. Run migrations
6. Restart services
## Rollback
1. Revert to previous commit
2. Run down migrations
3. Restart servicespython3 ~/.claude-memory/scripts/indexer.pypython3 ~/.claude-memory/scripts/indexer.py --search "deployment" +---------------------------+
| Claude Code / MCP |
| Client |
+------------+--------------+
|
JSON-RPC (stdio)
|
+------------+--------------+
| BloxCue MCP Server |
| (mcp_server.py) |
| |
| 6 tools: |
| - search_blocks |
| - get_block |
| - list_blocks |
| - index_blocks |
| - block_health |
| - inject_context |
+------+----------+----------+
| |
+------------+ +-----+-----------+
| | |
+---------v--------+ +----v--------------+ |
| Markdown Files | | PostgreSQL | |
| (.md blocks) | | (optional) | |
| | | | |
| guides/ | | archival_memory | |
| references/ | | table | |
| configs/ | | (pgvector) | |
+------------------+ +-------------------+ |
| | |
+--------+--------+ |
| |
+--------v-----------+ |
| Unified Index | |
| | |
| BM25 + IDF + | |
| Porter Stemmer + | |
| Intent Detection +<--------------+
| + MMR Diversity | UserPromptSubmit
| | hook (auto-inject)
+--------------------+
Two retrieval paths, one search interface. Markdown files and PostgreSQL learnings are merged into a single BM25 index. The MCP server and CLI hook both query the same engine.
| Feature | Description |
|---|---|
| BM25 Scoring | Industry-standard probabilistic ranking (same algorithm as Elasticsearch) |
| Porter Stemmer | Matches word variations (running -> run, deployment -> deploy) |
| IDF Weighting | Rare terms rank higher than common ones for better precision |
| Bigram Matching | Recognizes multi-word phrases like "error handling" |
| Query Intent Detection | Adjusts scoring based on query type (how-to, troubleshooting, concept, reference) |
| Synonym Expansion | 50+ tech term mappings (k8s -> kubernetes, auth -> authentication) |
| MMR Diversity | Prevents redundant results using Maximal Marginal Relevance |
| Memoized Stemming | LRU cache on stemmer for 50-70% faster repeated searches |
| Index Caching | In-memory cache with mtime checking eliminates repeated disk reads |
BloxCue doesn't just find relevant blocks - it manages your token budget:
Token budget: 3000
Block 1 (score: 12.4, 800 tokens) -> Full content injected
Block 2 (score: 8.1, 2500 tokens) -> Summary injected (over budget for full)
Block 3 (score: 5.2, 1200 tokens) -> Reference only (path + title)
- Full content for top-ranked blocks that fit
- Summaries (first 300 chars) for blocks that partially fit
- References (title + path) for the rest
- Configurable budget via
BLOXCUE_MAX_TOKENSor per-call
- Hooks into Claude Code's
UserPromptSubmitevent - Analyzes your prompt in real-time
- Injects only the most relevant blocks as context
- Zero manual intervention required
- Pure Python standard library - no pip installs required
- PostgreSQL integration is optional (
psycopg2imported in try/except) - Works offline, works anywhere Python 3.8+ runs
- No env vars needed for default behavior
BloxCue v2 ships with a full MCP (Model Context Protocol) server, making it compatible with any MCP client: Claude Code, Cursor, Windsurf, and more.
| Tool | Description |
|---|---|
search_blocks |
Search with BM25 scoring, returns ranked results with scores and previews |
get_block |
Retrieve full content of a specific block by path |
list_blocks |
List all indexed blocks, optionally filtered by category |
index_blocks |
Rebuild the search index (run after adding/editing blocks) |
block_health |
Health report: freshness, coverage gaps, and improvement suggestions |
inject_context |
One-shot retrieval: search + rank + deduplicate + token-budget + format |
Add to your MCP config (e.g., ~/.claude/mcp_config.json):
{
"mcpServers": {
"bloxcue": {
"type": "stdio",
"command": "python3",
"args": ["/path/to/bloxcue/scripts/mcp_server.py"]
}
}
}Once configured, your AI assistant can call BloxCue tools directly. No hooks needed - tools are self-documenting via MCP tools/list.
The most powerful tool. One call does search + rank + deduplicate + budget:
Tool call: inject_context(query="deployment guide", max_tokens=2000)
Response:
[BloxCue: Injected 3 block(s), ~1847 tokens, query: "deployment guide"]
---
### Block 1: Production Deployment (score: 14.2, updated: 2 days ago)
Tags: deployment, production, aws
# Production Deployment
## Prerequisites
- SSH access to production server
...
### Block 2: CI/CD Pipeline (score: 8.7, updated: 5 days ago)
[Summary - full block at: guides/cicd.md]
Our CI/CD pipeline uses GitHub Actions to...
### Block 3: Server Configuration (score: 5.1, updated: 12 days ago)
[Reference only - retrieve full block with: get_block("configs/servers.md")]
BloxCue can optionally connect to a PostgreSQL database to search learned memory alongside markdown files. This is designed for Continuous-Claude's archival_memory table but works with any compatible schema.
- Fetches session learnings from PostgreSQL at index time
- Converts them to BloxCue index entries with
pg://learning/{uuid}virtual paths - Merges them into the BM25 index alongside file entries
- Search results show
[PG]labels for database-sourced entries - Full content retrieval works transparently for both files and PG entries
You: "How did we fix that auth hook error?"
BloxCue searches BOTH:
1. Markdown files (guides/auth.md, references/hooks.md)
2. PostgreSQL learnings (past session where you fixed it)
Results:
1. [PG] Error Fix: hook authentication (score: 11.2) <-- from database
2. Authentication Guide (score: 8.4) <-- from file
3. [PG] Working Solution: token refresh (score: 6.1) <-- from database
Add env vars to your MCP config:
{
"mcpServers": {
"bloxcue": {
"type": "stdio",
"command": "python3",
"args": ["/path/to/bloxcue/scripts/mcp_server.py"],
"env": {
"BLOXCUE_DATABASE_URL": "postgresql://user:pass@localhost:5432/dbname",
"BLOXCUE_MEMORY_DIR": "/path/to/bloxcue"
}
}
}
}Or set the environment variables directly:
export BLOXCUE_DATABASE_URL="postgresql://user:pass@localhost:5432/dbname"- Optional:
psycopg2is imported in a try/except. Without it, BloxCue works identically to before. - Kill switch: Set
BLOXCUE_PG_ENABLED=0to disable even if a URL is configured. - Non-fatal: Every database call is wrapped in try/except. PG errors are logged to stderr; file search always works.
- Read-only: All database connections use
readonly=True. BloxCue never writes to your database. - Cache TTL: PG learnings are cached and refreshed every 5 minutes (configurable via
BLOXCUE_PG_CACHE_TTL).
| If you're... | BloxCue helps you... |
|---|---|
| A Claude Code user | Stop burning tokens on unused context |
| Using Continuous-Claude | Search your curated blocks AND learned memory in one place |
| Managing multiple configs | Keep docs, guides, and configs organized and searchable |
| Building MCP integrations | 6 tools available via standard MCP protocol |
| Working on several projects | Switch context without reloading everything |
| Hitting token limits | Save ~7,000 tokens per prompt |
| New to Claude Code | Start with good habits from day one |
Before BloxCue:
You: "How do I deploy to production?"
Claude loads: ENTIRE CLAUDE.md (34KB = ~8,500 tokens)
- Your coding standards (not needed)
- Your API documentation (not needed)
- Your 10 different project configs (not needed)
- Your deployment guide (NEEDED!)
- Everything else (not needed)
Result: ~8,500 tokens loaded, only ~800 were relevant
After BloxCue:
You: "How do I deploy to production?"
BloxCue: Detects "deploy" + "production" keywords
-> Stems to "deploy" + "product"
-> Expands with synonyms: "release", "deployment"
-> BM25 ranks deployment block highest (IDF boost)
-> Intent: "howto" -> boosts tags & keywords
-> Token budget: injects full content (within budget)
Claude loads: Just the deployment block (~800 tokens)
Result: ~800 tokens loaded, all relevant
Saved: ~7,700 tokens for thinking & coding
BloxCue works best alongside Continuous-Claude. They're complementary tools:
| Tool | Purpose |
|---|---|
| Continuous-Claude | Session memory (ledgers, handoffs, learnings) |
| BloxCue | Knowledge retrieval (on-demand context loading) |
Think of it this way:
- Continuous-Claude = Claude's memory (what to remember)
- BloxCue = Claude's filing cabinet (where to find it efficiently)
With PostgreSQL integration enabled, BloxCue becomes the unified search interface for both.
If you prefer manual setup, follow our Continuous-Claude v3 first.
Credit: Continuous-Claude was created by parcadei. Check out Continuous-Claude v3.
Required for BloxCue to work automatically.
nano ~/.claude/settings.jsonAdd to your hooks section:
{
"hooks": {
"UserPromptSubmit": [{
"hooks": [{
"type": "command",
"command": "~/.claude/hooks/memory-retrieve.sh"
}]
}]
}
}Close and reopen Claude Code for changes to take effect.
You: "How do I deploy to production?"
Claude will automatically receive your deployment block as context.
Important: BloxCue is installed, but you're still wasting tokens until you slim your CLAUDE.md!
Ask Claude to migrate your content:
My CLAUDE.md has grown too big. Help me migrate content to BloxCue blocks:
1. Read my current CLAUDE.md
2. Identify distinct topics (deployment, APIs, configs, etc.)
3. Create separate block files in ~/.claude-memory/
4. Slim my CLAUDE.md to essentials only
5. Re-index with: python3 ~/.claude-memory/scripts/indexer.py
Your CLAUDE.md should end up like this:
# My Workspace
Knowledge base at `~/.claude-memory/`.
Claude retrieves relevant context automatically via hooks.
## Essentials
- Project: MyApp
- Stack: Node.js, PostgreSQL, RedisAlready have a big CLAUDE.md file?
I have an existing CLAUDE.md file that's gotten too big.
Help me migrate it to BloxCue by:
1. Reading my current CLAUDE.md
2. Identifying distinct topics
3. Creating separate block files for each topic
4. Updating my CLAUDE.md to be minimal
- Let Claude install Continuous-Claude + BloxCue
- Start with a minimal CLAUDE.md
- Add blocks as you go
Your CLAUDE.md stays small forever because everything goes into blocks.
Real numbers from actual usage:
| Metric | Before BloxCue | After BloxCue | Saved |
|---|---|---|---|
| Tokens per prompt | ~8,500 | ~1,000 | ~7,500 |
| Tokens per session (20 prompts) | ~170,000 | ~20,000 | ~150,000 |
| Reduction | - | - | ~88% |
Saved tokens go toward:
- Deeper reasoning - Claude can think more thoroughly
- Longer sessions - Stay within context limits longer
- Faster responses - Less to process means quicker replies
~/.claude-memory/
├── guides/ # How-to guides
├── references/ # Quick reference docs
├── projects/ # Project-specific info
├── configs/ # Configuration templates
├── notes/ # General notes
├── scripts/
│ ├── indexer.py # Search engine + index builder
│ ├── mcp_server.py # MCP server (6 tools)
│ └── pg_provider.py # PostgreSQL integration (optional)
└── tests/
└── unit/ # 245+ unit tests
~/.claude-memory/
├── client-alpha/
│ ├── requirements.md
│ ├── api.md
│ └── contacts.md
├── client-beta/
│ └── ...
└── scripts/
# Index all blocks
python3 ~/.claude-memory/scripts/indexer.py
# Search for something
python3 ~/.claude-memory/scripts/indexer.py --search "keyword"
# Search with verbose output (shows scores)
python3 ~/.claude-memory/scripts/indexer.py --search "keyword" -v
# List all indexed blocks
python3 ~/.claude-memory/scripts/indexer.py --list
# Rebuild index from scratch
python3 ~/.claude-memory/scripts/indexer.py --rebuild
# Output as JSON
python3 ~/.claude-memory/scripts/indexer.py --search "keyword" --jsonOnce the MCP server is configured, these tools are available to any MCP client:
search_blocks(query="auth errors", limit=5)
get_block(path="guides/authentication.md")
list_blocks(category="guides")
index_blocks(force=true)
block_health()
inject_context(query="deployment", max_tokens=2000, limit=5)
All configuration is via environment variables. None are required - defaults match the original behavior.
| Variable | Default | Description |
|---|---|---|
BLOXCUE_MEMORY_DIR |
Parent of scripts/ | Path to your blocks directory |
BLOXCUE_MAX_TOKENS |
3000 |
Default token budget for inject_context |
BLOXCUE_DATABASE_URL |
(none) | PostgreSQL connection URL |
BLOXCUE_PG_ENABLED |
1 |
Set to 0 to disable PG even with a URL |
BLOXCUE_PG_CACHE_TTL |
300 |
Seconds before PG learnings are re-fetched |
DATABASE_URL |
(none) | Fallback PG URL (if BLOXCUE_DATABASE_URL not set) |
{
"mcpServers": {
"bloxcue": {
"type": "stdio",
"command": "python3",
"args": ["/home/user/bloxcue/scripts/mcp_server.py"],
"env": {
"BLOXCUE_DATABASE_URL": "postgresql://user:pass@localhost:5432/mydb",
"BLOXCUE_MEMORY_DIR": "/home/user/bloxcue",
"BLOXCUE_MAX_TOKENS": "4000",
"BLOXCUE_PG_CACHE_TTL": "600"
}
}
}
}- Keep CLAUDE.md minimal - Just essentials, let blocks handle details
- One topic per file - Better search precision
- Use frontmatter - Title, category, and tags improve indexing
- Use descriptive tags -
[deployment, production, aws]not just[deploy] - Re-index after changes - Run the indexer after adding/editing files
- Use
inject_contextoversearch_blocks+get_block- One call, token-budgeted, deduplicated - Enable PostgreSQL if using Continuous-Claude - Unified search across files and learnings
Do I need Continuous-Claude?
Technically no, but recommended. Continuous-Claude handles session memory, BloxCue handles knowledge retrieval. They complement each other. With PG integration, BloxCue can search both.
Will this work with Cursor/VS Code/Windsurf?
Yes! The MCP server works with any MCP-compatible client. The hook-based auto-retrieval is specific to Claude Code CLI, but the MCP tools work everywhere.
How is this different from a smaller CLAUDE.md?
Three key differences:
- Scalability - Your knowledge grows without growing token usage
- Relevance - Only blocks matching your query get loaded
- Intelligence - BM25 ranking, intent detection, and token budgeting beat naive loading
What if Claude needs multiple blocks?
inject_context handles this automatically. It returns multiple blocks ranked by relevance, with smart token budgeting: full content for top results, summaries for mid-range, and references for the rest.
Can I use project-specific docs?
Yes! You can have both:
- Global:
~/.claude-memory/for cross-project content - Project:
./claude-memory/for project-specific docs
The installer supports setting up both.
Do I need psycopg2 for PostgreSQL?
Yes, but it's completely optional. Without psycopg2, BloxCue works identically to before - pure Python, zero dependencies. Install it only if you want PG integration:
pip install psycopg2-binaryHow do I back up my blocks?
They're just markdown files. Back them up however you prefer:
- Git repo (recommended)
- Cloud sync (Dropbox, iCloud, etc.)
- Any backup solution you use
What's the difference between v1 and v2?
| Aspect | v1 | v2 |
|---|---|---|
| Search | Keyword matching | BM25 + IDF + stemming + intent detection |
| Interface | CLI + shell hook | CLI + shell hook + MCP server (6 tools) |
| Data sources | Markdown files | Markdown files + PostgreSQL |
| Context injection | Dump full files | Token-budgeted (full/summary/reference tiers) |
| Tests | None | 245+ unit tests |
# macOS
brew install python3
# Ubuntu/Debian
sudo apt install python3- Run the indexer:
python3 ~/.claude-memory/scripts/indexer.py - Check files have
.mdextension - Verify files are in the correct directory
- Check
~/.claude/settings.jsonsyntax (valid JSON?) - Verify the hook path is correct
- Restart Claude Code after changing settings
- Verify the path in your
mcp_config.jsonis absolute - Check stderr output:
python3 /path/to/mcp_server.py 2>&1 - Ensure Python 3.8+ is on the MCP command's PATH
- Verify
BLOXCUE_DATABASE_URLis set correctly - Check
psycopg2is installed:python3 -c "import psycopg2" - Rebuild the index: call
index_blocksvia MCP or run the CLI indexer - Check server startup logs for
PostgreSQL integration: enabled
BloxCue is designed with security in mind:
| Protection | Description |
|---|---|
| Local-only | No telemetry, no data collection |
| Path validation | Prevents directory traversal attacks |
| Input sanitization | User prompts are sanitized before processing |
| Type safety | Handles malformed data gracefully without crashes |
| Settings backup | Creates backup before modifying Claude config |
| File locking | Exclusive locks prevent index corruption from concurrent sessions |
| Read-only DB | PostgreSQL connections use readonly=True - BloxCue never writes to your DB |
| Optional deps | psycopg2 failure is non-fatal; core always works |
See SECURITY.md for the full security audit report.
- Porter Stemmer for word normalization
- IDF weighting for term importance
- Bigram/phrase matching
- Query intent detection
- Synonym expansion (50+ tech terms)
- BM25 probabilistic ranking
- MMR diversity in results
- Token-budgeted context injection
- Path traversal protection
- Type safety hardening
- Stemmer memoization (LRU cache)
- Index caching with mtime invalidation
- File locking for concurrent safety
- MCP server (6 tools)
- PostgreSQL integration (Continuous-Claude learnings)
- 245+ unit tests
- Semantic search with embeddings
- VS Code extension for block management
- Web UI for managing memory
- Cross-machine sync
Ideas and contributions welcome! See the roadmap above for planned features.
- parcadei - Creator of Continuous-Claude v3
MIT - Use it however you want.
