GitHub - enzodevs/code-context-v2: Semantic code search MCP server for Claude Code. AST-based chunking, Voyage AI embeddings, pgvector retrieval.

Semantic code search as an MCP server. Index your codebases, search with natural language, get precise results.

Built for Claude Code but works with any MCP-compatible client.

Why

LLMs work better with the right context. Grep finds text; this finds meaning. Ask for "authentication middleware" and get the actual auth logic, not every file that mentions "auth".

How it works:

Index your codebase with tree-sitter AST parsing (functions, classes, methods — not arbitrary line splits)
Embed chunks with Voyage AI (voyage-4-large for documents, voyage-4-lite for queries — same embedding space, asymmetric retrieval)
Search with vector similarity + rerank-2.5 for precision
Serve results via MCP protocol — Claude Code calls it automatically

Architecture

Claude Code ──MCP──▶ FastMCP Server
                        │
                   search_codebase()
                   search_by_file()
                   list_projects()
                        │
              ┌─────────┴─────────┐
              ▼                   ▼
         Voyage AI           PostgreSQL 16
      voyage-4-lite          + pgvector
       (query embed)         + pgvectorscale
      rerank-2.5             (StreamingDiskANN)
       (reranking)

Retrieval pipeline:

Embed query with voyage-4-lite (fast, shared space with indexed docs)
Vector search: top-50 candidates via cosine similarity (pgvector <=>)
Rerank: rerank-2.5 narrows to top-8 with relevance threshold
Dedup: Jaccard similarity >95% removal
Return: Markdown-formatted chunks with file paths, line numbers, relevance scores

Quick Start

Prerequisites

uv (Python package manager)
Docker (for PostgreSQL)
Voyage AI API key (free tier available)

1. Clone and configure

git clone https://github.com/YOUR_USER/code-context-v2.git
cd code-context-v2

cp .env.example .env
# Edit .env — set VOYAGE_API_KEY and POSTGRES_PASSWORD

2. Start PostgreSQL

docker compose up -d

This runs PostgreSQL 16 with pgvector + pgvectorscale (TimescaleDB image) on port 54329.

3. Install dependencies

uv sync

4. Index a project

uv run code-context-cli index /path/to/your/project

5. Connect to Claude Code

Add to ~/.claude/mcp.json (global) or .mcp.json (per-project):

{
  "mcpServers": {
    "code-context": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/code-context-v2",
        "run",
        "code-context"
      ],
      "env": {
        "VOYAGE_API_KEY": "${VOYAGE_API_KEY}",
        "DATABASE_URL": "postgresql://coderag:your_password@localhost:54329/coderag"
      }
    }
  }
}

Restart Claude Code. It will automatically discover the search_codebase, search_by_file, and list_projects tools.

MCP Tools

Tool	Purpose
`list_projects`	List all indexed projects (call first to get project IDs)
`search_codebase(query, project)`	Semantic search across entire codebase
`search_by_file(filepath, query, project)`	Search within a specific file
`list_books`	List indexed books (optional literature feature)
`search_literature(query, book?)`	Search indexed technical books

CLI

# Index a project (auto-generates ID from folder name)
uv run code-context-cli index /path/to/project

# Index with custom ID
uv run code-context-cli index /path/to/project --id my-project

# Check what changed (dry-run)
uv run code-context-cli check my-project

# Sync only changed files
uv run code-context-cli sync my-project

# Force full reindex
uv run code-context-cli reindex /path/to/project

# Show statistics
uv run code-context-cli stats

# Watch for changes (background daemon)
uv run code-context-cli watch /path/to/project

# Remove orphaned data
uv run code-context-cli prune

There's also cc2.sh — a bash wrapper with a gum-based TUI for interactive use.

Supported Languages

Language	Extensions	Parser
TypeScript	`.ts`, `.tsx`	tree-sitter-typescript
JavaScript	`.js`, `.jsx`, `.mjs`, `.cjs`	tree-sitter-javascript
Python	`.py`, `.pyi`	tree-sitter-python
Java	`.java`	tree-sitter-java

Adding a new language requires a tree-sitter grammar and chunk type mappings in src/code_context/chunking/parser.py.

Configuration

All settings via environment variables (or .env file):

Variable	Default	Description
`DATABASE_URL`	—	PostgreSQL connection string (required)
`VOYAGE_API_KEY`	—	Voyage AI API key (required)
`EMBEDDING_MODEL_INDEX`	`voyage-4-large`	Embedding model for indexing
`EMBEDDING_MODEL_QUERY`	`voyage-4-lite`	Embedding model for queries
`RERANK_MODEL`	`rerank-2.5`	Reranking model
`RERANK_THRESHOLD`	`0.65`	Minimum relevance score after reranking
`RESULT_MAX_TOKENS`	`8000`	Token budget for results
`LOG_LEVEL`	`INFO`	Logging verbosity

See src/code_context/config.py for all available settings.

Performance

Vector search: <50ms
Reranking: <100ms
Total MCP response: <200ms
Initial indexing: ~5-10 min for 1000 files
Incremental sync: <2s per changed file
Storage: ~100MB per 100k chunks

How Indexing Works

Walk the project tree (skips node_modules, .git, dist, etc.)
Hash each file with BLAKE3 — skip unchanged files
Parse with tree-sitter into semantic chunks (functions, classes, methods)
Small files (<200 lines) are kept as single chunks to avoid fragmentation
Embed chunks with voyage-4-large in batches
Store in PostgreSQL with pgvector embeddings
All operations are atomic — Ctrl+C won't corrupt the index

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
cli		cli
scripts		scripts
src/code_context		src/code_context
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.mcp.json.example		.mcp.json.example
LICENSE		LICENSE
README.md		README.md
SPEC.md		SPEC.md
cc2.sh		cc2.sh
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why

Architecture

Quick Start

Prerequisites

1. Clone and configure

2. Start PostgreSQL

3. Install dependencies

4. Index a project

5. Connect to Claude Code

MCP Tools

CLI

Supported Languages

Configuration

Performance

How Indexing Works

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

enzodevs/code-context-v2

Folders and files

Latest commit

History

Repository files navigation

Why

Architecture

Quick Start

Prerequisites

1. Clone and configure

2. Start PostgreSQL

3. Install dependencies

4. Index a project

5. Connect to Claude Code

MCP Tools

CLI

Supported Languages

Configuration

Performance

How Indexing Works

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages