Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 170 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# PageIndex Getting Started

This guide helps you set up PageIndex from scratch and run your first document.

## What PageIndex does

PageIndex builds a hierarchical tree structure from long documents and uses LLM reasoning for retrieval. The workflow is vectorless (no vector DB and no chunking-based retrieval).

## Prerequisites

- Python 3.8+
- macOS/Linux/Windows terminal
- One API key for an OpenAI-compatible provider

## 1) Install dependencies

### Option A: UV (recommended)

```bash
# install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh
# or: brew install uv

cd /Users/huidongdezhizhen/Desktop/PageIndex
./setup_uv.sh
source .venv/bin/activate
```

### Option B: pip

```bash
cd /Users/huidongdezhizhen/Desktop/PageIndex
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade -r requirements.txt
```

## 2) Configure model provider

Edit `.env`:

```bash
CHATGPT_API_KEY=your_api_key
OPENAI_API_BASE=https://provider-endpoint/v1
OPENAI_MODEL=provider-model-id
```

Example (Qwen):

```bash
CHATGPT_API_KEY=sk-your-qwen-key
OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1
OPENAI_MODEL=qwen-max
```

To switch providers quickly:

```bash
./switch_model.sh
```

## 3) Run PageIndex

### Process PDF

```bash
python run_pageindex.py --pdf_path /path/to/document.pdf
```

### Process Markdown

```bash
python run_pageindex.py --md_path /path/to/document.md
```

### Check outputs

```bash
ls -la ./results/
```

Generated file pattern:

- `{input_name}_structure.json`

## Common options

```bash
python run_pageindex.py \
--pdf_path /path/to/document.pdf \
--model gpt-4o \
--toc-check-pages 20 \
--max-pages-per-node 10 \
--max-tokens-per-node 20000 \
--if-add-node-id yes \
--if-add-node-summary yes \
--if-add-doc-description no \
--if-add-node-text no
```

Markdown-specific options:

```bash
python run_pageindex.py \
--md_path /path/to/document.md \
--if-thinning yes \
--thinning-threshold 5000 \
--summary-token-threshold 200
```

## Example output structure

```json
{
"title": "Financial Stability",
"node_id": "0006",
"start_index": 21,
"end_index": 22,
"summary": "...",
"nodes": [
{
"title": "Monitoring Financial Vulnerabilities",
"node_id": "0007",
"start_index": 22,
"end_index": 28,
"summary": "..."
}
]
}
```

## Troubleshooting

### API key not found

- Ensure `.env` exists in repo root.
- Ensure `CHATGPT_API_KEY` is set.
- Re-activate environment and retry.

### Dependency install failure

```bash
pip install --upgrade pip
pip install --upgrade -r requirements.txt
```

### PDF parse issues

- Test with a smaller clean PDF first.
- Verify file path and file integrity.
- For scanned PDFs, use OCR-first workflows.

### Token limit exceeded

Lower node size:

```bash
python run_pageindex.py \
--pdf_path file.pdf \
--max-tokens-per-node 15000 \
--max-pages-per-node 5
```

## Next docs

- [Quick Reference](quick-reference.md)
- [Multi-Model Configuration](multi-model-configuration.md)
- [Global and Custom Models](global-and-custom-models.md)
- [Qwen Configuration](qwen-configuration.md)
- [UV Quick Reference](uv-quick-reference.md)
183 changes: 183 additions & 0 deletions docs/global-and-custom-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Global and Custom Model Configuration

This guide covers non-mainland providers, aggregator platforms, local deployment, and fully custom OpenAI-compatible endpoints.

## Major global providers

### Anthropic Claude

```bash
CHATGPT_API_KEY=sk-ant-your-key
OPENAI_API_BASE=https://api.anthropic.com/v1
OPENAI_MODEL=claude-3-5-sonnet-20241022
```

Common model IDs:
- `claude-4-5-sonnet-20251022`
- `claude-4-opus-20250229`
- `claude-4-sonnet-20250229`
- `claude-4-haiku-20250307`

API key: <https://console.anthropic.com/settings/keys>

### Google Gemini

```bash
CHATGPT_API_KEY=your-gemini-api-key
OPENAI_API_BASE=https://generativelanguage.googleapis.com/v1beta/openai
OPENAI_MODEL=gemini-2.0-flash-exp
```

API key: <https://aistudio.google.com/apikey>

### Mistral

```bash
CHATGPT_API_KEY=your-mistral-api-key
OPENAI_API_BASE=https://api.mistral.ai/v1
OPENAI_MODEL=mistral-large-latest
```

API key: <https://console.mistral.ai/api-keys>

### Cohere

```bash
CHATGPT_API_KEY=your-cohere-api-key
OPENAI_API_BASE=https://api.cohere.ai/v1
OPENAI_MODEL=command-r-plus
```

API key: <https://dashboard.cohere.com/api-keys>

### Perplexity

```bash
CHATGPT_API_KEY=pplx-your-key
OPENAI_API_BASE=https://api.perplexity.ai
OPENAI_MODEL=llama-3.1-sonar-large-128k-online
```

API key: <https://www.perplexity.ai/settings/api>

### Groq (fast inference)

```bash
CHATGPT_API_KEY=gsk_your_groq_key
OPENAI_API_BASE=https://api.groq.com/openai/v1
OPENAI_MODEL=llama-3.3-70b-versatile
```

API key: <https://console.groq.com/keys>

## Aggregation platforms

### OpenRouter (recommended)

```bash
CHATGPT_API_KEY=sk-or-v1-your-key
OPENAI_API_BASE=https://openrouter.ai/api/v1
OPENAI_MODEL=anthropic/claude-3.5-sonnet
```

Why teams use it:
- One API key for many model families
- Easy model A/B comparisons
- Centralized usage and cost metrics

Docs: <https://openrouter.ai/docs>

### Together AI

```bash
CHATGPT_API_KEY=your-together-key
OPENAI_API_BASE=https://api.together.xyz/v1
OPENAI_MODEL=meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
```

### Fireworks AI

```bash
CHATGPT_API_KEY=fw-your-fireworks-key
OPENAI_API_BASE=https://api.fireworks.ai/inference/v1
OPENAI_MODEL=accounts/fireworks/models/llama-v3p1-70b-instruct
```

## Local deployment options

### Ollama (easiest local path)

```bash
# install and start
brew install ollama
ollama serve

# pull model
ollama pull llama3.1:70b

# configure PageIndex
CHATGPT_API_KEY=ollama
OPENAI_API_BASE=http://localhost:11434/v1
OPENAI_MODEL=llama3.1:70b
```

### LM Studio

```bash
CHATGPT_API_KEY=lm-studio
OPENAI_API_BASE=http://localhost:1234/v1
OPENAI_MODEL=local-model-name
```

### vLLM (production-style serving)

```bash
pip install vllm
python -m vllm.entrypoints.openai.api_server \
--model meta-llama/Meta-Llama-3.1-70B-Instruct \
--port 8000

# PageIndex config
CHATGPT_API_KEY=vllm
OPENAI_API_BASE=http://localhost:8000/v1
OPENAI_MODEL=meta-llama/Meta-Llama-3.1-70B-Instruct
```

### Text Generation WebUI

```bash
CHATGPT_API_KEY=textgen
OPENAI_API_BASE=http://localhost:5000/v1
OPENAI_MODEL=your-loaded-model
```

## Custom endpoint template

Use this for private hosting, internal gateways, or proxy services.

```bash
CHATGPT_API_KEY=your-custom-api-key
OPENAI_API_BASE=https://your-endpoint.example.com/v1
OPENAI_MODEL=your-model-id
```

## Validation

```bash
python test_qwen_api.py
python run_pageindex.py --pdf_path tests/pdfs/q1-fy25-earnings.pdf
```

## Recommendations

- Highest English quality: Claude or GPT-4o
- Best speed/cost: Groq or DeepSeek
- Highest privacy: local Ollama or vLLM
- Most flexible routing: OpenRouter

## Cautions

1. OpenAI-compatibility is sometimes partial; provider-specific differences may appear.
2. Rate limits vary by provider and plan.
3. Cloud APIs send document content to provider infrastructure.
4. Add usage caps to prevent cost surprises.
Loading