Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions BENCHMARK.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# KCP Benchmark Results — CrewAI

## Summary

**76% reduction in tool calls** when using the Knowledge Context Protocol (KCP) manifest compared to unguided repository exploration.

- Baseline total: **123 tool calls**
- KCP total: **30 tool calls**
- Saved: **93 tool calls** across 8 queries

## Results Table

| Query | Baseline | KCP | Saved |
| :---- | -------: | --: | ----: |
| What is the difference between Flows and Crews in CrewAI? | 14 | 2 | 12 |
| How do I create my first agent and assign it a task? | 7 | 3 | 4 |
| How do I create a custom tool for my agent? | 8 | 3 | 5 |
| How do I add memory to my crew? | 7 | 3 | 4 |
| Which LLM providers does CrewAI support? | 17 | 5 | 12 |
| How do I build a flow that triggers a crew? | 15 | 2 | 13 |
| How do I implement a hierarchical crew with a manager agent? | 22 | 9 | 13 |
| How do I add knowledge (RAG) to my crew? | 33 | 3 | 30 |
| **TOTAL** | **123** | **30** | **93** |

## Methodology

Each query was run twice against a local clone of the CrewAI repository:

1. **Baseline**: The agent was told the repository path and instructed to explore it freely using `read_file`, `glob_files`, and `grep_content` tools to find the answer.
2. **KCP**: The agent was instructed to first read `knowledge.yaml`, match the query against unit triggers, and read only the files pointed to by matching units — preferring TL;DR summary files when available.

Both runs used `claude-haiku-4-5-20251001` with `max_tokens=2048` and up to 20 turns. Tool call counts measure retrieval efficiency only (not answer quality).

## Findings

The KCP manifest delivered a **76% reduction in tool calls**, with the largest gains on broad or unfamiliar queries. The "knowledge (RAG)" query showed the most dramatic improvement (33 → 3 calls, 91% reduction): without KCP the agent recursively explored the docs directory; with KCP it read `knowledge.yaml`, matched the `rag crew` trigger directly to `tools-memory-tldr.mdx`, and answered immediately. The hierarchical crew query had the smallest relative gain (22 → 9), because the answer required reading the full `crews.mdx` and `tasks.mdx` even with guidance — demonstrating that KCP eliminates exploration overhead but cannot shrink inherently large source files.
165 changes: 165 additions & 0 deletions benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
import anthropic
import os
import glob as glob_module
import subprocess
from pathlib import Path

client = anthropic.Anthropic()

REPO_ROOT = os.path.dirname(os.path.abspath(__file__))
_REPO_ROOT_REAL = Path(os.path.realpath(REPO_ROOT))


def _within_repo(path: str) -> bool:
"""Return True if path resolves to a location inside REPO_ROOT."""
try:
Path(os.path.realpath(path)).relative_to(_REPO_ROOT_REAL)
return True
except (ValueError, OSError):
return False

TOOLS = [
{
"name": "read_file",
"description": "Read the content of a file",
"input_schema": {
"type": "object",
"properties": {"path": {"type": "string"}},
"required": ["path"]
}
},
{
"name": "glob_files",
"description": "Find files matching a pattern",
"input_schema": {
"type": "object",
"properties": {
"pattern": {"type": "string"},
"base_dir": {"type": "string"}
},
"required": ["pattern"]
}
},
{
"name": "grep_content",
"description": "Search for text in files",
"input_schema": {
"type": "object",
"properties": {
"pattern": {"type": "string"},
"path": {"type": "string"}
},
"required": ["pattern", "path"]
}
}
]

def execute_tool(tool_name, tool_input):
if tool_name == "read_file":
path = tool_input["path"]
if not _within_repo(path):
return "Error: access denied — path is outside the repository"
try:
with open(path, 'r', encoding='utf-8', errors='replace') as f:
content = f.read()
if len(content) > 8000:
content = content[:8000] + "\n...[truncated]"
return content
except Exception as e:
return f"Error: {e}"
elif tool_name == "glob_files":
pattern = tool_input["pattern"]
base = tool_input.get("base_dir", REPO_ROOT)
if not _within_repo(base):
base = REPO_ROOT
if not pattern.startswith("/"):
pattern = os.path.join(base, pattern)
matches = [m for m in glob_module.glob(pattern, recursive=True) if _within_repo(m)]
return "\n".join(matches[:20]) if matches else "No files found"
elif tool_name == "grep_content":
pattern = tool_input["pattern"]
path = tool_input["path"]
if not _within_repo(path):
return "Error: access denied — path is outside the repository"
try:
result = subprocess.run(
["grep", "-r", "-l", "-m", "5", "-e", pattern, path],
capture_output=True, text=True, timeout=10
)
return result.stdout[:2000] if result.stdout else "No matches"
except Exception as e:
return f"Error: {e}"
return "Unknown tool"

def run_agent(system_prompt, query, max_turns=20):
messages = [{"role": "user", "content": query}]
tool_count = 0
for _ in range(max_turns):
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=2048,
system=system_prompt,
tools=TOOLS,
messages=messages
)
tool_uses = [b for b in response.content if b.type == "tool_use"]
tool_count += len(tool_uses)
if response.stop_reason == "end_turn" or not tool_uses:
return "", tool_count
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for tool_use in tool_uses:
result = execute_tool(tool_use.name, tool_use.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": result
})
messages.append({"role": "user", "content": tool_results})
return "", tool_count

BASELINE_PROMPT = f"""You are a helpful assistant answering questions about the CrewAI framework.
The repository is at {REPO_ROOT}.
Use the available tools to read files and find the answer.
Start by exploring the repository structure to understand where to find information."""

KCP_PROMPT = f"""You are a helpful assistant answering questions about the CrewAI framework.
The repository is at {REPO_ROOT}.
IMPORTANT: First read {REPO_ROOT}/knowledge.yaml to understand the repository structure.
Match the question to the triggers in knowledge.yaml and read only the files pointed to by matching units.
If a unit has summary_available: true, read the summary_unit file first (it's much smaller)."""

QUERIES = [
"What is the difference between Flows and Crews in CrewAI?",
"How do I create my first agent and assign it a task?",
"How do I create a custom tool for my agent?",
"How do I add memory to my crew?",
"Which LLM providers does CrewAI support?",
"How do I build a flow that triggers a crew?",
"How do I implement a hierarchical crew with a manager agent?",
"How do I add knowledge (RAG) to my crew?",
]

if __name__ == "__main__":
print("CrewAI KCP Benchmark")
print("=" * 60)
results = []
for i, query in enumerate(QUERIES):
print(f"\nQuery {i+1}: {query[:60]}...")
_, baseline = run_agent(BASELINE_PROMPT, query)
print(f" Baseline: {baseline} tool calls")
_, kcp = run_agent(KCP_PROMPT, query)
print(f" KCP: {kcp} tool calls")
results.append((query, baseline, kcp))

print("\n" + "=" * 60)
total_baseline = sum(r[1] for r in results)
total_kcp = sum(r[2] for r in results)
print(f"\n{'Query':<55} {'Base':>5} {'KCP':>5} {'Saved':>6}")
print("-" * 75)
for query, b, k in results:
print(f"{query[:55]:<55} {b:>5} {k:>5} {b-k:>6}")
print("-" * 75)
print(f"{'TOTAL':<55} {total_baseline:>5} {total_kcp:>5} {total_baseline-total_kcp:>6}")
pct = round((1 - total_kcp/total_baseline) * 100) if total_baseline > 0 else 0
print(f"\nReduction: {pct}% fewer tool calls with KCP")
80 changes: 80 additions & 0 deletions docs/en/concepts/agents-tasks-tldr.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
title: Agents & Tasks (TL;DR)
description: The 5 key agent attributes and how to define tasks — quick reference
icon: robot
---

## Agent: The 5 Key Attributes

An `Agent` is an autonomous unit with a role, a goal, and the tools to get things done.

| Attribute | Parameter | What it does |
| :--- | :--- | :--- |
| **Role** | `role` | Defines the agent's function and expertise |
| **Goal** | `goal` | The individual objective guiding decisions |
| **Backstory** | `backstory` | Context and personality enriching interactions |
| **Tools** | `tools` | List of capabilities the agent can use (default: `[]`) |
| **LLM** | `llm` | The language model powering the agent (default: `gpt-4o`) |

### Minimal Agent Example

```python
from crewai import Agent

researcher = Agent(
role="Research Analyst",
goal="Find accurate, up-to-date information on any topic",
backstory="An expert at gathering data from multiple sources and identifying key insights.",
tools=[], # add tools here, e.g. SerperDevTool()
verbose=True, # enable logs for debugging
)
```

## Task: Description + Expected Output + Agent

A `Task` is a specific assignment given to an agent. Two fields are required; agent assignment is strongly recommended.

| Attribute | Parameter | What it does |
| :--- | :--- | :--- |
| **Description** | `description` | What the agent must do |
| **Expected Output** | `expected_output` | What a successful completion looks like |
| **Agent** | `agent` | Which agent handles this task |
| **Context** | `context` | Other tasks whose outputs feed into this one |
| **Output File** | `output_file` | Save output to a file path |

### Minimal Task Example

```python
from crewai import Task

research_task = Task(
description="Research the top 5 AI frameworks released in 2025 and summarize their key features.",
expected_output="A markdown list of 5 frameworks with name, key feature, and one-sentence summary.",
agent=researcher,
)
```

## Process Types: How Tasks Are Executed

```python
from crewai import Crew, Process

# Sequential (default): tasks run in order, output feeds into the next
crew = Crew(agents=[...], tasks=[...], process=Process.sequential)

# Hierarchical: a manager LLM assigns tasks based on agent capabilities
crew = Crew(
agents=[...],
tasks=[...],
process=Process.hierarchical,
manager_llm="gpt-4o", # required for hierarchical
)
```

**Sequential** — use when tasks have a clear order and each builds on the previous.
**Hierarchical** — use when tasks should be dynamically assigned by a manager agent.

## Full Reference

- All agent attributes: [concepts/agents.mdx](/en/concepts/agents)
- All task attributes: [concepts/tasks.mdx](/en/concepts/tasks)
102 changes: 102 additions & 0 deletions docs/en/concepts/flows-tldr.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
title: Flows (TL;DR)
description: Event-driven workflows with state management — quick reference
icon: arrow-progress
---

## What is a Flow?

A Flow is the control plane of your CrewAI application. It chains tasks together with state, conditional logic, and event-driven triggers. Crews run *inside* Flow steps when you need autonomous agent intelligence.

## Key Decorators

| Decorator | Purpose |
| :--- | :--- |
| `@start()` | Entry point — runs when `flow.kickoff()` is called |
| `@listen(method)` | Runs after the specified method completes, receives its output |
| `@router(method)` | Routes execution to different branches based on return value |
| `@and_(a, b)` | Runs only after **both** `a` and `b` complete |
| `@or_(a, b)` | Runs when **either** `a` or `b` completes |

## Minimal Flow + Crew Example

```python
from crewai import Agent, Task, Crew
from crewai.flow.flow import Flow, start, listen

class ResearchFlow(Flow):
# State is a dict accessible as self.state throughout the flow
model = "gpt-4o-mini"

@start()
def get_topic(self):
# Set initial state
self.state["topic"] = "AI agent frameworks"
return self.state["topic"]

@listen(get_topic)
def run_research_crew(self, topic):
# Spin up a Crew inside a Flow step
researcher = Agent(
role="Research Analyst",
goal=f"Research {topic} thoroughly",
backstory="Expert at finding and synthesizing information.",
)
task = Task(
description=f"Research the latest developments in {topic}.",
expected_output="A 3-bullet summary of key findings.",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task], verbose=False)
result = crew.kickoff()
self.state["research"] = str(result)
return str(result)

@listen(run_research_crew)
def save_result(self, research):
# Flow step: save to file (plain Python, no agent needed)
with open("research_output.txt", "w") as f:
f.write(research)
print("Saved!")
return research


flow = ResearchFlow()
result = flow.kickoff()
```

## Routing Example

```python
from crewai.flow.flow import Flow, start, listen, router

class BranchingFlow(Flow):
@start()
def check_input(self):
return "short" # or "long"

@router(check_input)
def route_by_length(self, result):
if result == "short":
return "handle_short"
return "handle_long"

@listen("handle_short")
def short_path(self):
return "Quick answer"

@listen("handle_long")
def long_path(self):
return "Detailed analysis"
```

## State Management

- `self.state` is a dict persisted across all steps in the flow
- Every flow instance gets a unique UUID at `self.state["id"]`
- State is accessible in every `@start`, `@listen`, and `@router` method

## Full Reference

- Complete Flows docs: [concepts/flows.mdx](/en/concepts/flows)
- Step-by-step tutorial: [guides/flows/first-flow.mdx](/en/guides/flows/first-flow)
Loading