Skip to content

feat: add guard extension point for third-party security scanners#1165

Closed
PaoloC68 wants to merge 5 commits intoagent0ai:developmentfrom
PaoloC68:guard-system
Closed

feat: add guard extension point for third-party security scanners#1165
PaoloC68 wants to merge 5 commits intoagent0ai:developmentfrom
PaoloC68:guard-system

Conversation

@PaoloC68
Copy link

@PaoloC68 PaoloC68 commented Mar 1, 2026

Supersedes #1106 (closed). Rebased onto latest development.

Summary

Adds a guard extension point that lets third-party security scanners (e.g. Cisco AI Skill Scanner) block dangerous tools and prompts before they execute. This builds on top of the existing extension system and the newly merged plugin architecture (#998) — zero new abstractions, just the plumbing that lets plugins say "stop".

Problem

Agent Zero can install and run arbitrary skills, but there is currently no hook where a security tool can inspect and block:

  • A tool call before it executes (e.g. block a skill flagged as malicious)
  • A prompt before it reaches the LLM (e.g. detect injection attacks)
  • A skill at install time (e.g. trigger an automated scan)

Related issues: #1074, #1071, #943, #851

What Changed

1. Mutable event dict in call_extensions() (python/helpers/extension.py)

  • Every extension call now receives and returns an event dict with metadata (extension_point, agent, plus all kwargs)
  • External packages can register guard handlers via importlib.metadata entry points (group: agent_zero.guards)
  • Entry point discovery is cached after first call (no runtime overhead)
  • Guard handler crashes are isolated — one bad handler cannot break the agent loop

2. Blocking logic in agent.py (two sites only)

  • tool_execute_before: if any handler sets event["blocked"] = True, the tool is skipped and a Response is returned with event["block_reason"]
  • message_loop_prompts_after: if any handler sets event["blocked"] = True, the LLM call is skipped and a warning is injected into the message history

3. skill_install extension point (python/helpers/skills_import.py)

  • Fires after each skill is copied into place, passing skill_name and skill_path
  • Enables scanners to run at install time (before the skill is ever used)

4. Guard utilities (python/helpers/guard_utils.py)

  • save_scan_status(skill_name, status) / get_scan_status(skill_name) — JSON file per skill in usr/skill_scans/
  • Status constants: SAFE, NEEDS_REVIEW, BLOCKED

5. Example guard extensions (drop-in, no config needed)

  • python/extensions/tool_execute_before/_05_scan_status_guard.py — blocks tools linked to skills with BLOCKED scan status
  • python/extensions/message_loop_prompts_after/_05_prompt_length_guard.py — detects prompt injection patterns and oversized prompts

6. Test suite (tests/test_guard_system.py)

  • 16 tests covering: event dict mutation, entry point discovery, cache behavior, crash isolation, guard_utils persistence, both example guards

Changes

File Change
agent.py Add blocking logic at tool_execute_before and message_loop_prompts_after
python/helpers/extension.py Mutable event dict, entry point guard discovery, cache
python/helpers/guard_utils.py Scan status persistence (save/get per skill)
python/helpers/skills_import.py skill_install extension point hook
python/extensions/tool_execute_before/_05_scan_status_guard.py Example: block tools with BLOCKED scan status
python/extensions/message_loop_prompts_after/_05_prompt_length_guard.py Example: detect prompt injection patterns
tests/test_guard_system.py 16 tests for the full guard system

Design Decisions

  • No new abstractions: Guards are just extension handlers that set event["blocked"].
  • Only two blocking sites: tool_execute_before and message_loop_prompts_after. Minimal surface, maximum impact.
  • Entry points for external packages: a pip-installable package registers via [project.entry-points."agent_zero.guards"] — discovered automatically.
  • Filename-prefix ordering: _05_ prefix runs before default _10_ extensions, following existing convention.
  • No existing extensions modified: all changes are additive.

Testing

python -m pytest tests/test_guard_system.py -v

All 16 tests pass. Rebased cleanly onto latest development (one conflict in extension.py resolved — kept upstream's _CACHE_AREA + extensible decorator alongside guard's _guard_cache).

@frdel
Copy link
Collaborator

frdel commented Mar 1, 2026

Hello, this is not something that should be implemented in agent.py directly, we're building the plugin system for this.

@PaoloC68
Copy link
Author

PaoloC68 commented Mar 3, 2026

Fair point — you're right that this doesn't belong in agent.py.

The guard system already works as a standalone plugin with zero core changes:

  • Tool blocking uses RepairableException (caught by the existing message loop)
  • Prompt scanning writes to loop_data.extras_temporary (no event dict needed)
  • Scan status persistence uses get_plugin_config/save_plugin_config

No modifications to agent.py, extension.py, or skills_import.py required.

I'll publish it to the plugin index instead. Thanks for the feedback.

@PaoloC68 PaoloC68 closed this Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants