Skip to content

feat(plugins): add guard extension point for third-party security scanners#1106

Closed
PaoloC68 wants to merge 5 commits intoagent0ai:developmentfrom
PaoloC68:guard-system
Closed

feat(plugins): add guard extension point for third-party security scanners#1106
PaoloC68 wants to merge 5 commits intoagent0ai:developmentfrom
PaoloC68:guard-system

Conversation

@PaoloC68
Copy link

Summary

Adds a guard extension point that lets third-party security scanners (e.g. Cisco AI Skill Scanner) block dangerous tools and prompts before they execute. This builds on top of the existing extension system and the newly merged plugin architecture (#998) — zero new abstractions, just the plumbing that lets plugins say "stop".

Problem

Agent Zero can install and run arbitrary skills, but there is currently no hook where a security tool can inspect and block:

  • A tool call before it executes (e.g. block a skill flagged as malicious)
  • A prompt before it reaches the LLM (e.g. detect injection attacks)
  • A skill at install time (e.g. trigger an automated scan)

Related issues: #1074, #1071, #943, #851

What Changed

1. Mutable event dict in call_extensions() (python/helpers/extension.py)

  • Every extension call now receives and returns an event dict with metadata (extension_point, agent, plus all kwargs)
  • External packages can register guard handlers via importlib.metadata entry points (group: agent_zero.guards)
  • Entry point discovery is cached after first call (no runtime overhead)
  • Guard handler crashes are isolated — one bad handler cannot break the agent loop

2. Blocking logic in agent.py (two sites only)

  • tool_execute_before: if any handler sets event["blocked"] = True, the tool is skipped and a Response is returned with event["block_reason"]
  • message_loop_prompts_after: if any handler sets event["blocked"] = True, the LLM call is skipped and a warning is injected into the message history

3. skill_install extension point (python/helpers/skills_import.py)

  • Fires after each skill is copied into place, passing skill_name and skill_path
  • Enables scanners to run at install time (before the skill is ever used)

4. Guard utilities (python/helpers/guard_utils.py)

  • save_scan_status(skill_name, status) / get_scan_status(skill_name) — JSON file per skill in usr/skill_scans/
  • Status constants: SAFE, NEEDS_REVIEW, BLOCKED

5. Example guard extensions (drop-in, no config needed)

  • python/extensions/tool_execute_before/_05_scan_status_guard.py — blocks tools linked to skills with BLOCKED scan status
  • python/extensions/message_loop_prompts_after/_05_prompt_length_guard.py — detects prompt injection patterns and oversized prompts

6. Test suite (tests/test_guard_system.py)

  • 16 tests covering: event dict mutation, entry point discovery, cache behavior, crash isolation, guard_utils persistence, both example guards

Design Decisions

  • No new abstractions: no Signal/Receiver/Decorator classes, no abstract base classes, no metaclass registries. Guards are just extension handlers that can set event["blocked"].
  • Only two blocking sites: tool_execute_before and message_loop_prompts_after. Minimal surface, maximum impact.
  • Entry points for external packages: a pip-installable package (like agent-zero-cisco-guard) registers via [project.entry-points."agent_zero.guards"] — discovered automatically, no manual config.
  • Filename-prefix ordering: _05_ prefix runs before default _10_ extensions, following existing convention.
  • No existing extensions modified: all changes are additive.

How to Test

# Run the guard system tests
python -m pytest tests/test_guard_system.py -v

# Example: install a third-party guard plugin
pip install agent-zero-cisco-guard  # registers entry point automatically

Example: Writing a Guard Plugin

# In your plugin's entry point handler:
async def my_guard_handler(event, **kwargs):
    if event["extension_point"] == "tool_execute_before":
        if is_dangerous(event["tool_name"]):
            event["blocked"] = True
            event["block_reason"] = "Tool flagged as dangerous"
# pyproject.toml
[project.entry-points."agent_zero.guards"]
my_guard = "my_package.handlers:my_guard_handler"

@PaoloC68
Copy link
Author

Closing this PR. The guard system has been rewritten as a standalone plugin following the new plugin architecture announced in the Feb 18 update.

New plugin repo: https://github.com/PaoloC68/a0-guard-system

The core framework changes (mutable event dict, entry_point guard discovery) are no longer needed — the plugin uses RepairableException from tool_execute_before extensions for blocking, and loop_data.extras_temporary for prompt warnings. This works within the existing extension system without any core modifications.

Will submit to the Plugin Index once the guard system is tested in production.

@PaoloC68
Copy link
Author

PaoloC68 commented Mar 1, 2026

Superseded by #1165 — rebased onto latest development and resubmitted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant