Skip to content

feat: add shell tool and tests#39

Open
notowen333 wants to merge 1 commit intomainfrom
maintainer/notowen333-shell-tool
Open

feat: add shell tool and tests#39
notowen333 wants to merge 1 commit intomainfrom
maintainer/notowen333-shell-tool

Conversation

@notowen333
Copy link

@notowen333 notowen333 commented Mar 6, 2026

Issue #, if available:
https://github.com/strands-agents/private-sdk-python-staging/issues/340

Description of changes:

  • simple subprocess based shell avoiding full terminal emulation with pty

Distinct from TS:

  • persistent shell (i.e. cd and export will persist across tool calls)
  • binary encoded output is parsed
  • standard error and standard out write to the same pipe to avoid race conditions
  • clean garbage collection tied to agent lifecycle
  • uses default shell instead of specifying bash

First PR to add pytest tests to the python scripts section of the repo, so needed to add __init__ and requirements.txt

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

"""Test basic shell command execution through agent."""
result = agent("Run: echo 'hello'")
result_str = str(result)
print(f"\nBasic execution result: {result_str[:300]}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add validation that the tool is being called; I think the easiest way is to validate the messages on the agent?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added tool call validation on all the tests

print("\n--- Complex Multi-Step Workflow Test ---")

# Step 1: Create temporary directory structure
result1 = agent("Create a temp directory at /tmp/shell_test_$$ and cd into it, then confirm with pwd")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a temp directory and instruct it to write there: https://github.com/strands-agents/sdk-python/blob/2d766c4e6974732590439a79ab10c4f07f2d1806/tests_integ/test_invalid_tool_names.py#L10

Then can we assert the actual directory instead of taking the agent output.

In general I want to verify the actual behavior instead of assuming that the agent output text indicating that it did it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense fixed on both counts

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made similar fixes for the remaining tests to use the temp directory to verify instead of the agent response string.

from ..shell_tool import ShellSession, shell_tool


class TestShellSession:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have it written anywhere that I can find) but we just do top level functions for tests; no classes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure flattened


def __init__(self, timeout: int = 30):
self.timeout = timeout
self.process = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can timeout and process be private?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

self._alive = False
self._buffer_condition.notify_all()

def run(self, command: str, timeout: int | None = None) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we always have some sort of default timeout? I'm a little concerned about commands taht hang the agent- but maybe that's not our concern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed today that Kiro timedout (though noticed that it was still running) - this reinforces my belief of a default timeout. That might move us to a class-based tool so that we can also allow folks to specify the default timeout

Copy link
Author

@notowen333 notowen333 Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default timeout is 30 seconds see above code snippet. Is it not enough to accept timeout arg in the tool call? If this is the only configuration a class based tool seems like overkill for the pattern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add to the PR description how this compares to the TS implementation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of the traits I noted are the differentiators. Separated it out in the description.

Uses the system default shell ($SHELL, defaulting to /bin/bash) with clean
startup configuration (--noprofile --norc for bash, -f for zsh).

**Supported commands:**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? Listing specific commands seems odd when we're not providing them

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the specific supported commands but left unsupported

if restart and (not command or command.strip() == ""):
if hasattr(tool_context.agent, SESSION_ATTR):
getattr(tool_context.agent, SESSION_ATTR).stop()
setattr(tool_context.agent, SESSION_ATTR, ShellSession())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this doing? Are we modifying the agent's properties?

We should find another way of doing this. (this could indicate a gap in the SDK FWIW)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is storing a ShellSession on the agent object itself for persistence.

Something like this would be the main alternative:

_sessions = {}  # Module-level

@tool(context=True)
def shell_tool(command: str, ..., tool_context: ToolContext = None):
    agent_id = id(tool_context.agent)  # Use memory address as key
    
    if agent_id not in _sessions:
        _sessions[agent_id] = ShellSession()
    
    return _sessions[agent_id].run(command)

@notowen333 notowen333 force-pushed the maintainer/notowen333-shell-tool branch from 98bd484 to db37c44 Compare March 9, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants