Skip to content

feat: Add option to propagate root rewards to subgagents#13

Open
ApGa wants to merge 1 commit intomainfrom
apga/root_reward_prop
Open

feat: Add option to propagate root rewards to subgagents#13
ApGa wants to merge 1 commit intomainfrom
apga/root_reward_prop

Conversation

@ApGa
Copy link
Copy Markdown
Owner

@ApGa ApGa commented Mar 24, 2026

No description provided.

shady-cs15 pushed a commit to shady-cs15/platoon that referenced this pull request Mar 24, 2026
…egation bonus

Replaces the data-processing-level reward override with rollout-level
propagation inspired by ApGa#13:

- Add shared utility platoon/utils/subagent_rewards.py with
  propagate_root_success() that copies root reward/success into child
  trajectories and rewrites subagent_succeeded so the standard
  reward_processor naturally computes correct delegation bonuses
- Add propagate_root_success flag to RolloutConfig
- Remove root_reward_propagation from WorkflowConfig, step_wise.py,
  and areal_data_processing.py (no longer needed)
- Simplify make_reward_processor to single code path

Result: leaf agents (no delegation) get base success (1.0), intermediate
agents that delegated get full reward with bonus (1.4).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant