Skip to content

feat(kiloclaw): forward proxy token and enable enforcement default#599

Merged
pandemicsyn merged 2 commits intomainfrom
florian/phase2
Feb 26, 2026
Merged

feat(kiloclaw): forward proxy token and enable enforcement default#599
pandemicsyn merged 2 commits intomainfrom
florian/phase2

Conversation

@pandemicsyn
Copy link
Contributor

@pandemicsyn pandemicsyn commented Feb 26, 2026

Summary

  • derive and inject x-kiloclaw-proxy-token on all worker proxy requests using the sandbox-scoped gateway token
  • keep Fly machine pinning behavior in one helper (fly-force-instance-id) and add unit coverage for header construction
  • set REQUIRE_PROXY_TOKEN to true in worker vars so new/restarted instances enforce proxy-token auth by default
  • reduce public controller health response surface: /_kilo/health and /health now return only {\"status\":\"ok\"}

Derive and inject per-sandbox proxy tokens on all worker-to-Fly proxy requests so gateway auth can be safely enforced. Turn on REQUIRE_PROXY_TOKEN by default so new and restarted instances adopt enforcement immediately.
@pandemicsyn pandemicsyn marked this pull request as ready for review February 26, 2026 02:01
const gatewayToken = await deriveGatewayToken(sandboxId, gatewayTokenSecret);
forwardHeaders.set('x-kiloclaw-proxy-token', gatewayToken);
forwardHeaders.set('fly-force-instance-id', machineId);
forwardHeaders.delete('host');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Incomplete hop-by-hop header stripping when forwarding request headers

Only host is removed, but connection and headers listed by connection (for example upgrade, te, proxy-connection, keep-alive, transfer-encoding) should also be stripped before proxying. Forwarding hop-by-hop headers can cause protocol inconsistencies and request smuggling-style edge cases across intermediaries.

Copy link
Contributor Author

@pandemicsyn pandemicsyn Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kinda true kinda not, for WS Connection/Upgrade are part of the handshake which is why haven't done strict hop-by-hop stripping.

@kilo-code-bot
Copy link
Contributor

kilo-code-bot bot commented Feb 26, 2026

Code Review Summary

Status: 3 Issues Found | Recommendation: Address before merge

Fix these issues in Kilo Cloud

Overview

Severity Count
CRITICAL 0
WARNING 3
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
kiloclaw/controller/src/routes/health.ts 5 Health endpoint now always reports healthy, masking startup/crash states and potentially routing traffic to unavailable gateways.
kiloclaw/src/index.ts 253 DEV_MODE env bypass is unintentionally negated in the proxy path due to an unconditional GATEWAY_TOKEN_SECRET guard.
kiloclaw/src/utils/proxy-headers.ts 15 Hop-by-hop header stripping is incomplete (host only), risking proxy protocol inconsistencies and edge-case request handling bugs.
Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File Line Issue
None - No additional out-of-diff issues identified.
Files Reviewed (6 files)
  • kiloclaw/controller/src/routes/health.test.ts - 0 issues
  • kiloclaw/controller/src/routes/health.ts - 1 issue
  • kiloclaw/src/index.ts - 1 issue
  • kiloclaw/src/utils/proxy-headers.ts - 1 issue
  • kiloclaw/src/utils/proxy-headers.test.ts - 0 issues
  • kiloclaw/wrangler.jsonc - 0 issues

Return a generic  response for  and  to avoid exposing runtime metadata in unauthenticated probes while preserving Fly health-check compatibility.
);
};
export function registerHealthRoute(app: Hono, _supervisor: Supervisor): void {
const handler = (c: Context) => c.json({ status: 'ok' });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Health endpoint now always reports healthy, masking startup/crash states

This route used to return 503 when the gateway was not running. Returning 200/{ status: 'ok' } unconditionally can make readiness checks and external monitors treat unhealthy instances as healthy, which risks routing traffic to instances that are still starting or have crashed.

@pandemicsyn pandemicsyn merged commit f64a77b into main Feb 26, 2026
12 checks passed
@pandemicsyn pandemicsyn deleted the florian/phase2 branch February 26, 2026 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants