fix: heartbeat timing bugs — reset on sell, missing activation, ConfigMap race

## Summary

The agent heartbeat has multiple timing bugs that cause it to silently fall back to the 30-minute default interval instead of the configured 5-minute interval. Discovered via automated user-flow validation (pi-autoresearch).

## Root Causes

### 1. `SyncAgentBaseURL` resets heartbeat on every `obol sell http`

```
obol sell http → EnsureTunnelForSell → SyncAgentBaseURL → helmfile sync
                                                              ↓
                                              ConfigMap re-rendered WITHOUT heartbeat config
                                                              ↓
                                              OpenClaw falls back to 30m default
```

Every `obol sell http` call triggers a helmfile sync that overwrites the `openclaw-config` ConfigMap. The Helm chart's `_helpers.tpl` does not render `agents.defaults.heartbeat` by default, so the heartbeat config is lost.

**Fix**: Added `patchHeartbeatAfterSync()` — re-patches the ConfigMap with `every: "5m"` after each helmfile sync. Also added idempotency: skip sync entirely when the tunnel URL hasn't changed.

### 2. Heartbeat not activated after `obol agent init`

`Init()` injects `HEARTBEAT.md` but doesn't ensure the ConfigMap has the heartbeat interval set. On a fresh cluster, the first `obol agent init` leaves the heartbeat at the chart default (30m or none).

**Fix**: Added `ensureHeartbeatActive()` — reads the ConfigMap, checks for `agents.defaults.heartbeat`, patches if missing.

### 3. Chokidar hot-reload misses ConfigMap symlink swaps

Kubernetes updates ConfigMaps by swapping symlinks (`..data → ..2026_03_19_...`). The chokidar file watcher inside OpenClaw uses inotify, which doesn't reliably detect symlink target changes. Result: the pod starts with whatever config was present at boot, and ConfigMap patches applied later are silently ignored until the next pod restart.

**Fix**: After patching the heartbeat ConfigMap, perform a `rollout restart` to ensure the new pod starts with the correct config loaded.

### 4. Incorrect pod restart on heartbeat patch (removed)

The old code was restarting the pod after every heartbeat ConfigMap patch, even though OpenClaw's hot-reload should handle it. This caused unnecessary downtime during `obol agent init`.

**Fix**: Removed the pod restart from the patch path. The rollout restart in fix #3 handles the deterministic case.

## Impact

Without these fixes, the heartbeat fires every 30 minutes instead of 5 minutes. This means:
- ServiceOffer reconciliation takes 30+ minutes instead of ~5 minutes
- Users see `obol sell status` stuck in non-Ready state for extended periods
- The monetize guide tells users to "wait ~60s for agent heartbeat" but it actually takes 30 minutes

## Files Changed

| File | Change |
|------|--------|
| `internal/agent/agent.go` | `ensureHeartbeatActive()`, simplified `Init()` |
| `internal/tunnel/agent.go` | `patchHeartbeatAfterSync()`, idempotent sync |
| `internal/tunnel/tunnel.go` | Tunnel stop, storefront cleanup, state management |
| `internal/openclaw/openclaw.go` | Removed incorrect restart from heartbeat patch |
| `internal/stack/stack.go` | Backend detection for host IP resolution |
| `cmd/obol/sell.go` | Tunnel lifecycle (auto-start on sell, auto-stop on last delete) |

## Verification

Validated by pi-autoresearch running 39 experiments with 90/90 flow steps passing, including:
- flow-06: `obol sell http` → poll `obol sell status` → all conditions Ready within 8 minutes
- flow-09: full lifecycle (sell → stop → delete → verify cleanup)

The heartbeat consistently fires within 5 minutes across all test runs.


File	Change
`internal/agent/agent.go`	`ensureHeartbeatActive()`, simplified `Init()`
`internal/tunnel/agent.go`	`patchHeartbeatAfterSync()`, idempotent sync
`internal/tunnel/tunnel.go`	Tunnel stop, storefront cleanup, state management
`internal/openclaw/openclaw.go`	Removed incorrect restart from heartbeat patch
`internal/stack/stack.go`	Backend detection for host IP resolution
`cmd/obol/sell.go`	Tunnel lifecycle (auto-start on sell, auto-stop on last delete)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: heartbeat timing bugs — reset on sell, missing activation, ConfigMap race #280

Summary

Root Causes

1. `SyncAgentBaseURL` resets heartbeat on every `obol sell http`

2. Heartbeat not activated after `obol agent init`

3. Chokidar hot-reload misses ConfigMap symlink swaps

4. Incorrect pod restart on heartbeat patch (removed)

Impact

Files Changed

Verification

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fix: heartbeat timing bugs — reset on sell, missing activation, ConfigMap race #280

Description

Summary

Root Causes

1. SyncAgentBaseURL resets heartbeat on every obol sell http

2. Heartbeat not activated after obol agent init

3. Chokidar hot-reload misses ConfigMap symlink swaps

4. Incorrect pod restart on heartbeat patch (removed)

Impact

Files Changed

Verification

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `SyncAgentBaseURL` resets heartbeat on every `obol sell http`

2. Heartbeat not activated after `obol agent init`