feat: autoresearch-at-home integration — GPU marketplace + optimized inference selling

## Vision

Three-sided marketplace on obol-stack powered by [autoresearch](https://github.com/karpathy/autoresearch) and its distributed fork [autoresearch-at-home](https://github.com/mutable-state-inc/autoresearch-at-home):

1. **GPU contributors** sell compute time to the autoresearch swarm, paid via x402
2. **Researchers** run distributed experiments across GPU workers, discovering them via ERC-8004
3. **Service builders** take autoresearch-optimized models and sell apps/inference on top via x402

## Context

**Autoresearch** is Andrej Karpathy's autonomous LLM optimization framework — an AI agent iterates on `train.py` (architecture, hyperparams, optimizer), trains for 5 minutes per experiment, measures `val_bpb` (bits per byte), keeps improvements, reverts failures. ~12 experiments/hour, ~100 overnight.

**Autoresearch-at-home** is the SETI@home-style fork: multiple agents on different GPUs collaborating through a shared coordination layer (currently Ensue). Adds claiming, result publishing, global best tracking, collective intelligence.

**Obol-stack** already has the payment and discovery infrastructure: ServiceOffer CRD, x402 ForwardAuth, ERC-8004 registration, buy-side sidecar. The integration connects autoresearch's GPU demand with obol-stack's payment rails.

## User Journeys

### Journey 1: GPU Contributor (earn money)
```
Run worker_api.py on bare metal GPU
→ obol sell http gpu-worker --upstream localhost:8080 --per-hour 0.50
→ x402 gates your GPU → researchers pay per-experiment
→ Worker registered on ERC-8004 with OASF skill: machine_learning/model_optimization
```

### Journey 2: Researcher (optimize models)
```
coordinate.py discover → find GPU workers on 8004scan
→ coordinate.py loop train.py → submit experiments through x402
→ Collect best val_bpb model → publish.py → sell optimized inference
→ Provenance (val_bpb, train hash, param count) flows into registration metadata
```

### Journey 3: Service Builder (build apps on optimized models) ⚠️ GAP
```
Take autoresearch-optimized model → build web app (CV enhancer, code reviewer, etc.)
→ obol sell http my-app --upstream localhost:3000 --per-request 0.05
→ Users hit frontend, pay via x402, get the service
```

## What's Implemented (this branch)

### Phase 1: Provenance + Skills
- [x] `spec.provenance` field on ServiceOffer CRD (framework, metric, experimentId, trainHash, paramCount)
- [x] `--provenance-file` flag on `obol sell inference` and `obol sell http`
- [x] Provenance injected into `.well-known/agent-registration.json` by monetize.py
- [x] New embedded skill: `autoresearch` (SKILL.md + publish.py + references)
- [x] New embedded skill: `autoresearch-coordinator` (SKILL.md + coordinate.py + references)

### Phase 2: GPU Marketplace
- [x] `worker_api.py` — Flask HTTP API wrapping train.py (`POST /experiment`, `GET /health`, `GET /status`, `GET /best`)
- [x] `Dockerfile.worker` — CUDA 12.4 container for the worker
- [x] Coordinator reimplements Ensue's THINK/CLAIM/RUN/PUBLISH using 8004scan discovery + x402 payments
- [x] GPU workers sold via existing `obol sell http` (no new CRD type needed)

### Design Decision: Bare Metal GPU
k3d doesn't support GPU passthrough. Workers run on the host, obol-stack proxies via `--upstream http://host.k3d.internal:<port>`.

## The Gap: App-on-Top-of-Inference (Journey 3)

**Today** you can sell raw inference (`obol sell inference`) or gate any HTTP service (`obol sell http`). But there's no scaffolding for the common pattern:

> "I want to build a web app that uses an LLM internally and charge users per-use via x402"

For example, a **CV enhancer service**:
- Frontend: upload form for resumes
- Backend: calls the in-cluster LiteLLM with an autoresearch-optimized model
- Payment: x402 gates the whole service per-request

**What's missing:**
1. **App template / scaffold** — no `obol app create` that generates a web app skeleton with LLM backend wired up
2. **Internal LLM routing** — the app needs to call LiteLLM internally (no x402 on internal calls) while the app itself is x402-gated externally
3. **Frontend payment UX** — no x402 payment widget/SDK for browser-based payment flows (today x402 is API-to-API)
4. **Deployment pattern** — no documented pattern for "deploy my app container into the cluster and gate it"

**Proposed solution direction:**
- `obol app create <name> --template inference-app` — scaffolds a Next.js/Flask app with LiteLLM client pre-configured
- App deployed into cluster with internal access to `litellm.llm.svc:4000` (no payment on internal calls)
- `obol sell http <name> --upstream <app-svc>` gates the external-facing endpoint
- x402 browser SDK or payment redirect flow for end-user UX

This needs proper spec work — filing separately or expanding here once we have a concrete prototype (starting with a CV enhancer).

## Phase 3 (Future, not in scope)

- On-chain experiment registry smart contract (experiments, results, bounties)
- GPU metering sidecar (`x402-meter`) for time-based billing
- IPFS/Filecoin model artifact storage (content-addressed by hash)
- Frontend: leaderboard page, experiment dashboard, GPU marketplace view

## Ensue → obol-stack Mapping

| Ensue concept | obol-stack replacement |
|---|---|
| `join_hub()` | `obol sell http` (register GPU as ServiceOffer) |
| `claim_experiment()` | `POST /experiment` to discovered worker via x402 |
| `publish_result()` | ERC-8004 `setMetadata("provenance", {...})` |
| `pull_best_config()` | Query 8004scan API for best val_bpb |
| `ask_swarm()` | Query registered agents' `.well-known` metadata |
| Leaderboard | 8004scan filtered by `machine_learning/model_optimization` skill |


Ensue concept	obol-stack replacement
`join_hub()`	`obol sell http` (register GPU as ServiceOffer)
`claim_experiment()`	`POST /experiment` to discovered worker via x402
`publish_result()`	ERC-8004 `setMetadata("provenance", {...})`
`pull_best_config()`	Query 8004scan API for best val_bpb
`ask_swarm()`	Query registered agents' `.well-known` metadata
Leaderboard	8004scan filtered by `machine_learning/model_optimization` skill

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: autoresearch-at-home integration — GPU marketplace + optimized inference selling #264

Vision

Context

User Journeys

Journey 1: GPU Contributor (earn money)

Journey 2: Researcher (optimize models)

Journey 3: Service Builder (build apps on optimized models) ⚠️ GAP

What's Implemented (this branch)

Phase 1: Provenance + Skills

Phase 2: GPU Marketplace

Design Decision: Bare Metal GPU

The Gap: App-on-Top-of-Inference (Journey 3)

Phase 3 (Future, not in scope)

Ensue → obol-stack Mapping

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: autoresearch-at-home integration — GPU marketplace + optimized inference selling #264

Description

Vision

Context

User Journeys

Journey 1: GPU Contributor (earn money)

Journey 2: Researcher (optimize models)

Journey 3: Service Builder (build apps on optimized models) ⚠️ GAP

What's Implemented (this branch)

Phase 1: Provenance + Skills

Phase 2: GPU Marketplace

Design Decision: Bare Metal GPU

The Gap: App-on-Top-of-Inference (Journey 3)

Phase 3 (Future, not in scope)

Ensue → obol-stack Mapping

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions