-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Vision
Three-sided marketplace on obol-stack powered by autoresearch and its distributed fork autoresearch-at-home:
- GPU contributors sell compute time to the autoresearch swarm, paid via x402
- Researchers run distributed experiments across GPU workers, discovering them via ERC-8004
- Service builders take autoresearch-optimized models and sell apps/inference on top via x402
Context
Autoresearch is Andrej Karpathy's autonomous LLM optimization framework — an AI agent iterates on train.py (architecture, hyperparams, optimizer), trains for 5 minutes per experiment, measures val_bpb (bits per byte), keeps improvements, reverts failures. ~12 experiments/hour, ~100 overnight.
Autoresearch-at-home is the SETI@home-style fork: multiple agents on different GPUs collaborating through a shared coordination layer (currently Ensue). Adds claiming, result publishing, global best tracking, collective intelligence.
Obol-stack already has the payment and discovery infrastructure: ServiceOffer CRD, x402 ForwardAuth, ERC-8004 registration, buy-side sidecar. The integration connects autoresearch's GPU demand with obol-stack's payment rails.
User Journeys
Journey 1: GPU Contributor (earn money)
Run worker_api.py on bare metal GPU
→ obol sell http gpu-worker --upstream localhost:8080 --per-hour 0.50
→ x402 gates your GPU → researchers pay per-experiment
→ Worker registered on ERC-8004 with OASF skill: machine_learning/model_optimization
Journey 2: Researcher (optimize models)
coordinate.py discover → find GPU workers on 8004scan
→ coordinate.py loop train.py → submit experiments through x402
→ Collect best val_bpb model → publish.py → sell optimized inference
→ Provenance (val_bpb, train hash, param count) flows into registration metadata
Journey 3: Service Builder (build apps on optimized models) ⚠️ GAP
Take autoresearch-optimized model → build web app (CV enhancer, code reviewer, etc.)
→ obol sell http my-app --upstream localhost:3000 --per-request 0.05
→ Users hit frontend, pay via x402, get the service
What's Implemented (this branch)
Phase 1: Provenance + Skills
-
spec.provenancefield on ServiceOffer CRD (framework, metric, experimentId, trainHash, paramCount) -
--provenance-fileflag onobol sell inferenceandobol sell http - Provenance injected into
.well-known/agent-registration.jsonby monetize.py - New embedded skill:
autoresearch(SKILL.md + publish.py + references) - New embedded skill:
autoresearch-coordinator(SKILL.md + coordinate.py + references)
Phase 2: GPU Marketplace
-
worker_api.py— Flask HTTP API wrapping train.py (POST /experiment,GET /health,GET /status,GET /best) -
Dockerfile.worker— CUDA 12.4 container for the worker - Coordinator reimplements Ensue's THINK/CLAIM/RUN/PUBLISH using 8004scan discovery + x402 payments
- GPU workers sold via existing
obol sell http(no new CRD type needed)
Design Decision: Bare Metal GPU
k3d doesn't support GPU passthrough. Workers run on the host, obol-stack proxies via --upstream http://host.k3d.internal:<port>.
The Gap: App-on-Top-of-Inference (Journey 3)
Today you can sell raw inference (obol sell inference) or gate any HTTP service (obol sell http). But there's no scaffolding for the common pattern:
"I want to build a web app that uses an LLM internally and charge users per-use via x402"
For example, a CV enhancer service:
- Frontend: upload form for resumes
- Backend: calls the in-cluster LiteLLM with an autoresearch-optimized model
- Payment: x402 gates the whole service per-request
What's missing:
- App template / scaffold — no
obol app createthat generates a web app skeleton with LLM backend wired up - Internal LLM routing — the app needs to call LiteLLM internally (no x402 on internal calls) while the app itself is x402-gated externally
- Frontend payment UX — no x402 payment widget/SDK for browser-based payment flows (today x402 is API-to-API)
- Deployment pattern — no documented pattern for "deploy my app container into the cluster and gate it"
Proposed solution direction:
obol app create <name> --template inference-app— scaffolds a Next.js/Flask app with LiteLLM client pre-configured- App deployed into cluster with internal access to
litellm.llm.svc:4000(no payment on internal calls) obol sell http <name> --upstream <app-svc>gates the external-facing endpoint- x402 browser SDK or payment redirect flow for end-user UX
This needs proper spec work — filing separately or expanding here once we have a concrete prototype (starting with a CV enhancer).
Phase 3 (Future, not in scope)
- On-chain experiment registry smart contract (experiments, results, bounties)
- GPU metering sidecar (
x402-meter) for time-based billing - IPFS/Filecoin model artifact storage (content-addressed by hash)
- Frontend: leaderboard page, experiment dashboard, GPU marketplace view
Ensue → obol-stack Mapping
| Ensue concept | obol-stack replacement |
|---|---|
join_hub() |
obol sell http (register GPU as ServiceOffer) |
claim_experiment() |
POST /experiment to discovered worker via x402 |
publish_result() |
ERC-8004 setMetadata("provenance", {...}) |
pull_best_config() |
Query 8004scan API for best val_bpb |
ask_swarm() |
Query registered agents' .well-known metadata |
| Leaderboard | 8004scan filtered by machine_learning/model_optimization skill |