Lightweight but capable TypeScript bot architecture. Single-process by default, tool- and skill-driven, MCP-ready, and safe-by-default.
Corebot is optimized for the single-host reliable AI bot track:
- one process, one local SQLite, one workspace
- durable queue + retries + dead-letter replay
- idempotent publish and inbound execution ledger
- restart-safe recovery for stale in-flight work
- observability and audit trail built into runtime
If your target is: "a bot that keeps running correctly on one machine under real failures", this repo is designed for that.
| Scenario | Semantics | Mechanism |
|---|---|---|
| Inbound/outbound enqueue | At-least-once enqueue attempt | Durable message_queue + retry |
| Duplicate publish (same message id) | Deduplicated enqueue | message_dedupe unique key |
| Inbound re-processing after crash | Effectively-once runtime side effect | inbound_executions ledger + deterministic outbound id |
| Tool execution retries | At-least-once execution attempt | Bus retry / dead-letter policy |
| Scheduled task dispatch | At-least-once dispatch attempt | Scheduler emits synthetic inbound + retry path |
| Heartbeat alerts | Suppressed if pure ACK / recent duplicate | ackToken gate + delivery dedupe window |
Notes:
- "Effectively-once" here means duplicate deliveries are neutralized by idempotency guards inside Corebot; external side effects still need idempotent tool design.
- Queue dead-letter is an explicit stop condition, not silent drop.
Corebot handles the following failure classes by default:
- Process crash during message handling
Result: stale
processingmessages are recovered on restart and re-queued. - LLM/tool transient failure
Result: exponential retry until
maxAttempts, then dead-letter with error reason. - Duplicate inbound/outbound publish Result: dedupe key collapses duplicates to one queue record.
- Router crash after runtime completed Result: inbound ledger serves cached result; runtime/tools are not re-executed.
- Queue overload Result: overload backoff and configurable queue caps; overflow goes to dead-letter.
- Schema migration failure Result: startup stops and reports pre-migration backup path for restore.
Use this as the minimum reliable baseline:
{
"storeFullMessages": true,
"bus": {
"maxAttempts": 5,
"processingTimeoutMs": 120000,
"maxPendingInbound": 5000,
"maxPendingOutbound": 5000
},
"observability": {
"enabled": true,
"http": { "enabled": true, "host": "127.0.0.1", "port": 3210 }
},
"slo": {
"enabled": true
}
}Also recommended:
- persist both
data/andworkspace/ - enable webhook auth if webhook channel is exposed
- set
COREBOT_MCP_ALLOWED_SERVERS/COREBOT_MCP_ALLOWED_TOOLSin production
- Agent runtime with tool-calling loop
- Built-in tools (fs, shell, web, memory, messaging, tasks, skills)
- Skills via
SKILL.md(progressive loading) - MCP client integration (tools injected dynamically)
- SQLite storage for chats, messages, summaries, and tasks
- Scheduler with
cron | interval | once - Agent heartbeat loop with debounce wake, ack suppression, and duplicate-delivery guard
- CLI channel for local usage (other channels stubbed)
- Isolated tool runtime for high-risk tools (
shell.exec,web.fetch,fs.write) - Durable queue with idempotent publish (retry/dead-letter/replay + dedupe by message id)
- Inbound execution ledger to avoid duplicate runtime/tool execution on re-queued messages
- Queue backpressure + per-chat rate limit (overflow to DLQ + overload backoff)
- Observability endpoints (
/health/*,/metrics,/status) and SLO monitor - Webhook channel (inbound POST + outbound pull API + optional token auth)
- Migration safety with pre-migration backups and migration history
- Persistent audit events for tool execution, denials, and errors
- CLI:
corebot(orpnpm run dev/pnpm run start) - SDK: import from
@corebot/coreand manage lifecycle viacreateCorebotApp() - CLI flags:
corebot --help,corebot --version,corebot preflight
import { createCorebotApp, loadConfig } from "@corebot/core";
const app = await createCorebotApp({ config: loadConfig() });
await app.start();
// ...
await app.stop();pnpm install --frozen-lockfile
export OPENAI_API_KEY=YOUR_KEY
pnpm run devType in the CLI prompt to chat. Use /exit to quit.
- Use
pnpmonly (packageManageris pinned inpackage.json). - Commit both
pnpm-lock.yamlandpnpm-workspace.yaml. - Install with
pnpm install --frozen-lockfilein local reproducible runs, CI, and Docker. - Keep build-script approvals explicit in
pnpm-workspace.yaml(onlyBuiltDependencies). - If a newly added dependency needs lifecycle scripts, run
pnpm approve-buildsand commit the updated policy file.
# Build + run production bundle locally
pnpm run build
node dist/bin.js
# Use a custom workspace/data directory
COREBOT_WORKSPACE=./workspace COREBOT_DATA_DIR=./data pnpm run dev
# Enable shell tool with executable allowlist
COREBOT_ALLOW_SHELL=true COREBOT_SHELL_ALLOWLIST="ls,git" pnpm run dev
# Enable web.search (Brave Search API)
BRAVE_API_KEY=YOUR_KEY COREBOT_ALLOWED_ENV=BRAVE_API_KEY pnpm run dev
# Restrict web.fetch to specific hosts/domains
COREBOT_WEB_ALLOWLIST="example.com,api.example.com" pnpm run dev
# Restrict web.fetch ports
COREBOT_WEB_ALLOWED_PORTS="443,8443" COREBOT_WEB_BLOCKED_PORTS="8080" pnpm run dev
# Isolate multiple high-risk tools in worker process
COREBOT_ISOLATION_TOOLS="shell.exec,web.fetch,fs.write" pnpm run dev
# Enable observability HTTP endpoints
COREBOT_OBS_HTTP_ENABLED=true COREBOT_OBS_HTTP_PORT=3210 pnpm run dev
# Enable webhook channel
COREBOT_WEBHOOK_ENABLED=true COREBOT_WEBHOOK_AUTH_TOKEN=YOUR_TOKEN pnpm run dev
# Manual database backup / restore
pnpm run ops:db:backup -- --db data/bot.sqlite
pnpm run ops:db:restore -- --db data/bot.sqlite --from data/backups/manual-xxxx.sqlite --force
# Validate startup config and MCP file before deployment
corebot preflight
corebot preflight --mcp-config ./path/to/.mcp.jsonCLI queue ops:
/dlq list [inbound|outbound|all] [limit]/dlq replay <queueId|inbound|outbound|all> [limit]
Example prompts (in CLI):
- “Schedule a daily summary at 9am.”
- “Save a short memory about my preferences.”
- “List available skills.”
You can configure via config.json or environment variables.
{
"workspaceDir": "workspace",
"dataDir": "data",
"sqlitePath": "data/bot.sqlite",
"logLevel": "info",
"provider": {
"type": "openai",
"baseUrl": "https://api.openai.com/v1",
"model": "gpt-4o-mini",
"temperature": 0.2,
"timeoutMs": 60000,
"maxInputTokens": 128000,
"reserveOutputTokens": 4096
},
"historyMaxMessages": 30,
"storeFullMessages": false,
"maxToolIterations": 8,
"maxToolOutputChars": 50000,
"skillsDir": "workspace/skills",
"mcpConfigPath": ".mcp.json",
"mcpSync": {
"failureBackoffBaseMs": 1000,
"failureBackoffMaxMs": 60000,
"openCircuitAfterFailures": 5,
"circuitResetMs": 30000
},
"heartbeat": {
"enabled": false,
"intervalMs": 300000,
"wakeDebounceMs": 250,
"wakeRetryMs": 1000,
"promptPath": "HEARTBEAT.md",
"activeHours": "",
"skipWhenInboundBusy": true,
"ackToken": "HEARTBEAT_OK",
"suppressAck": true,
"dedupeWindowMs": 86400000,
"maxDispatchPerRun": 20
},
"scheduler": { "tickMs": 60000 },
"bus": {
"pollMs": 1000,
"batchSize": 50,
"maxAttempts": 5,
"retryBackoffMs": 1000,
"maxRetryBackoffMs": 60000,
"processingTimeoutMs": 120000,
"maxPendingInbound": 5000,
"maxPendingOutbound": 5000,
"overloadPendingThreshold": 2000,
"overloadBackoffMs": 500,
"perChatRateLimitWindowMs": 60000,
"perChatRateLimitMax": 120
},
"observability": {
"enabled": true,
"reportIntervalMs": 30000,
"http": { "enabled": true, "host": "127.0.0.1", "port": 3210 }
},
"slo": {
"enabled": true,
"alertCooldownMs": 60000,
"maxPendingQueue": 2000,
"maxDeadLetterQueue": 20,
"maxToolFailureRate": 0.2,
"maxSchedulerDelayMs": 60000,
"maxMcpFailureRate": 0.3
},
"isolation": {
"enabled": true,
"toolNames": ["shell.exec"],
"workerTimeoutMs": 30000,
"maxWorkerOutputChars": 250000,
"maxConcurrentWorkers": 4,
"openCircuitAfterFailures": 5,
"circuitResetMs": 30000
},
"allowShell": false,
"allowedShellCommands": [],
"allowedEnv": [],
"allowedWebDomains": [],
"allowedWebPorts": [],
"blockedWebPorts": [],
"allowedMcpServers": [],
"allowedMcpTools": [],
"adminBootstrapKey": "",
"adminBootstrapSingleUse": true,
"adminBootstrapMaxAttempts": 5,
"adminBootstrapLockoutMinutes": 15,
"webhook": {
"enabled": false,
"host": "0.0.0.0",
"port": 8788,
"path": "/webhook",
"authToken": "",
"maxBodyBytes": 1000000
},
"cli": { "enabled": true }
}OPENAI_API_KEYOPENAI_BASE_URLOPENAI_MODELOPENAI_TEMPERATUREOPENAI_TIMEOUT_MS(deprecated alias forCOREBOT_PROVIDER_TIMEOUT_MS)COREBOT_PROVIDER_TIMEOUT_MSCOREBOT_PROVIDER_MAX_INPUT_TOKENSCOREBOT_PROVIDER_RESERVE_OUTPUT_TOKENSCOREBOT_WORKSPACECOREBOT_DATA_DIRCOREBOT_SQLITE_PATHCOREBOT_LOG_LEVELCOREBOT_HISTORY_MAXCOREBOT_STORE_FULLCOREBOT_MAX_TOOL_ITERCOREBOT_MAX_TOOL_OUTPUTCOREBOT_SKILLS_DIRCOREBOT_MCP_CONFIGCOREBOT_MCP_SYNC_BACKOFF_BASE_MSCOREBOT_MCP_SYNC_BACKOFF_MAX_MSCOREBOT_MCP_SYNC_OPEN_CIRCUIT_AFTER_FAILURESCOREBOT_MCP_SYNC_CIRCUIT_RESET_MSCOREBOT_HEARTBEAT_ENABLEDCOREBOT_HEARTBEAT_INTERVAL_MSCOREBOT_HEARTBEAT_WAKE_DEBOUNCE_MSCOREBOT_HEARTBEAT_WAKE_RETRY_MSCOREBOT_HEARTBEAT_PROMPT_PATHCOREBOT_HEARTBEAT_ACTIVE_HOURSCOREBOT_HEARTBEAT_SKIP_WHEN_INBOUND_BUSYCOREBOT_HEARTBEAT_ACK_TOKENCOREBOT_HEARTBEAT_SUPPRESS_ACKCOREBOT_HEARTBEAT_DEDUPE_WINDOW_MSCOREBOT_HEARTBEAT_MAX_DISPATCH_PER_RUNCOREBOT_ISOLATION_ENABLEDCOREBOT_ISOLATION_TOOLSCOREBOT_ISOLATION_WORKER_TIMEOUT_MSCOREBOT_ISOLATION_MAX_WORKER_OUTPUT_CHARSCOREBOT_ISOLATION_MAX_CONCURRENT_WORKERSCOREBOT_ISOLATION_OPEN_CIRCUIT_AFTER_FAILURESCOREBOT_ISOLATION_CIRCUIT_RESET_MSCOREBOT_ALLOW_SHELLCOREBOT_SHELL_ALLOWLISTCOREBOT_ALLOWED_ENVCOREBOT_WEB_ALLOWLISTCOREBOT_WEB_ALLOWED_PORTSCOREBOT_WEB_BLOCKED_PORTSCOREBOT_BUS_POLL_MSCOREBOT_BUS_BATCH_SIZECOREBOT_BUS_MAX_ATTEMPTSCOREBOT_BUS_RETRY_BACKOFF_MSCOREBOT_BUS_MAX_RETRY_BACKOFF_MSCOREBOT_BUS_PROCESSING_TIMEOUT_MSCOREBOT_BUS_MAX_PENDING_INBOUNDCOREBOT_BUS_MAX_PENDING_OUTBOUNDCOREBOT_BUS_OVERLOAD_PENDING_THRESHOLDCOREBOT_BUS_OVERLOAD_BACKOFF_MSCOREBOT_BUS_CHAT_RATE_WINDOW_MSCOREBOT_BUS_CHAT_RATE_MAXCOREBOT_OBS_ENABLEDCOREBOT_OBS_REPORT_MSCOREBOT_OBS_HTTP_ENABLEDCOREBOT_OBS_HTTP_HOSTCOREBOT_OBS_HTTP_PORTCOREBOT_SLO_ENABLEDCOREBOT_SLO_ALERT_COOLDOWN_MSCOREBOT_SLO_MAX_PENDING_QUEUECOREBOT_SLO_MAX_DEAD_LETTER_QUEUECOREBOT_SLO_MAX_TOOL_FAILURE_RATECOREBOT_SLO_MAX_SCHEDULER_DELAY_MSCOREBOT_SLO_MAX_MCP_FAILURE_RATECOREBOT_SLO_ALERT_WEBHOOK_URLCOREBOT_MCP_ALLOWED_SERVERSCOREBOT_MCP_ALLOWED_TOOLSCOREBOT_ADMIN_BOOTSTRAP_KEYCOREBOT_ADMIN_BOOTSTRAP_SINGLE_USECOREBOT_ADMIN_BOOTSTRAP_MAX_ATTEMPTSCOREBOT_ADMIN_BOOTSTRAP_LOCKOUT_MINUTESCOREBOT_WEBHOOK_ENABLEDCOREBOT_WEBHOOK_HOSTCOREBOT_WEBHOOK_PORTCOREBOT_WEBHOOK_PATHCOREBOT_WEBHOOK_AUTH_TOKENCOREBOT_WEBHOOK_MAX_BODY_BYTES
Notes:
COREBOT_ALLOWED_ENVis used by tools that explicitly gate env access (for exampleweb.search) and by isolatedshell.execworkers.COREBOT_SHELL_ALLOWLISTmatches executable names (for examplels,git), not full command prefixes.COREBOT_WEB_ALLOWLISTrestrictsweb.fetchtarget hosts (exact host or subdomain match).COREBOT_WEB_ALLOWED_PORTSandCOREBOT_WEB_BLOCKED_PORTSprovide port allow/deny controls forweb.fetch.COREBOT_ISOLATION_TOOLSdefaults toshell.exec; addweb.fetchand/orfs.writeto isolate network and file-write execution as well.COREBOT_ISOLATION_MAX_CONCURRENT_WORKERScaps simultaneous isolated workers (default4).COREBOT_ISOLATION_OPEN_CIRCUIT_AFTER_FAILURESandCOREBOT_ISOLATION_CIRCUIT_RESET_MScontrol per-tool circuit breaker for repeated worker failures.- Default policy denies non-admin
fs.writeto protected paths (skills/,IDENTITY.md,TOOLS.md,USER.md,.mcp.json). COREBOT_MCP_ALLOWED_SERVERSandCOREBOT_MCP_ALLOWED_TOOLSact as allowlists when set; empty lists allow all discovered MCP servers/tools.COREBOT_MCP_SYNC_*controls MCP auto-sync retry backoff and temporary circuit-open window after repeated failures.COREBOT_PROVIDER_TIMEOUT_MSbounds each LLM request; timeout errors enter normal retry/dead-letter flow.COREBOT_PROVIDER_MAX_INPUT_TOKENSandCOREBOT_PROVIDER_RESERVE_OUTPUT_TOKENSenforce prompt budgeting before each LLM turn.COREBOT_HEARTBEAT_ACTIVE_HOURSacceptsHH:mm-HH:mmin local process time; empty means always active.COREBOT_HEARTBEAT_PROMPT_PATHis resolved relative toworkspaceDirand must be non-empty to dispatch heartbeat turns.COREBOT_WEBHOOK_AUTH_TOKENcan be sent viaAuthorization: Bearer <token>orx-corebot-token.COREBOT_ADMIN_BOOTSTRAP_SINGLE_USE=trueinvalidates bootstrap elevation after first successful use.COREBOT_ADMIN_BOOTSTRAP_MAX_ATTEMPTSandCOREBOT_ADMIN_BOOTSTRAP_LOCKOUT_MINUTEScontrol invalid-key lockout policy.
- Build
pnpm install --frozen-lockfile
pnpm run build- Run
export OPENAI_API_KEY=YOUR_KEY
node dist/bin.js
# Or directly: node dist/main.js-
Persist data
Ensuredata/andworkspace/are persisted (bind mount or volume). Corebot auto-creates them if missing. -
Config
Useconfig.jsonfor stable configuration in production; use env vars for secrets.
Build and run using the included Dockerfile:
docker build -t corebot .
docker run -it --rm \\
-e OPENAI_API_KEY=YOUR_KEY \\
-v $(pwd)/data:/app/data \\
-v $(pwd)/workspace:/app/workspace \\
corebotOptional: mount .mcp.json or config.json if you want MCP or custom settings:
docker run -it --rm \\
-e OPENAI_API_KEY=YOUR_KEY \\
-v $(pwd)/data:/app/data \\
-v $(pwd)/workspace:/app/workspace \\
-v $(pwd)/.mcp.json:/app/.mcp.json \\
-v $(pwd)/config.json:/app/config.json \\
corebotname: ci
on:
push:
pull_request:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
with:
version: 10
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: pnpm install --frozen-lockfile
- run: pnpm run buildfs.read,fs.write,fs.listshell.exec(disabled by default)web.fetch,web.search(Brave Search API)memory.read,memory.writemessage.send,chat.register,chat.set_roletasks.schedule,tasks.list,tasks.updateskills.list,skills.read,skills.enable,skills.disable,skills.enabledheartbeat.status,heartbeat.trigger,heartbeat.enable(admin only)mcp.reload(admin only; force refresh MCP config and tool bindings)bus.dead_letter.list,bus.dead_letter.replay(admin only)
| Tool | Parameters | Description |
|---|---|---|
fs.read |
path: string |
Read a text file within the workspace |
fs.write |
path: string, content: string, mode?: "overwrite"|"append" (default "overwrite") |
Write a text file within the workspace |
fs.list |
path?: string (default ".") |
List files in a workspace directory |
| Tool | Parameters | Description |
|---|---|---|
shell.exec |
command: string, cwd?: string, timeoutMs?: number (default 20000, max 120000) |
Execute a command. Disabled by default; requires allowShell=true. Admin only. |
Commands are tokenized and executed directly (no shell interpreter). If allowedShellCommands is non-empty, only listed executable names are permitted.
| Tool | Parameters | Description |
|---|---|---|
web.fetch |
url: string, method?: "GET"|"POST" (default "GET"), headers?: Record<string,string>, body?: string, timeoutMs?: number (default 15000), maxResponseChars?: number (default 200000) |
Fetch a URL over HTTP |
web.search |
query: string, count?: number (default 5, max 10) |
Search the web using Brave Search API. Requires BRAVE_API_KEY in allowedEnv. |
| Tool | Parameters | Description |
|---|---|---|
memory.read |
scope?: "global"|"chat"|"all" (default "all") |
Read memory content |
memory.write |
scope?: "global"|"chat" (default "chat"), content: string, mode?: "append"|"replace" (default "append") |
Write memory. Global scope is admin only. |
Memory files: workspace/memory/MEMORY.md (global), workspace/memory/{channel}_{chatId}.md (per-chat).
| Tool | Parameters | Description |
|---|---|---|
message.send |
content: string, channel?: string, chatId?: string |
Send a message. Cross-chat sending is admin only. |
chat.register |
channel?: string, chatId?: string, role?: "admin"|"normal", bootstrapKey?: string |
Register a chat for full message storage. See Admin Bootstrap below. |
chat.set_role |
channel: string, chatId: string, role: "admin"|"normal" |
Set chat role. Admin only. |
| Tool | Parameters | Description |
|---|---|---|
tasks.schedule |
prompt: string, scheduleType: "cron"|"interval"|"once", scheduleValue: string, contextMode?: "group"|"isolated" (default "group") |
Create a scheduled task |
tasks.list |
includeInactive?: boolean (default true) |
List tasks for this chat |
tasks.update |
taskId: string, status?: "active"|"paused"|"done", scheduleType?, scheduleValue?, contextMode? |
Update a task. Cross-chat updates are admin only. |
scheduleValue format: cron expression for cron, milliseconds string for interval, ISO datetime for once.
| Tool | Parameters | Description |
|---|---|---|
skills.list |
(none) | List available skills with enabled status |
skills.read |
name: string |
Read a skill file content |
skills.enable |
name: string |
Enable a skill for this chat |
skills.disable |
name: string |
Disable a skill (always-skills cannot be disabled) |
skills.enabled |
(none) | List currently enabled skill names |
| Tool | Parameters | Description |
|---|---|---|
heartbeat.status |
(none) | Show heartbeat runtime status, config, and next due chats. Admin only. |
heartbeat.trigger |
reason?: string, force?: boolean, channel?: string, chatId?: string |
Queue an immediate heartbeat wake (optional force and target chat). Admin only. |
heartbeat.enable |
enabled: boolean, reason?: string |
Enable/disable runtime heartbeat loop without restart. Admin only. |
mcp.reload |
reason?: string, force?: boolean (default true) |
Reload MCP config and re-register tools. Set force=false to respect no-change checks and failure backoff. Admin only. |
bus.dead_letter.list |
direction?: "inbound"|"outbound", limit?: number (default 20) |
List dead-letter queue entries. Admin only. |
bus.dead_letter.replay |
queueId?: string, direction?: "inbound"|"outbound", limit?: number (default 10) |
Replay dead-letter entries back to pending. Admin only. |
Corebot uses two roles: admin and normal. New chats default to normal.
The first admin is created through a bootstrap flow:
- Set
COREBOT_ADMIN_BOOTSTRAP_KEYin config/env. - A user calls
chat.registerwithrole=adminandbootstrapKey=<the key>. - If the key matches, the chat is promoted to admin.
- With
adminBootstrapSingleUse=true(default), the key is invalidated after first use. - After
adminBootstrapMaxAttempts(default 5) failed attempts, bootstrap locks foradminBootstrapLockoutMinutes(default 15). - Once an admin exists, new admins can only be granted by existing admins via
chat.set_role.
| Capability | Normal | Admin |
|---|---|---|
| File read/write/list (within workspace) | Yes | Yes |
| File write to protected paths | No | Yes |
| Shell execution | No | Yes |
| Web fetch (policy-restricted) | Yes | Yes |
| Memory write (chat scope) | Yes | Yes |
| Memory write (global scope) | No | Yes |
| Send message (same chat) | Yes | Yes |
| Send message (cross-chat) | No | Yes |
| Register own chat | Yes | Yes |
| Register other chats | No | Yes |
| Update own tasks | Yes | Yes |
| Update other chats' tasks | No | Yes |
| Heartbeat control/status tools | No | Yes |
| MCP tool execution | No | Yes |
| MCP reload | No | Yes |
| Dead-letter queue operations | No | Yes |
| Set chat roles | No | Yes |
Non-admin fs.write is denied for: IDENTITY.md, TOOLS.md, USER.md, .mcp.json, skills/ (and any path under it).
Corebot maintains two types of persistent memory:
- Global memory (
workspace/memory/MEMORY.md): shared across all chats. Admin-only for writes. - Per-chat memory (
workspace/memory/{channel}_{chatId}.md): scoped to a specific chat session.
Both are automatically included in the system prompt when available, except isolated scheduled-task runs (chat memory excluded).
When the stored message count for a chat exceeds historyMaxMessages * 2, Corebot automatically compacts:
- Recent messages are sent to the LLM to generate a bullet summary (max 150 words).
- Old messages beyond
historyMaxMessagesare pruned from storage. - The summary is stored in
conversation_stateand included in future system prompts.
This keeps context manageable while preserving key facts and decisions.
To ensure idempotency when messages are re-queued (e.g., after a retry), Corebot maintains an inbound_executions table:
- Before processing, the router checks if the inbound message was already processed.
- If completed, the cached response is reused without re-running the LLM or tools.
- If a previous run is stale (older than
bus.processingTimeoutMs), it is reclaimed. - Outbound message IDs are deterministic (
outbound:{channel}:{chatId}:{inboundId}), so re-processing does not create duplicate replies.
Skills live in workspace/skills/<skill-name>/SKILL.md and support frontmatter:
---
name: web-research
description: "Web search + citation formatting"
always: false
requires:
- env: ["BRAVE_API_KEY"]
tools:
- web.search
- web.fetch
---
# Web Research Skill
...New skill directories/files are discovered dynamically during message handling, so adding a skill does not require a process restart.
Create .mcp.json in repo root:
{
"servers": {
"myserver": {
"command": "npx",
"args": ["@example/mcp-server"]
}
}
}MCP tools are injected as: mcp__<server>__<tool>.
.mcp.json is checked and auto-synced during message handling; changes are applied without restart.
You can also force refresh manually with mcp.reload.
If .mcp.json is invalid (for example malformed JSON), reload is rejected and the previous MCP tool set remains active.
Each enabled server must define exactly one of command or url; args/env are only valid with command.
Use corebot preflight to validate config and .mcp.json before rolling out changes.
Reload attempts are tracked in telemetry (corebot_mcp_reload_*) and persisted in audit_events with reason/duration metadata.
Heartbeat runs as a synthetic inbound turn per chat and reuses the same router/runtime/tool stack. It supports:
- interval scheduling per chat
- wake coalescing (debounce)
- inbound-busy skip/retry gate
- ack suppression (
ackToken) - recent duplicate suppression (
dedupeWindowMs)
Behavior controls live under heartbeat.* config and are also available via admin tools heartbeat.status, heartbeat.trigger, and heartbeat.enable.
Tasks support:
cron(cron expression)interval(milliseconds)once(ISO datetime)
Scheduler emits synthetic inbound messages with context_mode:
group: include chat contextisolated: minimal context
For single-host reliability, track these first:
corebot_queue_pending{direction="inbound"}corebot_queue_dead_letter{direction="inbound"}corebot_tools_failure_ratecorebot_scheduler_max_delay_mscorebot_mcp_failure_ratecorebot_heartbeat_scope_sent_total{scope="delivery"}corebot_heartbeat_scope_skipped_total{scope="delivery"}
Fast interpretation:
- rising inbound pending with flat throughput means handler bottleneck or provider slowdown
- rising dead-letter means retries are exhausted and manual replay/recovery is needed
- heartbeat sent drops to zero while skipped rises usually indicates ACK suppression or duplicate suppression dominating
- Health endpoints:
GET /health/liveGET /health/readyGET /health/startup
- Runtime endpoints:
GET /metrics(Prometheus format)GET /status(JSON snapshot with queue/tool/scheduler/MCP health)
- Webhook channel:
POST <COREBOT_WEBHOOK_PATH>with JSON{chatId, content, senderId?, id?, createdAt?, metadata?}GET <COREBOT_WEBHOOK_PATH>/outbound?chatId=<id>&limit=<n>
Detailed incident and recovery procedures are documented in RUNBOOK.md.
workspace/
IDENTITY.md
USER.md
TOOLS.md
memory/
MEMORY.md
skills/
<skill-name>/SKILL.md
- WhatsApp / Telegram adapters
- Container sandbox for tools
- Additional provider adapters
- Multi-instance coordination and queue partitioning
Corebot is inspired by NanoClaw + NanoBot patterns.
For the full architecture details, see ARCHITECTURE.md. For the operations runbook, see RUNBOOK.md. For contribution guidelines, see CONTRIBUTING.md.