This project is small. Nearly all runtime behavior lives in ipycodex/core.py and ipycodex/codex_client.py, so getting productive mainly means understanding those files and the tests in tests/test_core.py.
Install in editable mode:
pip install -e .[dev]Run tests:
pytestThis repo is configured for fastship releases:
ship-changelog
ship-releaseImplemented:
- period-to-magic rewriting using IPython cleanup transforms
- multiline prompts with backslash-Enter continuation
- notes: string-literal-only cells detected via
astand sent as<note>blocks in context - session-scoped prompt persistence in SQLite
- startup snapshot save/replay through
startup.ipynb(nbformat v4.5 with cell IDs) - notes saved as markdown cells, code as code cells, prompts as markdown with metadata
- dynamic code/output/note context reconstruction
- unified tool discovery from prompts, skills, notes, and tool responses via
_tool_refs() _parse_frontmatter()shared helper for extracting YAML frontmatter from skills, notes, and tool resultsallowed-toolsfrontmatter key in skills and notes for declaring tool dependencies- tool results with qualifying frontmatter (
allowed-toolsoreval: true) contribute tools - Agent Skills discovery from
.agents/skills/(CWD + parents) and~/.config/agents/skills/ load_skilltool added touser_nsat init time, resolved via normal tool mechanism (not special-cased)- skills list frozen at extension load time (security: prevents LLM from creating and loading skills mid-session)
- streaming responses with live Rich markdown rendering in TTY
- Codex app-server transport over stdio JSON-RPC (newline-delimited messages via the local
codex app-server) - one ephemeral Codex thread per prompt/completion call, with prior chat history serialized into the current turn input
- Codex dynamic tools backed by IPython callables, so
&tool refs now execute through app-serveritem/tool/call - live streaming of server-side
commandExecutionstdout in TTY sessions, while stored responses keep only the final command detail block - model reasoning streamed as blockquoted text during display and stored in
<thinking>blocks - tool call display compacted to single-line
π§ f(x=1) => 2form - AI inline completion via Alt-. (calls
completion_modelwith session context, shows as prompt_toolkit suggestion; partial accept via M-f preserves remaining suggestion) - keyboard shortcuts: Alt-Up/Down (history jump), Alt-Shift-W (all code blocks), Alt-Shift-1..9 (nth block), Alt-Shift-Up/Down (cycle blocks) via prompt_toolkit
- code block extraction uses
mistletoemarkdown parser (not regex) for correctness - syntax highlighting disabled for
.prompts and%%ipycodexcells (patchesIPythonPTLexerat class level) - XDG-backed config, startup, and system prompt files
- optional exact raw prompt/response logging
- skill eval blocks:
#| eval: truepython code blocks in skills are executed viashell.run_cellwhen loaded - per-directory session persistence: CWD stored in IPython
sessions.remark, session resume viaresume_session() - interactive session picker via
prompt_toolkit.radiolist_dialogforipycodex -r %ipycodex sessionscommand listing resumable sessions with last prompt previewipycodexCLI entry point (console script) launching IPython with ipythonng + ipycodex + output history- minimal IPython compatibility patches for
SyntaxTBandinspect.getfile(guarded withonce=Trueto coexist with ipykernel_helper)
- ipycodex/core.py: extension logic, XDG path globals, config loading, prompt/history building, tool resolution, skill discovery, session persistence/resume, async streaming, Rich rendering, keybindings
- ipycodex/codex_client.py: local Codex app-server client, stdio JSON-RPC transport, ephemeral thread/turn orchestration, dynamic tool dispatch, and tool/command item rendering
- ipycodex/cli.py:
ipycodexconsole script entry point β parses flags viaipythonng.cli.parse_flags, launches IPython with extensions and output history - ipycodex/init.py: package exports and version
- tests/test_core.py: focused unit tests for transformation, history, config, tools, notes, skills, sessions, rendering, and thinking display
- tests/test_codex_client.py: focused tests for the Codex wrapper layer
- pyproject.toml: packaging, console script (
ipycodex), and fastship configuration - .agents/skills/: project-local Agent Skills
Each AI prompt is saved in an ai_prompts table inside IPython's history SQLite database. Rows are keyed by the current IPython session_number and include:
promptresponsehistory_line
Stored rows contain only the user prompt, full AI response, and the line where the code context for that prompt stops.
Example:
In [1]: import math
In [2]: .first prompt
In [3]: x = 1
In [4]: .second promptThe stored rows are roughly:
- first prompt:
history_line=1 - second prompt:
history_line=3
So for the second prompt, ipycodex knows:
- the code context before it should include
x = 1, but notimport math - the prompt itself happened immediately after line 3
For each new prompt, ipycodex reconstructs chat history as alternating user / assistant entries:
- the user entry is
<context>...</context><user-request>...</user-request> - the assistant entry is the stored full response
The <context> block contains all non-ipycodex code run since the previous AI prompt in the current session, plus Out[...] history when IPython has it. String-literal-only cells are sent as <note> instead of <code> (detected via ast). The XML is intentionally simple:
<context><code>a = 1</code><note>This is a note</note><code>a</code><output>1</output></context>The extension lifecycle is:
%load_ext ipycodexcallsload_ipython_extension, which parsesIPYTHONNG_FLAGSand delegates tocreate_extension.create_extensionensures theai_promptstable exists, optionally resumes a session (or shows the interactive picker), creates the extension, stores CWD insessions.remark, and registers the atexit handler.IPyAIExtension.__init__loads config, system prompt, discovers skills, and loads the startup file.IPyAIExtension.load()registers%ipycodex/%%ipycodex, inserts a cleanup transform into IPython'sinput_transformer_manager.cleanup_transforms, registers keybindings, and appliesstartup.ipynbif the session is still fresh.- Any cell whose first character is
.is rewritten bytransform_dots()intoget_ipython().run_cell_magic('ipycodex', '', prompt). AIMagics.ipycodex()routes line input tohandle_line()and cell input directly to the_run_prompt()coroutine (returned to the asyncrun_cell_magicpatch for awaiting)._run_prompt()reconstructs conversation history, resolves tools, adds skills tools/system prompt if skills were discovered, runs the localAsyncChatwrapper fromipycodex.codex_client, streams the response, optionally writes an exact log entry, and stores the full response.
The Codex wrapper currently starts a fresh ephemeral app-server thread for each prompt and completion request. Prior dialog history is serialized into a <conversation-history> block prepended to the current turn input, while the current system prompt is sent as Codex developerInstructions. This keeps ipycodex's existing SQLite-backed history and session replay model intact without needing to persist Codex thread IDs.
At import time, ipycodex also applies two small global IPython bugfixes (shared with ipykernel_helper, guarded with once=True so only the first loader applies them):
SyntaxTB.structured_tracebackcoerces non-stringevalue.msgvalues tostrinspect.getfileis wrapped to always return a string
The period rewrite happens in cleanup_transforms, not in a later input transformer. That matters because IPython's own parsing for help syntax and similar features can interfere with raw prompts if the rewrite happens too late.
This is the mechanism that makes these cases work correctly:
- multiline pasted prompts
- prompts containing
? - backslash-Enter continuation
The stored prompt text is not the exact user message sent to the model. The actual user entry is built dynamically with:
{context}<user-request>{prompt}</user-request>context is empty when there has been no intervening code. Otherwise it is:
<context><code>...</code><note>...</note><output>...</output>...</context>Important detail: only the raw prompt and raw response are stored in SQLite. Context is regenerated on each run from normal IPython history. That keeps the table small and avoids baking transient context into stored rows.
ipycodex uses IPython's existing history database connection at shell.history_manager.db.
Table schema:
CREATE TABLE IF NOT EXISTS ai_prompts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session INTEGER NOT NULL,
prompt TEXT NOT NULL,
response TEXT NOT NULL,
history_line INTEGER NOT NULL DEFAULT 0
)Notes:
- rows are scoped by IPython
session_number history_lineis used to decide which code cells belong in the next prompt's generated<context>block- if
ai_promptsdoes not match the expected schema,ipycodexdrops and recreates it instead of migrating it %ipycodex resetdeletes only current-session rows and sets a reset baseline inuser_ns
startup.ipynb is stored as a Jupyter notebook (nbformat v4.5 with cell IDs) next to the other XDG files.
%ipycodex save writes a merged event stream for the current session as notebook cells:
- code events become code cells (with
metadata.ipycodex.kind="code") - string-literal-only code (notes) become markdown cells (with original source preserved in
metadata.ipycodex.sourcefor round-trip replay) - prompt events become markdown cells containing the AI response (with prompt text in
metadata.ipycodex.prompt)
On a fresh load:
- code cells (including notes) are replayed with
run_cell(..., store_history=True) - prompt cells are restored into
ai_promptsfrom metadata execution_countis advanced for restored prompt events so later saves preserve ordering
Legacy startup.json files (pre-notebook format) are still supported for loading.
ipycodex stores the working directory in IPython's sessions.remark column (an unused TEXT field) at extension load time. This enables per-directory session listing and resume.
Key functions:
_list_sessions(db, cwd)β queries sessions for the given directory, falls back to git repo root exact match; includes the last AI prompt per session via a subquery onai_prompts_fmt_session()β formats a session row for display (shared by%ipycodex sessionsand the interactive picker)_pick_session(rows)β interactiveradiolist_dialogpicker from prompt_toolkitresume_session(shell, session_id)β deletes the fresh session row, restoressession_numberandexecution_count, padsinput_hist_parsed/input_hist_raw, reopens the old session (clearsendtimestamp)
Resume is triggered by IPYTHONNG_FLAGS env var (set by the ipycodex CLI when -r is passed). The _ng_parser (argparse) parses -r <id> or bare -r (const=-1 for interactive picker).
On exit, an atexit handler prints the session ID for easy resume.
Skills follow the Agent Skills specification. Discovery happens once at extension init time via _discover_skills():
- Walk from CWD up through all parent directories, scanning
.agents/skills/in each - Scan
~/.config/agents/skills/ - Deduplicate by resolved path; closer-to-CWD skills take priority
Each skill directory must contain a SKILL.md with YAML frontmatter (name, description). Frontmatter is parsed with PyYAML.
At runtime, if skills were discovered:
- the system prompt gets a
<skills>section listing all skill names, paths, and descriptions - a
load_skilltool is added to the tools list (readsSKILL.mdand returns asFullResponse) - the tool namespace is a merged copy of
user_ns(does not pollute the user's namespace)
The skills list is frozen at load time to prevent the LLM from creating and loading skills during a session.
code_context(start, stop) pulls normal IPython history with:
history_manager.get_range(session=0, start=start, stop=stop, raw=True, output=True)Rules:
- inputs that look like
ipycodexcommands (starting with.or%ipycodex) are skipped - string-literal-only cells (detected by
_is_noteviaast.parse) become<note>tags containing the string value - normal code becomes
<code>...</code> - output history, when present, becomes
<output>...</output>
Tool references are written in prompts as &name``.
Tools are discovered from multiple sources via _tool_refs():
&name`` in the current prompt and prior prompts in dialog historyallowed-toolsfrontmatter and&name`` mentions in skills&name`` mentions andallowed-toolsfrontmatter in notes (string-literal cells)- tool results in stored AI responses whose frontmatter contains
allowed-toolsoreval: true
Shared helpers:
_parse_frontmatter(text)extracts YAML frontmatter from any text (reused by skills, notes, and tool results)_allowed_tools(text)combines frontmatterallowed-toolsand&name`` mentions into a set of tool names_tool_results(response)scans stored response<details>blocks for qualifying tool results
resolve_tools():
- validates tools from the current prompt (raises
NameError/TypeErrorfor missing or non-callable) - collects all tool names from all sources via
_tool_refs() - silently skips tools from non-prompt sources that are missing from
user_ns - builds tool schemas with
get_schema_nm(...)so the exposed tool name matches the namespace symbol instead of__call__for callable objects - passes those schemas to
ipycodex.codex_client.AsyncChat(..., tools=...), which exposes them to Codex app-server asdynamicTools
The load_skill tool is added to user_ns at extension init time when skills are discovered. It is resolved through the normal tool mechanism (skills always contribute load_skill to the tool name set) rather than being special-cased in _run_prompt.
The tool lookup is intentionally live against the active namespace, so changing a function in the IPython session changes the tool used by subsequent prompts. Async callables are awaited inside ipycodex.codex_client before their results are returned to app-server.
Streaming and storage are deliberately separated.
astream_to_stdout():
- uses
ipycodex.codex_client.AsyncStreamFormatterto iterate the response stream - in a TTY, updates a
rich.live.Liveview withMarkdown(...)as chunks arrive - outside a TTY, writes raw chunks to stdout
- returns the full original text for storage
Display processing (_display_text):
_thinking_to_blockquoteconverts stored<thinking>blocks to>blockquote markdown for displaycompact_tool_displayrewrites stored Codex command/tool detail blocks to a shortπ§ f(x=1) => 2form- these affect only the visible terminal output; SQLite keeps the original response
ipycodex wraps the streaming phase in a small guard that temporarily marks shell.display_pub._is_publishing = True. That keeps terminal-visible AI output out of IPython's normal stdout capture and therefore out of output_history, while still allowing ipycodex to store the full response in ai_prompts.
Registered via prompt_toolkit on shell.pt_app.key_bindings during load():
escape, .(Alt-.): AI inline completion β calls_ai_complete()as a background task, which builds a prompt from session context plus the current prefix/suffix and calls the configuredcompletion_model. The result is set asbuffer.suggestion(prompt_toolkit's auto-suggest display), accepted with right-arrow or word-at-a-time with M-f. IPython's existing auto-suggestget_suggestionis patched to remember the AI target text so partial accepts regenerate the remainder. Cancels safely if the buffer text changes before the response arrives.escape, up/escape, down(Alt-Up/Down): jump through complete history entries, bypassing line-by-line navigation in multiline inputs (callsbuffer.history_backward()/history_forward())escape, W(Alt-Shift-W): insert all Python code blocks from_ai_last_responseescape, !throughescape, ((Alt-Shift-1 through Alt-Shift-9): insert the Nth code blockescape, s-up/escape, s-down(Alt-Shift-Up/Down): cycle through code blocks one at a time, replacing the buffer contents; prompt_toolkit swaps A/B for modifier-4 (Alt+Shift) arrows, so the bindings are intentionally inverted
Code blocks are extracted using mistletoe.Document and CodeFence β only blocks tagged python or py are included.
XDG-backed module globals are defined at import time:
CONFIG_PATH: model, think, search, Rich code theme, and the exact-log flagSYSP_PATH: system prompt passed as CodexdeveloperInstructionsSTARTUP_PATH: saved startup snapshot (.ipynbformat)LOG_PATH: optional raw prompt/response log output
Creation behavior:
- these files are created on demand when first needed
- the initial
modeldefaults fromIPYAI_MODELif present - runtime
%ipycodex model ...and similar commands change only the live extension object, not the config file
When log_exact is enabled, the log file contains the exact fully-expanded prompt passed to the model and the exact raw response returned from the stream.
To run ipycodex in isolation (no user config, startup, or history), set these environment variables:
XDG_CONFIG_HOMEβ redirects ipycodex's config files (config.json,sysp.txt,startup.ipynb)IPYTHON_DIRβ redirects IPython's profile directory (prevents loading useripython_config.pyand startup scripts)--HistoryManager.hist_file=<path>β isolates the history database
The e2e test uses all three to create a fully isolated session via pexpect.
The test suite uses dummy shell, history, chat, formatter, console, and markdown objects.
Coverage currently focuses on:
- period prompt parsing and continuation handling
- cleanup-transform rewriting
- prompt/history persistence
- context generation including notes (
<note>tags) - tool resolution including unified discovery from skills, notes, and tool responses
- frontmatter parsing (
_parse_frontmatter) andallowed-toolsextraction - config and system prompt file creation
- startup save/replay in ipynb format with cell IDs
- startup round-trip for notes (markdown cells with preserved source)
- raw exact logging
- Rich live markdown rendering
- thinking block stripping
- skill discovery, parsing, XML generation,
load_skill, and eval blocks - skills integration in
_run_prompt - session persistence: CWD in remark, list sessions, resume session
- code block extraction
When changing behavior in ipycodex/core.py, update or add the narrowest possible test in tests/test_core.py.
If you want to change prompt parsing or magic routing:
- edit
is_dot_prompt(),prompt_from_lines(), ortransform_dots()
If you want to change the XML or history sent to the model:
- edit
_prompt_template,code_context(),format_prompt(), ordialog_history()
If you want to change notes behavior:
- edit
_is_note(),_note_str(), and the note handling incode_context()
If you want to change tool behavior:
- edit
_tool_names(),_tool_refs(),_parse_frontmatter(),_allowed_tools(),_tool_results(), orresolve_tools()
If you want to change skills:
- edit
_parse_skill(),_discover_skills(),_skills_xml(),load_skill(), and the skills/tool collection in_run_prompt()
If you want to change terminal rendering:
- edit
_display_text(),_strip_thinking(),compact_tool_display(),_astream_to_live_markdown(),_markdown_renderable(), orastream_to_stdout()
If you want to change persistence:
- edit
ensure_prompt_table(),prompt_records(),save_prompt(),save_startup(),apply_startup(), andreset_session_history()
If you want to change the startup notebook format:
- edit
_event_to_cell(),_cell_to_event(),_default_startup(),load_startup(), andsave_startup()
If you want to change keybindings:
- edit
_register_keybindings()and_extract_code_blocks()
If you want to change AI inline completion:
- edit
_ai_complete(),_COMPLETION_SP, and theescape, .binding in_register_keybindings()
If you want to change syntax highlighting:
- edit
_patch_lexer()
If you want to change session persistence or resume:
- edit
_list_sessions(),_fmt_session(),_pick_session(),resume_session(), thesessionscase inhandle_line(), and the session handling increate_extension()
- the primary target is terminal IPython
- prompt rows should remain compact; dynamic context generation is preferred over storing expanded prompts
- stored responses should keep full fidelity, even when terminal rendering is simplified
- skills are discovered once at load time and never re-scanned during a session