Zack Fitch johnzfitch

OpenAI Codex: Finding the Ghost in the Machine

TL;DR: Solved a pre-main() environment stripping bug causing 11-300x GPU slowdowns that eluded OpenAI's debugging team for months.^{[Issue #8945] [PR #8951]}

Full Investigation Details

The Ghost

In October 2025, OpenAI assembled a specialized debugging team to investigate mysterious slowdowns affecting Codex—their own coding tool they use to write OpenAI's code. After a week of intensive investigation: nothing.

The bug was literally a ghost—pre_main_hardening() executed before main(), stripped critical environment variables (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH), and disappeared without a trace. Standard profilers saw nothing. Users saw variables in their shell, but inside codex exec they vanished.

The Hunt

Within 3 days of their announcement, I identified the problematic commit^{PR #4521} and contacted @tibo-openai.

But identification isn't proof. I spent 2 months building an undeniable case:

Timeline

Sept 30, 2025 — PR #4521 merges, enabling pre_main_hardening() in release builds
Oct 1, 2025 — rust-v0.43.0 ships (first affected release)
Oct 6, 2025 — First "painfully slow" regression reports
Oct 1-29, 2025 — Spike in env/PATH inheritance issues across platforms
Oct 29, 2025 — Emergency PATH fix lands (didn't catch root cause)
Late Oct 2025 — OpenAI's specialized team investigates, declares there is no root cause, identifies issue as user behavior change.
Jan 9, 2026 — My fix merged, credited in release notes

Evidence Collected

Platform	Issues	Failure Mode
macOS	#6012, #5679, #5339, #6243, #6218	`DYLD_*` stripping breaking dynamic linking
Linux/WSL2	#4843, #3891, #6200, #5837, #6263	`LD_LIBRARY_PATH` stripping → silent CUDA/MKL degradation

Compiled evidence packages:

Platform-specific failure modes and reproduction steps
Quantifiable performance regressions (11-300x) with benchmarks
Pattern analysis across 15+ scattered user reports over 3 months
Process environment inheritance trace through fork/exec boundaries

📄 Comprehensive Technical Analysis
📄 Investigation Methodology

Why Conventional Debugging Failed

The bug was designed to be invisible:

Pre-main execution — Used #[ctor::ctor] to run before main(), before any logging/instrumentation
Silent stripping — No warnings, no errors, just missing environment variables
Distributed symptoms — Appeared as unrelated issues across different platforms/configs
User attribution — Everyone assumed they misconfigured something (shell looked fine!)
Wrong search space — Team was debugging post-main application code

Standard debugging tools can't see pre-main execution. Profilers start at main(). Log hooks aren't initialized yet. The code executes, modifies the environment, and vanishes. johnzfitch

/ in main

The Impact

OpenAI confirmed and merged the fix within 24 hours, explicitly crediting the investigation in v0.80.0 release notes on github and the webpage^{[rust-v0.80.0]}:

"Codex CLI subprocesses again inherit env vars like LD_LIBRARY_PATH/DYLD_LIBRARY_PATH to avoid runtime issues. As explained in #8945, failure to pass along these environment variables to subprocesses that expect them (notably GPU-related ones), was causing 10x+ performance regressions! Special thanks to @johnzfitch for the detailed investigation and write-up in #8945."

Restored:

GPU acceleration for their own internal ML/AI development teams
CUDA/PyTorch functionality for ML researchers globally
MKL/NumPy performance for scientific computing users
Conda environment compatibility
Enterprise database driver support

When the tools are blind, the system lies, and everyone else has stopped looking for it. This is class of problem I specialize in.

---

Software engineer with mathematics background specializing in systems programming, security research, and AI/ML applications. I build production tools across the full stack—from WebGPU-accelerated browser application to Rust CLI tools to bare-metal NixOS infrastructure.

More Projects:

Observatory - WebGPU deepfake detection running 4 ML models in-browser (live demo)
specHO - LLM watermark detection via phonetic/semantic analysis (The Echo Rule)
filearchy - COSMIC Files fork with sub-10ms trigram search (Rust)
nautilus-plus - Enhanced GNOME Files with sub-millisecond search (AUR)
indepacer - PACER CLI for federal court research (PyPI: pacersdk)

Self-hosting bare-metal infrastructure (NixOS) with post-quantum cryptography (ML-KEM, Rosenpass VPN), authoritative DNS, and containerized services.

Featured

Observatory - WebGPU Deepfake Detection

Live Demo: look.definitelynot.ai

Browser-based AI image detection running 4 specialized ML models (ViT, Swin Transformer) through WebGPU. Zero server-side processing—all inference happens client-side with 672MB of ONNX models.

Model	Accuracy	Architecture
dima806_ai_real	98.2%	Vision Transformer
SMOGY	98.2%	Swin Transformer
Deep-Fake-Detector-v2	92.1%	ViT-Base
umm_maybe	94.2%	Vision Transformer

Stack: JavaScript (ES6), Transformers.js, ONNX, WebGPU/WASM Design: 2006 "Purist" UI aesthetic - no frameworks, pure web standards

iconics - Semantic Icon Library

3,372+ PNG icons with semantic CLI discovery. Find the right icon by meaning, not filename.

icon suggest security       # → lock, shield, key, firewall...
icon suggest data           # → chart, database, folder...
icon use lock shield        # Export to ./icons/

Features: Fuzzy search, theme variants, batch export, markdown integration Stack: Python, FuzzyWuzzy, PIL

filearchy + triglyph - Sub-10ms File Search

COSMIC Files fork with embedded trigram search engine. Memory-mapped indices achieve sub-millisecond searches across 2.15M+ files with ~0 bytes resident memory.

filearchy/
├── triglyph/      # Zero-RSS trigram library (mmap, ~0 bytes resident)
└── triglyphd/     # D-Bus daemon for system-wide search

Performance: 2.15M files indexed, sub-10ms query time, 156MB index on disk Stack: Rust, libcosmic, memmap2, zbus

The Echo Rule - LLM Detection Methodology

LLMs echo their training data. That echo is detectable through pattern recognition:

Signature	Detection Method
Phonetic	CMU phoneme analysis, Levenshtein distance
Structural	POS tag patterns, sentence construction
Semantic	Word2Vec cosine similarity, hedging clusters

Implemented in specHO with 98.6% preprocessor test pass rate. Live demo at definitelynot.ai.

Skills

Core: Rust · Python · TypeScript · C · Nix · Shell

Projects

AI/ML

Project	Description	Stack
observatory	WebGPU deepfake detection, 4 ML models client-side · live	JS, Transformers.js, ONNX
specHO	LLM watermark detection via Echo Rule (phonetic/semantic)	Python, spaCy, Gensim
definitelynot.ai	Unicode security: Trojan Source, homoglyph, BiDi defense	PHP, JavaScript, ICU
marginium	Multimodal generation with LLM visual output awareness	Python
gemini-cli	Privacy-enhanced Gemini CLI fork, telemetry disabled	TypeScript, Node.js

Security Research

Project	Description	Stack
eero (private)	Mesh WiFi router security analysis, HackerOne prep	Python, Wireshark
blizzarchy (private)	Battle.net OAuth analysis, telemetry RE	Rust, Python, Ghidra
featherarchy	Security-hardened Monero wallet fork	C++, Qt6
alienware-monitor (private)	Dell monitor firmware RE, GSFW decoder	Python, Ghidra
proxyforge (private)	Transparent MITM proxy, TLS interception	Python, mitmproxy

Systems Programming

Project	Description	Stack
filearchy	COSMIC Files fork with embedded trigram search engine	Rust, libcosmic
↳ triglyph	Zero-RSS trigram index library (mmap, ~0 bytes resident)	Rust, memmap2
↳ triglyphd	D-Bus daemon for system-wide search	Rust, zbus
nautilus-plus	Enhanced GNOME Files with 512px thumbnails, search-cache	C, GTK4
↳ search-cache	HashMap-based file indexing, sub-ms search for 2.15M+ files	Rust, DashMap
cod3x	Terminal coding agent with 3D ASCII interface at 60fps	Rust, SQLite
bitmail (private)	Modern Bitmessage client with Python CLI and Rust TUI	Python, Rust

CLI Tools

Project	Description	Stack
indepacer	PACER CLI for federal court research, MFA, cost protection	Python, Click, Rich
iconics	Semantic icon library (3,372 PNGs), CLI discovery/export	Python
gemini-sharp	Single-file Gemini CLI binaries, 15+ color themes	C#, .NET

Desktop/Linux

Project	Description	Stack
omarchy	DHH's omarchy fork: waybar RSS, NVIDIA config, compact UI	Hyprland, Shell
waybar-config	RSS ticker for self-hosted FreshRSS, hover-pause	JSON, CSS, Shell
claude-desktop-arch	Claude Code preview patch for Arch Linux	JavaScript, Shell
qualcomm-x870e-linux-bug-patch	WiFi 7 firmware fix for WCN7850 on X870E	Python, ACPI
arch-dependency-matrices	Graph theory analysis of 1,553 Arch packages	Python, NumPy

Web/Mobile

Project	Description	Stack
NetworkBatcher	Energy-efficient network batching for iOS 26+	Swift
Liberty-Links	Tracker-free, privacy-respecting link alternatives	Markdown

Infrastructure

Primary Server: Dedicated bare-metal NixOS host (details available on request)

Service	Technology
Security	Post-quantum SSH (sntrup761x25519), Rosenpass VPN (ML-KEM + Kyber-512), nftables firewall
DNS	Unbound recursive resolver with DNSSEC, ad/tracker blocking, no third-party DNS
Services	FreshRSS, Caddy (HTTPS/HTTP3), cPanel/WHM, Podman containers
Network	Local 10Gbps, authoritative BIND9 with RFC2136 ACME

Infrastructure as Code:

Project	Description	Stack
NixOS Server (private)	Bare-metal config: post-quantum SSH, Rosenpass VPN, BIND9	Nix, agenix
unbound-config (private)	Recursive DNS with DNSSEC, ad/tracker blocking	Unbound, Shell

Philosophy

The best way to predict AI's impact is to build tools that shape it.

_{SF Bay Area · Open to remote · Icons from iconics}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly