Skip to content
View johnzfitch's full-sized avatar

Block or report johnzfitch

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
johnzfitch/README.md

Header

definitelynot.ai  Internet Universe  Email  UC Berkeley Mathematics


OpenAI Codex: Finding the Ghost in the Machine

TL;DR: Solved a pre-main() environment stripping bug causing 11-300x GPU slowdowns that eluded OpenAI's debugging team for months.[Issue #8945] [PR #8951]

Full Investigation Details

The Ghost

In October 2025, OpenAI assembled a specialized debugging team to investigate mysterious slowdowns affecting Codex—their own coding tool they use to write OpenAI's code. After a week of intensive investigation: nothing.

The bug was literally a ghost—pre_main_hardening() executed before main(), stripped critical environment variables (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH), and disappeared without a trace. Standard profilers saw nothing. Users saw variables in their shell, but inside codex exec they vanished.


The Hunt

Within 3 days of their announcement, I identified the problematic commitPR #4521 and contacted @tibo-openai.

But identification isn't proof. I spent 2 months building an undeniable case:

Timeline

  • Sept 30, 2025 — PR #4521 merges, enabling pre_main_hardening() in release builds
  • Oct 1, 2025rust-v0.43.0 ships (first affected release)
  • Oct 6, 2025 — First "painfully slow" regression reports
  • Oct 1-29, 2025 — Spike in env/PATH inheritance issues across platforms
  • Oct 29, 2025 — Emergency PATH fix lands (didn't catch root cause)
  • Late Oct 2025 — OpenAI's specialized team investigates, declares there is no root cause, identifies issue as user behavior change.
  • Jan 9, 2026 — My fix merged, credited in release notes

Evidence Collected

Platform Issues Failure Mode
macOS #6012, #5679, #5339, #6243, #6218 DYLD_* stripping breaking dynamic linking
Linux/WSL2 #4843, #3891, #6200, #5837, #6263 LD_LIBRARY_PATH stripping → silent CUDA/MKL degradation

Compiled evidence packages:

  • Platform-specific failure modes and reproduction steps
  • Quantifiable performance regressions (11-300x) with benchmarks
  • Pattern analysis across 15+ scattered user reports over 3 months
  • Process environment inheritance trace through fork/exec boundaries

📄 Comprehensive Technical Analysis
📄 Investigation Methodology


Why Conventional Debugging Failed

The bug was designed to be invisible:

  • Pre-main execution — Used #[ctor::ctor] to run before main(), before any logging/instrumentation
  • Silent stripping — No warnings, no errors, just missing environment variables
  • Distributed symptoms — Appeared as unrelated issues across different platforms/configs
  • User attribution — Everyone assumed they misconfigured something (shell looked fine!)
  • Wrong search space — Team was debugging post-main application code

Standard debugging tools can't see pre-main execution. Profilers start at main(). Log hooks aren't initialized yet. The code executes, modifies the environment, and vanishes. johnzfitch

/ in main


The Impact

OpenAI confirmed and merged the fix within 24 hours, explicitly crediting the investigation in v0.80.0 release notes on github and the webpage[rust-v0.80.0]:

"Codex CLI subprocesses again inherit env vars like LD_LIBRARY_PATH/DYLD_LIBRARY_PATH to avoid runtime issues. As explained in #8945, failure to pass along these environment variables to subprocesses that expect them (notably GPU-related ones), was causing 10x+ performance regressions! Special thanks to @johnzfitch for the detailed investigation and write-up in #8945."

Restored:

  • GPU acceleration for their own internal ML/AI development teams
  • CUDA/PyTorch functionality for ML researchers globally
  • MKL/NumPy performance for scientific computing users
  • Conda environment compatibility
  • Enterprise database driver support

When the tools are blind, the system lies, and everyone else has stopped looking for it. This is class of problem I specialize in.

---

Software engineer with mathematics background specializing in systems programming, security research, and AI/ML applications. I build production tools across the full stack—from WebGPU-accelerated browser application to Rust CLI tools to bare-metal NixOS infrastructure.

More Projects:

  • Observatory - WebGPU deepfake detection running 4 ML models in-browser (live demo)
  • specHO - LLM watermark detection via phonetic/semantic analysis (The Echo Rule)
  • filearchy - COSMIC Files fork with sub-10ms trigram search (Rust)
  • nautilus-plus - Enhanced GNOME Files with sub-millisecond search (AUR)
  • indepacer - PACER CLI for federal court research (PyPI: pacersdk)

Self-hosting bare-metal infrastructure (NixOS) with post-quantum cryptography (ML-KEM, Rosenpass VPN), authoritative DNS, and containerized services.


Featured

Observatory - WebGPU Deepfake Detection

Live Demo: look.definitelynot.ai

Browser-based AI image detection running 4 specialized ML models (ViT, Swin Transformer) through WebGPU. Zero server-side processing—all inference happens client-side with 672MB of ONNX models.

Model Accuracy Architecture
dima806_ai_real 98.2% Vision Transformer
SMOGY 98.2% Swin Transformer
Deep-Fake-Detector-v2 92.1% ViT-Base
umm_maybe 94.2% Vision Transformer

Stack: JavaScript (ES6), Transformers.js, ONNX, WebGPU/WASM Design: 2006 "Purist" UI aesthetic - no frameworks, pure web standards


iconics - Semantic Icon Library

3,372+ PNG icons with semantic CLI discovery. Find the right icon by meaning, not filename.

icon suggest security       # → lock, shield, key, firewall...
icon suggest data           # → chart, database, folder...
icon use lock shield        # Export to ./icons/

Features: Fuzzy search, theme variants, batch export, markdown integration Stack: Python, FuzzyWuzzy, PIL


filearchy + triglyph - Sub-10ms File Search

COSMIC Files fork with embedded trigram search engine. Memory-mapped indices achieve sub-millisecond searches across 2.15M+ files with ~0 bytes resident memory.

filearchy/
├── triglyph/      # Zero-RSS trigram library (mmap, ~0 bytes resident)
└── triglyphd/     # D-Bus daemon for system-wide search

Performance: 2.15M files indexed, sub-10ms query time, 156MB index on disk Stack: Rust, libcosmic, memmap2, zbus


The Echo Rule - LLM Detection Methodology

LLMs echo their training data. That echo is detectable through pattern recognition:

Signature Detection Method
Phonetic CMU phoneme analysis, Levenshtein distance
Structural POS tag patterns, sentence construction
Semantic Word2Vec cosine similarity, hedging clusters

Implemented in specHO with 98.6% preprocessor test pass rate. Live demo at definitelynot.ai.


Skills

Technical Focus - Skills breakdown

Core: Rust · Python · TypeScript · C · Nix · Shell


Projects

AI/ML

Project Description Stack
observatory WebGPU deepfake detection, 4 ML models client-side · live JS, Transformers.js, ONNX
specHO LLM watermark detection via Echo Rule (phonetic/semantic) Python, spaCy, Gensim
definitelynot.ai Unicode security: Trojan Source, homoglyph, BiDi defense PHP, JavaScript, ICU
marginium Multimodal generation with LLM visual output awareness Python
gemini-cli Privacy-enhanced Gemini CLI fork, telemetry disabled TypeScript, Node.js

Security Research

Project Description Stack
eero (private) Mesh WiFi router security analysis, HackerOne prep Python, Wireshark
blizzarchy (private) Battle.net OAuth analysis, telemetry RE Rust, Python, Ghidra
featherarchy Security-hardened Monero wallet fork C++, Qt6
alienware-monitor (private) Dell monitor firmware RE, GSFW decoder Python, Ghidra
proxyforge (private) Transparent MITM proxy, TLS interception Python, mitmproxy

Systems Programming

Project Description Stack
filearchy COSMIC Files fork with embedded trigram search engine Rust, libcosmic
triglyph Zero-RSS trigram index library (mmap, ~0 bytes resident) Rust, memmap2
triglyphd D-Bus daemon for system-wide search Rust, zbus
nautilus-plus Enhanced GNOME Files with 512px thumbnails, search-cache C, GTK4
search-cache HashMap-based file indexing, sub-ms search for 2.15M+ files Rust, DashMap
cod3x Terminal coding agent with 3D ASCII interface at 60fps Rust, SQLite
bitmail (private) Modern Bitmessage client with Python CLI and Rust TUI Python, Rust

CLI Tools

Project Description Stack
indepacer PACER CLI for federal court research, MFA, cost protection Python, Click, Rich
iconics Semantic icon library (3,372 PNGs), CLI discovery/export Python
gemini-sharp Single-file Gemini CLI binaries, 15+ color themes C#, .NET

Desktop/Linux

Project Description Stack
omarchy DHH's omarchy fork: waybar RSS, NVIDIA config, compact UI Hyprland, Shell
waybar-config RSS ticker for self-hosted FreshRSS, hover-pause JSON, CSS, Shell
claude-desktop-arch Claude Code preview patch for Arch Linux JavaScript, Shell
qualcomm-x870e-linux-bug-patch WiFi 7 firmware fix for WCN7850 on X870E Python, ACPI
arch-dependency-matrices Graph theory analysis of 1,553 Arch packages Python, NumPy

Web/Mobile

Project Description Stack
NetworkBatcher Energy-efficient network batching for iOS 26+ Swift
Liberty-Links Tracker-free, privacy-respecting link alternatives Markdown

Infrastructure

Primary Server: Dedicated bare-metal NixOS host (details available on request)

Service Technology
Security Post-quantum SSH (sntrup761x25519), Rosenpass VPN (ML-KEM + Kyber-512), nftables firewall
DNS Unbound recursive resolver with DNSSEC, ad/tracker blocking, no third-party DNS
Services FreshRSS, Caddy (HTTPS/HTTP3), cPanel/WHM, Podman containers
Network Local 10Gbps, authoritative BIND9 with RFC2136 ACME

Infrastructure as Code:

Project Description Stack
NixOS Server (private) Bare-metal config: post-quantum SSH, Rosenpass VPN, BIND9 Nix, agenix
unbound-config (private) Recursive DNS with DNSSEC, ad/tracker blocking Unbound, Shell

Philosophy

The best way to predict AI's impact is to build tools that shape it.


SF Bay Area · Open to remote · Icons from iconics

Profile Views

Pinned Loading

  1. specHO specHO Public

    An LLM watermark (pattern recognition) suite

    Python 1

  2. definitelynot.ai definitelynot.ai Public

    AI Text Linter

    JavaScript 2

  3. iconics iconics Public

    A semantic icon library with intelligent tagging and discovery

    Python 2

  4. arch-dependency-matrices arch-dependency-matrices Public

    Mathematical analysis of 1,553 Arch Linux package dependencies using graph theory, spectral analysis, and linear algebra

    Python

  5. NetworkBatcher NetworkBatcher Public

    Energy-efficient network request batching for iOS 26+

    Swift

  6. claude-desktop-arch claude-desktop-arch Public

    Enable Claude Code preview in Claude Desktop on Arch Linux

    Shell