Fix: proactive auth refresh to reload guarded disk state first by celia-oai · Pull Request #15357 · openai/codex

celia-oai · 2026-03-20T23:33:25Z

Summary

Fix a managed ChatGPT auth bug where a stale Codex process could proactively refresh using an old in-memory refresh token even after another process had already rotated auth on disk.

This changes the proactive AuthManager::auth() path to reuse the existing guarded refresh_token() flow instead of calling the refresh endpoint directly from cached auth state.

Original Issue

Users reported repeated codexd log lines like:

ERROR codex_core::auth: Failed to refresh token: error sending request for url (https://auth.openai.com/oauth/token)

In practice this showed up most often when multiple codexd processes were left running. Killing the extra processes stopped the noise, which suggested the issue was caused by stale auth state across processes rather than invalid user credentials.

Diagnosis

The bug was in the proactive refresh path used by AuthManager::auth():

Process A could refresh successfully, rotate refresh token R0 to R1, and persist the updated auth state plus last_refresh to disk.
Process B could keep an older auth snapshot cached in memory, still holding R0 and the old last_refresh.
Later, when Process B called auth(), it checked staleness from its cached in-memory auth instead of first reloading from disk.
Because that cached last_refresh was stale, Process B would proactively call /oauth/token with stale refresh token R0.
On failure, auth() logged the refresh error but kept returning the same stale cached auth, so repeated auth() calls could keep retrying with dead state.

This differed from the existing unauthorized-recovery flow, which already did the safer thing: guarded reload from disk first, then refresh only if the on-disk auth was unchanged.

What Changed

Switched proactive refresh in AuthManager::auth() to:
- do a pure staleness check on cached auth
- call refresh_token() when stale
- return the original cached auth on genuine refresh failure, preserving existing outward behavior
Removed the direct proactive refresh-from-cached-state path
Added regression tests covering:
- stale cached auth with newer same-account auth already on disk
- the same scenario even when the refresh endpoint would fail if called

Why This Fix

refresh_token() already contains the right cross-process safety behavior:

guarded reload from disk
same-account verification
skip-refresh when another process already changed auth

Reusing that path makes proactive refresh consistent with unauthorized recovery and prevents stale processes from trying to refresh already-rotated tokens.

Testing

Test shape:

create a fresh temp CODEX_HOME from ~/.codex/auth.json
force last_refresh to an old timestamp so proactive refresh is required
start two long-lived helper processes against the same auth file
start B first so it caches stale auth and sleeps
start A second so it refreshes first
point both at a local mock /oauth/token server
inspect whether B makes a second refresh request with the stale in-memory token, or reloads the rotated token from disk

Before the fix

The repro showed the bug clearly: the mock server saw two refreshes with the same stale token, A rotated to a new token, and B still returned the stale token instead of reloading from disk.

POST /oauth/token refresh_token=rt_j6s0...
POST /oauth/token refresh_token=rt_j6s0...

B:cached_before=rt_j6s0...
B:cached_after=rt_j6s0...
B:returned=rt_j6s0...

A:cached_before=rt_j6s0...
A:cached_after=rotated-refresh-token-logged-run-v2
A:returned=rotated-refresh-token-logged-run-v2

After the fix

After the fix, the mock server saw only one refresh request. A refreshed once, and B started with the stale token but reloaded and returned the rotated token.

POST /oauth/token refresh_token=rt_j6s0...

B:cached_before=rt_j6s0...
B:cached_after=rotated-refresh-token-fix-branch
B:returned=rotated-refresh-token-fix-branch

A:cached_before=rt_j6s0...
A:cached_after=rotated-refresh-token-fix-branch
A:returned=rotated-refresh-token-fix-branch

This shows the new behavior: A refreshes once, then B reuses the updated auth from disk instead of making a second refresh request with the stale token.

celia-oai · 2026-03-21T00:15:13Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 41a653fd81

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-21T00:18:05Z

codex-rs/login/src/auth/manager.rs

+        if Self::is_stale_for_proactive_refresh(&auth)
+            && let Err(err) = self.refresh_token().await
+        {


Keep proactive refresh working when account ID is missing

auth() now routes stale ChatGPT auth through refresh_token(), but that path hard-fails when expected_account_id is None. Auth files can legitimately lack account_id (it is optional), so proactive refresh now returns stale cached auth instead of refreshing/reloading. This can leave expired tokens in use and cause repeated auth failures/log spam for those users.

Useful? React with 👍 / 👎.

@pakrym you added this logic to skip reloading when there's no account_id: https://github.com/openai/codex/pull/8880/changes#diff-c618f196cd6214e554ed7c8845bdfefb3cd013c58274b10f3afe5d792e7d0af3R725-R727

is this expected behavior?

celia-oai added 2 commits March 20, 2026 16:29

changes

703ed17

fix tests

41a653f

celia-oai force-pushed the dev/cc/fix-auth branch from c010668 to 41a653f Compare March 21, 2026 00:00

celia-oai requested review from etraut-openai and pakrym-oai March 21, 2026 00:00

celia-oai marked this pull request as ready for review March 21, 2026 00:00

chatgpt-codex-connector bot reviewed Mar 21, 2026

View reviewed changes

This was referenced Mar 21, 2026

📊 AI CLI 工具社区动态日报 2026-03-21 gsscsd/big_model_radar#70

Open

📊 Bản tin hàng ngày công cụ AI CLI 2026-03-21 compasify/agents-radar#65

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: proactive auth refresh to reload guarded disk state first#15357

Fix: proactive auth refresh to reload guarded disk state first#15357
celia-oai wants to merge 2 commits intomainfrom
dev/cc/fix-auth

celia-oai commented Mar 20, 2026 •

edited

Loading

Uh oh!

celia-oai commented Mar 21, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 21, 2026

Uh oh!

celia-oai Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

celia-oai commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Original Issue

Diagnosis

What Changed

Why This Fix

Testing

Before the fix

After the fix

Uh oh!

celia-oai commented Mar 21, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

celia-oai Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

celia-oai commented Mar 20, 2026 •

edited

Loading