Skip to content

prompt_cache_retention type declares "in-memory" but API expects "in_memory" #2883

@qedbot

Description

@qedbot

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

The SDK declares prompt_cache_retention with Literal["in-memory", "24h"] (hyphen), but the API rejects "in-memory" with a 400 and only accepts "in_memory" (underscore). Using the SDK-typed value as-is always fails.

The bug appears in the Responses API params, Chat Completions params, and the Response model:
types/responses/response_create_params.py:155

prompt_cache_retention: Optional[Literal["in-memory", "24h"]]
"""The retention policy for the prompt cache.

Set to `24h` to enable extended prompt caching, which keeps cached prefixes
active for longer, up to a maximum of 24 hours.
[Learn more](https://platform.openai.com/docs/guides/prompt-caching#prompt-cache-retention).
"""

types/chat/completion_create_params.py:188

prompt_cache_retention: Optional[Literal["in-memory", "24h"]]
"""The retention policy for the prompt cache.

Set to `24h` to enable extended prompt caching, which keeps cached prefixes
active for longer, up to a maximum of 24 hours.
[Learn more](https://platform.openai.com/docs/guides/prompt-caching#prompt-cache-retention).
"""

types/responses/response.py:217 (Response model also uses the wrong literal)

prompt_cache_retention: Optional[Literal["in-memory", "24h"]] = None
"""The retention policy for the prompt cache.

Set to `24h` to enable extended prompt caching, which keeps cached prefixes
active for longer, up to a maximum of 24 hours.
[Learn more](https://platform.openai.com/docs/guides/prompt-caching#prompt-cache-retention).
"""

All three should use Literal["in_memory", "24h"].

To Reproduce

  1. Install the SDK: pip install openai==2.21.0
  2. Save the code snippet below as test.py
  3. Run: OPENAI_API_KEY=sk-... python test.py
  4. Observe:
    • "in_memory" (underscore) succeeds (200 OK)
    • "in-memory" (hyphen, the SDK-typed value) fails (400 Bad Request)

Code snippets

Full reproduction script:


#!/usr/bin/env python3
"""
PoC: OpenAI Python SDK v2.21.0 `prompt_cache_retention` type bug

SDK types define the value as Literal["in-memory", "24h"] | None,
but the actual API rejects "in-memory" (400) and expects "in_memory".

Usage:
    OPENAI_API_KEY=sk-... python test.py
"""

import json
import os
import sys
import urllib.error
import urllib.request

import openai

API_KEY = os.environ.get("OPENAI_API_KEY")
if not API_KEY:
    print("Set OPENAI_API_KEY to run this PoC.", file=sys.stderr)
    sys.exit(1)

BASE_URL = "https://api.openai.com/v1"


# ── Helpers ──────────────────────────────────────────────────


def heading(title: str) -> None:
    print(f"\n--- {title} ---\n")


def pass_(msg: str) -> None:
    print(f"  ✅  {msg}")


def fail_(msg: str) -> None:
    print(f"  ❌  {msg}")


# ── Test 1: Type-level bug (no API call) ─────────────────────

heading("Test 1 — SDK type-level mismatch (type stubs only)")

# The SDK type stub accepts "in-memory" (hyphen) without complaint:
#   prompt_cache_retention: Optional[Literal["in-memory", "24h"]]
# Pyright/mypy will not flag this:
typed_value: openai.types.responses.response_create_params.ResponseCreateParamsNonStreaming = {
    "model": "gpt-5.2",
    "input": "hi",
    "prompt_cache_retention": "in-memory",  # ← SDK says this is valid
}

pass_(f'SDK type stubs accept \'in-memory\' (prompt_cache_retention = "{typed_value["prompt_cache_retention"]}")')

# ── Test 2: Raw fetch — compare both values ──────────────────

heading("Test 2 — Direct fetch: underscore vs hyphen")


def raw_call(retention: str) -> tuple[int, dict]:
    data = json.dumps({
        "model": "gpt-5.2",
        "input": "Say hi",
        "prompt_cache_retention": retention,
    }).encode()
    req = urllib.request.Request(
        f"{BASE_URL}/responses",
        data=data,
        headers={
            "Content-Type": "application/json",
            "Authorization": f"Bearer {API_KEY}",
        },
    )
    try:
        with urllib.request.urlopen(req) as resp:
            return resp.status, json.loads(resp.read())
    except urllib.error.HTTPError as e:
        return e.code, json.loads(e.read())


# 2a: underscore — should succeed
status1, body1 = raw_call("in_memory")
if status1 == 200:
    pass_(f'"in_memory"  → {status1} OK  (API accepts underscore)')
else:
    fail_(f'"in_memory"  → {status1}  (unexpected — API should accept underscore)')
    print(json.dumps(body1, indent=2))

# 2b: hyphen — should fail
status2, body2 = raw_call("in-memory")
if status2 == 400:
    pass_(f'"in-memory"  → {status2} Bad Request  (API rejects hyphen)')
    print(json.dumps(body2, indent=2))
else:
    fail_(f'"in-memory"  → {status2}  (expected 400, got {status2})')
    print(json.dumps(body2, indent=2))

# ── Test 3: SDK call with workaround ─────────────────────────

heading("Test 3 — SDK client: workaround (type: ignore) vs typed value")

client = openai.OpenAI(api_key=API_KEY)

# 3a: workaround — use the correct value, suppress type checker
try:
    resp = client.responses.create(
        model="gpt-5.2",
        input="Say hi",
        prompt_cache_retention="in_memory",  # type: ignore[arg-type]
    )
    pass_(f'SDK + "in_memory" (type: ignore)  → OK  (id: {resp.id})')
except openai.APIStatusError as e:
    fail_(f'SDK + "in_memory" (type: ignore)  → {e.status_code}')
    print(json.dumps(json.loads(e.response.text), indent=2))

# 3b: SDK-typed value — should 400
try:
    client.responses.create(
        model="gpt-5.2",
        input="Say hi",
        prompt_cache_retention="in-memory",
    )
    fail_('SDK + "in-memory"  → OK (unexpected — API should reject hyphen)')
except openai.APIStatusError as e:
    if e.status_code == 400:
        pass_(f'SDK + "in-memory"  → {e.status_code} Bad Request  (SDK-typed value rejected by API)')
    else:
        fail_(f'SDK + "in-memory"  → {e.status_code} (expected 400)')
    print(json.dumps(json.loads(e.response.text), indent=2))


Workaroundsuppress the type checker:


prompt_cache_retention="in_memory",  # type: ignore[arg-type]

OS

Linux (Ubuntu)

Python version

Python 3.12

Library version

v2.21.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions