128k whitespace tokens, by accident: debugging the Responses API

tl;dr: I made setting max_output_tokens a hard rule for the Responses API. While generating some structured output, I hit a bug that generated 128k tokens. Removing the array length constraint fixed it!
View the series
  1. See how I used the OpenAI API to generate audio and images
  2. See why structured outputs also need hard guardrails

Always set the max_output_tokens request parameter in the OpenAI Responses API ↗. This is my new rule.

While testing phrasebook-fr-to-en CLI, for some reason the API generated 128k tokens, the maximum. I expected a JSON like this:

{
  "translations": [
    {
      "french": "fr2",
      "english": "en2"
    },
    {
      "french": "fr3",
      "english": "en3"
    }
  ]
}

I used their structured output ↗ feature and defined the expected JSON schema with this Pydantic ↗ model:

# DON'T USE THIS CODE.  IT MAKES THE REPSONSES API BUG.
from openai import OpenAI
from pydantic import BaseModel, conlist

client = OpenAI()

class Translation(BaseModel):
    french: str
    english: str

class Translations(BaseModel):
    # BUG CAUSED BY THIS FOLLOWING LINE
    translations: conlist(Translation, min_length=2, max_length=2)

response = client.responses.parse(
    model="gpt-5.2",
    input="Il est beau. -> He is handsome.",
    text_format=Translations,
)

But what I got was an incomplete JSON with more than 63,000 whitespaces. In total, it hit exactly 128k tokens. If I had set the max_output_tokens request parameter to 256, it would have stopped at 256.

OpenAI Responses API bug - 128,000 maximum token limit reached only when specifying an array size in the JSON Schema structured output request

The bug seems to kick in because of the minItems and maxItems array fields in the JSON schema. I set both to 2 with conlist(Translation, min_length=2, max_length=2) in the Pydantic model. Per the docs ↗ this should have worked:

OpenAI Docs structured output (Responses API) - Supported array properties in JSON schema

And it did, once I removed the length condition on the array, the API responded the right way:

class Translations(BaseModel):
    translations: list[Translation]

Lesson learned:

Always set the max_output_tokens request parameter in the OpenAI Responses API ↗.

Get my thoughts on working with AI

No spam. Unsubscribe anytime.

That's all I have for today! Talk soon 👋

References

Recent posts
featuredTech improvements are quickly taken for granted
Donald Knuth had to look up word definitions by hand. We don't.
I made a CLI to enrich French-English phrasebooks with AI translations, audio, and images
See how I used the OpenAI API to generate audio and images
code
Improve your docs by giving your AI assistant the project's issues
See why a virtual keyboard bug calls for issue-aware docs
misc
Curious about the tools I use?