LLM API Response

Updated: 2025-11-25

When you make an API call, you receive a structured response object named ChatCompletion. Let us break down what each part means using the response from the "Hello World" example in src/llm_api_access.py:

ChatCompletion(
  id='chatcmpl-CfaRxgTbF2CfDMo63LJwICeQBjODl',
  object='chat.completion',
  created=1764027597,
  model='gpt-5-nano-2025-08-07',
  service_tier='default',
  system_fingerprint=None,
  choices=[
    Choice(
      index=0,
      finish_reason='stop',
      logprobs=None,
      message=ChatCompletionMessage(
        content='Hello World',
        role='assistant',
        refusal=None,
        annotations=[],
        audio=None,
        tool_calls=None))],
  usage=CompletionUsage(
    prompt_tokens=11, 
    completion_tokens=203, 
    total_tokens=214, 
    prompt_tokens_details=PromptTokensDetails(
      audio_tokens=0, 
      cached_tokens=0),
    completion_tokens_details=CompletionTokensDetails(
      accepted_prediction_tokens=0, 
      audio_tokens=0, 
      reasoning_tokens=192, 
      rejected_prediction_tokens=0)))

Top-Level Fields

Here are the top-level fields in the response:

Field
Description
Example

id

Unique identifier for this API request

'chatcmpl-CfaRxgTbF2CfDMo63LJwICeQBjODl'

object

Type of object returned

'chat.completion'

created

Unix timestamp when the response was created

1764027597

model

The specific model version that generated the response

'gpt-5-nano-2025-08-07'

service_tier

The service tier used for the request

'default'

system_fingerprint

Backend configuration identifier

'fp_12345abc' or None

Contains the model's response(s)

Array of Choice

Tracks how many tokens were used

The choices Array

This field contains an array of possible responses. In most cases, there is only one choice (index 0):

Field
Description
Example

index

Position in the choices array

0

finish_reason

Why the model stopped generating • stop: natural completion • length: hit max_tokens limit • content_filter: content was filtered

'stop

Log probabilities for tokens

None (unless requested)

The response and metadata from the model

The message Object

This field contains the actual response from the model:

Field
Description
Example

content

The actual text response from the model

'Hello World'

role

Who generated this message

'assistant', 'user', 'system'

refusal

Message if model refused to respond

None or 'I cannot help with that'

annotations

Citations or references (RAG/retrieval)

[] (usually empty)

audio

Audio response data

None or audio object

tool_calls

Tools/functions the model wants to use

None or list of tool calls

The usage Object

This is critical for understanding API costs. Every API call consumes tokens, which you pay for:

Field
Description
Example

prompt_tokens

Number of tokens in your input (prompt)

11

completion_tokens

Number of tokens in the model's response

203

total_tokens

Total tokens used (prompt + completion)

214

prompt_tokens_details

Breakdown of tokens used in a prompt

completion_tokens_details

Breakdown of tokens used in a completion

Cost Analysis

Tokens are pieces of words. Roughly:

  • 1 token ≈ 4 characters in English

  • 1 token ≈ ¾ of a word

  • 100 tokens ≈ 75 words

You pay separately for prompt_token (input) and completion_tokens (output). For gpt-5-nano (as of 2025-12-01):

  • Input: $0.05 per 1M tokens ($0.00000005 per token)

  • Output: $0.4 per 1M tokens ($0.0000004 per token)

Thus, the total cost is:

  • Input: 11 tokens x $0.00000005 = $0.00000055

  • Output: 203 tokens x $0.0000004 = $0.000081

  • Total: $0.00000055 + $0.00000812 = $0.00008175

Last updated

Was this helpful?