LLM API Response
Updated: 2025-11-25
When you make an API call, you receive a structured response object named ChatCompletion. Let us break down what each part means using the response from the "Hello World" example in src/llm_api_access.py:
ChatCompletion(
id='chatcmpl-CfaRxgTbF2CfDMo63LJwICeQBjODl',
object='chat.completion',
created=1764027597,
model='gpt-5-nano-2025-08-07',
service_tier='default',
system_fingerprint=None,
choices=[
Choice(
index=0,
finish_reason='stop',
logprobs=None,
message=ChatCompletionMessage(
content='Hello World',
role='assistant',
refusal=None,
annotations=[],
audio=None,
tool_calls=None))],
usage=CompletionUsage(
prompt_tokens=11,
completion_tokens=203,
total_tokens=214,
prompt_tokens_details=PromptTokensDetails(
audio_tokens=0,
cached_tokens=0),
completion_tokens_details=CompletionTokensDetails(
accepted_prediction_tokens=0,
audio_tokens=0,
reasoning_tokens=192,
rejected_prediction_tokens=0)))Top-Level Fields
Here are the top-level fields in the response:
id
Unique identifier for this API request
'chatcmpl-CfaRxgTbF2CfDMo63LJwICeQBjODl'
object
Type of object returned
'chat.completion'
created
Unix timestamp when the response was created
1764027597
model
The specific model version that generated the response
'gpt-5-nano-2025-08-07'
service_tier
The service tier used for the request
'default'
system_fingerprint
Backend configuration identifier
'fp_12345abc' or None
The choices Array
choices ArrayThis field contains an array of possible responses. In most cases, there is only one choice (index 0):
index
Position in the choices array
0
finish_reason
Why the model stopped generating • stop: natural completion • length: hit max_tokens limit • content_filter: content was filtered
'stop
The message Object
message ObjectThis field contains the actual response from the model:
content
The actual text response from the model
'Hello World'
role
Who generated this message
'assistant', 'user', 'system'
refusal
Message if model refused to respond
None or 'I cannot help with that'
annotations
Citations or references (RAG/retrieval)
[] (usually empty)
audio
Audio response data
None or audio object
tool_calls
Tools/functions the model wants to use
None or list of tool calls
The usage Object
usage ObjectThis is critical for understanding API costs. Every API call consumes tokens, which you pay for:
prompt_tokens
Number of tokens in your input (prompt)
11
completion_tokens
Number of tokens in the model's response
203
total_tokens
Total tokens used (prompt + completion)
214
Cost Analysis
Tokens are pieces of words. Roughly:
1 token ≈ 4 characters in English
1 token ≈ ¾ of a word
100 tokens ≈ 75 words
You pay separately for prompt_token (input) and completion_tokens (output). For gpt-5-nano (as of 2025-12-01):
Input: $0.05 per 1M tokens ($0.00000005 per token)
Output: $0.4 per 1M tokens ($0.0000004 per token)
Thus, the total cost is:
Input: 11 tokens x $0.00000005 = $0.00000055
Output: 203 tokens x $0.0000004 = $0.000081
Total: $0.00000055 + $0.00000812 = $0.00008175
Last updated
Was this helpful?