TOKEN EXPLANATION GUIDE

How AI Tokens Work

AI models do not read text as full sentences or words. They process prompts as tokens, which is why token estimates sit at the center of prompt cost planning.

What tokens actually are

Tokens are chunks of text, code, punctuation, numbers, or symbols that a model can process. A short word may be one token, while a longer word, a code fragment, or structured text can break into several tokens.

That is why token counts are not the same as word counts or character counts. Prompt estimators use heuristics or provider-specific tokenizers to approximate how a model may split your input.

Input tokens vs output tokens

Input tokens are what you send to the model. That includes the prompt itself plus any system instructions, pasted context, conversation history, or retrieved documents that travel with the request.

Output tokens are what the model generates in response. Longer explanations, structured JSON, code, tables, or step-by-step answers can all increase output usage and therefore total cost.

Why token counts vary so much

Different providers and models tokenize text differently, so the same prompt can land at different token counts across OpenAI, Anthropic, Gemini, DeepSeek, or other platforms. That is why cross-provider token estimates should stay directional.

Long prompts, code, JSON, tables, pasted documents, and multi-turn conversation history can all change token usage quickly. Token counts are estimates unless a provider-specific tokenizer is used, and even then real request overhead can still vary.