Claude & OpenAI Token Cost Calculator (2026)
Token costs are one of the most misunderstood parts of building with LLMs. Developers often wildly over- or under-estimate how much their AI feature will cost at scale. This calculator gives you a realistic monthly estimate before you commit to an architecture.
How Token Pricing Works
LLM APIs charge per token — roughly ¾ of a word, or 4 characters. Every API call has two components:
- Input tokens: Everything you send to the model (system prompt + conversation history + user message)
- Output tokens: Everything the model returns (the response)
Output tokens typically cost 3–5x more than input tokens. The ratio of input to output in your use case significantly affects your total cost.
Current Pricing (June 2026)
Anthropic Claude
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Best For |
|---|---|---|---|
| Claude Haiku 4.5 | US$0.80 | US$4.00 | High-volume, simple tasks (classification, extraction, short responses) |
| Claude Sonnet 4.6 | US$3.00 | US$15.00 | Balanced — most production use cases |
| Claude Opus 4.6 | US$15.00 | US$75.00 | Complex reasoning, low-volume premium use cases |
OpenAI
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Best For |
|---|---|---|---|
| GPT-4o mini | US$0.15 | US$0.60 | Very high volume, simple tasks |
| GPT-4o | US$2.50 | US$10.00 | General purpose |
| o3 | US$10.00 | US$40.00 | Complex reasoning tasks |
Prices are approximate and change periodically. Check the provider's pricing page for current rates.
The Calculator
Step 1: Estimate your token usage per conversation
For most business AI applications, a single conversation (from user message to model response) uses:
| Use Case | Typical Input Tokens | Typical Output Tokens |
|---|---|---|
| Simple chatbot response | 500–1,000 | 100–300 |
| Customer service agent | 1,000–3,000 | 200–500 |
| Document summarisation | 2,000–8,000 | 300–600 |
| Code generation | 500–2,000 | 300–1,000 |
| Complex multi-turn reasoning | 3,000–10,000 | 500–2,000 |
| RAG (retrieval-augmented, with context) | 4,000–15,000 | 300–800 |
A useful rule of thumb: 1 page of text ≈ 500 tokens.
Step 2: Estimate monthly conversations
| Application Scale | Monthly Conversations |
|---|---|
| Internal tool (small team, 10 users) | 500–2,000 |
| SME customer service bot | 1,000–5,000 |
| Consumer app (soft launch) | 5,000–20,000 |
| Consumer app (growth stage) | 20,000–100,000 |
| Enterprise deployment | 50,000–500,000+ |
Step 3: Calculate
Example Calculations
Example 1: Customer Service Bot for Singapore SME
- Model: Claude Haiku 4.5
- 2,000 conversations/month
- 2,000 input tokens per conversation, 400 output tokens per conversation
Input cost: 2,000 × 2,000 × (US$0.80 / 1,000,000) = US$3.20 Output cost: 2,000 × 400 × (US$4.00 / 1,000,000) = US$3.20 Monthly total: US$6.40 ≈ S$8.60/month
Example 2: Document Intelligence Tool (Contract Review)
- Model: Claude Sonnet 4.6
- 500 documents/month
- 8,000 input tokens per document (10-page contract), 600 output tokens
Input cost: 500 × 8,000 × (US$3.00 / 1,000,000) = US$12.00 Output cost: 500 × 600 × (US$15.00 / 1,000,000) = US$4.50 Monthly total: US$16.50 ≈ S$22/month
Example 3: Consumer AI App (Growth Stage)
- Model: Claude Haiku 4.5 (volume optimised)
- 50,000 conversations/month
- 3,000 input tokens, 500 output tokens
Input cost: 50,000 × 3,000 × (US$0.80 / 1,000,000) = US$120 Output cost: 50,000 × 500 × (US$4.00 / 1,000,000) = US$100
Model Selection Guide
Use Claude Haiku 4.5 when: - You have high volume and simple tasks (classification, routing, short responses) - Latency matters (Haiku is significantly faster) - You're running thousands of small interactions per day
Use Claude Sonnet 4.6 when: - You need strong reasoning with reasonable cost - You're building a production product where quality matters - Your conversations are medium complexity (most business use cases land here)
Use Claude Opus 4.6 when: - You need the best possible reasoning quality regardless of cost - Volume is low (internal tools, premium workflows) - The task requires complex multi-step reasoning or nuanced judgment
Consider GPT-4o mini when: - You need absolute minimum cost at very high volume - The task is simple and well-defined - You're already in the OpenAI ecosystem
Cost Optimisation Strategies
Prompt caching: Both Anthropic and OpenAI offer prompt caching for repeated system prompts. If your system prompt is 2,000 tokens and you make 10,000 calls/month, caching it saves approximately 80% of those input token costs. Implement caching from day one.
Right-size your model: Running Opus on tasks that Haiku handles equally well is a 10–20x cost multiplier. Evaluate each use case independently.
Trim your context window: Every token in your conversation history costs money. Implement a context pruning strategy — summarise older messages rather than sending full history indefinitely.
Batch requests: Non-real-time tasks (document processing, analysis jobs) can use batch APIs at 50% cost for Anthropic (Batch API) and OpenAI (Batch API).
Building Cost Into Your Product Economics
For most B2B SaaS products, AI API costs should be a predictable line item — not a surprise. Model them as:
- COGS (Cost of Goods Sold): For products where AI is the core deliverable
- Infrastructure cost: For products where AI is a feature within a larger offering
- Per-user cost: For pricing models where you need to understand unit economics
A well-optimised AI feature in a B2B product typically adds S$2–S$30/month per active user to COGS. For a SaaS product charging S$200–S$500/month, this is a manageable percentage of revenue.
Want Help Architecting Your AI Cost Structure?
PowerDigital builds AI agents and LLM-powered products for Singapore businesses. If you're at the architecture stage and trying to model costs before committing to a build, we can help you:
- Select the right model for each use case
- Design a cost-efficient prompt architecture
- Model costs at different growth scenarios
- Build caching, batching, and context management from day one