
Claude API Pricing in 30 Seconds (2026): Official Rates, RMB Billing, and Savings
Claude API billing is per token across three tiers—Haiku ($1/$5) for cost-efficient volume work, Sonnet ($3/$15) as the balanced default, and Opus ($5/$25) as the flagship tier. Prices are per million tokens (MTok)—input / output.
Anthropic list pricing (March 2026): Opus 4.6 — $5 input / $25 output per MTok; Sonnet 4.6 — $3 / $15; Haiku 4.5 — $1 / $5.

What does this mean on ClaudeAPI.com?
On ClaudeAPI.com, ¥1 RMB ≈ $1 USD of API quota (platform billing model—confirm current rules in the console). With Sonnet 4.6, about ¥1 buys roughly 130K input tokens (~100,000 Chinese characters)—enough to analyze most of a long novel. The Claude 4.5+ generation cut list costs versus the prior generation by about 67%. ClaudeAPI.com bills through a 2.5× multiplier converted to RMB so spend is visible and predictable on the dashboard.
Is the 2.5× multiplier “expensive”?
No—in context. Spot FX is often around 6.9 CNY per USD. If you pay Anthropic direct in USD, Opus input at $5/MTok ≈ ¥34.5/MTok at that rate. On ClaudeAPI.com, Opus input is about ¥12.5/MTok—roughly 3.6× cheaper in RMB terms (~64% savings) versus that FX illustration, before card fees and extra network tooling.
You are entering at a strong point on the price curve
Claude 3 Opus list pricing was $15 / $75 per MTok. Opus 4.6 is $5 / $25—about 67% lower on list rates while capability improved generation over generation.
II. ClaudeAPI.com billing: simple enough to skip the spreadsheet
If you are comparing platforms, you want to know: how are we charged, and are there hidden fees? Answer: no hidden line items beyond what the pricing page and dashboard show—always confirm live rates before scaling.
2.1 Core rule: official USD list × 2.5, pay in RMB
- How it’s calculated: model prices follow Anthropic’s official USD list × 2.5, presented and settled in RMB
- Why it can still be cheaper than “direct USD”: the effective FX implied by 2.5:1 is far better than ~6.9:1 spot for many users—~64% total savings in the scenarios below
- Top up via Alipay / WeChat Pay—no overseas credit card required
- No monthly fee, no subscription lock-in, no minimum spend—pure pay-as-you-go
One-line summary: you might pay about ¥12.5 for quota that maps to $5 of list API value (~¥34.5 at 6.9:1)—and you avoid much of the FX, payment, and network friction of direct overseas billing.
2.2 What is a token? How are you charged?
A token is the unit LLMs use to measure text.
Claude API bills input tokens (what you send) and output tokens (what the model returns) separately.
Output is typically 5× input on the three main models—reply length is usually your biggest cost lever.
Rough guides:
- ~1 token ≈ 4 English characters or ~0.75 English words
- For Chinese text, ~1 token ≈ 1–2 characters
Example (Sonnet 4.6 on ClaudeAPI.com RMB rates): ~200 Chinese characters of coding question (~200 input tokens) + ~500 characters of code reply (~500 output tokens):
- Input: 200 × ¥7.5/MTok ≈ ¥0.0015
- Output: 500 × ¥37.5/MTok ≈ ¥0.01875
- Total ≈ ¥0.02 (~2 fen per turn)
2.3 Transparent usage
Each response includes a usage object with exact input/output token counts. The ClaudeAPI.com console shows historical spend; you can set budget caps and alerts.
III. 2026 Claude model price list

3.1 Core models (USD per 1M tokens — Anthropic list)
| Model | Input | Output | Cache write (5 min) | Cache read | Best for |
|---|---|---|---|---|---|
| Haiku 4.5 | $1 | $5 | $1.25 | $0.10 | High-frequency lightweight tasks |
| Sonnet 4.6 | $3 | $15 | $3.75 | $0.30 | General-purpose default |
| Opus 4.6 | $5 | $25 | $6.25 | $0.50 | Flagship complex reasoning |
Source: Anthropic pricing documentation, verified March 2026.
https://platform.claude.com/docs/en/about-claude/pricing
Prompt cache (Anthropic): 5-minute cache write = 1.25× base input; 1-hour cache write = 2×; cache read = 0.1× base input.
3.2 ClaudeAPI.com RMB equivalents (illustrative)
| Model | Input tokens per ¥1 | Output tokens per ¥1 | ~Cost per typical chat* |
|---|---|---|---|
| Haiku 4.5 | 1M | 200K | ≈ ¥0.01 |
| Sonnet 4.6 | 330K | 67K | ≈ ¥0.03 |
| Opus 4.6 | 200K | 40K | ≈ ¥0.05 |
*Typical chat ≈ 200 input + 500 output tokens (estimate).
3.3 Pricing details and pitfalls
Opus 4.6 & Sonnet 4.6: full 1M context at standard list rates
As of March 2026, both models support up to 1M tokens at $5/$25 (Opus) and $3/$15 (Sonnet)—a 900K-token prompt and a 9K-token prompt use the same per-token rate. No long-context surcharge on these SKUs.
Legacy Sonnet 4.5: 200K threshold
Above 200K input tokens, Sonnet 4.5 switches to premium pricing: $6 input / $22.50 output per MTok. Migrate to Sonnet 4.6 when you can.
Output is 5× input on the main three models
Keeping answers concise is the fastest way to cut spend.
Extended / adaptive thinking
Thinking tokens bill at the standard output token rate—no separate “thinking SKU,” but thinking still consumes output-priced tokens. Set a thinking budget and watch usage.
IV. Model selection: pick the right tier, not always the priciest
The common mistake is defaulting to Opus. ~80% of daily work fits Sonnet.
4.1 Decision tree
What are you building?
│
├─ Classification / extraction / short Q&A / translation
│ → Haiku 4.5 (¥2.5 / ¥12.5 per MTok) — fastest, cheapest
│
├─ Daily coding / content / docs / support bots
│ → Sonnet 4.6 (¥7.5 / ¥37.5) — best default for most developers
│
└─ Architecture / deep reasoning / very long docs / heavy agents
→ Opus 4.6 (¥12.5 / ¥62.5) — strongest, highest cost
What are you building?
│
├─ Classification / extraction / short Q&A / translation
│ → Haiku 4.5 (¥2.5 / ¥12.5 per MTok) — fastest, cheapest
│
├─ Daily coding / content / docs / support bots
│ → Sonnet 4.6 (¥7.5 / ¥37.5) — best default for most developers
│
└─ Architecture / deep reasoning / very long docs / heavy agents
→ Opus 4.6 (¥12.5 / ¥62.5) — strongest, highest cost
(¥ prices = official USD list × 2.5 on ClaudeAPI.com; confirm live numbers in console.)
4.2 By scenario
Chat / translation / summarization → Haiku 4.5
High-volume, low-complexity: classification, extraction, short Q&A, routing.
- ~10,000 simple chats/month ≈ ¥30
- Suggested first top-up: ¥50 — often enough for 2–3 months of light use
Coding / content / analytics → Sonnet 4.6
Best balance for production chatbots, generation, and general reasoning.
- Typical monthly spend ¥100–500
- Suggested top-up: ¥100–200 for solo devs and creators
Deep reasoning / agents / 1M context → Opus 4.6
Both Opus 4.6 and Sonnet 4.6 support 1M context and extended thinking; Opus allows up to 128K output vs 64K on Sonnet 4.6. Use Opus when depth beats cost.
- Typical monthly spend ¥500+
- Suggested top-up: ¥500+ for heavy users; ask support for enterprise tiers
4.3 Mixed-model strategy
Haiku for triage → Sonnet for production → Opus for the hard 20%
Do not use Opus when Sonnet suffices, or Sonnet when Haiku suffices. A 70 / 20 / 10 Haiku / Sonnet / Opus split can save ~60% versus Sonnet-only traffic (workload-dependent).
V. Three ways to spend less for the same outcome
Looking for a cheaper Claude API path? The percentage savings below still apply on top of the 2.5× RMB model because discounts are proportional to base usage.
5.1 Prompt caching — up to ~90% on repeated input
Caching reuses processed prompt prefixes across calls. Cache reads cost a fraction of normal input.
- 5-minute cache write: 1.25× base input
- 1-hour cache write: 2×
- Cache read: 0.1× → ~90% savings on cached input
Sonnet 4.6 example (RMB): standard input ¥7.50/MTok → cache read ¥0.75/MTok.
Enable: add top-level cache_control; the service sets the breakpoint on the last cacheable block—no extra product fee for enabling cache itself.
Best for: stable system prompts, repeated RAG documents, unchanged conversation prefixes, shared few-shot blocks, fixed agent rules invoked every turn.
On ClaudeAPI.com: same API field—no separate setup charge for caching.
Workload fit: bulk generation, offline pipelines, document batches (often complete within ~1 hour; vendor docs allow up to 24 hours for some batch-style flows).
5.2 Control output length and tighten prompts
Output tokens dominate cost when they run 5× input on mainline models.
Practical tips:
- Ask for concise answers or explicit length caps
- Prefer structured output (JSON, tables, fixed fields) instead of long prose
- Trim context—drop irrelevant history
- Summarize long documents in chunks before final analysis
- Route simple tasks to Haiku, escalate only when needed
Example instruction:
Return JSON only; do not explain your reasoning.
That cuts output tokens and compounds savings over time.
📊 Savings cheat sheet
| Method | Typical savings | Difficulty |
|---|---|---|
| Mixed models | 60–80% | Easy |
| Shorter outputs | 30–50% | Easy |
| Prompt caching | Up to ~90% on repeated input | Medium |
VI. ClaudeAPI.com vs Anthropic direct (especially for developers in China)
Anthropic builds strong models; direct API access can still be painful for mainland China teams—overseas cards, FX, network path, and account policies. ClaudeAPI.com is a third-party gateway (not Anthropic) focused on RMB top-up and Anthropic-compatible endpoints.
6.1 Comparison
| Item | Anthropic direct | ClaudeAPI.com |
|---|---|---|
| Payment | Overseas cards (Visa/Mastercard, etc.) | Alipay / WeChat Pay, RMB top-up |
| Price display | USD list (e.g. Opus $5/MTok input) | RMB list (e.g. Opus ¥12.5/MTok input) |
| Illustrative RMB cost | $5 × 6.9 ≈ ¥34.5/MTok + fees | ¥12.5/MTok (includes platform 2.5× model) |
| Illustrative savings | — | ~64% vs FX example above |
| Network | Often needs VPN; variable latency | Domestic routing, lower latency for many users |
| Access model | Anthropic developer account | API key via ClaudeAPI console |
| Models | Full Claude family | Same Claude model IDs |
| API format | Native Anthropic | Anthropic-compatible — change base_url + key |
6.2 Hidden cost illustration (example month)
Assume usage equivalent to $100 of list API:
| Line item | Direct (illustrative) | ClaudeAPI.com (illustrative) |
|---|---|---|
| API usage | $100 × 6.9 ≈ ¥690 | $100 × 2.5 = ¥250 |
| Card fee (~2%) | ≈ ¥14 | ¥0 |
| VPN / extra network | ¥50–100/mo | ¥0 |
| FX swing | ~3–5% | Fixed multiplier rules |
| Example monthly total | ¥754–804+ | ¥250 |
| Example annual savings | — | ~¥6,000–6,600 |
The 2.5× multiplier is still far below spot ~6.9:1 in this illustration—total cost can be about one-third of the direct scenario. Run your own estimate with real traffic and console rates.
6.3 Near-zero migration cost
Swap endpoint and key—Anthropic SDK compatible:
# Change only this line (plus your ClaudeAPI key)
client = anthropic.Anthropic(
api_key="your-claudeapi-key",
base_url="https://gw.claudeapi.com",
)
# Change only this line (plus your ClaudeAPI key)
client = anthropic.Anthropic(
api_key="your-claudeapi-key",
base_url="https://gw.claudeapi.com",
)
Works with common frameworks that accept a custom Anthropic base_url.
VII. Enterprise and volume pricing
7.1 Anthropic enterprise
Anthropic offers subscription products and usage-based API billing; high-volume tiers may need overseas entities, English sales workflows, and USD settlement.
7.2 ClaudeAPI.com enterprise
- Volume top-ups with bonus credits / lower effective multiplier
- Corporate wire transfer + invoicing (fapiao)
- Chinese-language support for technical questions
- Multi-sub-account usage reporting
- Contact: enterprise WeChat (QR in console) or support email—see contact
VIII. How much to top up? Three profiles

Individual developer / student
- Models: Sonnet daily, Haiku for trivial tasks
- Monthly spend: ~¥30–100
- First top-up: ¥50 — validate the full workflow, add more as needed
- ¥50 ≈ 1,600+ everyday Sonnet conversations (order-of-magnitude; depends on prompt size)
Freelancer / content creator
- Models: Sonnet primary, Opus for critical deliverables
- Monthly spend: ~¥100–500
- First top-up: ¥200 for 2–4 weeks of heavier use
- Production apps with caching often land around $30–100/month equivalent in USD terms for some workloads
Enterprise / engineering team
- Models: Haiku filter + Sonnet default + Opus for hard cases
- Monthly spend: ¥1,000+
- First top-up: ¥1,000 — contact support for enterprise discounts
FAQ
Claude API vs ChatGPT API — which is cheaper?
Depends on tier and token mix. Claude’s 2026 positioning is premium on raw list $/token versus some providers, with strong instruction-following and reasoning. On ClaudeAPI.com’s 2.5× RMB model, out-of-pocket RMB can be competitive versus buying some OpenAI SKUs through spot FX—compare your workload, not headlines alone.
Do ClaudeAPI.com credits expire?
Per product policy: credits do not expire—pay for what you use. Confirm current terms in the console.
How do I enable prompt caching?
Add cache_control at the top level of the request body. The breakpoint attaches to the last cacheable block—typically one line of config.
Does extended thinking cost extra?
No separate thinking fee, but thinking tokens bill at the output rate within your token budget.
How much can ¥1 buy?
Sonnet 4.6 example: ~130K input tokens or ~27K output tokens per ¥1. At ~200 input + ~500 output tokens per chat, ¥1 ≈ ~50 light chats per day (illustrative).
Is there a surcharge for 1M context on Opus 4.6 / Sonnet 4.6?
No long-context surcharge at standard list pricing (March 2026 policy). Legacy Sonnet 4.5 differs above 200K input—see §3.3.
Enterprise volume discounts?
Yes—contact ClaudeAPI.com support for quotes, lower effective multipliers on bulk top-up, wire transfer, and invoicing.
Summary
Opus 4.6–class capability at list prices ~67% below early Opus generations, plus ClaudeAPI.com RMB billing that can save ~64% versus illustrative direct USD+FX scenarios for developers in China.
Top up from ¥50, point the SDK at https://gw.claudeapi.com, and run your first call in minutes.
Top up now · API documentation · Enterprise contact
Pricing last verified March 2026 against Anthropic documentation. Anthropic may change list prices; check ClaudeAPI.com and Anthropic’s pricing page for current numbers.



