Claude Code Usage Limits & Pricing
Cost Overview
| Metric | Value |
|---|
| Average Cost per Developer | ~$100-200/month with Sonnet 4 |
| Daily Average | $6 per developer |
| 90th Percentile Daily Cost | Below $12 per developer |
| Background Usage | Under $0.04 per session |
Rate Limit Recommendations (Token Per Minute)
| Team Size | TPM per User | Example Total TPM |
|---|
| 1-5 users | 200k-300k | 200k-1.5M |
| 5-20 users | 100k-150k | 500k-3M |
| 20-50 users | 50k-75k | 1M-3.75M |
| 50-100 users | 25k-35k | 1.25M-3.5M |
| 100-500 users | 15k-20k | 1.5M-10M |
| 500+ users | 10k-15k | 5M+ |
Note: Rate limits apply at the organization level, not per individual user. Individual users can temporarily consume more than their calculated share when others aren't actively using the service.
Pricing Model
| Aspect | Details |
|---|
| Billing Method | Pay per API token consumption |
| Subscription Plans | Pro and Max plans include usage |
| Enterprise | Contact sales for custom pricing |
Cost Management Features
| Feature | Description |
|---|
| Session Tracking | Use /cost command to see current session usage |
| Workspace Limits | Set maximum monthly spend per workspace (Anthropic API only) |
| Usage Reports | Available in Anthropic Console for Admins/Billing roles |
| Cloud Providers | Bedrock/Vertex require third-party tools like LiteLLM for metrics |
Default Model Configuration
| Model Type | Default Model | Token Limits |
|---|
| Main Model | Claude Sonnet 4 | Standard API limits apply |
| Small/Fast Model | Claude Haiku | Used for background tasks |
| Max Output Tokens | 4096 | Typical usage without extended thinking |
| Max Thinking Tokens | 1024 | For extended reasoning |
Factors Affecting Token Usage
| Factor | Impact |
|---|
| Codebase Size | Larger codebases = more tokens for analysis |
| Query Complexity | Complex tasks require more tokens |
| File Operations | Number of files searched/modified affects usage |
| Conversation Length | Longer conversations consume more context |
| Auto-compact | Triggers at 95% context capacity |
| Background Processes | ~1 cent/day for haiku generation |
Cost Optimization Tips
- Use Compact Conversations
- Auto-compact enabled by default
- Manual compact with
/compact command
- Custom compaction instructions in CLAUDE.md
- Write Specific Queries
- Avoid vague requests that trigger unnecessary scanning
- Break down complex tasks into focused interactions
- Clear History
- Use
/clear between unrelated tasks
- Reduces context window usage
- Team Deployment
- Start with small pilot group
- Establish usage patterns before wider rollout
Platform-Specific Considerations
| Platform | Authentication | Cost Tracking |
|---|
| Anthropic API | API Key | Built-in Console metrics |
| AWS Bedrock | AWS Credentials/SSO | Use LiteLLM or AWS CloudWatch |
| Google Vertex AI | Google Cloud Auth | Use LiteLLM or GCP monitoring |
Important Notes
- Claude Code is currently in beta
- Usage can vary significantly based on how many instances users run and automation usage
- For high-volume or enterprise use cases, contact Anthropic sales for custom arrangements
- Rate limits are shared across all models in the organization