
Baseten
- Features
- Ease of use
- Ease of management
- Quality of support
- Affordability
- Market presence
- Healthcare and life sciences
- Information technology and software
- Manufacturing
What is Baseten
Production-grade model serving
GPU infrastructure abstraction
API-first integration approach
Not an end-to-end AI suite
Platform lock-in considerations
Cost and capacity variability
Plan & Pricing
| Plan | Price | Key features & notes |
|---|---|---|
| Basic | $0 per month (pay-as-you-go) | Dedicated deployments; Model APIs; Fast cold starts; SOC 2 Type II & HIPAA compliant; Email & in-app chat support. |
| Pro | Quote (volume discounts available) | Everything in Basic plus: priority access to high-demand GPUs; dedicated compute; higher Model API rate limits; hands-on engineering expertise; dedicated support (Slack & Zoom). |
| Enterprise | Custom pricing (contact sales) | Everything in Pro plus: custom SLAs; training; self-host deployments; on-demand flex compute; ability to use existing cloud commitments; full control over data residency; advanced security & compliance; custom global regions. |
--
Usage-based details (published on Baseten official pricing page):
Pricing model: Pay-as-you-go
Model API (price per 1M tokens)
- MiniMax M2.5 — Input: $0.30; Output: $1.20
- Kimi K2.5 — Input: $0.60; Output: $3.00
- Kimi K2 Thinking — Input: $0.60; Output: $2.50
- Kimi K2 Instruct — Input: $0.60; Output: $2.50
- GLM 4.7 — Input: $0.60; Output: $2.20
- GLM 4.6 — Input: $0.60; Output: $2.20
- GPT OSS 120B — Input: $0.10; Output: $0.50
- DeepSeek V3.1 — Input: $0.50; Output: $1.50
- DeepSeek V3 0324 — Input: $0.77; Output: $0.77
(Prices listed per 1 million tokens; volume discounts available — contact Baseten.)
Dedicated deployments (compute billed down to the minute) (Prices shown are per minute; hourly-equivalent can be derived by multiplying by 60.)
GPU instances — Price per minute:
- T4 (16 GiB) — $0.01052/min
- L4 (24 GiB VRAM) — $0.01414/min
- A10G (24 GiB) — $0.02012/min
- A100 (80 GiB) — $0.06667/min
- H100 MIG (40 GiB) — $0.0625/min
- H100 (80 GiB) — $0.10833/min
- B200 (180 GiB) — $0.16633/min
CPU instances — Price per minute:
- 1x2 (1 vCPU, 2 GiB) — $0.00058/min
- 1x4 (1 vCPU, 4 GiB) — $0.00086/min
- 2x8 (2 vCPU, 8 GiB) — $0.00173/min
- 4x16 (4 vCPU, 16 GiB) — $0.00346/min
- 8x32 (8 vCPU, 32 GiB) — $0.00691/min
- 16x64 (16 vCPU, 64 GiB) — $0.01382/min
Training: On-demand training compute uses the same instance pricing (per minute). Volume discounts available.
Notes: New Baseten accounts receive free credits for testing/deployment (official docs). Billing initially occurs when usage exceeds $50 or at the end of the month. Volume/education/non-profit discounts are available via support.