Serverless Inference
State-of-the-art language and multimodal models.
Price 1M tokens
Batch API price
Price per 1M tokens
Batch API price
Generate stunning visuals with the latest and greatest image models.
Price per MP
Prices include default steps shown above. Additional costs apply only when exceeding default steps. See full pricing details →
Speech synthesis and processing models.
Price per 1M Characters
Models for automatic speech recognition (ASR) and speech translation.
Price per audio minute
Batch API price
Vector embeddings for semantic search and RAG.
Price per MP
Improve search relevance with reranking models.
Price per MP
Improve search relevance with reranking models.
Price per hour
Deploy models on custom hardware with guaranteed performance and full control.
Guaranteed performance (no sharing)
Support for custom models
Autoscaling & traffic spike handling
Per-minute billing
Ideal for workloads > 130,000 tokens/minute
Customize open-source models with your data.
Price is based on the sum of tokens processed in the fine-tuning training dataset (training dataset size * number of epochs) plus any tokens in the optional evaluation dataset (validation dataset size * number of evaluations).
Code Execution
Customize a deployment of VM sandboxes for large development environments.
Price per hour
Execute LLM-generated code securely using our API.
Price per session
GPU Cloud
HARDWARE TYPES
pricING
HARDWARE TYPES
NVIDIA GB200
Pricing
Coming soon
HARDWARE TYPES
NVIDIA B200
Pricing
Coming soon
HARDWARE TYPES
NVIDIA H200
Pricing
On-demand hourly: $3.79 GPU/hr
1 to 6 days: $3.45 GPU/hr
Up to 3 months: $3.15 GPU/hr
HARDWARE TYPES
NVIDIA H100
Pricing
On-demand hourly: $3.19 GPU/hr
1 to 6 days: $2.85 GPU/hr
Up to 3 months: $2.65 GPU/hr
STORAGE TYPES
STORAGE SIZE
pricING
Shared Storage
STORAGE SIZE
up to 1PB
pricING
$0.16 Gib/mo
State-of-the-art clusters with NVIDIA Blackwell and Hopper GPUs. 3 month minimum commitment. 64 → 1K+ GPUs.
Price per hour
Large-scale, custom-built private GPU clusters. 1K → 10K → 100K+ NVIDIA GPUs.