Products: Inference, Fine-Tuning, Training, and GPU Clusters | Together AI

This website uses cookies to anonymously analyze website traffic using Google Analytics.

Together Products

Build and run generative AI applications with accelerated performance, maximum accuracy, and lowest cost at production scale.

Start building now Docs

Qwen 2.5-Coder 32B

Llama 3.2 11B Free

Llama 4 Maverick

Arcee AI AFM-4.5B

FLUX.1 [schnell] Free

Mistral Small 3

Cartesia Sonic-2

Sphere

Go from model training to production under one roof

Together AI provides the most complete end-to-end platform to train, fine-tune, and deploy AI models with flexibility, performance, control, and cost-efficiency.

Together
Inference
Fast inference for open-source models.
- ✔ Fast serverless API for 200+ models with pay-per-token pricing.
- ✔ Customizable Dedicated Endpoints with per-minute billing.
- ✔ Optimized by the Together Inference Stack (4x faster than vLLM).
Explore Inference
Together
Fine-Tuning
Fine-tune models with your data.
- ✔ Straightforward Fine-Tuning API.
- ✔ Long-context fine-tuning (up to 32K).
- ✔ Conversational and instruction data format support.
- ✔ Direct Preference Optimization & Continued Fine-Tuning.
Explore Fine-Tuning
Together
GPU Clusters
Turbocharged GPUs for training & inference.
- ✔ Top-Tier NVIDIA Blackwell hardware: GB200 NVL72, HGX B200, H200 & more.
- ✔ Clusters ready with 16 → 100K+ GPUs.
- ✔ Up to 24% faster training operations and 75% faster inference.
Explore GPU Clusters

Not sure where to start? Our team is ready to help!

Subscribe to newsletter

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Together.ai