Growing to 20 exaflops, Together GPU Clusters help startups and enterprises accelerate generative AI development

Generative AI has taken the world by storm, and its capabilities are only increasing. Venture capital funding for AI startups has more than doubled in 2023 and over 92% of Fortune 500 companies are integrating generative AI into their business processes. This has led to a skyrocketing demand for purpose-built, highly optimized training clusters, which are necessary to train gen AI models.

Together GPU Clusters, previously known as Together Compute, offers purpose-built dedicated GPU training clusters. It delivers unparalleled model training speeds, amazing cost efficiency and expert support. This product has seen incredible adoption in the first 4 months since its launch. Startups and enterprises appreciate not only the cutting edge-training performance, but they also appreciate that our clusters meet the highest standards for hardware, flexibility, and support:

Cutting-edge hardware — high-end NVIDIA GPUs including H100, A100 GPUs with fast Infiniband networking.
Together Training stack — our optimized training stack is ready to go on your cluster, so that you can focus on optimizing model quality instead of tweaking software setup.
Flexible capacity — ability to rapidly grow capacity as your needs change, and schedule clusters only for the time needed.
Top-tier support — confidence that you can get expert support quickly regardless of the issue, whether it is a systems concern or an AI training issue.

Together GPU Clusters provides the perfect solution to meet these key needs.

Cutting edge hardware

Together GPU Clusters are built on state-of-the-art NVIDIA GPUs and networking. The majority of customers are on H100 SXM5 GPUs with 3200 Gbps Infiniband networking or A100 SXM4 80GB GPUs with 1600 Gbps Infiniband networking. Clusters come with fast NVMe storage and options for expanding to high speed network attached storage.

Optimized software stack

We created the Together Training stack to build our own models, like RedPajama. Over the past year, we have worked to make it up to 9x faster than training with a standard attention implementation in PyTorch, through meticulous optimizations. We are thrilled to make it available to you on all Together GPU Clusters.

Train with the Together Training stack, delivering up to 9 times faster training speed with FlashAttention-2.
Slurm configured out-of-the-box for distributed training and the option to use your own scheduler.
Directly ssh into the cluster, download your dataset and you’re ready to go.

Bar chart comparing training speeds: PyTorch 1x, FlashAttention 2.7x, FlashAttention-2 9x fastest.

Flexible capacity

Together AI offers unparalleled flexibility to you on terms, so that you can allocate compute capacity to match your needs.

Start with as little as 30 days — and expand at your own pace.
Scale up or down as your needs change — from 16 GPUs to 2048 GPUS.

When training generative AI models, your compute needs will grow and shrink over time. Most AI teams are not looking for a constant set of hardware for two years straight. You require the highest capacity during training. After pre-training completes, compute needs reduce as you switch to fine-tuning iterations, RLHF, and final testing. Then, as the model is deployed into production,workloads change from training to inference and planning typically starts for the next big model build.

Top-tier support

Our expert AI team is committed to making every Together GPU Clusters customer successful. Our expert team will help unblock you, whether you have AI or system issues. Guaranteed uptime SLA and support is included with every cluster. Additional engineering services available whenever you need them.

Customer story: Pika Labs creates next-gen text-to-video models with Together GPU Clusters

Pika Labs, a video generation company founded by two Stanford PhD students, built its text-to-video model on Together GPU Clusters. As they got traction, Pika built new iterations of the model from scratch with Together GPU Clusters, and they scaled their inference volume as they grew to millions of videos generated per month.

“Together GPU Clusters provided a combination of amazing training performance, expert support, and the ability to scale to meet our rapid growth to help us serve our growing community of AI creators.”

— Demi Guo, CEO, Pika Labs

$1.1 million saved over 5 months
4 hours time to training start
92,300 discord users

Read the case study

Customer story: NexusFlow uses Together GPU Clusters to build gen ai cybersecurity models

Nexusflow, a leader in generative AI solutions for cybersecurity, relies on Together GPU Clusters to build robust cybersecurity models as they democratize cyber intelligence with AI.

"In an industry where time and specialized capabilities mean the difference between vulnerability and security, Together GPU Clusters helped us scale quickly and cost-effectively. Their high-performance infra and top-notch support let us focus on building state-of-the-art solutions for cybersecurity."

— Jian Zhang, CTO of Nexusflow

40% cost savings per month
<90 minutes onboarding time
Zero downtime

Read the case study

Get started today

With growing capacity, we are ready to set you up with your Together GPU Cluster today. Contact us to learn more or reserve your cluster.