Request your H100 cluster

Reserve large-scale NVIDIA H100 GPU clusters, optimized by Together Kernel Collection and frontier research.

  • NVIDIA HGX H100 SXM with 80GB HBM2e and InfiniBand across all nodes.
  • Clusters from 8 → 256 GPUs. On-demand from $2.99/hr, reserved from $1.75/hr.
  • Accelerated by Together Kernel Collection for faster training and inference.
  • Managed Kubernetes or Slurm orchestration — deploy in minutes.
  • Available across 25+ regions in North America, Europe, and Asia.
Trusted by

Large scale NVIDIA GPU clusters, custom built for you

As an NVIDIA Cloud Partner, we have massive clusters ready for you right now, and can also work with you to build GPU Clusters specific to your project needs.

  • NVIDIA GB200 NVL72

    A rack-scale, liquid-cooled supercomputer that enables 72 NVIDIA Blackwell GPUs to act as one massive GPU, delivering 1.4 exaFLOPs of AI performance and up to 30TB of fast memory.

    Learn more
  • NVIDIA Blackwell GPU

    Delivering up to 15X more real-time inference and 3X faster training to accelerate trillion-parameter language models compared to the NVIDIA Hopper architecture generation.

    Learn more
  • NVIDIA HGX H200

    1.1TB of HBM3e across 8 Hopper GPUs with 7.2TB/s of total aggregate bandwidth, nearly doubling the memory capacity and offering 1.4 times more memory bandwidth than HGX H100.

    Learn more
  • NVIDIA HGX H100

    Delivering exceptional performance, scalability, and security for every workload.

    Learn more

Customers running Together GPU Clusters in production

    “Together AI provides the performance and reliability we need for real-time, high-quality image and video generation at scale. We value that Together AI is much more than an infrastructure provider — they're a true innovation partner, enabling us to push creative boundaries without compromise.”

    Victor Perez

    Co-Founder, Krea

      "Training our omnimodal Character-3 model required infrastructure designed for large-scale AI. The Together Frontier AI Factory delivered the performance we needed to push the boundaries of multimodal video generation. Together AI understands what builders need — and that made all the difference."

      Michael Lingelbach

      CEO, Hedra

        "Delivering competitive pricing, strong reliability and a properly set up cluster is the bulk of the value differentiation for most AI clouds. The only differentiated value we have seen outside this set is from a Neocloud called Together AI, where the inventor of FlashAttention, Tri Dao, works. We don't believe the value created by Together can be replicated elsewhere."

        Dylan Patel

        Founder, SemiAnalysis