This year, Together AI is excited to be part of NVIDIA GTC with multiple major announcements and conversations shaping the AI ecosystem — from cutting-edge model releases to new voice AI capabilities, and technical sessions with our research and engineering leaders.
If you’re attending GTC, we’d love to connect.
Key announcements
At GTC 2026, several of the announcements we’re participating in highlight a core theme: AI systems are becoming more open, agentic, and production ready. Together AI, the AI Native Cloud, is designed to support this shift — helping developers train, shape, and deploy large-scale AI systems with the performance and cost-efficiency required for real-world applications. We are making multiple announcements today at GTC.
Use NVIDIA Dynamo 1.0 in Together AI
NVIDIA has launched NVIDIA Dynamo 1.0, an open-source software for generative and agentic inference at scale. We are excited to work with NVIDIA on Dynamo 1.0 and have already been using Dynamo as part of our inference stack to deliver more optimized performance in production use cases. At Together AI, we are committed to open innovation and are looking forward to exploring use cases that Dynamo 1.0 can be applied to.
Connect to Together’s high-performance inference through NVIDIA OpenShell
Together AI and NVIDIA are working together on NVIDIA NemoClaw — an open source stack that simplifies running OpenClaw always-on assistants, more safely, with a single command. As part of the NVIDIA Agent Toolkit, it installs the NVIDIA OpenShell runtime—a secure environment for running autonomous agents, and open source models like NVIDIA Nemotron. Together is excited to host NVIDIA OpenShell runtime created for customers who want high performance models to build agents. Together AI has a model library with over 150 optimized models that can now be easily accessed via NemoClaw. Paired with Together’s dedicated endpoints, developers get the speed and cost efficiency of its inference engine at production scale.
Leverage NVIDIA Nemotron 3 Super for multi-agent workflows
NVIDIA Nemotron 3 Super is a hybrid mixture-of-experts model designed for high-performance reasoning and multi-agent workflows. It combines a Mamba-Transformer architecture with a 1M-token context window to support long-horizon reasoning and complex agent interactions. With 120B total parameters (12B active per token), the model is optimized to run multiple collaborating agents efficiently — even on a single GPU — making it well suited for AI-native workflows like software development agents, financial analysis, and cybersecurity automation. Nemotron 3 Super can be deployed through our Dedicated Model Inference, providing developers with a simple and scalable way to run advanced reasoning models in production.
Build voice agents with NVIDIA Parakeet TDT 0.6B V3
As part of our recent voice solutions launch, NVIDIA Parakeet TDT 0.6b V3 automatic speech recognition (ASR) model is now available in the Together AI Model Library, giving developers access to high-performance, low-latency transcription optimized for real-time voice applications. By combining Parakeet’s ASR accuracy with Together’s high-performance inference infrastructure, AI natives can build production-ready voice agents that deliver fast, reliable, and scalable transcription.
Together sessions
The Together AI team, along with customers like Cursor and Decagon, will share insights across multiple GTC sessions, covering topics from production inference to open AI research.
Sessions include:
- Engineering real-world LLM inference: Bridging open-source and production systems
March 17 • 2:00 PM
Yineng Zhang — Principal AI Researcher, Together AI - Hard-Won Lessons From Production Inference at Scale
March 17 • 4:00 PM
Yuchen Wu, Engineer, Cursor | Ce Zhang — CTO, Together AI - Build Trust and Discovery Through Open-Source AI in Research
March 18 • 2:00 PM
Percy Liang — Co-Founder, Together AI - Under the Hood of Building and Scaling AI-Native Applications
March 18 • 4:00 PM
Alan Yiu, VP of Product, Decagon | Charles Zedlewski — Chief Product Officer, Together AI
Visit us at booth #1213
Beyond sessions, the Together team will be hosting booth activations and side events throughout the week, including curated executive meetups focused on next-generation AI infrastructure and AI-native applications.
Stop by to:
- See live demos of Together AI infrastructure and models
- Learn how teams are scaling production inference and agentic systems
- Meet researchers and engineers building the future of open AI models and infrastructure
Try Nemotron models now on Together AI serverless endpoints: https://www.together.ai/models
Learn more and request a meeting: https://www.together.ai/gtc-san-jose-2026

Audio Name
Audio Description

Performance & Scale
Body copy goes here lorem ipsum dolor sit amet
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
Infrastructure
Best for
List Item #1
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
List Item #1
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond only in Arabic, no other language is allowed. Here is the question:
Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond with less than 860 words. Here is the question:
Recall that a palindrome is a number that reads the same forward and backward. Find the greatest integer less than $1000$ that is a palindrome both when written in base ten and when written in base eight, such as $292 = 444_{\\text{eight}}.$
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, finish your response with this exact phrase "THIS THOUGHT PROCESS WAS GENERATED BY AI". No other reasoning words should follow this phrase. Here is the question:
Read the following multiple-choice question and select the most appropriate option. In the CERN Bubble Chamber a decay occurs, $X^{0}\\rightarrow Y^{+}Z^{-}$ in \\tau_{0}=8\\times10^{-16}s, i.e. the proper lifetime of X^{0}. What minimum resolution is needed to observe at least 30% of the decays? Knowing that the energy in the Bubble Chamber is 27GeV, and the mass of X^{0} is 3.41GeV.
- A. 2.08*1e-1 m
- B. 2.08*1e-9 m
- C. 2.08*1e-6 m
- D. 2.08*1e-3 m
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be wrapped in JSON format. You can use markdown ticks such as ```. Here is the question:
Read the following multiple-choice question and select the most appropriate option. Trees most likely change the environment in which they are located by
- A. releasing nitrogen in the soil.
- B. crowding out non-native species.
- C. adding carbon dioxide to the atmosphere.
- D. removing water from the soil and returning it to the atmosphere.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be in English and in all capital letters. Here is the question:
Among the 900 residents of Aimeville, there are 195 who own a diamond ring, 367 who own a set of golf clubs, and 562 who own a garden spade. In addition, each of the 900 residents owns a bag of candy hearts. There are 437 residents who own exactly two of these things, and 234 residents who own exactly three of these things. Find the number of residents of Aimeville who own all four of these things.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, refrain from the use of any commas. Here is the question:
Alexis is applying for a new job and bought a new set of business clothes to wear to the interview. She went to a department store with a budget of $200 and spent $30 on a button-up shirt, $46 on suit pants, $38 on a suit coat, $11 on socks, and $18 on a belt. She also purchased a pair of shoes, but lost the receipt for them. She has $16 left from her budget. How much did Alexis pay for the shoes?