Rime Arcana V3 Turbo and Rime Arcana V3 now available on Together AI
High-performance multilingual TTS with native code-switching and real-time latency on dedicated endpoints.
Summary
- Starting today, two new Rime models are available on Together AI: Rime Arcana V3 Turbo (English–Spanish, performance) and Rime Arcana V3 (11-language switching)
- Native code-switching that keeps cadence and prosody consistent across language boundaries
- Rime Arcana V3 Turbo: ~120 ms time-to-first-audio on Together AI dedicated endpoints
- Co-located with LLM and STT workloads, with one API and unified observability
When a caller code-switches mid-sentence, most voice agents lose what makes them sound native. Cadence slips, the response lands like a translation, and trust drops. Teams patch it by routing between language-specific TTS models, but the handoff adds latency and makes voice behavior inconsistent inside the same conversation. Rime's Arcana V3 line is built for that moment: natural code-switching at production speed without turning multilingual into a routing problem.
Starting today, Together AI, the AI Native Cloud, is adding Rime Arcana V3 Turbo and Rime Arcana V3 to the Together Model Library. V3 Turbo delivers English–Spanish code-switching at ~120 ms time-to-first-audio on dedicated endpoints, with prosody trained on bilingual speech patterns. V3 expands switching across 11 languages from a single model. Both run co-located with your LLM and STT workloads behind the same API, authentication, and observability surface you already use.
V3 Turbo: Performance for real-time bilingual conversations
~120ms time-to-first-audio
Voice agents need end-to-end latency under 700ms to feel conversational, which means TTS must leave headroom for STT and LLM processing. V3 Turbo hits ~120ms time-to-first-audio on Together AI dedicated endpoints, so when a customer switches from English to Spanish mid-sentence, the agent's bilingual response arrives in stride. Co-locating V3 Turbo with LLM and STT on Together AI keeps the full pipeline (speech recognition through reasoning to synthesis) within that 700ms budget.
English-Spanish code-switching trained on native bilingual speech
Bilingual callers mix languages inside a sentence. V3 Turbo is trained on those patterns, including where pauses land and how stress shifts at the boundary. A customer says, "I need help with my account, es que no puedo acceder." V3 Turbo can respond in the same mixed register, with pauses and emphasis that match how bilingual speakers actually talk.
Efficient concurrency for high-volume deployments
V3 Turbo's performance enables higher concurrency per GPU. For contact centers handling thousands of concurrent calls in bilingual markets, this means fewer GPUs to maintain production latency when customers code-switch, reducing total cost of ownership while preserving conversational quality.
V3: Multilingual breadth with code-switching
~160ms time-to-first-audio across 11 languages
V3 reaches ~160ms p50 time-to-first-audio on Together AI dedicated endpoints while supporting code-switching across 11 languages. This keeps multilingual conversations responsive even as the model handles the complexity of natural transitions between any supported language pair.
11 languages with natural transitions
V3 supports 11 languages and can code-switch between supported languages. A customer starts in French, switches to English for a technical term, then back to French for clarification. V3 handles these transitions while preserving prosody and accent consistency.
Single model for multilingual markets
V3 lets teams consolidate what used to require separate models or vendors per language. Deploy once and serve multilingual customers from a single endpoint without maintaining separate infrastructure per market. When the conversation switches languages, V3 keeps cadence and emphasis natural so the transition does not sound stitched together.
Use cases
Bilingual metro markets
In bilingual metro markets, customer service calls routinely involve code-switching. Customers start in English, switch to Spanish for culturally specific context, switch back for confirmation. V3 Turbo handles these transitions at ~120ms time-to-first-audio, so customers stay in the automated flow longer instead of requesting transfer to human agents. Together AI dedicated endpoints keep performance consistent even during peak call volume.
Regulated services in bilingual contexts
Banks, healthcare providers, and government services serving bilingual communities need agents that code-switch the way their customers do. A customer calling about a prescription might use English for most of the conversation, but switch to their native language for symptoms or medication names. Natural switching reduces repeats and transfers because callers stop testing the agent's language ability mid-call. Running your full voice stack on Together AI means one compliance review covers LLM, STT, and TTS.
International call centers
Call centers serving multilingual markets handle customers who code-switch across multiple languages in a single call. A business customer in Luxembourg might mix French, German, and English in one conversation. V3 processes these transitions while maintaining flow, and Together AI's unified observability means you can track performance across all languages from a single dashboard.
Production inference on Together AI
Both Rime Arcana V3 models run on Together AI dedicated endpoints with isolated GPU capacity alongside LLM and STT workloads. Together AI offers a broad TTS catalog on a single platform, from open-source models to enterprise-grade proprietary models like Rime, all with unified tooling.
Get started
→ Try both models now
→ Read TTS Documentation
→ Contact Sales for deterministic pronunciation control, dedicated deployment, and volume pricing
LOREM IPSUM
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
LOREM IPSUM
Audio Name
Audio Description
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
Value Prop #1
Body copy goes here lorem ipsum dolor sit amet
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
Value Prop #1
Body copy goes here lorem ipsum dolor sit amet
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
Value Prop #1
Body copy goes here lorem ipsum dolor sit amet
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
List Item #1
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
List Item #1
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
List Item #1
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
List Item #1
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
List Item #2
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
List Item #3
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Grow
Benefits included:
✔ Up to $30K in free platform credits*
✔ 6 hours of free forward-deployed engineering time.
Funding: $5M-$10M
Scale
Benefits included:
✔ Up to $50K in free platform credits*
✔ 10 hours of free forward-deployed engineering time.
Funding: $10M-$25M
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond only in Arabic, no other language is allowed. Here is the question:
Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond with less than 860 words. Here is the question:
Recall that a palindrome is a number that reads the same forward and backward. Find the greatest integer less than $1000$ that is a palindrome both when written in base ten and when written in base eight, such as $292 = 444_{\\text{eight}}.$
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, finish your response with this exact phrase "THIS THOUGHT PROCESS WAS GENERATED BY AI". No other reasoning words should follow this phrase. Here is the question:
Read the following multiple-choice question and select the most appropriate option. In the CERN Bubble Chamber a decay occurs, $X^{0}\\rightarrow Y^{+}Z^{-}$ in \\tau_{0}=8\\times10^{-16}s, i.e. the proper lifetime of X^{0}. What minimum resolution is needed to observe at least 30% of the decays? Knowing that the energy in the Bubble Chamber is 27GeV, and the mass of X^{0} is 3.41GeV.
- A. 2.08*1e-1 m
- B. 2.08*1e-9 m
- C. 2.08*1e-6 m
- D. 2.08*1e-3 m
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be wrapped in JSON format. You can use markdown ticks such as ```. Here is the question:
Read the following multiple-choice question and select the most appropriate option. Trees most likely change the environment in which they are located by
- A. releasing nitrogen in the soil.
- B. crowding out non-native species.
- C. adding carbon dioxide to the atmosphere.
- D. removing water from the soil and returning it to the atmosphere.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be in English and in all capital letters. Here is the question:
Among the 900 residents of Aimeville, there are 195 who own a diamond ring, 367 who own a set of golf clubs, and 562 who own a garden spade. In addition, each of the 900 residents owns a bag of candy hearts. There are 437 residents who own exactly two of these things, and 234 residents who own exactly three of these things. Find the number of residents of Aimeville who own all four of these things.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, refrain from the use of any commas. Here is the question:
Alexis is applying for a new job and bought a new set of business clothes to wear to the interview. She went to a department store with a budget of $200 and spent $30 on a button-up shirt, $46 on suit pants, $38 on a suit coat, $11 on socks, and $18 on a belt. She also purchased a pair of shoes, but lost the receipt for them. She has $16 left from her budget. How much did Alexis pay for the shoes?
article