Arcana v2
Expressive text-to-speech with extensive voice library and multi-lingual support
About model
300+
35 flagship options
5
EN, ES, FR, DE + bilingual
Code
Mid-sentence language mixing
- Extensive Voice Library: 300+ voices with diverse accents, ages, and styles
- Multi-Lingual Code-Switching: Seamless Spanglish, Franglais, Denglish
- Expressive Speech: False starts, breathwork, vocal nuances
- 5 Languages: English, Spanish, French, German, bilingual combinations
Model card
Architecture Overview:
• Autoregressive TTS model with discrete audio tokenization and high-resolution codec.
• Large language model backbone trained on text and conversational audio data.
• 300+ voice library: 18 English, 4 Spanish, 3 bilingual English/Spanish, 5 French, 5 German flagship voices.
• Multi-lingual code-switching enables seamless mid-sentence language transitions.
Training Methodology:
• Three-stage training: pre-training, conversational fine-tuning, speaker optimization.
• Trained on large-scale conversational speech with sociolinguistic annotations.
• Captures paralinguistic features: false starts, breathwork, glottal stops, vocal fry.
• Multi-lingual training for code-switching between English, Spanish, French, German.
Performance Characteristics:
• 300+ voices with extensive accent, age, and stylistic diversity for varied applications.
• Paralinguistic features (false starts, breathwork, pauses) create expressive, natural speech.
• Multi-lingual code-switching supports Spanglish, Franglais, Denglish without interruption.
• Faster-than-real-time synthesis with natural rhythm and emotional range.
Applications & use cases
Content Production:
• Audiobook generation with expressive narration and character voices.
• Podcast creation with natural conversational delivery and multiple speakers.
• E-learning course voiceovers with clear, engaging presentation.
• YouTube video narration and explainer content.
Conversational AI:
• Voice agents requiring expressive speech and emotional range.
• Customer service bots with natural personality and varied voice options.
• Interactive storytelling and narrative experiences.
Media & Entertainment:
• Character voices for games, animations, and interactive fiction.
• Voice acting for indie game development and virtual productions.
• Voiceover for commercials, trailers, and promotional content.
Multi-Lingual Applications:
• Bilingual content creation with code-switching (Spanglish, Franglais, Denglish).
• Language learning apps with authentic pronunciation and natural speech.
• International content localization with native-sounding voices.
Accessibility:
• Screen readers with high-quality, natural voice output.
• Text-to-speech for visually impaired users with expressive delivery.
• Assistive technology requiring diverse voice options and languages.
- Model providerRime
- TypeAudio
- Main use casesText-to-Speech
- DeploymentOn-Demand DedicatedMonthly Reserved
- Input modalitiesText
- Output modalitiesAudio
- ReleasedAugust 19, 2025
- External link
- CategoryAudio