Deepgram Nova-3 Multilingual
Real-time multilingual speech-to-text with code-switching

About model
Nova-3 Multilingual is DeepGram's speech-to-text model providing real-time multilingual transcription across 10 languages with seamless code-switching. Built on latent space architecture, Nova-3 Multilingual delivers low latency with robust multilingual performance for global enterprise production workloads.
10
Real-time multilingual transcription with code-switching
Code-Switching
Seamless language switching without routing
Self-Serve
Vocabulary adaptation across all supported languages
- 10-Language Support: Real-time conversation transcription with seamless code-switching
- Unified Multilingual: Single model handles language switches without routing or separate models
- Enterprise Ready: Maintains accuracy in challenging environments for global production workloads
Model card
Architecture Overview:
• Latent space architecture compressing audio into expressive representations while preserving acoustic features
• Audio embedding framework using representation learning for diverse acoustic conditions
• Unified multilingual system trained to handle language switches without routing
• Audio-text alignment enabling training on challenging multilingual examples
Training Methodology:
• Multi-stage training combining synthetic code-switched data with real-world multilingual datasets
• Targeted data augmentation for specialized vocabulary across languages
• Trained on multilingual conversational data covering challenging environments
• Optimization for medical, legal, financial, and technical vocabulary across 10 languages
Performance Characteristics:
• Real-time multilingual conversation transcription across 10 languages with seamless code-switching
• Maintains accuracy in noisy environments with speaker distance variation and overlapping speech
• Self-serve customization with keyterm boosting for vocabulary adaptation in any supported language
• Optional personal information redaction for compliance requirements across languages
Applications & use cases
Global Customer Support:
• International call centers with multilingual agent-customer interactions
• Global customer support transcribing conversations across 10 languages with code-switching
• Multilingual quality monitoring and compliance recording
• Cross-border business communications requiring multilingual transcription
Real-Time Multilingual Applications:
• Voice agents for global markets supporting code-switching between languages
• Real-time translation pipelines combining transcription with machine translation
• Multilingual live captioning for international events and webinars
• Broadcast media for multilingual subtitling and accessibility
Enterprise Multilingual:
• Multilingual content moderation and compliance monitoring
• International legal transcription for depositions and proceedings
• Global financial services for multilingual earnings calls and compliance
• Healthcare transcription supporting multilingual patient interactions
- Model providerDeepgram
- TypeAudio
- DeploymentOn-Demand Dedicated
- Price
$0.0077/min + GPU hourly / min
- Input modalitiesAudio
- Output modalitiesText
- ReleasedFebruary 11, 2025
- CategoryTranscribe