Models / Deepgram
Audio

Deepgram Nova-3 Multilingual

Real-time multilingual speech-to-text with code-switching

About model

Nova-3 Multilingual is DeepGram's speech-to-text model providing real-time multilingual transcription across 10 languages with seamless code-switching. Built on latent space architecture, Nova-3 Multilingual delivers low latency with robust multilingual performance for global enterprise production workloads.

Languages

10

Real-time multilingual transcription with code-switching

Support

Code-Switching

Seamless language switching without routing

Customization

Self-Serve

Vocabulary adaptation across all supported languages

Model key capabilities
  • 10-Language Support: Real-time conversation transcription with seamless code-switching
  • Unified Multilingual: Single model handles language switches without routing or separate models
  • Enterprise Ready: Maintains accuracy in challenging environments for global production workloads
  • Model card

    Architecture Overview:
    • Latent space architecture compressing audio into expressive representations while preserving acoustic features
    • Audio embedding framework using representation learning for diverse acoustic conditions
    • Unified multilingual system trained to handle language switches without routing
    • Audio-text alignment enabling training on challenging multilingual examples

    Training Methodology:
    • Multi-stage training combining synthetic code-switched data with real-world multilingual datasets
    • Targeted data augmentation for specialized vocabulary across languages
    • Trained on multilingual conversational data covering challenging environments
    • Optimization for medical, legal, financial, and technical vocabulary across 10 languages

    Performance Characteristics:
    • Real-time multilingual conversation transcription across 10 languages with seamless code-switching
    • Maintains accuracy in noisy environments with speaker distance variation and overlapping speech
    • Self-serve customization with keyterm boosting for vocabulary adaptation in any supported language
    • Optional personal information redaction for compliance requirements across languages

  • Applications & use cases

    Global Customer Support:
    • International call centers with multilingual agent-customer interactions
    • Global customer support transcribing conversations across 10 languages with code-switching
    • Multilingual quality monitoring and compliance recording
    • Cross-border business communications requiring multilingual transcription

    Real-Time Multilingual Applications:
    • Voice agents for global markets supporting code-switching between languages
    • Real-time translation pipelines combining transcription with machine translation
    • Multilingual live captioning for international events and webinars
    • Broadcast media for multilingual subtitling and accessibility

    Enterprise Multilingual:
    • Multilingual content moderation and compliance monitoring
    • International legal transcription for depositions and proceedings
    • Global financial services for multilingual earnings calls and compliance
    • Healthcare transcription supporting multilingual patient interactions

Related models
  • Model provider
    Deepgram
  • Type
    Audio
  • Deployment
    On-Demand Dedicated
  • Price

    $0.0077/min + GPU hourly / min

  • Input modalities
    Audio
  • Output modalities
    Text
  • Released
    February 11, 2025
  • Category
    Transcribe