Models / Qwen
LLM

Qwen3 30B A3B Base

30.5B-parameter Mixture-of-Experts base model with 3.3B activated parameters trained on 36T tokens across 119 languages for efficient pretraining.

About model

Qwen3-30B-A3B-Base is a causal language model with 30.5B parameters, trained on a diverse corpus of 36 trillion tokens across 119 languages. It excels in long-context comprehension and reasoning skills, making it suitable for applications requiring advanced language understanding.

To run this model you first need to deploy it on a Dedicated Endpoint.

  • Model card

    Architecture Overview:
    • Mixture-of-Experts with 48 layers, 32/4 Q/KV heads, 128 experts (8 activated)
    • 128K context window for extensive document processing
    • Sparse activation patterns for computational efficiency
    • Designed for fine-tuning and custom training pipelines

    Training Foundation:
    • Trained on 36 trillion tokens across 119 languages for foundational modeling
    • Optimized for downstream fine-tuning across diverse domains
    • Expert specialization enables efficient knowledge transfer
    • Superior baseline performance for specialized model development

    Fine-Tuning Capabilities:
    • Efficient fine-tuning through expert-specific adaptation
    • Supports supervised fine-tuning, reinforcement learning, and custom training approaches
    • Excellent foundation for domain-specific model creation
    • Maintains computational efficiency during adaptation processes

  • Prompting

    Base Model Characteristics:
    • Foundation model designed for fine-tuning and custom applications
    • No special prompting required for base model text completion
    • Requires task-specific fine-tuning for optimal performance
    • Supports various downstream training methodologies

    Fine-Tuning Approaches:
    • Supervised fine-tuning for specific task adaptation
    • Reinforcement learning for behavior optimization
    • Domain-specific training for specialized applications
    • Custom training pipelines for unique requirements

    Development Considerations:
    • Excellent starting point for advanced AI model development
    • Efficient expert utilization during fine-tuning processes
    • Supports extensive customization for specialized domains
    • Foundation for creating proprietary conversational AI systems

  • Applications & use cases

    Research & Development:
    • Academic research in natural language processing and AI
    • Custom AI training pipelines for specialized applications
    • Foundation for domain-specific model development
    • Large-scale language model research and experimentation

    Enterprise Customization:
    • Multilingual AI applications requiring extensive customization
    • STEM reasoning applications for scientific computing
    • Coding assistance tools requiring specialized training
    • Model fine-tuning for proprietary business applications

    Advanced Applications:
    • Foundation for specialized conversational AI systems
    • Custom training for industry-specific requirements
    • Research in mixture-of-experts architectures
    • Development of next-generation AI applications requiring extensive domain adaptation

Related models
  • Model provider
    Qwen
  • Type
    LLM
  • Main use cases
    Chat
    Small & Fast
    Medium General Purpose
  • Fine tuning
    Supported
  • Deployment
    On-Demand Dedicated
    Monthly Reserved
  • Parameters
    30.5B
  • Context length
    128K
  • Input modalities
    Text
  • Output modalities
    Text