Models / DeepSeekDeepSeek / / DeepSeek-V3.1 API
DeepSeek-V3.1 API
Advanced reasoning model with hybrid thinking capabilities
.webp)
This model is not currently supported on Together AI.
Visit our Models page to view all the latest models.
Revolutionary Hybrid AI:
DeepSeek-V3.1 is a groundbreaking hybrid model that switches between thinking and non-thinking modes, delivering exceptional performance in reasoning, coding, and agent tasks. Built on a massive 671B parameter MoE architecture with 37B activated parameters, it offers unparalleled flexibility for developers seeking both fast responses and deep analytical capabilities.
DeepSeek-V3.1 API Usage
Endpoint
How to use DeepSeek-V3.1
Model details
Architecture Overview:
• Mixture-of-Experts (MoE) architecture with 671B total parameters and 37B activated parameters
• Built upon DeepSeek-V3.1-Base with two-phase long context extension approach
• Extended training with 630B tokens for 32K phase and 209B tokens for 128K phase
• Compatible with UE8M0 FP8 scale data format for microscaling optimization
Training Methodology:
• Post-trained on expanded dataset with additional long documents
• 10-fold increase in 32K extension phase training
• 3.3x extension in 128K phase training
• Advanced post-training optimization for tool usage and agent tasks
Performance Characteristics:
• Hybrid mode supporting both thinking and non-thinking operations
• Superior performance on MMLU-Redux (91.8% non-thinking, 93.7% thinking)
• Exceptional coding capabilities with LiveCodeBench Pass@1 of 56.4% (non-thinking) and 74.8% (thinking)
• Advanced math reasoning with AIME 2024 Pass@1 of 66.3% (non-thinking) and 93.1% (thinking)
Prompting DeepSeek-V3.1
Applications & Use Cases
Advanced Reasoning & Analysis:
• Complex mathematical problem solving with step-by-step reasoning
• Scientific research and analysis with transparent thought processes
• Academic writing and research with comprehensive literature review capabilities
• Strategic planning and decision-making with multi-factor analysis
Software Development & Engineering:
• Full-stack application development with multi-language support
• Code review and optimization with detailed explanations
• Architecture design and system planning
• Debugging and troubleshooting with systematic approaches
Agent & Automation Tasks:
• Autonomous code agents for software development workflows
• Search agents for information gathering and analysis
• Multi-step task automation with tool integration
• Workflow orchestration and process optimization
Enterprise & Business Applications:
• Data analysis and reporting with comprehensive insights
• Technical documentation and knowledge management
• Customer support and query resolution
• Process automation and workflow optimization