⚡️ FlashAttention-4: up to 1.3× faster than cuDNN on NVIDIA Blackwell →

Introducing Together AI's new look →

🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →

⚡ Together GPU Clusters: self-service NVIDIA GPUs, now generally available →

📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →

🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →

Models / DeepSeek

Chat

Reasoning

DeepSeek R1 Distilled Qwen 14B

Qwen 14B distilled with reasoning capabilities from Deepseek R1. Outperforms GPT-4o in math & matches o1-mini on coding.

Performance benchmarks

Model	AIME 2025	GPQA Diamond	HLE	LiveCodeBench	MATH500	SWE-bench verified
DeepSeek R1 Distilled Qwen 14B		65.0%			78.3%
Related open-source models
Competitor closed-source models
Claude Opus 4.6		90.5%	34.2%			78.7%
OpenAI o3		83.3%	24.9%		99.2%	62.3%
OpenAI o1		76.8%			96.4%	48.9%
GPT-4o		49.2%	2.7%	32.3%	89.3%	31.0%

This model is not available on Together’s Serverless API.

Pick a supported alternative from the Model Library.

Related models

Model specifications

Model provider
DeepSeek
Type
Chat
Reasoning
Main use cases
Chat
Reasoning
Features
JSON Mode
Fine tuning
Supported
Parameters
14.8B

Quantization level
FP16
External link
Provider docs
Category
Chat