# Together AI > Together AI is the AI Native Cloud — a platform for running, fine-tuning, and deploying > open-source and frontier AI models at scale. Developers use Together for serverless > inference, dedicated endpoints, GPU clusters, fine-tuning, and code sandboxes via an > OpenAI-compatible API. ## Getting Started - [Quickstart](https://docs.together.ai/docs/quickstart): Get your first API call working in minutes - [OpenAI Compatibility](https://docs.together.ai/docs/openai-api-compatibility): Drop-in replacement for OpenAI SDK - [Pricing](https://www.together.ai/pricing): Per-token and per-compute pricing for all products - [API Authentication](https://docs.together.ai/reference/authentication-1): How to authenticate API requests - [Introduction](https://docs.together.ai/docs/introduction): Platform overview and capabilities - [Integrations](https://docs.together.ai/docs/integrations): Connect Together AI with your existing tools - [Multiple API Keys](https://docs.together.ai/docs/multiple-api-keys): Manage API keys across teams and projects ## Products - [Serverless Inference](https://www.together.ai/serverless-inference): Pay-per-token access to 200+ open-source and frontier models - [Dedicated Model Inference](https://www.together.ai/dedicated-model-inference): Reserved model capacity for production workloads - [Dedicated Container Inference](https://www.together.ai/dedicated-container-inference): Deploy custom containers on dedicated infrastructure - [Batch Inference](https://www.together.ai/batch-inference): Async large-scale inference at lower cost - [Fine-tuning](https://www.together.ai/fine-tuning): Full fine-tuning and LoRA adapters for custom models - [Accelerated Compute / GPU Clusters](https://www.together.ai/accelerated-compute): On-demand H100/H200/GB200/B300 clusters for training and inference - [Sandbox](https://www.together.ai/sandbox): Secure code interpreter and execution environment - [Managed Storage](https://www.together.ai/managed-storage): Persistent storage for AI workloads - [Evaluations](https://www.together.ai/evaluations): Evaluate and benchmark model performance - [Enterprise](https://www.together.ai/scale-enterprise): Private deployments, SLAs, and compliance features ## Models - [Models Catalog](https://www.together.ai/models): Browse 200+ available models by category and capability - [Llama 4 Scout](https://www.together.ai/deploy/llama-4-scout): Meta's latest multimodal frontier model - [Llama 4 Maverick](https://www.together.ai/deploy/llama-4-maverick): Meta's high-performance reasoning model - [DeepSeek R1](https://www.together.ai/deploy/deepseek-r1): Top-ranked open reasoning model - [DeepSeek V3](https://www.together.ai/deploy/deepseek-v3): High-performance open frontier model - [DeepSeek V3.1](https://www.together.ai/deploy/deepseek-v3-1): Updated DeepSeek V3 with improvements - [Qwen3 Coder](https://www.together.ai/deploy/qwen3-coder): State-of-the-art open coding model - [Qwen3 Instruct](https://www.together.ai/deploy/qwen3-instruct): Qwen3 instruction-tuned model family - [Llama 3.3 70B](https://www.together.ai/deploy/llama-3-3-70b): Meta's Llama 3.3 70B instruct model - [OpenAI Open Models](https://www.together.ai/deploy/openai-open-models): OpenAI's open-weight models on Together ## API Reference - [Chat Completions](https://docs.together.ai/reference/chat-completions-1): OpenAI-compatible chat endpoint - [Text Completions](https://docs.together.ai/reference/completions-1): Legacy completions endpoint - [Image Generation](https://docs.together.ai/reference/post_images-generations): Text-to-image API - [Embeddings](https://docs.together.ai/reference/embeddings-2): Vector embeddings for RAG - [Rerank](https://docs.together.ai/reference/rerank-1): Rerank documents by relevance - [Fine-tuning Jobs](https://docs.together.ai/reference/finetune): Create and manage fine-tuning runs - [Fine-tuning Events](https://docs.together.ai/reference/get_fine-tunes-id-events): Monitor fine-tuning job progress - [Fine-tuning Checkpoints](https://docs.together.ai/reference/get_fine-tunes-id-checkpoints): Download fine-tuning checkpoints - [Dedicated Endpoints: Create](https://docs.together.ai/reference/createendpoint): Deploy a dedicated endpoint - [Dedicated Endpoints: List](https://docs.together.ai/reference/listendpoints): List all dedicated endpoints - [Dedicated Endpoints: Get](https://docs.together.ai/reference/getendpoint): Get endpoint details - [Dedicated Endpoints: Update](https://docs.together.ai/reference/updateendpoint): Update endpoint configuration - [Dedicated Endpoints: Delete](https://docs.together.ai/reference/deleteendpoint): Delete a dedicated endpoint - [Batch API: Submit](https://docs.together.ai/reference/post_batches): Submit async inference batches - [Batch API: Get](https://docs.together.ai/reference/get_batches-id): Get batch job status - [Batch API: List](https://docs.together.ai/reference/get_batches): List all batch jobs - [Files: Upload](https://docs.together.ai/reference/post_files-upload): Upload training or batch files - [Files: List](https://docs.together.ai/reference/get_files): List uploaded files - [Files: Get](https://docs.together.ai/reference/get_files-id): Get file metadata - [Files: Download](https://docs.together.ai/reference/get_files-id-content): Download file content - [Files: Delete](https://docs.together.ai/reference/delete_files-id): Delete a file - [Models List](https://docs.together.ai/reference/models-1): List all available models - [Audio: Speech](https://docs.together.ai/reference/audio-speech): Text-to-speech API - [Audio: Transcriptions](https://docs.together.ai/reference/audio-transcriptions): Speech-to-text transcription - [Audio: Translations](https://docs.together.ai/reference/audio-translations): Translate audio to English - [Inference: Execute](https://docs.together.ai/reference/inference): Direct inference endpoint - [Hardware: List](https://docs.together.ai/reference/listhardware): List available GPU hardware - [Upload Model](https://docs.together.ai/reference/uploadmodel): Upload a custom model ## Key Documentation - [Function Calling](https://docs.together.ai/docs/function-calling): Structured outputs and tool use - [JSON Mode](https://docs.together.ai/docs/json-mode): Guaranteed JSON-formatted responses - [Reasoning Models Guide](https://docs.together.ai/docs/reasoning-models-guide): Working with chain-of-thought models - [RAG Quickstart](https://docs.together.ai/docs/quickstart-retrieval-augmented-generation-rag): Build retrieval-augmented apps - [Rate Limits](https://docs.together.ai/docs/rate-limits): Understand and increase rate limits - [Error Codes](https://docs.together.ai/docs/error-codes): API error reference - [Inference Best Practices Guide](https://www.together.ai/guides/best-practices-to-accelerate-inference-for-large-scale-production-workloads): Optimizing large-scale production workloads - [Serverless Models Reference](https://docs.together.ai/docs/serverless-models): Complete list of serverless-available models - [Dedicated Inference Guide](https://docs.together.ai/docs/dedicated-inference): Configure and optimize dedicated deployments - [Fine-tuning Quickstart](https://docs.together.ai/docs/fine-tuning-quickstart): Fine-tune your first model - [Fine-tuning Data Preparation](https://docs.together.ai/docs/fine-tuning-data-preparation): Format training data correctly - [Fine-tuning Pricing](https://docs.together.ai/docs/fine-tuning-pricing): Per-token pricing for training runs - [Fine-tuning FAQs](https://docs.together.ai/docs/fine-tuning-faqs): Common questions about fine-tuning - [Batch Inference Guide](https://docs.together.ai/docs/batch-inference): Submit and manage async inference jobs - [Embeddings Overview](https://docs.together.ai/docs/embeddings-overview): Vector embeddings for semantic search - [Embeddings + RAG](https://docs.together.ai/docs/embeddings-rag): Build RAG pipelines with embeddings - [Rerank Overview](https://docs.together.ai/docs/rerank-overview): Improve retrieval quality with reranking - [Images Overview](https://docs.together.ai/docs/images-overview): Text-to-image generation guide - [Vision Overview](https://docs.together.ai/docs/vision-overview): Multimodal image understanding - [Logprobs](https://docs.together.ai/docs/logprobs): Access token log probabilities - [LoRA Inference](https://docs.together.ai/docs/lora-inference): Deploy fine-tuned LoRA adapters - [Custom Models](https://docs.together.ai/docs/custom-models): Upload and run your own model weights - [Deploying a Fine-tuned Model](https://docs.together.ai/docs/deploying-a-fine-tuned-model): Deploy a trained model to production - [Deployment Options](https://docs.together.ai/docs/deployment-options): Overview of serverless vs dedicated - [Deprecations](https://docs.together.ai/docs/deprecations): Model deprecation schedule - [Instant Clusters](https://docs.together.ai/docs/instant-clusters): On-demand GPU cluster provisioning - [Cluster Storage](https://docs.together.ai/docs/cluster-storage): Persistent storage for GPU clusters - [Cluster User Management](https://docs.together.ai/docs/cluster-user-management): Manage team access to clusters - [Speech to Text](https://docs.together.ai/docs/speech-to-text): Whisper-based transcription API - [Text to Speech](https://docs.together.ai/docs/text-to-speech): TTS model API - [OCR](https://docs.together.ai/docs/ocr): Optical character recognition guide - [Language Overview](https://docs.together.ai/docs/language-overview): Chat and completion model overview - [Chat Overview](https://docs.together.ai/docs/chat-overview): Chat model capabilities - [Preference Fine-tuning](https://docs.together.ai/docs/preference-fine-tuning): DPO and RLHF fine-tuning - [Deepseek FAQs](https://docs.together.ai/docs/deepseek-faqs): Common questions about DeepSeek models - [Prompting DeepSeek R1](https://docs.together.ai/docs/prompting-deepseek-r1): Effective prompting for reasoning models - [DeepSeek R1 Guide](https://docs.together.ai/docs/deepseek-r1): Deploy and use DeepSeek R1 - [Llama 4 Quickstart](https://docs.together.ai/docs/llama4-quickstart): Get started with Llama 4 models - [Together Code Sandbox](https://docs.together.ai/docs/together-code-sandbox): Secure code execution environment - [Workflows](https://docs.together.ai/docs/workflows): Multi-step AI pipeline orchestration - [Mixture of Agents](https://docs.together.ai/docs/mixture-of-agents): Combine multiple models for better results - [SLURM](https://docs.together.ai/docs/slurm): SLURM job scheduling on GPU clusters ## Framework & Integration Guides - [LangGraph](https://docs.together.ai/docs/langgraph): Build stateful LLM agents with LangGraph - [CrewAI](https://docs.together.ai/docs/crewai): Multi-agent frameworks with CrewAI - [AutoGen](https://docs.together.ai/docs/autogen): Microsoft AutoGen integration - [PydanticAI](https://docs.together.ai/docs/pydanticai): Type-safe LLM apps with PydanticAI - [Agno](https://docs.together.ai/docs/agno): Build AI agents with Agno - [DSPy](https://docs.together.ai/docs/dspy): Programmatic LLM pipelines with DSPy - [Composio](https://docs.together.ai/docs/composio): Tool calling integrations via Composio - [Vercel AI SDK](https://docs.together.ai/docs/using-together-with-vercels-ai-sdk): Next.js and Edge Runtime integration - [Next.js Chat Quickstart](https://docs.together.ai/docs/nextjs-chat-quickstart): Build a chat app with Next.js - [Cline](https://docs.together.ai/docs/cline): Cline IDE extension integration - [Inference Web Interface](https://docs.together.ai/docs/inference-web-interface): Use the Together Playground UI - [HuggingFace Integration](https://docs.together.ai/docs/quickstart-using-hugging-face-inference): Run HuggingFace models via Together - [Together + LlamaRank](https://docs.together.ai/docs/together-and-llamarank): Salesforce LlamaRank integration ## Cookbook Quickstarts - [AI Search Engine](https://docs.together.ai/docs/ai-search-engine): Build a semantic search engine - [AI Tutor](https://docs.together.ai/docs/ai-tutor): Build an adaptive tutoring system - [Data Analyst Agent](https://docs.together.ai/docs/data-analyst-agent): Build a data analysis agent - [Building a RAG Workflow](https://docs.together.ai/docs/building-a-rag-workflow): End-to-end RAG implementation - [OCR + RAG](https://docs.together.ai/docs/quickstart-flux-kontext): FLUX Kontext image editing quickstart - [FLUX LoRA Quickstart](https://docs.together.ai/docs/quickstart-flux-lora): Fine-tune and run FLUX LoRA models - [FLUX Tools Quickstart](https://docs.together.ai/docs/quickstart-flux-tools-models): Use FLUX Canny and Depth models - [Building Coding Agents](https://docs.together.ai/docs/how-to-build-coding-agents): Practical guide to AI coding agents - [NotebookLM Clone](https://docs.together.ai/docs/open-notebooklm-pdf-to-podcast): PDF-to-podcast with open models - [Multimodal Document RAG](https://docs.together.ai/docs/quickstart): Vision models + ColQwen2 retrieval - [Contextual RAG](https://docs.together.ai/docs/how-to-implement-contextual-rag-from-anthropic): Anthropic-style contextual RAG - [Search with Rerankers](https://docs.together.ai/docs/how-to-improve-search-with-rerankers): Improve search quality with reranking - [Sequential Agent Workflow](https://docs.together.ai/docs/sequential-agent-workflow): Chain agents in sequence - [Parallel Workflows](https://docs.together.ai/docs/parallel-workflows): Run agents in parallel - [Conditional Workflows](https://docs.together.ai/docs/conditional-workflows): Branch workflows based on model output - [Iterative Workflow](https://docs.together.ai/docs/iterative-workflow): Self-improving agent loops - [Create Slack Tickets](https://docs.together.ai/docs/create-tickets-in-slack): Automate ticket creation in Slack ## Customers - [Customer Stories](https://www.together.ai/customers): How leading AI companies build on Together - [Cursor](https://www.together.ai/customers/cursor): Real-time low-latency inference at scale for AI coding - [Decagon](https://www.together.ai/customers/decagon): AI customer support with sub-second voice AI - [Washington Post](https://www.together.ai/customers/washington-post): AI-powered editorial workflows at a major news publisher - [Zomato](https://www.together.ai/customers/zomato): AI customer support bot that doubled satisfaction scores - [Arcee AI](https://www.together.ai/customers/arcee-ai): Custom model training and deployment at scale - [Dippy AI](https://www.together.ai/customers/dippy-ai): Character AI and roleplay applications - [Hedra](https://www.together.ai/customers/hedra): AI video generation platform - [HeroUI](https://www.together.ai/customers/heroui): AI-powered UI component generation - [Latent Health](https://www.together.ai/customers/latent-health): Healthcare AI applications - [LegionEdge](https://www.together.ai/customers/legionedge): AI-powered gaming infrastructure - [Runware](https://www.together.ai/customers/runware): High-throughput image generation - [Scaled Cognition](https://www.together.ai/customers/scaled-cognition): AI reasoning and agent systems - [SCB10X](https://www.together.ai/customers/scb10x): Financial services AI - [Slingshot AI](https://www.together.ai/customers/slingshot-ai): Sales intelligence AI - [Vercept](https://www.together.ai/customers/vercept): AI-powered video analysis ## Company - [About](https://www.together.ai/about-us): Company mission and team - [Research](https://www.together.ai/research): Together AI research publications - [Research Blog](https://www.together.ai/research-blog): Technical research posts - [Blog](https://www.together.ai/blog): Engineering and product updates - [Careers](https://www.together.ai/careers): Open roles at Together AI - [Brand](https://www.together.ai/brand): Together AI brand assets and guidelines - [Events](https://www.together.ai/events): Upcoming and past Together AI events ## GPU Hardware - [NVIDIA GB200 NVL72](https://www.together.ai/nvidia-gb200-nvl72): GB200 NVL72 specs and availability - [NVIDIA GB300 NVL72](https://www.together.ai/nvidia-gb300-nvl72): GB300 NVL72 specs and availability - [NVIDIA HGX B200](https://www.together.ai/nvidia-hgx-b200): HGX B200 specs and availability - [NVIDIA H200](https://www.together.ai/nvidia-h200): H200 specs and availability - [NVIDIA H100](https://www.together.ai/nvidia-h100): H100 specs and availability - [B300 GPU Cluster](https://www.together.ai/gpu/b300): B300 cluster configuration and pricing ## Optional - [Cookbooks](https://www.together.ai/cookbooks): End-to-end implementation examples - [Demos](https://www.together.ai/demos): Interactive product demos - [Startup Accelerator](https://www.together.ai/startup-accelerator): Credits and support for early-stage AI startups - [Contact Sales](https://www.together.ai/contact-sales): Enterprise and GPU cluster inquiries - [Support](https://www.together.ai/support): Documentation and help center - [Privacy Policy](https://www.together.ai/privacy): Data handling and privacy - [Terms of Service](https://www.together.ai/terms-of-service): Usage terms - [Cookie Policy](https://www.together.ai/cookie-policy): Cookie usage policy - [Data Center Locations](https://www.together.ai/data-center-locations): Infrastructure regions ## All Model Pages https://www.together.ai/models/afm-4-5b-preview https://www.together.ai/models/apriel-1-5-15b-thinker https://www.together.ai/models/apriel-1-6-15b-thinker https://www.together.ai/models/arcana-v2 https://www.together.ai/models/arcee-ai-afm-4-5b https://www.together.ai/models/arcee-ai-arcee-blitz https://www.together.ai/models/arcee-ai-arcee-spotlight https://www.together.ai/models/arcee-ai-caller https://www.together.ai/models/arcee-ai-coder-large https://www.together.ai/models/arcee-ai-maestro-reasoning https://www.together.ai/models/arcee-ai-virtuoso-large https://www.together.ai/models/arcee-ai-virtuoso-medium https://www.together.ai/models/bge-base-en-v1-5 https://www.together.ai/models/bge-large-en-v1-5 https://www.together.ai/models/bytedance-seedance-1-0-lite https://www.together.ai/models/bytedance-seedance-1-0-pro https://www.together.ai/models/bytedance-seededit https://www.together.ai/models/bytedance-seedream-3-0 https://www.together.ai/models/bytedance-seedream-4-0 https://www.together.ai/models/cartesia-sonic https://www.together.ai/models/cartesia-sonic-3 https://www.together.ai/models/cogito-109b-moe https://www.together.ai/models/cogito-405b https://www.together.ai/models/cogito-671b-moe https://www.together.ai/models/cogito-70b https://www.together.ai/models/cogito-v1-preview-llama-3b https://www.together.ai/models/cogito-v1-preview-llama-70b https://www.together.ai/models/cogito-v1-preview-qwen-14b https://www.together.ai/models/cogito-v1-preview-qwen-32b https://www.together.ai/models/cogito-v2-1-671b https://www.together.ai/models/dbrx-instruct https://www.together.ai/models/deepseek-r1 https://www.together.ai/models/deepseek-r1-0528-throughput https://www.together.ai/models/deepseek-r1-distilled-llama-70 https://www.together.ai/models/deepseek-r1-distilled-llama-70b-free https://www.together.ai/models/deepseek-r1-distilled-qwen-14 https://www.together.ai/models/deepseek-v3 https://www.together.ai/models/deepseek-v3-1 https://www.together.ai/models/deepseek-v3-2-exp https://www.together.ai/models/devstral-small-2505 https://www.together.ai/models/dreamshaper https://www.together.ai/models/exaone-3-5-32b-instruct https://www.together.ai/models/exaone-deep-32b https://www.together.ai/models/flux-1-canny-pro https://www.together.ai/models/flux-1-dev https://www.together.ai/models/flux-1-kontext-dev https://www.together.ai/models/flux-1-kontext-max https://www.together.ai/models/flux-1-kontext-pro https://www.together.ai/models/flux-1-krea-dev https://www.together.ai/models/flux-1-schnell-2 https://www.together.ai/models/flux-1-schnell-fixedres https://www.together.ai/models/flux-2-flex https://www.together.ai/models/flux-2-max https://www.together.ai/models/flux-2-pro https://www.together.ai/models/flux1-1-pro https://www.together.ai/models/gemini-flash-image-2-5 https://www.together.ai/models/gemma-3-12b https://www.together.ai/models/gemma-3-1b https://www.together.ai/models/gemma-3-27b https://www.together.ai/models/gemma-3-4b https://www.together.ai/models/gemma-3n-e4b-it https://www.together.ai/models/gemma-instruct-2b https://www.together.ai/models/glm-4-5-air https://www.together.ai/models/glm-4-6 https://www.together.ai/models/glm-4-7 https://www.together.ai/models/glm-5 https://www.together.ai/models/google-imagen-4-0-fast https://www.together.ai/models/google-imagen-4-0-preview https://www.together.ai/models/google-imagen-4-0-ultra https://www.together.ai/models/google-veo-2-0 https://www.together.ai/models/google-veo-3-0 https://www.together.ai/models/google-veo-3-0-audio https://www.together.ai/models/google-veo-3-0-fast https://www.together.ai/models/google-veo-3-0-fast-audio https://www.together.ai/models/gpt-oss-120b https://www.together.ai/models/gpt-oss-20b https://www.together.ai/models/gryphe-mythomax-l2-lite-13b https://www.together.ai/models/gte-modernbert-base https://www.together.ai/models/hidream-i1-dev https://www.together.ai/models/hidream-i1-fast https://www.together.ai/models/hidream-i1-full https://www.together.ai/models/ideogram-3-0 https://www.together.ai/models/juggernaut-lightning-flux https://www.together.ai/models/juggernaut-pro-flux https://www.together.ai/models/kimi-k2-0905 https://www.together.ai/models/kimi-k2-5 https://www.together.ai/models/kimi-k2-instruct https://www.together.ai/models/kimi-k2-thinking https://www.together.ai/models/kling-1-6-pro https://www.together.ai/models/kling-1-6-standard https://www.together.ai/models/kling-2-0-master https://www.together.ai/models/kling-2-1-master https://www.together.ai/models/kling-2-1-pro https://www.together.ai/models/kling-2-1-standard https://www.together.ai/models/kokoro-82m https://www.together.ai/models/lfm2-24b-a2b https://www.together.ai/models/llama-2-chat-13b https://www.together.ai/models/llama-2-chat-7b https://www.together.ai/models/llama-3-1 https://www.together.ai/models/llama-3-1-70b https://www.together.ai/models/llama-3-2-3b-instruct-turbo https://www.together.ai/models/llama-3-3-70b https://www.together.ai/models/llama-3-3-70b-free https://www.together.ai/models/llama-3-70b-instruct-reference https://www.together.ai/models/llama-3-70b-instruct-turbo https://www.together.ai/models/llama-3-8b-instruct-lite https://www.together.ai/models/llama-4-maverick https://www.together.ai/models/llama-4-scout https://www.together.ai/models/llama-guard-2-8b https://www.together.ai/models/llama-guard-3-11b-vision-turbo https://www.together.ai/models/llama-guard-3-8b https://www.together.ai/models/llama-guard-4-12b https://www.together.ai/models/llama-guard-7b https://www.together.ai/models/m2-bert-80m-2k-retrieval https://www.together.ai/models/m2-bert-80m-32k-retrieval https://www.together.ai/models/m2-bert-80m-8k-retrieval https://www.together.ai/models/magistral-small-2506 https://www.together.ai/models/marin-8b-instruct https://www.together.ai/models/minimax-01-director https://www.together.ai/models/minimax-hailuo-02 https://www.together.ai/models/minimax-m1-40k https://www.together.ai/models/minimax-m1-80k https://www.together.ai/models/minimax-m2-5 https://www.together.ai/models/minimax-m21 https://www.together.ai/models/minimax-speech-2-6-turbo https://www.together.ai/models/ministral-3-14b-instruct-2512 https://www.together.ai/models/ministral-3-3b-instruct-2512 https://www.together.ai/models/mist-v2 https://www.together.ai/models/mistral-7b-instruct-v0-2 https://www.together.ai/models/mistral-beb7b https://www.together.ai/models/mistral-instruct https://www.together.ai/models/mistral-small-3 https://www.together.ai/models/mixtral-8x7b-v0-1 https://www.together.ai/models/mixtral-instruct https://www.together.ai/models/multilingual-e5-large-instruct https://www.together.ai/models/mxbai-rerank-large-v2 https://www.together.ai/models/mythomax-l2 https://www.together.ai/models/nim-llama-3-1-70b-instruct https://www.together.ai/models/nim-llama-3-1-8b-instruct https://www.together.ai/models/nim-llama-3-1-nemotron-70b-instruct https://www.together.ai/models/nim-llama-3-2-11b-vision-instruct https://www.together.ai/models/nim-llama-3-2-90b-vision-instruct https://www.together.ai/models/nim-llama-3-3-70b-instruct https://www.together.ai/models/nim-llama-3-3-nemotron-super-49b-v1 https://www.together.ai/models/nim-mistral-nemo-12b-instruct https://www.together.ai/models/nim-mixtral-8x22b-instruct-v0-1 https://www.together.ai/models/nim-mixtral-8x7b-instruct-v0-1 https://www.together.ai/models/nvidia-nemotron-3-nano https://www.together.ai/models/nvidia-nemotron-nano-9b-v2 https://www.together.ai/models/openai-whisper-large-v3 https://www.together.ai/models/orpheus-tts https://www.together.ai/models/pixverse-v5 https://www.together.ai/models/qwen-2-5 https://www.together.ai/models/qwen-2-5-coder-32b-instruct https://www.together.ai/models/qwen-image https://www.together.ai/models/qwen-image-edit https://www.together.ai/models/qwen-qwq-32b https://www.together.ai/models/qwen2-5-7b-instruct-turbo https://www.together.ai/models/qwen2-5-vl-72b-instruct https://www.together.ai/models/qwen3-0-6b https://www.together.ai/models/qwen3-0-6b-base https://www.together.ai/models/qwen3-1-7b https://www.together.ai/models/qwen3-1-7b-base https://www.together.ai/models/qwen3-14b-base https://www.together.ai/models/qwen3-235b-a22b-fp8-tput https://www.together.ai/models/qwen3-235b-a22b-instruct-2507-fp8 https://www.together.ai/models/qwen3-235b-a22b-thinking-2507 https://www.together.ai/models/qwen3-30b-a3b https://www.together.ai/models/qwen3-30b-a3b-base https://www.together.ai/models/qwen3-32b https://www.together.ai/models/qwen3-4b https://www.together.ai/models/qwen3-4b-base https://www.together.ai/models/qwen3-5-397b-a17b https://www.together.ai/models/qwen3-8b https://www.together.ai/models/qwen3-coder-480b-a35b-instruct https://www.together.ai/models/qwen3-coder-next https://www.together.ai/models/qwen3-next-80b-a3b-instruct https://www.together.ai/models/qwen3-next-80b-a3b-thinking https://www.together.ai/models/qwen3-vl-32b-instruct https://www.together.ai/models/refuel-llm-2 https://www.together.ai/models/refuel-llm-2-small https://www.together.ai/models/rime-arcana-v3 https://www.together.ai/models/rime-arcana-v3-turbo https://www.together.ai/models/rnj-1-instruct https://www.together.ai/models/salesforce-llamarank https://www.together.ai/models/sd-xl https://www.together.ai/models/sora-2 https://www.together.ai/models/sora-2-pro https://www.together.ai/models/stable-diffusion-3 https://www.together.ai/models/trinity-mini https://www.together.ai/models/typhoon-2-8b-instruct https://www.together.ai/models/typhoon2-1-gemma3-12b https://www.together.ai/models/uae-large-v1 https://www.together.ai/models/virtueguard-text-lite https://www.together.ai/models/vidu-2-0 https://www.together.ai/models/vidu-q1 https://www.together.ai/models/voxtral-mini-3b-2507 https://www.together.ai/models/wan-2-2-i2v https://www.together.ai/models/wan-2-2-t2v https://www.together.ai/models/whisper-large-v3-streaming ## All Blog Posts https://www.together.ai/blog/20-exaflops-gpu-clusters https://www.together.ai/blog/40-new-image-and-video-models https://www.together.ai/blog/a-practitioners-guide-to-testing-and-running-large-gpu-clusters-for-training-generative-ai-models https://www.together.ai/blog/adaptive-learning-speculator-system-atlas https://www.together.ai/blog/ai-agents-to-automate-complex-engineering-tasks https://www.together.ai/blog/alon-gavrielov-as-vp-of-infrastructure-strategy https://www.together.ai/blog/announcing-the-availability-of-openais-open-models-on-together-ai https://www.together.ai/blog/announcing-together-ai-startup-accelerator https://www.together.ai/blog/announcing-together-custom-models https://www.together.ai/blog/api-announcement https://www.together.ai/blog/arcee-ai https://www.together.ai/blog/august-2023-pricing-update https://www.together.ai/blog/axiomatic-agents https://www.together.ai/blog/based https://www.together.ai/blog/batch-api https://www.together.ai/blog/batch-inference-api-updates-2025 https://www.together.ai/blog/benchmarking-language-models-using-the-together-research-computer https://www.together.ai/blog/bitdelta https://www.together.ai/blog/build-ultra-low-latency-voice-ai-applications-with-together-ai-and-cartesia-sonic https://www.together.ai/blog/building-an-autonomous-and-open-data-scientist-agent-from-scratch https://www.together.ai/blog/cache-aware-disaggregated-inference https://www.together.ai/blog/chipmunk https://www.together.ai/blog/clustermax-gold https://www.together.ai/blog/cocktailsgd https://www.together.ai/blog/code-sandbox-code-interpreter https://www.together.ai/blog/collinear-simulations-together-evals https://www.together.ai/blog/consistency-diffusion-language-models https://www.together.ai/blog/continued-fine-tuning https://www.together.ai/blog/customized-speculative-decoding https://www.together.ai/blog/decentralized-training-of-foundation-models-in-heterogeneous-environments https://www.together.ai/blog/deepcoder https://www.together.ai/blog/deepseek-v3-1-hybrid-thinking-model-now-available-on-together-ai https://www.together.ai/blog/deepswe https://www.together.ai/blog/deploy-deepseek-r1-and-distilled-models-securely-on-together-ai https://www.together.ai/blog/deploy-deepseek-r1-at-scale-fast-secure-serverless-apis-and-large-scale-together-reasoning-clusters https://www.together.ai/blog/dippy-ai https://www.together.ai/blog/direct-preference-optimization https://www.together.ai/blog/dragonfly-v1 https://www.together.ai/blog/embeddings-endpoint-release https://www.together.ai/blog/evaluate-and-benchmark-llms https://www.together.ai/blog/even-better-even-faster-quantized-llms-with-qtip https://www.together.ai/blog/evo https://www.together.ai/blog/fastest-inference-for-deepseek-r1-0528-with-nvidia-hgx-b200 https://www.together.ai/blog/fastest-inference-for-the-top-open-source-models https://www.together.ai/blog/fine-tune-gpt-oss-models-into-domain-experts-together-ai https://www.together.ai/blog/fine-tune-small-open-source-llms-outperform-closed-models https://www.together.ai/blog/fine-tuning-api-introducing-long-context-training-conversation-data-support-and-more-configuration-options https://www.together.ai/blog/fine-tuning-language-models-over-slow-networks-using-activation-compression-with-guarantees https://www.together.ai/blog/fine-tuning-llms-for-multi-turn-conversations-a-technical-deep-dive https://www.together.ai/blog/fine-tuning-open-llm-judges-to-outperform-gpt-5-2 https://www.together.ai/blog/fine-tuning-updates-sept-2025 https://www.together.ai/blog/finetuning https://www.together.ai/blog/flash-decoding-for-long-context-inference https://www.together.ai/blog/flashattentionfandm https://www.together.ai/blog/flashfftconv https://www.together.ai/blog/flexgen-high-throughput-generative-inference-of-large-language-models-with-a-single-gpu https://www.together.ai/blog/flux-1-kontext https://www.together.ai/blog/flux-2-multi-reference-image-generation-now-available-on-together-ai https://www.together.ai/blog/flux-api-is-now-available-on-together-ai-new-pro-free-access-to-flux-schnell https://www.together.ai/blog/flux-tools-models-together-apis-canny-depth-image-generation https://www.together.ai/blog/function-calling-json-mode https://www.together.ai/blog/futurebench https://www.together.ai/blog/generate-images-with-specific-styles-using-flux-loras-on-together-ai https://www.together.ai/blog/h3 https://www.together.ai/blog/how-speech-models-fail https://www.together.ai/blog/how-to-build-a-coding-agent-from-scratch-a-practical-guide-for-developers https://www.together.ai/blog/how-to-build-a-real-time-image-generator-with-together-ai https://www.together.ai/blog/how-to-choose-the-right-open-model-for-production https://www.together.ai/blog/how-zomato-built-an-ai-customer-support-bot-that-doubled-customer-satisfaction https://www.together.ai/blog/hungry-hungry-hippos-towards-language-modeling-with-state-space-models https://www.together.ai/blog/hyena-hierarchy-towards-larger-convolutional-language-models https://www.together.ai/blog/instant-gpu-clusters https://www.together.ai/blog/introducing-autojudge-streamlined-inference-acceleration-via-automated-dataset-curation https://www.together.ai/blog/introducing-fine-tuning-platform https://www.together.ai/blog/introducing-the-together-enterprise-platform https://www.together.ai/blog/introducing-together-evaluations https://www.together.ai/blog/kimi-k2-leading-open-source-model-now-available-on-together-ai https://www.together.ai/blog/large-reasoning-models-fail-to-follow-instructions-during-reasoning-a-benchmark-study https://www.together.ai/blog/learn-how-cursor-partnered-with-together-ai-to-deliver-real-time-low-latency-inference-at-scale https://www.together.ai/blog/linearizing-llms-with-lolcats https://www.together.ai/blog/llama-2-7b-32k https://www.together.ai/blog/llama-2-7b-32k-instruct https://www.together.ai/blog/llama-3-2-vision-stack https://www.together.ai/blog/llama-3-3 https://www.together.ai/blog/llama-31-quality https://www.together.ai/blog/llama-4 https://www.together.ai/blog/long-context-retrieval-models-with-monarch-mixer https://www.together.ai/blog/mahadev-konar-svp-infrastructure-engineering https://www.together.ai/blog/mamba-3b-slimpj https://www.together.ai/blog/medusa https://www.together.ai/blog/meta-llama-3-1 https://www.together.ai/blog/minions https://www.together.ai/blog/mistral-small-3-api-now-available-on-together-ai-a-new-category-leader-in-small-models https://www.together.ai/blog/mixtral https://www.together.ai/blog/monarch-mixer https://www.together.ai/blog/multi-node-gpu-training https://www.together.ai/blog/multimodal-document-rag-with-llama-3-2-vision-and-colqwen2 https://www.together.ai/blog/nemotron-3-nano-now-available-on-together-ai https://www.together.ai/blog/neurips-2022-overcoming-communication-bottlenecks-for-decentralized-training-12 https://www.together.ai/blog/neurips-2022-overcoming-communication-bottlenecks-for-decentralized-training-2 https://www.together.ai/blog/nvidia-ai-foundry-partnership https://www.together.ai/blog/nvidia-blackwell-test-drive https://www.together.ai/blog/nvidia-cloud-partner https://www.together.ai/blog/nvidia-gb200-together-gpu-cluster-36k https://www.together.ai/blog/nvidia-h200-and-h100-gpu-cluster-performance-together-kernel-collection https://www.together.ai/blog/nvidia-hgx-b200-with-together-kernel-collection https://www.together.ai/blog/nvidia-nim https://www.together.ai/blog/on-demand-dedicated-endpoints https://www.together.ai/blog/open-deep-research https://www.together.ai/blog/openais-new-open-gpt-oss-models-vs-o4-mini-a-real-world-comparison https://www.together.ai/blog/openchatkit https://www.together.ai/blog/openchatkit-016 https://www.together.ai/blog/optimizing-inference-speed-and-costs https://www.together.ai/blog/python-sdk-v1 https://www.together.ai/blog/qwen-3-coder https://www.together.ai/blog/rag-fine-tuning https://www.together.ai/blog/rag-tutorial-langchain https://www.together.ai/blog/rag-tutorial-llamaindex https://www.together.ai/blog/rag-tutorial-mongodb https://www.together.ai/blog/redpajama https://www.together.ai/blog/redpajama-3b-updates https://www.together.ai/blog/redpajama-7b https://www.together.ai/blog/redpajama-data-v2 https://www.together.ai/blog/redpajama-models-v1 https://www.together.ai/blog/redpajama-training-progress https://www.together.ai/blog/redpajama-v2-faq https://www.together.ai/blog/releasing-v1-of-gpt-jt-powered-by-open-source-ai https://www.together.ai/blog/research-pov-yes-agi-can-happen https://www.together.ai/blog/rime-arcana-v3-turbo-and-rime-arcana-v3-now-available-on-together-ai https://www.together.ai/blog/safety-models https://www.together.ai/blog/seed-funding https://www.together.ai/blog/sequoia https://www.together.ai/blog/series-a https://www.together.ai/blog/series-a2 https://www.together.ai/blog/serverless-multi-lora-fine-tune-and-deploy-hundreds-of-adapters-for-model-customization-at-scale https://www.together.ai/blog/snorkel-partnership https://www.together.ai/blog/snowflake-artic-llm https://www.together.ai/blog/soc-2-compliance https://www.together.ai/blog/sota-search-stack-for-llms https://www.together.ai/blog/specexec https://www.together.ai/blog/speculative-decoding-for-high-throughput-long-context-inference https://www.together.ai/blog/speech-to-text-whisper-apis https://www.together.ai/blog/stanford-open-source-software-award https://www.together.ai/blog/stripedhyena-7b https://www.together.ai/blog/teal-training-free-activation-sparsity-in-large-language-models https://www.together.ai/blog/the-fastest-inference-for-realtime-voice-ai-agents https://www.together.ai/blog/the-frontier-is-open https://www.together.ai/blog/the-mamba-in-the-llama-distilling-and-accelerating-hybrid-models https://www.together.ai/blog/thunderkittens https://www.together.ai/blog/thunderkittens-nvidia-blackwell-gpus https://www.together.ai/blog/together-ai-acquires-refuel-ai https://www.together.ai/blog/together-ai-and-meta-partner-to-bring-pytorch-reinforcement-learning-to-the-ai-native-cloud https://www.together.ai/blog/together-ai-announcing-305m-series-b https://www.together.ai/blog/together-ai-available-aws-marketplace-to-accelerate-enterprise-ai-development https://www.together.ai/blog/together-ai-expands-in-europe https://www.together.ai/blog/together-ai-partners-with-meta-to-release-meta-llama-3-for-inference-and-fine-tuning https://www.together.ai/blog/together-ai-powers-pioneers-at-nvidia-gtc-2025 https://www.together.ai/blog/together-ai-welcomes-kai-mak https://www.together.ai/blog/together-chat https://www.together.ai/blog/together-crusoe-reduce-carbon-impact-of-generative-ai https://www.together.ai/blog/together-evaluations-v2 https://www.together.ai/blog/together-inference-engine-2 https://www.together.ai/blog/together-inference-engine-v1 https://www.together.ai/blog/together-instant-clusters-ga https://www.together.ai/blog/together-moa https://www.together.ai/blog/together-python-sdk-2-0 https://www.together.ai/blog/together-rerank-api-and-salesforce-llamarank https://www.together.ai/blog/torchforge-reinforcement-learning-pipelines https://www.together.ai/blog/tri-dao-flash-attention https://www.together.ai/blog/virtueguard https://www.together.ai/blog/yaqa ## Webinars & Events https://www.together.ai/events https://www.together.ai/build-coding-agent-webinar https://www.together.ai/codesandbox-sdk-webinar https://www.together.ai/deepseek-r1-how-it-works-simplified-together-ai-webinar https://www.together.ai/nvidia-blackwell-deep-dive-webinar https://www.together.ai/webinar-how-advanced-tool-calling-transforms-agentic-use-cases https://www.together.ai/webinar-openai-gpt-oss-deep-dive