Apriel-1.6-15B-Thinker

Frontier-level multimodal reasoning in a compact, token-efficient model

Try Now

read docs

About model

Apriel-1.6-15B-Thinker is an updated multimodal reasoning model from ServiceNow's Apriel SLM series that scores 57 on the AA Intelligence Index, matching models like Qwen-235B-A22B and DeepSeek-v3.2 Exp while being 15x smaller. With 30% better reasoning token efficiency than its predecessor, it delivers frontier-level performance on a single GPU.

AA Intelligence Index

Matches Qwen-235B & DeepSeek-v3.2 Exp

AIME 2025

88%

Elite mathematical reasoning

Fewer Tokens

30%

Better reasoning efficiency vs. v1.5

Model key capabilities

Frontier Performance: 57 on AA Index, matching models 15x its size
Token Efficiency: 30% fewer reasoning tokens than Apriel-1.5
Enterprise Ready: 69% Tau2 Bench Telecom, 69% IFBench
Single GPU: 15B parameters fit entirely on one GPU

Quickstart guides

Building a RAG Workflow

Performance benchmarks

Model	AIME 2025	GPQA Diamond	HLE	LiveCodeBench	MATH500	SWE-bench verified
Apriel-1.6-15B-Thinker	88.0%	73.0%	10.0%	81.0%		23.0%
Related open-source models
Competitor closed-source models
Claude Opus 4.6		90.5%	34.2%			78.7%
OpenAI o3		83.3%	24.9%		99.2%	62.3%
OpenAI o1		76.8%			96.4%	48.9%
GPT-4o		49.2%	2.7%	32.3%	89.3%	31.0%

API usage

cURL
Python
Typescript

Endpoint:

ServiceNow-AI/Apriel-1.6-15b-Thinker

curl -X POST "https://api.together.xyz/v1/chat/completions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ServiceNow-AI/Apriel-1.6-15b-Thinker",
    "messages": [
      {
        "role": "user",
        "content": "What are some fun things to do in New York?"
      }
    ]
}'

from together import Together

client = Together()

response = client.chat.completions.create(
  model="ServiceNow-AI/Apriel-1.6-15b-Thinker",
  messages=[
    {
      "role": "user",
      "content": "What are some fun things to do in New York?"
    }
  ]
)
print(response.choices[0].message.content)

import Together from 'together-ai';
const together = new Together();

const completion = await together.chat.completions.create({
  model: 'ServiceNow-AI/Apriel-1.6-15b-Thinker',
  messages: [
    {
      role: 'user',
      content: 'What are some fun things to do in New York?'
     }
  ],
});

console.log(completion.choices[0].message.content);

Model card
Architecture Overview:
• 15B parameter multimodal model supporting image-text-to-text reasoning with 131K context window for complex tasks.
• Built on continual pre-training across billions of tokens covering math, code, science, logical reasoning, and multimodal image-text data.
• Simplified chat template for easier output parsing with reasoning steps followed by final response delimiter.
• Fits entirely on a single GPU, making it highly memory-efficient for deployment.

Training Methodology:
• Multi-stage training: continual pre-training, supervised fine-tuning (2.4M samples), and reinforcement learning optimization.
• Training data includes ~15% from NVIDIA Nemotron collection for depth up-scaling and diverse domain coverage.
• RL stage specifically optimizes reasoning efficiency by using fewer tokens, stopping earlier when confident, and giving direct answers on simple queries.
• Incremental lightweight multimodal SFT following text-based supervised fine-tuning phase.

Performance Characteristics:
• Elite reasoning: 88% AIME 2025, 73% GPQA Diamond, 81% LiveCodeBench, 79% MMLU Pro.
• Strong instruction following: 69% IFBench, 83.34% Multi IF, 57.2% Agent IF.
• Enterprise-ready: 69% Tau2 Bench Telecom, 66.67% Tau2 Bench Retail, 58% Tau2 Bench Airline.
• Advanced function calling: 63.5% BFCL v3, 33.2% ComplexFuncBench.
• Multimodal excellence: 72% MMMU validation, 60.28% MMMU-PRO, 79.9% MathVista, 86.04% AI2D Test.
• Reduces reasoning token usage by 30%+ compared to Apriel-1.5 while maintaining or improving task performance.
‍
Applications & use cases
Multimodal Reasoning:
• Visual question answering and complex image understanding tasks requiring deep reasoning.
• Mathematical problem solving from visual inputs including charts, diagrams, and equations.
• Document analysis combining text and visual elements for comprehensive understanding.

Code & Development:
• Code assistance and generation with logical reasoning and multi-step problem decomposition.
• Technical documentation understanding and creation with visual component support.
• Software development workflows requiring reasoning over code structure and logic.

Enterprise Applications:
• Telecom, retail, and airline domain-specific workflows with strong Tau2 Bench performance.
• Complex instruction following and function calling for business automation.
• Agent-based systems requiring reliable instruction adherence and multi-turn interactions.

Knowledge & Question Answering:
• Information retrieval combining text and visual context for accurate responses.
• Scientific and technical question answering with reasoning transparency.
• Educational applications requiring step-by-step problem solving explanations.

Creative & General Purpose:
• Question answering across diverse domains with multimodal context.
• Logical reasoning tasks requiring systematic analysis and structured thinking.
• Real-world workflows where efficiency and single-GPU deployment are critical constraints.
‍

Related models

Model specifications

Model data

Model provider
ServiceNow AI
Type
Reasoning
Vision
Chat
Main use cases
Reasoning
Vision
Deployment
Serverless
Monthly Reserved
Endpoint
ServiceNow-AI/Apriel-1.6-15b-Thinker
Parameters
15B
Context length
128K
Input modalities
Text
Image
Output modalities
Text

Released
November 28, 2025
Quantization level
BF16
External link
Provider docs
Category
Chat

Run in Playground

Quickstart docs

Deploy model

Apriel-1.6-15B-Thinker

About model

API usage

Model card

Applications & use cases