Models / ServiceNow AI
Reasoning
Vision
Chat

Apriel-1.6-15B-Thinker

Frontier-level multimodal reasoning in a compact, token-efficient model

About model

Apriel-1.6-15B-Thinker is an updated multimodal reasoning model from ServiceNow's Apriel SLM series that scores 57 on the AA Intelligence Index, matching models like Qwen-235B-A22B and DeepSeek-v3.2 Exp while being 15x smaller. With 30% better reasoning token efficiency than its predecessor, it delivers frontier-level performance on a single GPU.
AA Intelligence Index

57

Matches Qwen-235B & DeepSeek-v3.2 Exp

AIME 2025

88%

Elite mathematical reasoning

Fewer Tokens

30%

Better reasoning efficiency vs. v1.5

Model key capabilities
  • Frontier Performance: 57 on AA Index, matching models 15x its size
  • Token Efficiency: 30% fewer reasoning tokens than Apriel-1.5
  • Enterprise Ready: 69% Tau2 Bench Telecom, 69% IFBench
  • Single GPU: 15B parameters fit entirely on one GPU
Performance benchmarks

Model

AIME 2025

GPQA Diamond

HLE

LiveCodeBench

MATH500

SWE-bench verified

88.0%

73.0%

10.0%

81.0%

23.0%

Related open-source models

Competitor closed-source models

Claude Opus 4.6

90.5%

34.2%

78.7%

OpenAI o3

83.3%

24.9%

99.2%

62.3%

OpenAI o1

76.8%

96.4%

48.9%

GPT-4o

49.2%

2.7%

32.3%

89.3%

31.0%

  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    ServiceNow-AI/Apriel-1.6-15b-Thinker

    curl -X POST "https://api.together.xyz/v1/chat/completions" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "ServiceNow-AI/Apriel-1.6-15b-Thinker",
        "messages": [
          {
            "role": "user",
            "content": "What are some fun things to do in New York?"
          }
        ]
    }'
    
    from together import Together
    
    client = Together()
    
    response = client.chat.completions.create(
      model="ServiceNow-AI/Apriel-1.6-15b-Thinker",
      messages=[
        {
          "role": "user",
          "content": "What are some fun things to do in New York?"
        }
      ]
    )
    print(response.choices[0].message.content)
    
    import Together from 'together-ai';
    const together = new Together();
    
    const completion = await together.chat.completions.create({
      model: 'ServiceNow-AI/Apriel-1.6-15b-Thinker',
      messages: [
        {
          role: 'user',
          content: 'What are some fun things to do in New York?'
         }
      ],
    });
    
    console.log(completion.choices[0].message.content);
    
  • Model card

    Architecture Overview:
    • 15B parameter multimodal model supporting image-text-to-text reasoning with 131K context window for complex tasks.
    • Built on continual pre-training across billions of tokens covering math, code, science, logical reasoning, and multimodal image-text data.
    • Simplified chat template for easier output parsing with reasoning steps followed by final response delimiter.
    • Fits entirely on a single GPU, making it highly memory-efficient for deployment.

    Training Methodology:
    • Multi-stage training: continual pre-training, supervised fine-tuning (2.4M samples), and reinforcement learning optimization.
    • Training data includes ~15% from NVIDIA Nemotron collection for depth up-scaling and diverse domain coverage.
    • RL stage specifically optimizes reasoning efficiency by using fewer tokens, stopping earlier when confident, and giving direct answers on simple queries.
    • Incremental lightweight multimodal SFT following text-based supervised fine-tuning phase.

    Performance Characteristics:
    • Elite reasoning: 88% AIME 2025, 73% GPQA Diamond, 81% LiveCodeBench, 79% MMLU Pro.
    • Strong instruction following: 69% IFBench, 83.34% Multi IF, 57.2% Agent IF.
    • Enterprise-ready: 69% Tau2 Bench Telecom, 66.67% Tau2 Bench Retail, 58% Tau2 Bench Airline.
    • Advanced function calling: 63.5% BFCL v3, 33.2% ComplexFuncBench.
    • Multimodal excellence: 72% MMMU validation, 60.28% MMMU-PRO, 79.9% MathVista, 86.04% AI2D Test.
    • Reduces reasoning token usage by 30%+ compared to Apriel-1.5 while maintaining or improving task performance.

  • Applications & use cases

    Multimodal Reasoning:
    • Visual question answering and complex image understanding tasks requiring deep reasoning.
    • Mathematical problem solving from visual inputs including charts, diagrams, and equations.
    • Document analysis combining text and visual elements for comprehensive understanding.

    Code & Development:
    • Code assistance and generation with logical reasoning and multi-step problem decomposition.
    • Technical documentation understanding and creation with visual component support.
    • Software development workflows requiring reasoning over code structure and logic.

    Enterprise Applications:
    • Telecom, retail, and airline domain-specific workflows with strong Tau2 Bench performance.
    • Complex instruction following and function calling for business automation.
    • Agent-based systems requiring reliable instruction adherence and multi-turn interactions.

    Knowledge & Question Answering:
    • Information retrieval combining text and visual context for accurate responses.
    • Scientific and technical question answering with reasoning transparency.
    • Educational applications requiring step-by-step problem solving explanations.

    Creative & General Purpose:
    • Question answering across diverse domains with multimodal context.
    • Logical reasoning tasks requiring systematic analysis and structured thinking.
    • Real-world workflows where efficiency and single-GPU deployment are critical constraints.

Related models
  • Model provider
    ServiceNow AI
  • Type
    Reasoning
    Vision
    Chat
  • Main use cases
    Reasoning
    Vision
  • Deployment
    Serverless
    Monthly Reserved
  • Parameters
    15B
  • Context length
    128K
  • Input modalities
    Text
    Image
  • Output modalities
    Text
  • Released
    November 28, 2025
  • Quantization level
    BF16
  • External link
  • Category
    Chat