Chat

NIM Llama 3.1 Nemotron 70B Instruct

NVIDIA NIM for GPU accelerated Llama 3.1 Nemotron 70B Instruct inference through OpenAI compatible APIs.

About model

NVIDIA's Llama-3.1-Nemotron-70B-Instruct fine-tunes for alignment and helpfulness, providing accurate and informative responses. It specializes in generating human-like text based on user input. Suitable for developers and researchers requiring advanced language understanding capabilities.

To run this model, you first need to deploy it on a Dedicated Endpoint.

Related models

Model specifications

Model data

Model provider
Meta
Type
Chat
Deployment
On-Demand Dedicated
Monthly Reserved
Parameters
70B
Context length
128K
Input modalities
Text
Output modalities
Text

Released
September 30, 2024
Last updated
August 26, 2025
External link
Provider docs
Category
Chat

Quickstart docs

Deploy model