NIM Llama 3.2 11B Vision Instruct
NVIDIA NIM for GPU accelerated Llama 3.2 11B Vision Instruct inference through OpenAI compatible APIs.
About model
NIM Llama 3.2 11B Vision Instruct processes multimodal inputs, combining text and vision capabilities. It excels at tasks requiring both language understanding and visual context. Suitable for developers and researchers needing advanced multimodal processing.
To run this model, you first need to deploy it on a Dedicated Endpoint.
- TypeVision
- Main use casesSmall & FastVision
- DeploymentOn-Demand DedicatedMonthly Reserved
- Parameters11B
- Context length128K
- Input modalitiesTextImage
- Output modalitiesText
- ReleasedSeptember 24, 2024
- Last updatedAugust 26, 2025
- External link
- CategoryVision