Models / Language / Llama Guard 4 12B API
Llama Guard 4 12B API
Multimodal safety model from Llama 4 Scout, classifying text and images for safe LLM prompts and responses.

To use this moderation model, please follow the instructions from our blog post.
Llama Guard 4 12B API Usage
How to use Llama Guard 4 12B
1. Use Llama Guard as a standalone classifier
Use this code snippet in your command line to run inference on Llama Guard 4 12B:
curl -v -X POST https://api.together.xyz/inference \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-d '{
"model": "meta-llama/Llama-Guard-4-12B",
"messages": [ { "role": "user", "content": "a horse is a horse" } ]
}'
2. Use Llama Guard as a filter to safeguard responses from 200+ models
Use this code snippet in your command line to run inference of any of our 200+ models together with Llama Guard (the only change is adding the safety_model parameter):
curl -v -X POST https://api.together.xyz/inference \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-d '{
"model": "meta-llama/Llama-4-Scout-17B-16E-Instruct",
"max_tokens": 32,
"prompt": "a horse is a horse",
"temperature": 0.1,
"safety_model": "meta-llama/Llama-Guard-4-12B"
}'
Model details
Prompting Llama Guard 4 12B
Applications & Use Cases
Model Provider:
Meta
Type:
Moderation
Variant:
Parameters:
12B
Deployment:
✔ Serverless
✔ On-Demand Dedicated
✔ Monthly Reserved
Quantization
Context length:
1M
Pricing:
$0.20
Check pricing
Run in playground
Deploy model
Quickstart docs
Quickstart docs
Serverless
Monthly Reserved
Looking for production scale? Deploy on a dedicated endpoint
Deploy Llama Guard 4 12B on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.
