Models / Language / Llama Guard 4 12B API
Llama Guard 4 12B API
Multimodal safety model from Llama 4 Scout, classifying text and images for safe LLM prompts and responses.

Llama Guard 4 12B API Usage
Endpoint
RUN INFERENCE
curl -X POST "https://api.together.xyz/v1/chat/completions" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Llama-Guard-4-12B",
"messages": [
{
"role": "user",
"content": "What are some fun things to do in New York?"
}
]
}'
JSON RESPONSE
RUN INFERENCE
from together import Together
client = Together()
response = client.chat.completions.create(
model="meta-llama/Llama-Guard-4-12B",
messages=[
{
"role": "user",
"content": "What are some fun things to do in New York?"
}
]
)
print(response.choices[0].message.content)
JSON RESPONSE
RUN INFERENCE
import Together from "together-ai";
const together = new Together();
const response = await together.chat.completions.create({
messages: [
{
role: "user",
content: "What are some fun things to do in New York?"
}
],
model: "meta-llama/Llama-Guard-4-12B"
});
console.log(response.choices[0].message.content)
JSON RESPONSE
Model Provider:
Meta
Type:
Moderation
Variant:
Parameters:
12B
Deployment:
✔️ Serverless
Quantization
Context length:
1M
Pricing:
$0.20
Run in playground
Deploy model
Quickstart docs
Quickstart docs
How to use Llama Guard 4 12B
1. Use Llama Guard as a standalone classifier
Use this code snippet in your command line to run inference on Llama Guard 4 12B:
curl -v -X POST https://api.together.xyz/inference \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-d '{
"model": "meta-llama/Llama-Guard-4-12B",
"messages": [ { "role": "user", "content": "a horse is a horse" } ]
}'
2. Use Llama Guard as a filter to safeguard responses from 200+ models
Use this code snippet in your command line to run inference of any of our 200+ models together with Llama Guard (the only change is adding the safety_model parameter):
curl -v -X POST https://api.together.xyz/inference \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-d '{
"model": "meta-llama/Llama-4-Scout-17B-16E-Instruct",
"max_tokens": 32,
"prompt": "a horse is a horse",
"temperature": 0.1,
"safety_model": "meta-llama/Llama-Guard-4-12B"
}'
Model details
Prompting Llama Guard 4 12B
Applications & Use Cases
Looking for production scale? Deploy on a dedicated endpoint
Deploy Llama Guard 4 12B on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.
