Models / Meta
Moderation

Llama Guard 4 12B

Multimodal safety model from Llama 4 Scout, classifying text and images for safe LLM prompts and responses.

About model

Llama Guard 4 12B detects and mitigates harmful content, providing a safe environment for users. Its key strength lies in accurately identifying sensitive information. It is designed for developers and organizations requiring robust content moderation.

To use this moderation model, please follow the instructions from our blog post.

  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    meta-llama/Llama-Guard-4-12B

    curl -X POST https://api.together.xyz/v1/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -d '{
        "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
        "prompt": "A horse is a horse",
        "max_tokens": 32,
        "temperature": 0.1,
        "safety_model": "meta-llama/Llama-Guard-4-12B"
      }'
    
    from together import Together
    
    client = Together()
    
    response = client.completions.create(
      model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
      prompt="A horse is a horse",
      max_tokens=32,
      temperature=0.1,
      safety_model="meta-llama/Llama-Guard-4-12B",
    )
    
    print(response.choices[0].text)
    
    
    import Together from "together-ai";
    
    const together = new Together();
    
    async function main() {
      const response = await together.completions.create({
        model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
        prompt: "A horse is a horse",
        max_tokens: 32,
        temperature: 0.1,
        safety_model: "meta-llama/Llama-Guard-4-12B"
      });
      
      console.log(response.choices[0]?.text);
    }
    
    main();
    
    
  • How to use model

    1. Use Llama Guard as a standalone classifier

    Use this code snippet in your command line to run inference on Llama Guard 4 12B:

        
          curl -v -X POST https://api.together.xyz/inference \
            -H 'Content-Type: application/json' \
            -H "Authorization: Bearer $TOGETHER_API_KEY" \
            -d '{
            "model": "meta-llama/Llama-Guard-4-12B",
            "messages": [ { "role": "user", "content": "a horse is a horse" } ]
          }'
        
    

    2. Use Llama Guard as a filter to safeguard responses from 200+ models

    Use this code snippet in your command line to run inference of any of our 200+ models together with Llama Guard (the only change is adding the safety_model parameter):

        
          curl -v -X POST https://api.together.xyz/inference \
            -H 'Content-Type: application/json' \
            -H "Authorization: Bearer $TOGETHER_API_KEY" \
            -d '{
            "model": "meta-llama/Llama-4-Scout-17B-16E-Instruct",
            "max_tokens": 32,
            "prompt": "a horse is a horse",
            "temperature": 0.1,
            "safety_model": "meta-llama/Llama-Guard-4-12B"
          }'
        
    
Related models
  • Model provider
    Meta
  • Type
    Moderation
  • Main use cases
    Moderation
  • Deployment
    Serverless
    On-Demand Dedicated
    Monthly Reserved
  • Parameters
    12B
  • Context length
    1M
  • Input price

    $0.20 / 1M tokens

  • Output price

    $0.20 / 1M tokens

  • Input modalities
    Text
    Image
  • Output modalities
    Text
  • Released
    April 23, 2025
  • External link
  • Category
    Moderation