Filter responses of any model with Llama Guard or your own safety model

December 10, 2023



We’re excited to announce we’ve partnered with Meta to make Llama Guard available through the Together Platform.

You can now use Llama Guard, an LLM-based input-output safeguard model through the Together API. From Day 1, Together API was designed with safety in mind – It natively supports optional safety models to moderate any open model that is being hosted. Now you can use Llama Guard with all 100+ open models that are available, by adding one parameter to your API call (safety_model=Meta-Llama/Llama-Guard-7b). You can also run Llama Guard as a model on its own.

Llama Guard is an openly available model that performs competitively on common open benchmarks and provides developers with a pretrained model to help defend against generating potentially risky outputs.

We are committed to transparent and safe science, and are thrilled to help build towards open trust and safety in the new world of generative AI.

We are also working closely with the Meta team to allow end users to benchmark their models against their Cybersecurity Evals.

1. Use Llama Guard as a standalone classifier

Use this code snippet in your command line to run inference on Llama-Guard-7b:

curl -v -X POST \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
  "model": "Meta-Llama/Llama-Guard-7b",
  "messages": [ { "role": "user", "content": "a horse is a horse" } ]

2. Use Llama Guard as a filter to safeguard responses from 100+ models

Use this code snippet in your command line to run inference of any of our 100+ models together with Llama Guard (the only change is adding the safety_model parameter):

curl -v -X POST \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
  "model": "togethercomputer/llama-2-13b-chat",
  "max_tokens": 32,
  "prompt": "a horse is a horse",
  "temperature": 0.1,
  "safety_model": "Meta-Llama/Llama-Guard-7b"

Llama Guard is also available in our playground: 

