How to create an AI Agent

AI Agents are effectively specialized models that you can give custom guidelines for moderation. You can create an agent in minutes, and use it just like any other model in your projects. To create an AI agent, you simply explain what your content policies are, and the AI agent will classify content based on your instructions.

How it works

Define purpose

Give your agent a name, describe its job, and choose which LLM to use.

Add your rules

Add content policies that the agent should follow. You can select from a list of pre-defined rules, or create your own.

Use your AI agent

Add the agent to classify your content in your projects, just like any other model.

Need help getting started?

If you need help creating AI agents please send us a message any time. We’re happy to help you set up your first agent.

Get started

Head over to the Model Studio in your dashboard and press the “Add new agent” button. Here you’ll give your agent a name and describe what its job is. These fields are not used for moderation, but are useful for you to identify the agent’s purpose. Next you’ll need to choose which LLM model should power your agent. We recommend using Llama Guard 3 because it’s optimized for moderation, but you can also use GPT-4o-mini.

Choose between Llama guard and GPT

We currently offer two LLM models for AI agents:

Llama Guard 3

LLM model that is optimized for moderation. Recommended for most use cases.

GPT-4o-mini

OpenAI’s latest model. Recommended if you need to moderate something that is not covered by Llama Guard 3.

Note, if you would like to use a different LLM model, please send us a message and we’ll see what we can do.

Benefits of Llama guard compared to GPT-4o-mini:

Faster for real-time applications
More accurate within the 14 safety categories
Hosted on GDPR compliant GPU servers in Germany
Can be on-premise for enterprise customers
Can be fine-tuned for enterprise customers
Returns probability scores instead of binary flags

When to use GPT-4o-mini over Llama guard:

For custom safety categories that fall far outside the scope of the MLCommons safety categories
If you prefer false positives over false negatives
When speed is not crucial

You can read more about Llama Guard 3 here and GPT-4o-mini here.

Documentation

Quickstart

Learn

Resources

How it works

Need help getting started?

Get started

Choose between Llama guard and GPT

Llama Guard 3

GPT-4o-mini

Documentation

Quickstart

Learn

Resources

​How it works

​Need help getting started?

​Get started

​Choose between Llama guard and GPT

Llama Guard 3

GPT-4o-mini

How it works

Need help getting started?

Get started

Choose between Llama guard and GPT