Analyze content
To analyze content, you need to send a POST request with the content to the moderation endpoint and you will receive a response with the results right away.
Remember to create a project first and configure the models you want to use to analyze your content.
Content types
We provide 5 different endpoints for analyzing content:
- Text moderation
- Image moderation
- Object moderation
- Audio moderation (enterprise)
- Video moderation (enterprise)
Acting on the response
It’s up to you how you want to act on the result of the moderation.
- You could block the content and return an error message to your user if it gets flagged.
- You could store it to your database and let a human review it (e.g. use our review queues).
- Or do something in between where you only review if the AI is not so sure about the content.
Here we look at which fields you can use to act on the response.
Utilize flagged field
The easiest way to get started is to use the flagged
field. The flagged
field is a boolean that indicates if any of the models detected something non-neutral. You can adjust the thresholds for flagging content in the project settings. See flagging thresholds.
Utilize model scores
You can also utilize the model scores to have more granular control over the content moderation. For example, if you want to flag content that has a negative sentiment score of more than 0.9, you can do the following:
Saving matched entities
You can also save the matched entities to your database. For example, if you want to save the email address that was detected in the content, you can do the following:
Detecting look-alike characters
Spammers might try to get around your moderation by spoofing text with look-alike characters. For example, they might write 🅼🅾🅽🅴🆈
instead of money
, or it can be an even more subtle replacement like hidden spaces or very similar looking letters.
Moderation API detects and normalizes look-alike characters before analyzing content. You can find the normalized text in the content
field of the response, and the original text in the original
field.
Additionally you can use the unicode_spoofing
field to see if look-alike characters were detected.
Lastly, models like the spam
model are trained to take look-alike characters into account. Specifically the spam
model should raise a flag for excessive use of look-alike characters. Other models like the email
model will run on the normalized text to improve accuracy.
Storing modified content
Some models can be configured to modify the content. For example, the email
model can mask email addresses with {{ email hidden }}
. You can store the modified content in your database by using the content
field.
This can be useful if you need to anonymize/pseudonymize the content before storing it in your database, or if you do not wish your end users to see personal information.
Adding data to the request
You can add metadata to the content you are sending for moderation. Some specific metadata is used by the moderation pipeline and is documented below.
ContextId
The text moderation endpoint allows for contextId
in the request body.
This could be the ID of a chatroom or a post thread.
If you are using review queues the contextId
can also be used to filter the review queue to only show content from a specific context.
Enable context awareness to improve the accuracy of your moderation using the contextId.
AuthorId
The text moderation endpoint allows for authorId
in the request body.
This would be the ID of the user who wrote the content.
This is useful if you are using review queues, as it enables you to perform user level moderation, and to filter the review queue to only show content from a specific user.
Enable context awareness to improve the accuracy of your moderation using the authorId.
Other metadata
The text moderation endpoint allows for a metadata
object in the request body.
This can be used to add any additional information to the request. For example you can keep a reference ID to the original content in the metadata object.
Metadata is shown in the review queues and included in any webhooks.
If you add a link it will be clickable from the review queue. Useful if you link to the original content in the metadata, you can easily navigate to the original content from the review queues.
Context awareness
Enable Context awareness
in your project settings, and then include authorId
and/or contextId
in the API request. This enables models to pull in previous messages for their analysis and increase their accuracy.
Context awareness can increase quota usage as it requires additional processing and analysis of previous messages. Each contextual message analyzed counts towards your quota usage. Consider this when implementing context awareness in your project, especially for high-volume applications.
LLM models like AI agents can use the contextId
to see the previous messages in the same context, and authorId
to see the previous messages from the same author.
Specifically this can prevent unwanted content that are spread of multiple messages.
It can also help understanding messages in the context of a conversation.
Opt out of content store
The text moderation endpoint allows for a doNotStore
boolean in the request body.
If you set doNotStore
to true
, the content will not be stored and only pass through the moderation models.
If you set doNotStore
to true
, it will make parts of the moderation
dashboards less useful.
Do not disable content store if you wish to train or optimize models based on your own data.
Was this page helpful?