Analyze text
To analyze text, you need to send a POST request with the text to the text moderation endpoint and you will receive a response with the results right away.
Example
{
"status": "success",
"request": {
"timestamp": 1612792574690,
"quota_usage": 1
},
"content_moderated": true,
"data_found": true,
"flagged": true,
"original": "You can contact me on mr_robot[at]gmail|DOT|com or call me on 12 34 65 78",
"content": "You can contact me on {{ email hidden }} or call me on {{ number hidden }}",
"email": {
"found": true,
"mode": "SUSPICIOUS",
"matches": ["mr_robot[at]gmail|DOT|com"]
},
"phone": {
"found": true,
"mode": "NORMAL",
"matches": ["12 34 65 78"]
},
"sentiment": {
"label": "NEUTRAL",
"labelIndex": 0,
"score": 0.997857,
"label_scores": {
"NEUTRAL": 0.997857,
"POSITIVE": 0.001105,
"NEGATIVE": 0.000666
}
}
}
Acting on the response
Here are some examples on how to use the response in your application.
Utilize flagged field
The easiest way to get started is to use the flagged
field. The flagged
field is a boolean that indicates if any of the models detected something non-neutral. You can adjust the thresholds for flagging content in the project settings. See flagging thresholds.
if (data.flagged) {
// Do something
}
Utilize model scores
You can also utilize the model scores to have more granular control over the content moderation. For example, if you want to flag content that has a negative sentiment score of more than 0.9, you can do the following:
if (data.flagged && data.sentiment.label_scores.NEGATIVE > 0.9) {
// Do something
}
Saving matched entities
You can also save the matched entities to your database. For example, if you want to save the email address that was detected in the content, you can do the following:
if (data.email.mathces.length > 0) {
// Save data.email.matches[0] to your database
}
Detecting look-alike characters
Spammers might try to get around your moderation by spoofing text with look-alike characters. For example, they might write 🅼🅾🅽🅴🆈
instead of money
, or it can be an even more subtle replacement like hidden spaces or very similar looking letters.
Moderation API detects and normalizes look-alike characters before analyzing content. You can find the normalized text in the content
field of the response, and the original text in the original
field.
Additionally you can use the unicode_spoofing
field to see if look-alike characters were detected.
Lastly, models like the spam
model are trained to take look-alike characters into account. Specifically the spam
model should raise a flag for excessive use of look-alike characters. Other models like the email
model will run on the normalized text to improve accuracy.
Storing moderated content
Some models can be configured to modify the content. For example, the email
model can mask email addresses with {{ email hidden }}
. You can store the modified content in your database by using the content
field.
if (data.content_moderated) {
// Store data.content in your database
}
This can be useful if you need to anonymize/pseudonymize the content before storing it in your database, or if you do not wish your end users to see personal information.
Adding context to requests
The text moderation endpoint allows for contextId
in the request body.
This could be the ID of a chatroom or a post thread.
This enables improved accuracy in some models as they account for previous messages in the same context.
If you are using review queues the contextId
can also be used to filter the review queue to only show content from a specific context.
Adding author information
The text moderation endpoint allows for authorId
in the request body.
This would be the ID of the user who wrote the content.
This is useful if you are using review queues, as it enables you to perform user level moderation, and to filter the review queue to only show content from a specific user.
Metadata
The text moderation endpoint allows for a metadata
object in the request body.
This can be used to add any additional information to the request. For example you can keep a reference ID to the original content in the metadata object.
Metadata is shown in the review queues and included in any webhooks.
If you add a link to the original content in the metadata, you can easily navigate to the original content from the review queues.
Opt out of content store
The text moderation endpoint allows for a doNotStore
boolean in the request body.
If you set doNotStore
to true
, the content will not be stored and only pass through the moderation models.
If you set doNotStore
to true
, the item will not enter the content
queues and you will not be able to train custom models based
on previous requests.
Was this page helpful?