To analyze text, you need to send a POST request with the text to the text moderation endpoint and you will receive a response with the results right away.

Example

import ModerationApi from "@moderation-api/sdk";

const moderationApi = new ModerationApi({
  key: process.env.API_KEY,
});

const analysis = await moderationApi.moderate.text({
  value:
    "You can contact me on mr_robot[at]gmail|DOT|com or call me on 12 34 65 78",
});
Response example
{
  "status": "success",
  "request": {
    "timestamp": 1612792574690,
    "quota_usage": 1
  },
  "content_moderated": true,
  "data_found": true,
  "flagged": true,
  "original": "You can contact me on mr_robot[at]gmail|DOT|com or call me on 12 34 65 78",
  "content": "You can contact me on {{ email hidden }} or call me on {{ number hidden }}",
  "email": {
    "found": true,
    "mode": "SUSPICIOUS",
    "matches": ["mr_robot[at]gmail|DOT|com"]
  },
  "phone": {
    "found": true,
    "mode": "NORMAL",
    "matches": ["12 34 65 78"]
  },
  "sentiment": {
    "label": "NEUTRAL",
    "labelIndex": 0,
    "score": 0.997857,
    "label_scores": {
      "NEUTRAL": 0.997857,
      "POSITIVE": 0.001105,
      "NEGATIVE": 0.000666
    }
  }
}

Acting on the response

Here are some examples on how to use the response in your application.

Utilize flagged field

The easiest way to get started is to use the flagged field. The flagged field is a boolean that indicates if any of the models detected something non-neutral. You can adjust the thresholds for flagging content in the project settings. See flagging thresholds.

if (data.flagged) {
  // Do something
}

Utilize model scores

You can also utilize the model scores to have more granular control over the content moderation. For example, if you want to flag content that has a negative sentiment score of more than 0.9, you can do the following:

if (data.flagged && data.sentiment.label_scores.NEGATIVE > 0.9) {
  // Do something
}

Saving matched entities

You can also save the matched entities to your database. For example, if you want to save the email address that was detected in the content, you can do the following:

if (data.email.mathces.length > 0) {
  // Save data.email.matches[0] to your database
}

Detecting look-alike characters

Spammers might try to get around your moderation by spoofing text with look-alike characters. For example, they might write 🅼🅾🅽🅴🆈 instead of money, or it can be an even more subtle replacement like hidden spaces or very similar looking letters.

Moderation API detects and normalizes look-alike characters before analyzing content. You can find the normalized text in the content field of the response, and the original text in the original field.

Additionally you can use the unicode_spoofing field to see if look-alike characters were detected.

Lastly, models like the spam model are trained to take look-alike characters into account. Specifically the spam model should raise a flag for excessive use of look-alike characters. Other models like the email model will run on the normalized text to improve accuracy.

Storing moderated content

Some models can be configured to modify the content. For example, the email model can mask email addresses with {{ email hidden }}. You can store the modified content in your database by using the content field.

if (data.content_moderated) {
  // Store data.content in your database
}

This can be useful if you need to anonymize/pseudonymize the content before storing it in your database, or if you do not wish your end users to see personal information.

Adding context to requests

The text moderation endpoint allows for contextId in the request body.

This could be the ID of a chatroom or a post thread.

This enables improved accuracy in some models as they account for previous messages in the same context.

If you are using review queues the contextId can also be used to filter the review queue to only show content from a specific context.

Adding author information

The text moderation endpoint allows for authorId in the request body.

This would be the ID of the user who wrote the content.

This is useful if you are using review queues, as it enables you to perform user level moderation, and to filter the review queue to only show content from a specific user.

Metadata

The text moderation endpoint allows for a metadata object in the request body.

This can be used to add any additional information to the request. For example you can keep a reference ID to the original content in the metadata object.

Metadata is shown in the review queues and included in any webhooks.

If you add a link to the original content in the metadata, you can easily navigate to the original content from the review queues.

Opt out of content store

The text moderation endpoint allows for a doNotStore boolean in the request body.

If you set doNotStore to true, the content will not be stored and only pass through the moderation models.

If you set doNotStore to true, the item will not enter the content queues and you will not be able to train custom models based on previous requests.

Was this page helpful?