Model key: toxicity

Works on the whole text to detect general features like profanity, swearing, racism, threats etc. Contrary to our profanity filter the toxicity analyzer will detect cases where the profanity is not as pronounced.

Toxicity Response Example:
{
  "label": "TOXICITY",
  "score": 0.9563754,
  "label_scores": {
    "TOXICITY": 0.9563754,
    "PROFANITY": 0.89909166,
    "INSULT": 0.68668073,
    "SEVERE_TOXICITY": 0.45895407,
    "DISCRIMINATION": 0.27914262,
    "THREAT": 0.09009721,
    "NEUTRAL": 0.0436246
  }
}

Labels

LabelDescription
TOXICITYThe general toxicity. If any other labels have a high score, this one is likely to score high as well.
PROFANITYContaining swearing, curse words, and other obscene language.
DISCRIMINATIONRacism and other discrimination based on race, religion, gender, etc.
INSULTInsulting, inflammatory, or negative language.
SEVERE_TOXICITYVery toxic, containing severe profanity, racism, etc.
THREATThreatening, bullying, or aggressive language.
NEUTRALNothing toxic was detected.

Supported languages

This model works in the following languages:

  • English en
  • Spanish es
  • French fr
  • German de
  • Italian it
  • Portuguese pt
  • Russian ru
  • Japanese ja
  • Indonesian id
  • Chinese zh
  • Dutch nl
  • Polish pl
  • Swedish sv

The model might work on other launguages we haven't tested. Feel free to try it on launguages that are not listed above and provide us with feedback.

Limitations

This model does not have any API limitations.

Was this page helpful?