Policies
id | Type | Supported | What it does |
|---|---|---|---|
toxicity | classifier | text, audio | General-purpose toxicity detection: insults, harassment, hostile language. |
toxicity_severe | classifier | text, audio | A stricter sub-classifier for severe toxicity. Useful when you want to distinguish “rude” from “abusive.” |
hate | classifier | text, image, video, audio | Hate speech, discrimination, racism, and extremism — including image and video content. |