Spam and security policies catch content that’s trying to manipulate, deceive, or exploit your platform — from low-effort promotion all the way to active phishing and code injection attempts. For URL-specific risk scoring, see URL Risk.Documentation Index
Fetch the complete documentation index at: https://docs.moderationapi.com/llms.txt
Use this file to discover all available pages before exploring further.
Policies
id | Type | Supported | What it does |
|---|---|---|---|
spam | classifier | text, audio | Spam, repetitive content, unsolicited messages. |
self_promotion | classifier | text, audio | Self-promotional content and advertising. |
code_abuse | classifier | text, audio | Malicious code, code injection attempts, and abuse of code features. Useful as a guardrail in front of LLM agents. |
phishing | classifier | text, audio | Phishing attempts and scam messages. Requires conversation context to be enabled on the channel. |
url | entity_matcher | text, audio | Extracts URLs from content and can mask them in the returned content. |
Reading the result
phishing looks at the conversation as a whole, not just the latest message. Enable conversation context on the channel and pass prior turns when you submit content — otherwise the policy can’t run.