Spam & Security

Spam and security policies catch content that’s trying to manipulate, deceive, or exploit your platform — from low-effort promotion all the way to active phishing and code injection attempts. For URL-specific risk scoring, see URL Risk.

Policies

`id`	Type	Supported	What it does
`spam`	`classifier`	text, audio	Spam, repetitive content, unsolicited messages.
`self_promotion`	`classifier`	text, audio	Self-promotional content and advertising.
`code_abuse`	`classifier`	text, audio	Malicious code, code injection attempts, and abuse of code features. Useful as a guardrail in front of LLM agents.
`phishing`	`classifier`	text, audio	Phishing attempts and scam messages. Requires conversation context to be enabled on the channel.
`url`	`entity_matcher`	text, audio	Extracts URLs from content and can mask them in the returned content.

Reading the result

const spam = response.policies.find(p => p.id === "spam");
if (spam?.flagged) {
  await sendToReview(content);
}

const urls = response.policies.find(p => p.id === "url");
urls?.matches?.forEach(m => console.log(m.match));

phishing looks at the conversation as a whole, not just the latest message. Enable conversation context on the channel and pass prior turns when you submit content — otherwise the policy can’t run.

See Understanding API responses for the full response shape.

Illicit & Regulated URL Risk

Documentation

Learn

Resources

Policies

Reading the result

​Policies

​Reading the result

Policies

Reading the result