URL Risk

When URL Risk is on, you don’t have to pass URLs separately. Anything that looks like a link in the submitted text gets pulled out and scored. Each URL goes through threat-intel feeds and a model that’s seen a lot of phishing infrastructure. The response gives you a risk score and a handful of reason codes per URL. This page documents those fields and how to interpret them.

Fields

{
  "url": "https://secure-paypal-verify.xyz/account",
  "risk_score": 0.98,
  "reasons": ["brand_impersonation", "suspicious_keywords", "high_risk_tld"],
  "signals": {
    "brand_impersonation": {"brand": "paypal", "method": "registered_domain_token"},
    "has_suspicious_characters": false,
    "is_link_shortener": false,
    "domain_age_days": null,
    "has_email_setup": null,
    "redirect_count": null,
    "final_url": null,
    "bot_protection": null,
    "is_reported": false
  }
}

Field	Type	Meaning
`url`	string	The URL that was evaluated.
`risk_score`	number (0.0–1.0)	Risk score. Higher is riskier. Scores at or above `0.5` are treated as malicious by default; you can apply a stricter or looser cutoff for your use case.
`reasons`	string[]	Stable codes explaining why the URL looks risky. Empty when the URL is clean. A list of reasons means something actually flagged, not a full audit of what was checked.
`signals`	object	Observable properties of the URL, described below.

Signals

Observable properties of the URL. The shape is consistent on every request. Fields that aren’t applicable or weren’t checked come back as null.

Field	Type	Meaning	Null when
`brand_impersonation`	`{brand, method}` \| null	A well-known brand name appears in the URL in a way that doesn’t match its legitimate domain, e.g. `paypal-verify.xyz` or `paypal.evil.com`. `brand` is the impersonated brand (e.g. `"paypal"`); `method` is `"registered_domain_token"` or `"subdomain_token"`.	No brand match detected.
`has_suspicious_characters`	boolean	Punycode, Unicode lookalike characters, or an unusual ratio of special characters (classic typosquatting and homograph-attack indicators). Flagged if any URL in the redirect chain exhibits this.	Always populated.
`is_link_shortener`	boolean	A known shortener is used anywhere in the redirect chain. This includes first-party (`youtu.be`, `lnkd.in`) and third-party (`bit.ly`, `tinyurl.com`, and others).	Always populated.
`domain_age_days`	integer \| null	How many days ago the destination’s domain was registered. Freshly registered domains (under 30 days old) are disproportionately used for phishing. Describes the registered domain, not the subdomain.	The signal isn’t informative for this URL, or wasn’t needed to reach a verdict.
`has_email_setup`	boolean \| null	Whether the destination’s domain is set up to receive email. Legitimate businesses almost always are; throwaway phishing domains often aren’t. Describes the registered domain.	Not needed to reach a verdict.
`redirect_count`	integer \| null	Number of redirect hops from the submitted URL to its final destination. `0` means no redirect.	Not needed to reach a verdict.
`final_url`	string \| null	The final URL reached after following redirects. Equal to the submitted URL when there’s no redirect.	Not needed to reach a verdict.
`bot_protection`	boolean \| null	Whether the destination sits behind a bot challenge or web application firewall. When `true`, some destination-describing signals may be `null` because we can’t see past the challenge.	Not needed to reach a verdict.
`is_reported`	boolean	The submitted URL matches one of our threat-intelligence feeds. Stays `false` if a redirect destination is reported but the submitted URL itself isn’t.	Always populated.

Not every URL is analyzed in full depth. URLs that are clearly clean or clearly malicious from the string alone get a fast verdict, and the network-level signals (domain_age_days, has_email_setup, redirect_count, final_url, bot_protection) come back null. Treat null as “not checked,” not “not present.”

How signals describe redirect chains

When a URL redirects across domains (e.g. a shortener resolving to a landing page), signals are assembled from both the submitted URL and the final URL:

Describe the destination (where the user ends up): brand_impersonation, domain_age_days, has_email_setup, bot_protection
Describe the submitted URL (what was sent): redirect_count, final_url, is_reported
Either URL exhibiting the trait: is_link_shortener, has_suspicious_characters

Same-domain redirects (http:// → https://, trailing-slash canonicalization) don’t trigger re-analysis.

Reason codes

reasons is an ordered list of stable codes explaining why the URL looks risky. Codes only appear when a signal or rule actually attributed risk to this URL. A field being present is not enough; it has to have driven the score. Benign URLs return reasons: [].

Code	Aligns with signal	What it means
`blocklisted`	None	The URL’s registered domain matched your blocklist. Verdict comes from configuration, not from analysis.
`allowlisted`	None	The URL’s registered domain matched your allowlist. Verdict comes from configuration, not from analysis.
`brand_impersonation`	`signals.brand_impersonation`	A brand name is used in the domain or subdomain in a way that doesn’t match its legitimate home.
`has_suspicious_characters`	`signals.has_suspicious_characters`	Punycode, Unicode lookalikes, or an unusual special-character ratio.
`is_link_shortener`	`signals.is_link_shortener`	The URL uses a shortener and that pattern contributed to the risk score.
`is_reported`	`signals.is_reported`	The URL is on one of our threat-intelligence feeds.
`new_domain`	`signals.domain_age_days`	The destination domain was registered recently and that freshness drove up risk.
`missing_email_setup`	`signals.has_email_setup`	The destination isn’t set up for email, a common characteristic of throwaway phishing domains.
`high_risk_tld`	None	Top-level domain with disproportionate phishing prevalence.
`suspicious_keywords`	None	URL contains phishing keywords such as `login`, `verify`, `account`, `password`, `secure`.
`suspicious_url_structure`	None	Structural red flags: `@` in the authority, `//` redirect trick, IP address as host, URL embedded in path, credential-collecting query parameters, and similar tricks.
`ssl_invalid`	None	The destination’s SSL certificate failed to validate.

Reasons only describe what increased risk. You will not see has_email_setup as a reason. It’s the absence of email setup that’s concerning, and that surfaces as missing_email_setup.

Allowlists and blocklists

You can configure per-tenant allowlists and blocklists in the dashboard. These are applied before the risk model runs:

A blocklist hit returns risk_score: 1 and reasons: ["blocklisted"]. No signals are returned. The verdict comes from your configuration, not from analysis of the URL.
An allowlist hit returns risk_score: 0 and reasons: ["allowlisted"]. Also no signals.
Everything else flows through the risk model.

If a domain is on both lists, the blocklist wins.

Domain-level matching

Entries match at the registered domain level. Subdomains are not matched automatically. To allow every subdomain of your service, add each one explicitly. Given an allowlist entry of example.com:

URL	Matches?
`https://example.com/page`	Yes
`https://www.example.com/page`	Yes (`www` is normalized away)
`https://login.example.com/page`	No, add `login.example.com` explicitly
`https://api.prod.example.com/`	No, add `api.prod.example.com` explicitly
`https://example.com.evil.xyz/`	No, the registered domain is `evil.xyz`

Matching is case-insensitive. Enter plain domain strings: no scheme, no path, no wildcards. Internationalized domains should be entered in their punycode form (xn--...).

FAQ

Why does the score for the same URL change over time?

Risk is a moving target. Several inputs change between requests:

Domains age. A freshly registered domain looks risky today and less risky in six months. domain_age_days grows naturally.
Email infrastructure gets added. Legitimate businesses set up MX, SPF, and DMARC records as they grow up; throwaway domains rarely do. has_email_setup can flip from false to true as a domain matures.
Threat-intelligence feeds update constantly. A URL not on any feed today may be reported tomorrow.
Redirect destinations change. Shorteners and redirectors can be repointed at any time. The destination is re-resolved on every request.
The model is updated as the threat landscape shifts.

If you’re caching scores, cache them briefly. Re-evaluate any URL still in active circulation rather than relying on a result that’s hours or days old.

What threshold should I use?

risk_score >= 0.5 is the default cutoff for “treat as malicious,” and it’s tuned so the rate of false positives at that threshold is low across typical user-generated content. Tighten it (e.g. 0.7) if your audience is unusually tolerant of risky links, or loosen it (e.g. 0.3) if you’d rather over-block. The reasons array gives you the why in either direction.

A legitimate URL of mine is being flagged. What do I do?

Add it to your allowlist. Allowlist entries override the risk model. This is the right tool for your own product domains, trusted partners, and URLs you’ve manually verified as safe.If you think the score is wrong in a way that would also affect other customers (for example, a brand-impersonation false positive on a legitimate brand variant), let us know and we’ll look at the model.

Documentation

Learn

Resources

Fields

Signals

How signals describe redirect chains

Reason codes

Allowlists and blocklists

Domain-level matching

FAQ

​Fields

​Signals

​How signals describe redirect chains

​Reason codes

​Allowlists and blocklists

​Domain-level matching

​FAQ

Fields

Signals

How signals describe redirect chains

Reason codes

Allowlists and blocklists

Domain-level matching

FAQ