Skip to main content
When URL Risk is on, you don’t have to pass URLs separately. Anything that looks like a link in the submitted text gets pulled out and scored. Each URL goes through threat-intel feeds and a model that’s seen a lot of phishing infrastructure. The response gives you a risk score and a handful of reason codes per URL. This page documents those fields and how to interpret them.

Fields

{
  "url": "https://secure-paypal-verify.xyz/account",
  "risk_score": 0.98,
  "reasons": ["brand_impersonation", "suspicious_keywords", "high_risk_tld"],
  "signals": {
    "brand_impersonation": {"brand": "paypal", "method": "registered_domain_token"},
    "has_suspicious_characters": false,
    "is_link_shortener": false,
    "domain_age_days": null,
    "has_email_setup": null,
    "redirect_count": null,
    "final_url": null,
    "bot_protection": null,
    "is_reported": false
  }
}
FieldTypeMeaning
urlstringThe URL that was evaluated.
risk_scorenumber (0.0–1.0)Risk score. Higher is riskier. Scores at or above 0.5 are treated as malicious by default; you can apply a stricter or looser cutoff for your use case.
reasonsstring[]Stable codes explaining why the URL looks risky. Empty when the URL is clean. A list of reasons means something actually flagged, not a full audit of what was checked.
signalsobjectObservable properties of the URL, described below.

Signals

Observable properties of the URL. The shape is consistent on every request. Fields that aren’t applicable or weren’t checked come back as null.
FieldTypeMeaningNull when
brand_impersonation{brand, method} | nullA well-known brand name appears in the URL in a way that doesn’t match its legitimate domain, e.g. paypal-verify.xyz or paypal.evil.com. brand is the impersonated brand (e.g. "paypal"); method is "registered_domain_token" or "subdomain_token".No brand match detected.
has_suspicious_charactersbooleanPunycode, Unicode lookalike characters, or an unusual ratio of special characters (classic typosquatting and homograph-attack indicators). Flagged if any URL in the redirect chain exhibits this.Always populated.
is_link_shortenerbooleanA known shortener is used anywhere in the redirect chain. This includes first-party (youtu.be, lnkd.in) and third-party (bit.ly, tinyurl.com, and others).Always populated.
domain_age_daysinteger | nullHow many days ago the destination’s domain was registered. Freshly registered domains (under 30 days old) are disproportionately used for phishing. Describes the registered domain, not the subdomain.The signal isn’t informative for this URL, or wasn’t needed to reach a verdict.
has_email_setupboolean | nullWhether the destination’s domain is set up to receive email. Legitimate businesses almost always are; throwaway phishing domains often aren’t. Describes the registered domain.Not needed to reach a verdict.
redirect_countinteger | nullNumber of redirect hops from the submitted URL to its final destination. 0 means no redirect.Not needed to reach a verdict.
final_urlstring | nullThe final URL reached after following redirects. Equal to the submitted URL when there’s no redirect.Not needed to reach a verdict.
bot_protectionboolean | nullWhether the destination sits behind a bot challenge or web application firewall. When true, some destination-describing signals may be null because we can’t see past the challenge.Not needed to reach a verdict.
is_reportedbooleanThe submitted URL matches one of our threat-intelligence feeds. Stays false if a redirect destination is reported but the submitted URL itself isn’t.Always populated.
Not every URL is analyzed in full depth. URLs that are clearly clean or clearly malicious from the string alone get a fast verdict, and the network-level signals (domain_age_days, has_email_setup, redirect_count, final_url, bot_protection) come back null. Treat null as “not checked,” not “not present.”

How signals describe redirect chains

When a URL redirects across domains (e.g. a shortener resolving to a landing page), signals are assembled from both the submitted URL and the final URL:
  • Describe the destination (where the user ends up): brand_impersonation, domain_age_days, has_email_setup, bot_protection
  • Describe the submitted URL (what was sent): redirect_count, final_url, is_reported
  • Either URL exhibiting the trait: is_link_shortener, has_suspicious_characters
Same-domain redirects (http://https://, trailing-slash canonicalization) don’t trigger re-analysis.

Reason codes

reasons is an ordered list of stable codes explaining why the URL looks risky. Codes only appear when a signal or rule actually attributed risk to this URL. A field being present is not enough; it has to have driven the score. Benign URLs return reasons: [].
CodeAligns with signalWhat it means
blocklistedNoneThe URL’s registered domain matched your blocklist. Verdict comes from configuration, not from analysis.
allowlistedNoneThe URL’s registered domain matched your allowlist. Verdict comes from configuration, not from analysis.
brand_impersonationsignals.brand_impersonationA brand name is used in the domain or subdomain in a way that doesn’t match its legitimate home.
has_suspicious_characterssignals.has_suspicious_charactersPunycode, Unicode lookalikes, or an unusual special-character ratio.
is_link_shortenersignals.is_link_shortenerThe URL uses a shortener and that pattern contributed to the risk score.
is_reportedsignals.is_reportedThe URL is on one of our threat-intelligence feeds.
new_domainsignals.domain_age_daysThe destination domain was registered recently and that freshness drove up risk.
missing_email_setupsignals.has_email_setupThe destination isn’t set up for email, a common characteristic of throwaway phishing domains.
high_risk_tldNoneTop-level domain with disproportionate phishing prevalence.
suspicious_keywordsNoneURL contains phishing keywords such as login, verify, account, password, secure.
suspicious_url_structureNoneStructural red flags: @ in the authority, // redirect trick, IP address as host, URL embedded in path, credential-collecting query parameters, and similar tricks.
ssl_invalidNoneThe destination’s SSL certificate failed to validate.
Reasons only describe what increased risk. You will not see has_email_setup as a reason. It’s the absence of email setup that’s concerning, and that surfaces as missing_email_setup.

Allowlists and blocklists

You can configure per-tenant allowlists and blocklists in the dashboard. These are applied before the risk model runs:
  • A blocklist hit returns risk_score: 1 and reasons: ["blocklisted"]. No signals are returned. The verdict comes from your configuration, not from analysis of the URL.
  • An allowlist hit returns risk_score: 0 and reasons: ["allowlisted"]. Also no signals.
  • Everything else flows through the risk model.
If a domain is on both lists, the blocklist wins.

Domain-level matching

Entries match at the registered domain level. Subdomains are not matched automatically. To allow every subdomain of your service, add each one explicitly. Given an allowlist entry of example.com:
URLMatches?
https://example.com/pageYes
https://www.example.com/pageYes (www is normalized away)
https://login.example.com/pageNo, add login.example.com explicitly
https://api.prod.example.com/No, add api.prod.example.com explicitly
https://example.com.evil.xyz/No, the registered domain is evil.xyz
Matching is case-insensitive. Enter plain domain strings: no scheme, no path, no wildcards. Internationalized domains should be entered in their punycode form (xn--...).

FAQ

Risk is a moving target. Several inputs change between requests:
  • Domains age. A freshly registered domain looks risky today and less risky in six months. domain_age_days grows naturally.
  • Email infrastructure gets added. Legitimate businesses set up MX, SPF, and DMARC records as they grow up; throwaway domains rarely do. has_email_setup can flip from false to true as a domain matures.
  • Threat-intelligence feeds update constantly. A URL not on any feed today may be reported tomorrow.
  • Redirect destinations change. Shorteners and redirectors can be repointed at any time. The destination is re-resolved on every request.
  • The model is updated as the threat landscape shifts.
If you’re caching scores, cache them briefly. Re-evaluate any URL still in active circulation rather than relying on a result that’s hours or days old.
risk_score >= 0.5 is the default cutoff for “treat as malicious,” and it’s tuned so the rate of false positives at that threshold is low across typical user-generated content. Tighten it (e.g. 0.7) if your audience is unusually tolerant of risky links, or loosen it (e.g. 0.3) if you’d rather over-block. The reasons array gives you the why in either direction.
Add it to your allowlist. Allowlist entries override the risk model. This is the right tool for your own product domains, trusted partners, and URLs you’ve manually verified as safe.If you think the score is wrong in a way that would also affect other customers (for example, a brand-impersonation false positive on a legitimate brand variant), let us know and we’ll look at the model.