Fields
| Field | Type | Meaning |
|---|---|---|
url | string | The URL that was evaluated. |
risk_score | number (0.0–1.0) | Risk score. Higher is riskier. Scores at or above 0.5 are treated as malicious by default; you can apply a stricter or looser cutoff for your use case. |
reasons | string[] | Stable codes explaining why the URL looks risky. Empty when the URL is clean. A list of reasons means something actually flagged, not a full audit of what was checked. |
signals | object | Observable properties of the URL, described below. |
Signals
Observable properties of the URL. The shape is consistent on every request. Fields that aren’t applicable or weren’t checked come back asnull.
| Field | Type | Meaning | Null when |
|---|---|---|---|
brand_impersonation | {brand, method} | null | A well-known brand name appears in the URL in a way that doesn’t match its legitimate domain, e.g. paypal-verify.xyz or paypal.evil.com. brand is the impersonated brand (e.g. "paypal"); method is "registered_domain_token" or "subdomain_token". | No brand match detected. |
has_suspicious_characters | boolean | Punycode, Unicode lookalike characters, or an unusual ratio of special characters (classic typosquatting and homograph-attack indicators). Flagged if any URL in the redirect chain exhibits this. | Always populated. |
is_link_shortener | boolean | A known shortener is used anywhere in the redirect chain. This includes first-party (youtu.be, lnkd.in) and third-party (bit.ly, tinyurl.com, and others). | Always populated. |
domain_age_days | integer | null | How many days ago the destination’s domain was registered. Freshly registered domains (under 30 days old) are disproportionately used for phishing. Describes the registered domain, not the subdomain. | The signal isn’t informative for this URL, or wasn’t needed to reach a verdict. |
has_email_setup | boolean | null | Whether the destination’s domain is set up to receive email. Legitimate businesses almost always are; throwaway phishing domains often aren’t. Describes the registered domain. | Not needed to reach a verdict. |
redirect_count | integer | null | Number of redirect hops from the submitted URL to its final destination. 0 means no redirect. | Not needed to reach a verdict. |
final_url | string | null | The final URL reached after following redirects. Equal to the submitted URL when there’s no redirect. | Not needed to reach a verdict. |
bot_protection | boolean | null | Whether the destination sits behind a bot challenge or web application firewall. When true, some destination-describing signals may be null because we can’t see past the challenge. | Not needed to reach a verdict. |
is_reported | boolean | The submitted URL matches one of our threat-intelligence feeds. Stays false if a redirect destination is reported but the submitted URL itself isn’t. | Always populated. |
Not every URL is analyzed in full depth. URLs that are clearly clean or clearly malicious from the string alone get a fast verdict, and the network-level signals (
domain_age_days, has_email_setup, redirect_count, final_url, bot_protection) come back null. Treat null as “not checked,” not “not present.”How signals describe redirect chains
When a URL redirects across domains (e.g. a shortener resolving to a landing page), signals are assembled from both the submitted URL and the final URL:- Describe the destination (where the user ends up):
brand_impersonation,domain_age_days,has_email_setup,bot_protection - Describe the submitted URL (what was sent):
redirect_count,final_url,is_reported - Either URL exhibiting the trait:
is_link_shortener,has_suspicious_characters
http:// → https://, trailing-slash canonicalization) don’t trigger re-analysis.
Reason codes
reasons is an ordered list of stable codes explaining why the URL looks risky. Codes only appear when a signal or rule actually attributed risk to this URL. A field being present is not enough; it has to have driven the score. Benign URLs return reasons: [].
| Code | Aligns with signal | What it means |
|---|---|---|
blocklisted | None | The URL’s registered domain matched your blocklist. Verdict comes from configuration, not from analysis. |
allowlisted | None | The URL’s registered domain matched your allowlist. Verdict comes from configuration, not from analysis. |
brand_impersonation | signals.brand_impersonation | A brand name is used in the domain or subdomain in a way that doesn’t match its legitimate home. |
has_suspicious_characters | signals.has_suspicious_characters | Punycode, Unicode lookalikes, or an unusual special-character ratio. |
is_link_shortener | signals.is_link_shortener | The URL uses a shortener and that pattern contributed to the risk score. |
is_reported | signals.is_reported | The URL is on one of our threat-intelligence feeds. |
new_domain | signals.domain_age_days | The destination domain was registered recently and that freshness drove up risk. |
missing_email_setup | signals.has_email_setup | The destination isn’t set up for email, a common characteristic of throwaway phishing domains. |
high_risk_tld | None | Top-level domain with disproportionate phishing prevalence. |
suspicious_keywords | None | URL contains phishing keywords such as login, verify, account, password, secure. |
suspicious_url_structure | None | Structural red flags: @ in the authority, // redirect trick, IP address as host, URL embedded in path, credential-collecting query parameters, and similar tricks. |
ssl_invalid | None | The destination’s SSL certificate failed to validate. |
has_email_setup as a reason. It’s the absence of email setup that’s concerning, and that surfaces as missing_email_setup.
Allowlists and blocklists
You can configure per-tenant allowlists and blocklists in the dashboard. These are applied before the risk model runs:- A blocklist hit returns
risk_score: 1andreasons: ["blocklisted"]. Nosignalsare returned. The verdict comes from your configuration, not from analysis of the URL. - An allowlist hit returns
risk_score: 0andreasons: ["allowlisted"]. Also nosignals. - Everything else flows through the risk model.
Domain-level matching
Entries match at the registered domain level. Subdomains are not matched automatically. To allow every subdomain of your service, add each one explicitly. Given an allowlist entry ofexample.com:
| URL | Matches? |
|---|---|
https://example.com/page | Yes |
https://www.example.com/page | Yes (www is normalized away) |
https://login.example.com/page | No, add login.example.com explicitly |
https://api.prod.example.com/ | No, add api.prod.example.com explicitly |
https://example.com.evil.xyz/ | No, the registered domain is evil.xyz |
xn--...).
FAQ
Why does the score for the same URL change over time?
Why does the score for the same URL change over time?
Risk is a moving target. Several inputs change between requests:
- Domains age. A freshly registered domain looks risky today and less risky in six months.
domain_age_daysgrows naturally. - Email infrastructure gets added. Legitimate businesses set up MX, SPF, and DMARC records as they grow up; throwaway domains rarely do.
has_email_setupcan flip fromfalsetotrueas a domain matures. - Threat-intelligence feeds update constantly. A URL not on any feed today may be reported tomorrow.
- Redirect destinations change. Shorteners and redirectors can be repointed at any time. The destination is re-resolved on every request.
- The model is updated as the threat landscape shifts.
What threshold should I use?
What threshold should I use?
risk_score >= 0.5 is the default cutoff for “treat as malicious,” and it’s tuned so the rate of false positives at that threshold is low across typical user-generated content. Tighten it (e.g. 0.7) if your audience is unusually tolerant of risky links, or loosen it (e.g. 0.3) if you’d rather over-block. The reasons array gives you the why in either direction.A legitimate URL of mine is being flagged. What do I do?
A legitimate URL of mine is being flagged. What do I do?
Add it to your allowlist. Allowlist entries override the risk model. This is the right tool for your own product domains, trusted partners, and URLs you’ve manually verified as safe.If you think the score is wrong in a way that would also affect other customers (for example, a brand-impersonation false positive on a legitimate brand variant), let us know and we’ll look at the model.