- Detect if a text contains certain data
- Extract structured data from unstructured text
- Mask out swear words, phone numbers, and other sensitive data
- Pseudonymize data for GDPR compliance
Masking
You can use entity matchers to mask out certain words or phrases. For example, you can mask out swear words, phone numbers, or other sensitive data. You can enable masking and set the replacement value when you add a matcher to your project. When masking is enabled thecontent
field of the text moderation response will the modified text, the original
field will contain the original text. The content_moderated
field will indicate whether the content differs from the original text.
Response example with masked content
Detection levels
For most matchers you can set the detection level. This determines how strict the matcher should be.Level | Description |
---|---|
NORMAL | Detect values that are spelled and formatted correctly. |
SUSPICIOUS | Also detect values that are mispelled or obfuscated. |
PARANOID | Also detect values even if somewhat unsure about correctness. |
Response signature
All matcher models have the same response signature:The detection level used for the match.
Whether the matcher found a match.
An array of all the matches found.
An array of objects with the components of each match. For example, for a name
matcher, the components would be the first and last name.