Configure entity type handling for redaction
By default, when you:
Configure a dataset
Redact a string
Retrieve a redacted file
Textual does the following:
For the string and file redaction, replaces detected values with tokens.
For LLM synthesis, generates realistic synthesized values.
When you make the request, you can override the default behavior.
Specifying the handling option for entity types
For each entity type, you can choose to redact, synthesize, or ignore the value.
When you redact a value, Textual replaces the value with a token that consists of the entity type. For example,
ORGANIZATION
.When you synthesize a value, Textual replaces the value with a different realistic value.
When you ignore a value, Textual passes through the original value.
To specify the handling option for entity types, you use the generator_config
parameter.
Where:
<entity_type>
is the identifier of the entity type. For example,ORGANIZATION
. For the list of built-in entity types that Textual scans for, go to Entity types that Textual detects.<handling_option>
is the handling option to use for the specified entity type. The possible values areRedact
,Synthesis
, andOff
.
For example, to synthesize organization values, and ignore languages:
Specifying a default handling option
For string and file redaction, you can specify a default handling option to use for entity types that are not specified in generator_config
.
To do this, you use the generator_default
parameter.
generator_default
can be either Redact
, Synthesis
, or Off
.
Providing added and excluded values for entity types
You can also configure added and excluded values for each entity type.
You add values that Textual does not detect for an entity type, but should. You exclude values that you do not want Textual to identify as that entity type.
To specify the added values, use
label_allow_lists
.To specify the excluded values, use
label_block_lists
.
For each of these parameters, the value is a list of entity types to specify the added or excluded values for. To specify the values, you provide an array of regular expressions.
The following example uses label_allow_lists
to add values:
For
NAME_GIVEN
, adds the valuesThere
andHere
.For
NAME_FAMILY
, adds values that match the regular expression([a-z]{2})
.
Last updated