Selecting the handling option for entity types

Required dataset permission: Edit dataset settings

For each entity type, you choose how to handle the detected values.

Available handling options

The available options are:

Synthesis - Indicates to replace the value with another realistic value. For example, the first name value Michael might be replaced with the value John. The synthesized values are always consistent, meaning that a given entity value always has the same replacement value. For example, if the first name Michael appears multiple times in the text, it is always replaced with John. Textual does not synthesize any excluded values. For custom entity types, Textual scrambles the values.
Redaction - This is the default option, except for the Full Mailing Address entity type, which is Off by default. For text files, Redaction indicates to tokenize the value - to replace it with a token that identifies the entity type followed by a unique identifier. For example, the first name value Michael might be replaced with NAME_GIVEN_12m5s. The identifiers are consistent, which means that for a given original value, the replacement always has the same unique identifier. For example, the first name Michael might always be replaced with NAME_GIVEN_12m5sb, while the first name Helen might always be replaced with NAME_GIVEN_9ha3m2. For PDF files, Redaction indicates to either cover the value with a black box, or, if there is space, display the entity type and identifier. For image files, Redaction indicates to cover the value with a black box. Textual does not redact any excluded values.
Off - Indicates to not make any changes to the values. For example, the first name value Michael remains Michael. This this the default option for the Full Mailing Address entity type.

Selecting the handling option for a specific entity type

To select the handling option for an individual entity type, click the option for that type.

Selecting the handling option for all of the entity types

For a dataset, to select the same handling option for all of the entity types, from the Bulk Edit dropdown above the data type list, select the option.

For a pipeline that generates synthesized files, on the Generator Config tab, use the Bulk Edit options at the top of the entity types list.

Last updated 2 months ago

Was this helpful?