Configuring added and excluded values for an entity type
Last updated
Last updated
For each entity type in a dataset, you can configure additional values to detect, and values to exclude.
You might add values that Textual does not detect because, for example, they are specific to your organization or industry.
You might exclude a value because:
Textual labeled the value incorrectly.
You do not want to redact a specific value. For example, you might want to preserve known test values.
Note that for a pipeline that redacts files, you cannot add or exclude specific values.
In the entity types list, the add values and exclude values icons indicate whether there are configured added and excluded values for the entity type.
When added or excluded values are configured, the corresponding icon is green.
When there are no configured values, the corresponding icon is black.
From the Custom Entity Detection panel, you configure both added and excluded values for entity types.
To display the panel, either:
Click the add values or exclude values icon for an entity type.
In the word count panel, click Custom Entity Detection.
The panel contains an Add to detection tab for added values, and an Exclude from detection tab for excluded values.
The entity type dropdown list at the top of the Custom Entity Detection panel indicates the entity type to configure added and excluded values for.
If you display the panel from an add values or exclude values icon, then the initial selected entity type is the entity type for which you clicked the icon. To configure values for a different entity type, select the entity type from the list.
If you display the panel from the Custom Entity Detection option, then there is no default selection. You must select the entity type.
On the Add to detection tab, you configure the added values for the selected entity type.
Each value can be a specific word or phrase, or a regular expression to identify the values to add. Regular expressions must be C# compatible.
To add an added value:
Click the empty entry.
Type the value into the field.
To edit an added value:
Click the value.
Update the value text.
For each added value, you can test whether Textual correctly detects it.
To test a value:
From the Test Entry dropdown list, select the number for the value to test.
In the text field, type or paste content that contains a value or values that Textual should detect.
The Results field displays the text and highlights matching values.
To remove an added value, click its delete icon.
On the Exclude from detection tab, you configure the excluded values for the selected entity type.
Each value can be either a specific word or phrase to exclude, or a regular expression to identify the values to exclude. The regular expression must be C# compatible.
You can also provide a specific context within which to ignore a value. For example, in the phrase "one moment, please", you probably do not want the word "one" to be detected as a numeric value. If you specify "one moment, please" as an excluded value for the numeric entity type, then "one" is not identified as a number when it is seen in that context.
To add an excluded value:
Click the empty entry.
Type the value into the field.
To edit an excluded value:
Click the value.
Update the value text.
For each excluded value, you can test whether Textual correctly detects it.
To test the value that you are currently editing:
From the Test Entry dropdown list, select the number for the value to test.
In the text field, type or paste content that contains a value or values to exclude.
The Results field displays the text and highlights matching values.
To remove an excluded value, click its delete icon.
The new added and excluded values are not reflected in the entity types list until Textual runs a new scan.
When you save the changes, you can choose whether to immediately run a new scan on the dataset files.
To save the changes and also start a scan, click Save and Scan Files.
To save the changes, but not run a scan, click Save Without Scanning Files. When you do not run the scan, then on the dataset details page, Textual displays a prompt to run a scan.