Previewing Textual detection and redaction
Last updated
Was this helpful?
Last updated
Was this helpful?
The Tonic Textual Home page provides a tool that allows you to see how Textual detects and replaces values in plain text or an uploaded file.
It also provides a preview of the redaction configuration options, including:
How to replace the values for each entity type.
Added and excluded values for each entity type.
The Home page displays automatically when you log in to Textual. To return to the Home page from other pages, in the navigation menu, click Home.
To provide the content to redact, you can enter text directly, or you can upload a file.
As you enter or paste text in the Original Content text area, Textual displays the redacted version in the Results panel at the right.
Textual also provides sample text options for some common use cases. To populate the text with a sample, under Try a sample, click the sample to use.
You can also redact a file. The file must be in one of the supported file formats for datasets.
To provide a file, either:
Drag and drop the file to the Original Content text area.
Click the upload prompt, then search for and select the file.
Textual processes the file and then displays the redacted version in the Results panel. The Original Content text area is removed.
To clear the text, click Clear.
The handling option indicates how Textual replaces a detected value for an entity type. You can experiment with different handling options.
Note that the updated configuration is only used for the current redacted text. When you clear the text, Textual also clears the configuration.
The options are:
Redact - This is the default value. Textual replaces the value with the name of the entity type.
For example, the first name John is replaced with NAME_GIVEN
.
Synthesize - Textual replaces the value with a realistic generated value. For example, the first name John is replaced with the first name Michael. The replacement values are consistent, which means that a given value always has the same replacement. For example, MIchael is always the replacement value for John.
Off - Textual ignores the value and copies it as is to the Results panel.
To change the handling option for an entity type:
In the Results panel, click an instance of the entity type.
On the configuration panel, click the handling option to use.
Textual updates all instances of that entity type to use the selected handling option.
For example, if you change the handling option for NAME_GIVEN
to Synthesize, then all instances of first names are replaced with realistic values.
For each entity type in entered text, you can use regular expressions to define added and excluded values.
Added values are values that Textual does not detect for an entity type, but that you want to include. For example, you might have values that are specific to your company or industry.
Excluded values are values that you do not want Textual to identify as a given entity type.
Note that the configuration is only used for the current redacted text. When you clear the text, Textual also clears the configuration.
Also, this option is only available for text that you enter directly. For an uploaded file, to do additional configuration or to download the file, you must create a dataset from the file.
To display the configuration panel for added and excluded values, click Fine-tune Results.
The Fine-Tune Results panel displays the list of configured rules for the current text. For each rule, the list includes:
The entity type.
Whether the rule adds or excludes values.
The regular expression to identify the added or excluded values.
On the Fine-Tune Results panel, to create a rule:
Click Add Rule.
From the entity type dropdown list, select the entity type that the rule applies to.
From the rule type dropdown list:
If the rule adds values, then select Include.
If the rule excludes values, then select Exclude.
In the regular expression field, provide the regular expression to use to identify the values to add or exclude.
To save the rule, click the save icon.
To edit a rule:
On the Fine-Tune Results panel, click the edit icon for the rule.
Update the configuration.
Click the save icon.
On the Fine-Tune Results panel, to delete a rule, click its delete icon.
From an uploaded file, you can create a dataset that contains the file.
You can then provide additional configuration, such as added and excluded values, and download the redacted file.
To create a dataset from an uploaded file:
Click Download.
Click Create a Dataset.
Textual displays the dataset details for the new dataset. The dataset name is Playground Dataset <number>
, where the number reflects the number of datasets that were created from the Home page.
The dataset contains the uploaded file.
When Textual generates the redacted version of the text, it also generates the corresponding API request. The request includes the entity type configuration.
To view the API request code, click Show Code.
To hide the code, click Hide Code.
On the code panel:
The Python tab contains the Python version of the request.
The cURL tab contains the cURL version of the request.
To copy the currently selected version of the request code, click Copy Code.
For entered text on the Home page, Textual offers an option to send the following to an OpenAI large language model (LLM):
The detected entity values.
The text that surrounds each value.
The LLM processing is not available for uploaded files.
It is also limited to text that contains 100 or fewer words.
The LLM processing is intended to improve the detection and the replacement values. The LLM:
Verifies that the assigned entity type is correct.
If it is not, determines the correct entity type.
Standardizes an entity value that has different formats, such as Main St. versus Main Street.
Generates replacement values that use the same format as the original value.
To enable the LLM processing:
Set the environment variable ENABLE_EXPERIMENTAL_SYNTHESIS
to True
.
By default, the LLM processing uses the gpt-4o
model.
To use a different a model, configure the environment variable LLM_MODEL
. For example, to use gpt-4o-mini
, set LLM_MODEL
to openai/gpt-4o-mini
.
To use Azure OpenAI, you also configure the following environment variables:
AZURE_OPENAI_API_KEY
- The API key.
AZURE_API_BASE
- The URL of your Azure OpenAI deployment.
AZURE_API_VERSION
- The API version of Azure to use.
After you enter text in the Original Content panel, to enable the LLM processing, in the Results panel, click Use an LLM to perform AI synthesis.
You cannot use this option for text that contains more than 100 words.
When you clear the text, Textual reverts to the default processing.