Previewing Textual detection and redaction
Last updated
Was this helpful?
Last updated
Was this helpful?
Was this helpful?
The Tonic Textual Home page provides a tool that allows you to see how Textual detects and replaces values in plain text or an uploaded file.
It also provides a preview of the redaction configuration options, including:
How to replace the values for each entity type.
Added and excluded values for each entity type.
The Home page displays automatically when you log in to Textual. To return to the Home page from other pages, in the navigation menu, click Home.
To provide the content to redact, you can enter text directly, or you can upload a file.
As you enter or paste text in the Original Content text area, Textual displays the redacted version in the Results panel at the right.
Textual also provides sample text options for some common use cases. To populate the text with a sample, under Try a sample, click the sample to use.
You can also redact .txt or .docx files.
To provide a file, either:
Drag and drop the file to the Original Content text area.
Click the upload prompt, then search for and select the file.
Textual processes the file and then displays the redacted version in the Results panel. The Original Content text area is removed.
To clear the text, click Clear.
The handling option indicates how Textual replaces a detected value for an entity type. You can experiment with different handling options.
Note that the updated configuration is only used for the current redacted text. When you clear the text, Textual also clears the configuration.
The options are:
Redact - This is the default value. Textual replaces the value with the name of the entity type.
For example, the first name John is replaced with NAME_GIVEN
.
Synthesize - Textual replaces the value with a realistic generated value. For example, the first name John is replaced with the first name Michael. The replacement values are consistent, which means that a given value always has the same replacement. For example, Michael is always the replacement value for John.
Off - Textual ignores the value and copies it as is to the Results panel.
To change the handling option for an entity type:
In the Results panel, click an instance of the entity type.
On the configuration panel, click the handling option to use.
Textual updates all instances of that entity type to use the selected handling option.
For example, if you change the handling option for NAME_GIVEN
to Synthesize, then all instances of first names are replaced with realistic values.
For each entity type in entered text, you can use regular expressions to define added and excluded values.
Added values are values that Textual does not detect for an entity type, but that you want to include. For example, you might have values that are specific to your company or industry.
Excluded values are values that you do not want Textual to identify as a given entity type.
Note that the configuration is only used for the current redacted text. When you clear the text, Textual also clears the configuration.
Also, this option is only available for text that you enter directly. For an uploaded file, to do additional configuration or to download the file, you must create a dataset from the file.
To display the configuration panel for added and excluded values, click Fine-tune Results.
The Fine-Tune Results panel displays the list of configured rules for the current text. For each rule, the list includes:
The entity type.
Whether the rule adds or excludes values.
The regular expression to identify the added or excluded values.
On the Fine-Tune Results panel, to create a rule:
Click Add Rule.
From the entity type dropdown list, select the entity type that the rule applies to.
From the rule type dropdown list:
If the rule adds values, then select Include.
If the rule excludes values, then select Exclude.
In the regular expression field, provide the regular expression to use to identify the values to add or exclude.
To save the rule, click the save icon.
To edit a rule:
On the Fine-Tune Results panel, click the edit icon for the rule.
Update the configuration.
Click the save icon.
On the Fine-Tune Results panel, to delete a rule, click its delete icon.
From an uploaded file, you can create a dataset that contains the file.
You can then provide additional configuration, such as added and excluded values, and download the redacted file.
To create a dataset from an uploaded file:
Click Download.
Click Create a Dataset.
Textual displays the dataset details for the new dataset. The dataset name is Playground Dataset <number>
, where the number reflects the number of datasets that were created from the Home page.
The dataset contains the uploaded file.
When Textual generates the redacted version of the text, it also generates the corresponding API request. The request includes the entity type configuration.
To view the API request code, click Show Code.
To hide the code, click Hide Code.
On the code panel:
The Python tab contains the Python version of the request.
The cURL tab contains the cURL version of the request.
To copy the currently selected version of the request code, click Copy Code.
For entered text on the Home page, Textual offers an option to send the following to our custom Large Language Model (LLM) to synthesize accurate replacements. The following information is sent to our models.
The detected entity values.
The text that surrounds each value.
The LLM processing is not available for uploaded files.
It is also limited to text that contains 100 or fewer words.
Textual's LLM functionality is run only on our cloud and does not use any third parties.
The LLM processing is intended to improve the detection and the replacement values. The LLM:
Groups entities based on whether they refer to the same thing, concept, or person. The grouping is only done within each entity type. For example, Lyon the person and Lyon the city are never grouped together.
Chooses a representative value for each group. For example, if the content includes the names Will, William, and W.I.L.L, the LLM processing chooses William as the representative value, because it's the most complete form of the name.
Sends the representative value to our standard, non-LLM, synthesis generators.
Gets the replacement value from the generators, and then formats it to match the original format. For example, because Will is replaced with Rob, W.I.L.L becomes R.O.B.
To enable the LLM processing, set the environment variable ENABLE_EXPERIMENTAL_SYNTHESIS
to True
. If this is not set to true, then the LLM processing does not work.
You must also set up the Solar.LLM container.
To configure the container, you can use the following Docker Compose content as a reference:
services:
textual-llm:
image: textual-llm:[textual-version-here]
container_name: textual-llm
volumes:
- llm-models:/app/models
ports:
- "11443:11443"
secrets:
- llm_aws_key_id
- llm_aws_access_key
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: unless-stopped
networks:
- llm-network
volumes:
llm-models:
networks:
llm-network:
driver: bridge
secrets:
llm_aws_key_id:
environment: "LLM_AWS_KEY_ID"
llm_aws_access_key:
environment: "LLM_AWS_ACCESS_KEY"
The AWS keys are used to download our custom models. To get a copy of the keys, contact your Tonic.ai support representative.
After you enter text in the Original Content panel, to enable the LLM processing, in the Results panel, click Use an LLM to perform AI synthesis.
You cannot use this option for text that contains more than 100 words.
When you clear the text, Textual reverts to the default processing.
In the Python SDK, to use LLM synthesis, call the llm_synthesis
function.