Redact and synthesize individual strings

Redact a single text string

To redact a specific text string and view the results, use textual.redact:

redaction_response = textual.redact("""<text of the string>""")
redaction_response.describe()

The response provides the redacted version of the string, and the list of redacted values. For each redacted item, the response includes:

  • The location of the value

  • The type of sensitive value

  • The original value

  • A score to indicate confidence in the detection and redaction

For example:

redaction_response = textual.redact("""Contact Tonic AI with questions""")
redaction_response.describe()

Contact ORGANIZATION_EPfC7XZUZ with questions
    
{"start": 8, "end": 16, "label": "ORGANIZATION", "text": "Tonic AI", "score": 0.85}

Redact a text string and specify the handling type

By default, Textual redacts detected sensitive values. The redacted value is <value type>_<generated identifier>. For example, ORGANIZATION_EPfC7XZUZ.

For each value type, you can instead choose to either synthesize or ignore the value.

  • When you synthesize a value, Textual replaces the value with a different realistic value.

  • When you ignore a value, Textual passes through the original value.

To specify the handling type for a value type, you use the generator_config parameter.

generator_config={'<value_type>':'<handling_type>'}

Where:

  • <value_type> is the identifier of the type of value. For example, ORGANIZATION. For the list of built-in value types that Textual scans for, go to About entity types in Textual.

  • <handling_type> is the handling type to use for the specified value type. The possible values are Redact, Synthesis, and Off.

To specify a handling type to use for value types that are not specified in generator_config, you use the generator_default parameter. generator_default can be either Redact, Synthesis, or Off.

The following example redacts a string and indicates to synthesize organization values. In the response, the organization value is replaced with a realistic value instead of an ORGANIZATION placeholder value.

redaction_response = textual.redact(
    """Contact Tonic AI with questions""", 
    generator_config={'ORGANIZATION':'Synthesis'}
)

redaction_response.describe()

    Contact Live Torch Works with questions
    
    {"start": 8, "end": 16, "label": "ORGANIZATION", "text": "Tonic AI", "score": 0.85}

Using an LLM to generate synthesized values

You can also request synthesized values from a large language model (LLM).

When you use this process, Textual first identifies the sensitive values in the text. It then sends the value locations and redacted values to the LLM. For example, if Textual identifies a product name, it sends the location and the redacted value PRODUCT to the LLM. Textual does not send the original values to the LLM.

The LLM then generates realistic synthesized values of the appropriate value types.

To send text to an LLM, use textual.llm_synthesis:

raw_synthesis = textual.llm_synthesis("Text of the string")

The response provides the text with the synthesized replacement values, followed by the list of synthesized values. For each value, the list includes:

  • Where the value is located in the string

  • The type of value

  • The original value

  • A score to indicate confidence in the detection and synthesis

Here is an example of a request to send a text string to an LLM, and the response with the updated string and value list:

raw_synthesis = textual.llm_synthesis("My name is John, and today I am demoing Textual, a software product created by Tonic")
raw_synthesis.describe()

My name is John, and on Monday afternoon I am demoing Widget Pro, a software product created by Initech Enterprises.
{"start": 11, "end": 15, "label": "NAME_GIVEN", "text": "John", "score": 0.9}
{"start": 21, "end": 26, "label": "DATE_TIME", "text": "today", "score": 0.85}
{"start": 40, "end": 47, "label": "PRODUCT", "text": "Textual", "score": 0.85}
{"start": 79, "end": 84, "label": "ORGANIZATION", "text": "Tonic", "score": 0.85}

Last updated