To call a tool on a self-hosted instance of Textual, the configuration parameters must include your Textual instance URL:
Available tools
Tool
Input
Use to
TonicTextualRedactText
Plain text string
Synthesize or tokenize entities in raw text or the content of a .txt file.
TonicTextualRedactJson
JSON string
Synthesize or tokenize entities in raw JSON or the content of a .json file.
TonicTextualRedactHtml
HTML string
Synthesize or tokenize entities in raw HTML, or the content of an .html or .htm file.
TonicTextualRedactFile
File path
Synthesize or tokenize entities in PDF, image (JPG, PNG), CSV, or TSV files.
For .txt, .json, .htm, or .html files, you read the file content, then pass the content to the text, JSON, or HTML tool.
TonicTextualExtractEntities
Plain text string
Return a list of detected entities.
For each entity, identify the type, value, location, and detection confidence.
TonicTextualPiiTypes
None
Lists the supported entity types.
Provides the type names to use in the configuration for the redaction tools.
Text redaction
JSON redaction
HTML redaction
File redaction (PDF, image, CSV, TSV)
For .txt, .json, and .html/.htm files, you do not use the file redaction tool. Instead, you read the file content, then pass the content to the text, JSON, or HTML redaction tool.
Get the list of detected entities
Returns a JSON array of detected entities, each with the following fields:
label - The entity type for the detected entity.
text - The text of the detected entity.
start - The start location of the entity.
end - The end location of the entity.
score - The confidence score. Indicates how confident Textual is in its detection.
Tool configuration options
All of the redaction tools provide the same configuration options to determine how to de-identify entities of specific types.
Getting the list of available entity types
When you configure entity type handling, you must provide the entity type names.
To get a list of all of the supported entity type names, use TonicTextualPiiTypes:
Available handling options
The available entity type handling options are:
Redaction- This is the default, unless you specify otherwise. Indicates to replace the entity value with the entity type name followed by a unique identifier for each unique value. For example, replace John with NAME_GIVEN_1234.
Synthesis - Indicates to replace the entity value with a realistic generated value. For example, replace John with Michael.
Off- Indicates to not replace the entity value at all, and keep it as is in the output.
Specifying the default handling option
To specify the default handling option to use for all entity types, use the generator_default parameter.
In the following example, the default handling option is set to Synthesis. All entities are replaced with realistic generated values.
Providing handling options for specific entity types
To provide handling options for specific entity types, use the generator_config parameter.
Within generator_config, for each entity type:
In the following example, the default handling option is Off. First and last names are replaced with realistic generated values, and email addresses are redacted:
from langchain_textual import TonicTextualRedactText
tool = TonicTextualRedactText()
tool.invoke("My name is John Smith and my email is [email protected].")
# "My name is [NAME_GIVEN_xxxx] [NAME_FAMILY_xxxx] and my email is [EMAIL_ADDRESS_xxxx]."
from langchain_textual import TonicTextualRedactHtml
tool = TonicTextualRedactHtml()
tool.invoke("<p>Contact John Smith at [email protected]</p>")
# "<p>Contact [NAME_GIVEN_xxxx] [NAME_FAMILY_xxxx] at [EMAIL_ADDRESS_xxxx]</p>"
from langchain_textual import TonicTextualExtractEntities
tool = TonicTextualExtractEntities()
tool.invoke("My name is John Smith and my email is [email protected].")
# '[{"label": "NAME_GIVEN", "text": "John", "start": 11, "end": 15, "score": 0.9}, ...]'
tool = TonicTextualRedactText(generator_default="Synthesis")
tool.invoke("Contact Jane Doe at [email protected].")
# "Contact Maria Chen at [email protected]."
"<type name>: "<handling option>"
tool = TonicTextualRedactText(
generator_default="Off",
generator_config={
"NAME_GIVEN": "Synthesis",
"NAME_FAMILY": "Synthesis",
"EMAIL_ADDRESS": "Redaction",
},
)
tool.invoke("Contact Jane Doe at [email protected].")
# "Contact Maria Chen at chen@[EMAIL_ADDRESS_xxxx]."