> For the complete documentation index, see [llms.txt](https://docs.tonic.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.tonic.ai/textual/textual-integrations/textual-langchain-integration.md). # Textual LangChain integration The [Textual LangChain integration](https://github.com/TonicAI/langchain-textual) provides Textual tools that you can use to detect and de-identify sensitive data in text, JSON, HTML, and files. You can replace entity values with realistic generated values, or with tokenized placeholders. You can also extract the list of detected entities. You can drop the Textual tools into any LangChain chain or agent as standard tools. ## Installing the integration ``` pip install langchain-textual ``` ## Providing a Textual API key To use the Textual tools, you must provide a [Textual API key](/textual/tonic-textual-api/textual-api-keys.md). To set the API key as an environment variable value: ``` export TONIC_TEXTUAL_API_KEY="your-api-key" ``` To provide the API key when you call a tool: ``` tool = TonicTextualRedactText(tonic_textual_api_key="your-api-key") ``` ## Calling a tool To call a tool on Textual Cloud: ``` from langchain_textual import tool = () tool.invoke() ``` To call a tool on a self-hosted instance of Textual, the configuration parameters must include your Textual instance URL: ``` tool = (tonic_textual_base_url="https://textual.your-company.com") ``` ## Available tools

Tool	Input	Use to
`TonicTextualRedactText`	Plain text string	Synthesize or tokenize entities in raw text or the content of a `.txt` file.
`TonicTextualRedactJson`	JSON string	Synthesize or tokenize entities in raw JSON or the content of a `.json` file.
`TonicTextualRedactHtml`	HTML string	Synthesize or tokenize entities in raw HTML, or the content of an `.html` or `.htm` file.
`TonicTextualRedactFile`	File path	Synthesize or tokenize entities in PDF, image (JPG, PNG), CSV, or TSV files. For .txt, .json, .htm, or .html files, you read the file content, then pass the content to the text, JSON, or HTML tool.
`TonicTextualExtractEntities`	Plain text string	Return a list of detected entities. For each entity, identify the type, value, location, and detection confidence.
`TonicTextualPiiTypes`	None	Lists the supported entity types. Provides the type names to use in the configuration for the redaction tools.

### Text redaction ``` from langchain_textual import TonicTextualRedactText tool = TonicTextualRedactText() tool.invoke("My name is John Smith and my email is john@example.com.") # "My name is [NAME_GIVEN_xxxx] [NAME_FAMILY_xxxx] and my email is [EMAIL_ADDRESS_xxxx]." ``` ### JSON redaction ``` from langchain_textual import TonicTextualRedactJson tool = TonicTextualRedactJson() tool.invoke('{"name": "John Smith", "email": "john@example.com"}') # '{"name": "[NAME_GIVEN_xxxx] [NAME_FAMILY_xxxx]", "email": "[EMAIL_ADDRESS_xxxx]"}' ``` ### HTML redaction ``` from langchain_textual import TonicTextualRedactHtml tool = TonicTextualRedactHtml() tool.invoke("

Contact John Smith at john@example.com

") # "

Contact [NAME_GIVEN_xxxx] [NAME_FAMILY_xxxx] at [EMAIL_ADDRESS_xxxx]

" ``` ### File redaction (PDF, image, CSV, TSV) ``` from langchain_textual import TonicTextualRedactFile tool = TonicTextualRedactFile() tool.invoke({"file_path": "/path/to/scan.pdf"}) # "/path/to/scan_redacted.pdf" tool.invoke({"file_path": "/path/to/photo.jpg", "output_path": "/tmp/redacted.jpg"}) # "/tmp/redacted.jpg" ``` For `.txt`, `.json`, and `.html`/`.htm` files, you do not use the file redaction tool. Instead, you read the file content, then pass the content to the text, JSON, or HTML redaction tool. ### Get the list of detected entities ``` from langchain_textual import TonicTextualExtractEntities tool = TonicTextualExtractEntities() tool.invoke("My name is John Smith and my email is john@example.com.") # '[{"label": "NAME_GIVEN", "text": "John", "start": 11, "end": 15, "score": 0.9}, ...]' ``` Returns a JSON array of detected entities, each with the following fields: * `label` - The entity type for the detected entity. * `text` - The text of the detected entity. * `start` - The start location of the entity. * `end` - The end location of the entity. * `score` - The confidence score. Indicates how confident Textual is in its detection. ## Tool configuration options All of the redaction tools provide the same configuration options to determine how to de-identify entities of specific types. ### Getting the list of available entity types When you configure entity type handling, you must provide the entity type names. To get a list of all of the supported entity type names, use `TonicTextualPiiTypes`: ``` from langchain_textual import TonicTextualPiiTypes TonicTextualPiiTypes().invoke("") # "NUMERIC_VALUE, LANGUAGE, MONEY, ..., EMAIL_ADDRESS, NAME_GIVEN, NAME_FAMILY, ..." ``` ### Available handling options The available entity type handling options are: * `Redaction` - This is the default, unless you specify otherwise. Indicates to replace the entity value with the entity type name followed by a unique identifier for each unique value. For example, replace `John` with `NAME_GIVEN_1234`. * `Synthesis` - Indicates to replace the entity value with a realistic generated value. For example, replace `John` with `Michael`. * `Off` - Indicates to not replace the entity value at all, and keep it as is in the output. ### Specifying the default handling option To specify the default handling option to use for all entity types, use the `generator_default` parameter. In the following example, the default handling option is set to `Synthesis`. All entities are replaced with realistic generated values. ``` tool = TonicTextualRedactText(generator_default="Synthesis") tool.invoke("Contact Jane Doe at jane.doe@example.com.") # "Contact Maria Chen at maria.chen@gmail.com." ``` ### Providing handling options for specific entity types To provide handling options for specific entity types, use the `generator_config` parameter. Within `generator_config`, for each entity type: {% code overflow="wrap" %} ``` ": "" ``` {% endcode %} In the following example, the default handling option is `Off`. First and last names are replaced with realistic generated values, and email addresses are redacted: ``` tool = TonicTextualRedactText( generator_default="Off", generator_config={ "NAME_GIVEN": "Synthesis", "NAME_FAMILY": "Synthesis", "EMAIL_ADDRESS": "Redaction", }, ) tool.invoke("Contact Jane Doe at jane.doe@example.com.") # "Contact Maria Chen at chen@[EMAIL_ADDRESS_xxxx]." ``` --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://docs.tonic.ai/textual/textual-integrations/textual-langchain-integration.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.