Textual Haystack

The Textual Haystackarrow-up-right integration provides Haystackarrow-up-right components that you use to call Textual detection and redaction functions from within Haystack. You can add those components to your Haystack pipeline.

Installing and configuring the integration

To install the integration, run:

pip install textual-haystack

Providing a Textual API key

The calls to Textual require a Textual API key.

To use the same key for every call, set the API key as the value of TONIC_TEXTUAL_API_KEY:

export TONIC_TEXTUAL_API_KEY="your-api-key"

Alternatively, you can provide the API key when you call a component:

from haystack.utils.auth import Secret

extractor = TonicTextualEntityExtractor(
    api_key=Secret.from_token("your-api-key")
)

Providing the URL for a self-hosted instance

If you are calling a self-hosted instance of Textual, then the call must include the Textual URL:

Textual Haystack components

The Textual Haystack integration includes the following components:

Component
Description

TonicTextualEntityExtractor

Extracts entities from provided content. The results include the entity type, entity value, location within the text, and the detection confidence score.

TonicTextualDocumentCleaner

Replaces entities in provided content. You can optionally specify how to replace values for different entity types.

Using the entity extraction component

To use the entity extraction component, you provide the content for the component to analyze.

For example:

Document cleaning

To use the document cleaning component, you provide the content for the component to redact.

For example:

You can optionally specify how to replace each entity type:

  • Redaction - Replaces each value with the entity type name plus a unique identifier.

  • Synthesis - Replaces each value with a realistic generated value.

  • Off - Does not replace the value.

generator_default identifies the default to use for all entity types. generator_config specifies the handling for individual entity types.

For example:

Last updated

Was this helpful?