LogoLogo
Release notesPython SDK docsDocs homeTextual CloudTonic.ai
  • Tonic Textual guide
  • Getting started with Textual
  • Previewing Textual detection and redaction
  • Entity types that Textual detects
    • Built-in entity types
    • Managing custom entity types
  • Language support in Textual
  • Datasets - Create redacted files
    • Datasets workflow for text redaction
    • Creating and managing datasets
    • Assigning tags to datasets
    • Displaying the file manager
    • Adding and removing dataset files
    • Reviewing the sensitivity detection results
    • Configuring the redaction
      • Configuring added and excluded values for built-in entity types
      • Working with custom entity types
      • Selecting the handling option for entity types
      • Configuring synthesis options
      • Configuring handling of file components
    • Adding manual overrides to PDF files
      • Editing an individual PDF file
      • Creating templates to apply to PDF files
    • Sharing dataset access
    • Previewing the original and redacted data in a file
    • Downloading redacted data
  • Pipelines - Prepare LLM content
    • Pipelines workflow for LLM preparation
    • Viewing pipeline lists and details
    • Assigning tags to pipelines
    • Setting up pipelines
      • Creating and editing pipelines
      • Supported file types for pipelines
      • Creating custom entity types from a pipeline
      • Configuring file synthesis for a pipeline
      • Configuring an Amazon S3 pipeline
      • Configuring a Databricks pipeline
      • Configuring an Azure pipeline
      • Configuring a Sharepoint pipeline
      • Selecting files for an uploaded file pipeline
    • Starting a pipeline run
    • Sharing pipeline access
    • Viewing pipeline results
      • Viewing pipeline files, runs, and statistics
      • Displaying details for a processed file
      • Structure of the pipeline output file JSON
    • Downloading and using pipeline output
  • Textual Python SDK
    • Installing the Textual SDK
    • Creating and revoking Textual API keys
    • Obtaining JWT tokens for authentication
    • Instantiating the SDK client
    • Datasets and redaction
      • Create and manage datasets
      • Redact individual strings
      • Redact individual files
      • Transcribe and redact an audio file
      • Configure entity type handling for redaction
      • Record and review redaction requests
    • Pipelines and parsing
      • Create and manage pipelines
      • Parse individual files
  • Textual REST API
    • About the Textual REST API
    • REST API authentication
    • Redaction
      • Redact text strings
  • Datasets
    • Manage datasets
    • Manage dataset files
  • Snowflake Native App and SPCS
    • About the Snowflake Native App
    • Setting up the app
    • Using the app
    • Using Textual with Snowpark Container Services directly
  • Install and administer Textual
    • Textual architecture
    • Setting up and managing a Textual Cloud pay-as-you-go subscription
    • Deploying a self-hosted instance
      • System requirements
      • Deploying with Docker Compose
      • Deploying on Kubernetes with Helm
    • Configuring Textual
      • How to configure Textual environment variables
      • Configuring the number of textual-ml workers
      • Configuring the number of jobs to run concurrently
      • Configuring the format of Textual logs
      • Setting a custom certificate
      • Configuring endpoint URLs for calls to AWS
      • Enabling PDF and image processing
      • Setting the S3 bucket for file uploads and redactions
      • Required IAM role permissions for Amazon S3
      • Configuring model preferences
    • Viewing model specifications
    • Managing user access to Textual
      • Textual organizations
      • Creating a new account in an existing organization
      • Single sign-on (SSO)
        • Viewing the list of SSO groups in Textual
        • Azure
        • GitHub
        • Google
        • Keycloak
        • Okta
      • Managing Textual users
      • Managing permissions
        • About permissions and permission sets
        • Built-in permission sets and available permissions
        • Viewing the lists of permission sets
        • Configuring custom permission sets
        • Configuring access to global permission sets
        • Setting initial access to all global permissions
    • Textual monitoring
      • Downloading a usage report
      • Tracking user access to Textual
Powered by GitBook
On this page
  • Creating a pipeline
  • Setting the pipeline name and source type
  • Providing Amazon S3 credentials
  • Providing Databricks connection information
  • Providing Azure credentials
  • Providing Sharepoint credentials
  • Editing a pipeline
  • Deleting a pipeline

Was this helpful?

Export as PDF
  1. Pipelines - Prepare LLM content
  2. Setting up pipelines

Creating and editing pipelines

Last updated 14 days ago

Was this helpful?

Creating a pipeline

Required global permission: Create pipelines

To create a pipeline, on the Pipelines page, click Create a New Pipeline.

Setting the pipeline name and source type

On the Create A New Pipeline panel:

  1. In the Name field, type the name of the pipeline.

  2. Under Files Source, select the location of the source files.

    • To upload files from a local file system, click File upload, then click Save.

    • To select files from and write output to Amazon S3, click Amazon S3.

    • To select files from and write output to Databricks, click Databricks.

    • To select files from and write output to Azure Blob Storage, click Azure.

    • To select files from and write output to Sharepoint, click Sharepoint.

  3. Click Save.

Providing Amazon S3 credentials

If you selected Amazon S3, provide the credentials to use to connect to Amazon S3.

  1. In the Access Secret field, provide the secret key that is associated with the access key.

  2. From the Region dropdown list, select the AWS Region to send the authentication request to.

  3. In the Session Token field, provide the session token to use for the authentication request.

  4. To test the credentials, click Test AWS Connection.

  5. By default, connections to Amazon S3 use Amazon S3 encryption. To instead use AWS KMS encryption:

    1. Click Show Advanced Options.

    2. In the Server-side Encryption AWS KMS ID field, provide the KMS key ID. Note that if the KMS key doesn't exist in the same account that issues the command, you must provide the full key ARN instead of the key ID.

  6. Click Save.

  7. On the Pipeline Settings page, provide the rest of the pipeline configuration. For more information, go to Configuring an Amazon S3 pipeline.

  8. Click Save.

Providing Databricks connection information

If you selected Databricks, provide the connection information:

  1. In the Databricks URL field, provide the URL to the Databricks workspace.

  2. In the Access Token field, provide the access token to use to get access to the volume.

  3. To test the connection, click Test Databricks Connection.

  4. Click Save.

  5. On the Pipeline Settings page, provide the rest of the pipeline configuration. For more information, go to Configuring a Databricks pipeline.

  6. Click Save.

Providing Azure credentials

If you selected Azure, provide the connection information:

  1. In the Account Name field, provide the name of your Azure account.

  2. In the Account Key field, provide the access key for your Azure account.

  3. To test the connection, click Test Azure Connection.

  4. Click Save.

  5. On the Pipeline Settings page, provide the rest of the pipeline configuration. For more information, go to Configuring an Azure pipeline.

  6. Click Save.

Providing Sharepoint credentials

If you selected Sharepoint, provide the credentials for the Entra ID application.

The credentials must have the following application permissions (not delegated permissions):

  • Files.Read.All - To see the Sharepoint files

  • Files.ReadWrite.All -To write redacted files and metadata back to Sharepoint

  • Sites.ReadWrite.All - To view and modify the Sharepoint sites

  1. In the Tenant ID field, provide the Sharepoint tenant identifier for the Sharepoint site.

  2. In the Client ID field, provide the client identifier for the Sharepoint site.

  3. In the Client Secret field, provide the secret to use to connect to the Sharepoint site.

  4. To test the connection, click Test Sharepoint Connection.

  5. On the Pipeline Settings page, provide the rest of the pipeline configuration. For more information, go to Configuring a Sharepoint pipeline.

  6. Click Save.

Editing a pipeline

Required pipeline permissions:

  • Edit pipeline settings

  • Manage the pipeline file list

To update a pipeline configuration:

  1. On the pipeline details page, click the settings icon. For cloud storage pipelines, the settings icon is next to the Run Pipeline option. For uploaded file pipelines, the settings icon is next to the Upload Files option.

  2. On the Pipeline Settings page, update the configuration. For all pipelines, you can change the pipeline name, and whether to also create redacted versions of the original files. For cloud storage pipelines, you can change the file selection. For more information, go to:

    • Configuring an Amazon S3 pipeline

    • Configuring a Databricks pipeline

    • Configuring an Azure pipeline

    • Configuring a Sharepoint pipeline

    For uploaded file pipelines, you do not manage files from the Pipeline Settings page. For information about uploading files, go to Selecting files for an uploaded file pipeline.

  3. Click Save.

Deleting a pipeline

Required pipeline permission: Delete a pipeline

To delete a pipeline, on the Pipeline Settings page, click Delete Pipeline.

In the Access Key field, provide an AWS access key that is associated with an IAM user or role. For an example of a role that has the required permissions for an Amazon S3 pipeline, go to .

From the Server-Side Encryption Type dropdown list, select AWS KMS. Note that after you save the new pipeline, you cannot change the encryption type.

Pipeline creation panel for an uploaded file pipeline
Pipeline creation panel for Amazon S3
Pipeline creation panel for Databricks
Pipeline creation panel for Azure
Pipeline creation panel for Sharepoint
Delete Pipeline button for a pipeline
Example IAM role for Amazon S3 pipelines