LogoLogo
Release notesPython SDK docsDocs homeTextual CloudTonic.ai
  • Tonic Textual guide
  • Getting started with Textual
  • Previewing Textual detection and redaction
  • Entity types that Textual detects
    • Built-in entity types
    • Managing custom entity types
  • Language support in Textual
  • Datasets - Create redacted files
    • Datasets workflow for text redaction
    • Creating and managing datasets
    • Assigning tags to datasets
    • Adding and removing dataset files
    • Reviewing the sensitivity detection results
    • Configuring the redaction
      • Configuring added and excluded values for built-in entity types
      • Working with custom entity types
      • Selecting the handling option for entity types
      • Configuring synthesis options
      • Configuring handling of file components
    • Adding manual overrides to PDF files
      • Editing an individual PDF file
      • Creating templates to apply to PDF files
    • Sharing dataset access
    • Previewing the original and redacted data in a file
    • Downloading redacted data
  • Pipelines - Prepare LLM content
    • Pipelines workflow for LLM preparation
    • Viewing pipeline lists and details
    • Assigning tags to pipelines
    • Setting up pipelines
      • Creating and editing pipelines
      • Supported file types for pipelines
      • Creating custom entity types from a pipeline
      • Configuring file synthesis for a pipeline
      • Configuring an Amazon S3 pipeline
      • Configuring a Databricks pipeline
      • Configuring an Azure pipeline
      • Configuring a Sharepoint pipeline
      • Selecting files for an uploaded file pipeline
    • Starting a pipeline run
    • Sharing pipeline access
    • Viewing pipeline results
      • Viewing pipeline files, runs, and statistics
      • Displaying details for a processed file
      • Structure of the pipeline output file JSON
    • Downloading and using pipeline output
  • Textual Python SDK
    • Installing the Textual SDK
    • Creating and revoking Textual API keys
    • Obtaining JWT tokens for authentication
    • Instantiating the SDK client
    • Datasets and redaction
      • Create and manage datasets
      • Redact individual strings
      • Redact individual files
      • Transcribe and redact an audio file
      • Configure entity type handling for redaction
      • Record and review redaction requests
    • Pipelines and parsing
      • Create and manage pipelines
      • Parse individual files
  • Textual REST API
    • About the Textual REST API
    • REST API authentication
    • Redaction
      • Redact text strings
  • Datasets
    • Manage datasets
    • Manage dataset files
  • Snowflake Native App and SPCS
    • About the Snowflake Native App
    • Setting up the app
    • Using the app
    • Using Textual with Snowpark Container Services directly
  • Install and administer Textual
    • Textual architecture
    • Setting up and managing a Textual Cloud pay-as-you-go subscription
    • Deploying a self-hosted instance
      • System requirements
      • Deploying with Docker Compose
      • Deploying on Kubernetes with Helm
    • Configuring Textual
      • How to configure Textual environment variables
      • Configuring the number of textual-ml workers
      • Configuring the number of jobs to run concurrently
      • Configuring the format of Textual logs
      • Setting a custom certificate
      • Configuring endpoint URLs for calls to AWS
      • Enabling PDF and image processing
      • Setting the S3 bucket for file uploads and redactions
      • Required IAM role permissions for Amazon S3
      • Configuring model preferences
    • Viewing model specifications
    • Managing user access to Textual
      • Textual organizations
      • Creating a new account in an existing organization
      • Single sign-on (SSO)
        • Viewing the list of SSO groups in Textual
        • Azure
        • GitHub
        • Google
        • Keycloak
        • Okta
      • Managing Textual users
      • Managing permissions
        • About permissions and permission sets
        • Built-in permission sets and available permissions
        • Viewing the lists of permission sets
        • Configuring custom permission sets
        • Configuring access to global permission sets
        • Setting initial access to all global permissions
    • Textual monitoring
      • Downloading a usage report
      • Tracking user access to Textual
Powered by GitBook
On this page
  • Viewing the list of datasets
  • Displaying the Datasets page
  • Filtering the datasets by name
  • Filtering the datasets by tag
  • Creating a dataset
  • Displaying details for a dataset
  • Changing the dataset name
  • Deleting a dataset

Was this helpful?

Export as PDF
  1. Datasets - Create redacted files

Creating and managing datasets

Last updated 9 days ago

Was this helpful?

A Tonic Textual dataset is a collection of text-based files. Textual uses models to detect and redact the sensitive information in each file.

Viewing the list of datasets

Displaying the Datasets page

To display the Datasets page, in the navigation menu, click Datasets.

The datasets list only displays the datasets that you have access to.

Users who have the global permission View all datasets can see the complete list of datasets.

For each dataset, the Datasets page includes:

  • The name of the dataset

  • Any tags assigned to the dataset, as well as an option to add tags. For more information, go to Assigning tags to datasets.

  • When the dataset was most recently updated

  • The user who most recently updated the dataset

Filtering the datasets by name

To filter the datasets by name, in the search field, begin to type text that is in the dataset name.

As you type, the list is filtered to only include datasets with names that contain the filter text.

Filtering the datasets by tag

You can assign tags to each dataset. Tags can help you to organize and provide a quick glance into the dataset configuration.

On the Datasets page, to filter the datasets by their assigned tags:

  1. In the heading for the Tags column, click the filter icon.

  2. On the tag list, check the checkbox for each tag to include.

To find a specific tag, in the search field, type the tag name.

Creating a dataset

Required global permission: Create datasets

From the Datasets page, you can create a new empty dataset. Textual prompts you for the dataset name, then displays the dataset details page.

To create a dataset:

  1. On the Datasets page, click Create a Dataset.

  2. On the dataset creation panel, in the Dataset Name field, provide the name of the dataset.

  1. Click Create Dataset. The dataset details page for the new dataset is displayed.

Displaying details for a dataset

Required dataset permission: View dataset settings

To display the details page for a dataset, on the Datasets page, click the dataset name.

The dataset details page includes:

  • The tags assigned to the dataset, as well as an option to add tags. For more information, go to Assigning tags to datasets.

  • The list of files in the dataset

  • The results of the scan for entity values

  • The configured handling for each type of value

Changing the dataset name

Required dataset permission: Edit dataset settings

The dataset name displays in the panel at the top left of the dataset details page.

To change the dataset name:

  1. On the dataset details page, click Settings.

  2. On the Dataset Settings page, in the Dataset Name field, provide the new name for the dataset..

  1. Click Save Dataset.

Deleting a dataset

Required dataset permission: Delete a dataset

To delete a dataset:

  1. On the dataset details page, click Settings.

  2. On the Dataset Settings page, click Delete Dataset.

  3. Click Confirm Delete.

Datasets page
Panel to filter datasets by their assigned tags
Dataset creation panel
Dataset details page
Dataset name and file list
Dataset Settings page