LogoLogo
Release notesPython SDK docsDocs homeTextual CloudTonic.ai
  • Tonic Textual guide
  • Getting started with Textual
  • Previewing Textual detection and redaction
  • Entity types that Textual detects
    • Built-in entity types
    • Managing custom entity types
  • Language support in Textual
  • Datasets - Create redacted files
    • Datasets workflow for text redaction
    • Creating and managing datasets
    • Assigning tags to datasets
    • Displaying the file manager
    • Adding and removing dataset files
    • Reviewing the sensitivity detection results
    • Configuring the redaction
      • Configuring added and excluded values for built-in entity types
      • Working with custom entity types
      • Selecting the handling option for entity types
      • Configuring synthesis options
      • Configuring handling of file components
    • Adding manual overrides to PDF files
      • Editing an individual PDF file
      • Creating templates to apply to PDF files
    • Sharing dataset access
    • Previewing the original and redacted data in a file
    • Downloading redacted data
  • Pipelines - Prepare LLM content
    • Pipelines workflow for LLM preparation
    • Viewing pipeline lists and details
    • Assigning tags to pipelines
    • Setting up pipelines
      • Creating and editing pipelines
      • Supported file types for pipelines
      • Creating custom entity types from a pipeline
      • Configuring file synthesis for a pipeline
      • Configuring an Amazon S3 pipeline
      • Configuring a Databricks pipeline
      • Configuring an Azure pipeline
      • Configuring a Sharepoint pipeline
      • Selecting files for an uploaded file pipeline
    • Starting a pipeline run
    • Sharing pipeline access
    • Viewing pipeline results
      • Viewing pipeline files, runs, and statistics
      • Displaying details for a processed file
      • Structure of the pipeline output file JSON
    • Downloading and using pipeline output
  • Textual Python SDK
    • Installing the Textual SDK
    • Creating and revoking Textual API keys
    • Obtaining JWT tokens for authentication
    • Instantiating the SDK client
    • Datasets and redaction
      • Create and manage datasets
      • Redact individual strings
      • Redact individual files
      • Transcribe and redact an audio file
      • Configure entity type handling for redaction
      • Record and review redaction requests
    • Pipelines and parsing
      • Create and manage pipelines
      • Parse individual files
  • Textual REST API
    • About the Textual REST API
    • REST API authentication
    • Redaction
      • Redact text strings
  • Datasets
    • Manage datasets
    • Manage dataset files
  • Snowflake Native App and SPCS
    • About the Snowflake Native App
    • Setting up the app
    • Using the app
    • Using Textual with Snowpark Container Services directly
  • Install and administer Textual
    • Textual architecture
    • Setting up and managing a Textual Cloud pay-as-you-go subscription
    • Deploying a self-hosted instance
      • System requirements
      • Deploying with Docker Compose
      • Deploying on Kubernetes with Helm
    • Configuring Textual
      • How to configure Textual environment variables
      • Configuring the number of textual-ml workers
      • Configuring the number of jobs to run concurrently
      • Configuring the format of Textual logs
      • Setting a custom certificate
      • Configuring endpoint URLs for calls to AWS
      • Enabling PDF and image processing
      • Setting the S3 bucket for file uploads and redactions
      • Required IAM role permissions for Amazon S3
      • Configuring model preferences
    • Viewing model specifications
    • Managing user access to Textual
      • Textual organizations
      • Creating a new account in an existing organization
      • Single sign-on (SSO)
        • Viewing the list of SSO groups in Textual
        • Azure
        • GitHub
        • Google
        • Keycloak
        • Okta
      • Managing Textual users
      • Managing permissions
        • About permissions and permission sets
        • Built-in permission sets and available permissions
        • Viewing the lists of permission sets
        • Configuring custom permission sets
        • Configuring access to global permission sets
        • Setting initial access to all global permissions
    • Textual monitoring
      • Downloading a usage report
      • Tracking user access to Textual
Powered by GitBook
On this page
  • Add images to the repository
  • Create the API service
  • Create the machine learning (ML) service
  • Create functions
  • Example usage

Was this helpful?

Export as PDF
  1. Snowflake Native App and SPCS

Using Textual with Snowpark Container Services directly

Last updated 3 months ago

Was this helpful?

Snowpark Container Services (SPCS) allow developers to run containerized workloads directly within Snowflake. Because Tonic Textual is distributed using a private Docker repository, you can use these images in SPCS to run Textual workloads.

It is quicker to use the , but SPCS allows for more customization.

Add images to the repository

To use the Textual images, you must add them to Snowflake. The Snowflake and walks through the process in great detail, but the basic steps are as follows:

  1. .

  2. To pull down the required images, you must have access to our private Docker image repository on . You should have been provided credentials during onboarding. If you require new credentials, or you experience issues accessing the repository, contact . Once you have access, pull down the following images:

    • textual-snowflake

    • Either textual-ml or textual-ml-gpu, depending on whether you plan to use a GPU compute pool

The images are now available in Snowflake.

Create the API service

The API service exposes the functions that are used to redact sensitive values in Snowflake. The service must be attached to a compute pool. You can scale the instances as needed, but you likely only need one API.

DROP SERVICE IF EXISTS api_service;
CREATE SERVICE api_service
  IN COMPUTE POOL compute_pool
  FROM SPECIFICATION $$
    spec:
      containers:
      - name: api_container
        image: /your_db/your_schema/your_image_repository/textual-snowflake:latest
        env:
          ML_SERVICE_URL: https://ml-service:7701
      endpoints:
        - name: api_endpoint
          port: 9002
          protocol: HTTP
      $$
   MIN_INSTANCES=1
   MAX_INSTANCES=1;

Create the machine learning (ML) service

Next, you create the ML service, which recognizes personally identifiable information (PII) and other sensitive values in text. This is more likely to need scaling.

DROP SERVICE IF EXISTS ml_service;
CREATE SERVICE ml_service
  IN COMPUTE POOL compute_pool
  FROM SPECIFICATION $$
    spec:
      containers:
      - name: ml_container
        image: /your_db/your_schema/your_image_repository/textual-ml:latest
      endpoints:
        - name: ml_endpoint
          port: 7701
          protocol: TCP
      $$
   MIN_INSTANCES=1
   MAX_INSTANCES=1;

Create functions

You can create custom SQL functions that use your API and ML services. These functions are accessible from directly within Snowflake.

CREATE OR REPLACE FUNCTION textual_redact(input_text STRING, config STRING)
  RETURNS STRING
  SERVICE = your_db.your_schema.api_service
  ENDPOINT = 'api_endpoint'
  AS '/api/redact';

CREATE OR REPLACE FUNCTION textual_redact(input_text STRING)
  RETURNS STRING
  SERVICE = your_db.your_schema.api_service
  CONTEXT_HEADERS = (current_user)
  ENDPOINT = 'api_endpoint'
  AS '/api/redact';
  
CREATE OR REPLACE FUNCTION textual_parse(PATH VARCHAR, STAGE_NAME VARCHAR, md5sum VARCHAR)
  returns string
  SERVICE=core.textual_service
  CONTEXT_HEADERS = (current_user)
  endpoint='api_endpoint'
  MAX_BATCH_ROWS=10
  as '/api/parse/start';

Example usage

It can take a couple of minutes for the containers to start. After the containers are started, you can use the functions that you created in Snowflake.

To test the functions, use an existing table. You can also create this simple test table:

CREATE TABLE Messages (
    Message TEXT
);

INSERT INTO Messages (Message) VALUES ('Hi my name is John Smith');
INSERT INTO Messages (Message) VALUES ('Hi John, mine is Jane Doe');

For example:

SELECT Message, textual_redact(Message) as REDACTED, textual_redact(Message, PARSE_JSON('NAME_GIVEN':'Synthesis', 'NAME_FAMILY':'Off')) as SYNTHESIZED FROM MESSAGES;

By default, the function redacts the entity values. In other words, it replaces the values with a placeholder that includes the type. Synthesis indicates to replace the value with a realistic replacement value. Off indicates to leave the value as is.

The response from the above example should look something like this:

Message
Redacted
Synthesized

Hi my name is John Smith

Hi my name is [NAME_GIVEN_Kx0Y7] [NAME_FAMILY_s9TTP0]

Hi my name is Lamar Smith

Hi John, mine is Jane Doe

Hi [NAME_GIVEN_Kx0Y7], mine is [NAME_GIVEN_veAy9] [NAME_FAMILY_6eC2]

Hi Lamar, mine is Doris Doe

You use the function in the same way as any other user-defined function. You can pass in additional configuration to determine how to process .

The textual_redact function works identically to the .

The textual_parse function works identically to the .

Snowflake Native App
documentation
tutorial
Set up an image repository in Snowflake
Quay.io
support@tonic.ai
Use the Docker CLI to upload the images to the image repository.
specific types of sensitive values
textual_redact function in the Snowflake Native App
textual_parse function in the Snowflake Native App