Using Textual with Snowpark Container Services directly
Snowpark Container Services (SPCS) allow developers to run containerized workloads directly within Snowflake. Because Tonic Textual is distributed using a private Docker repository, you can use these images in SPCS to run Textual workloads.
It is quicker to use the Snowflake Native App, but SPCS allows for more customization.
Add images to the repository
To use the Textual images, you must add them to Snowflake. The Snowflake documentation and tutorial walks through the process in great detail, but the basic steps are as follows:
To pull down the required images, you must have access to our private Docker image repository on Quay.io. You should have been provided credentials during onboarding. If you require new credentials, or you experience issues accessing the repository, contact support@tonic.ai. Once you have access, pull down the following images:
textual-snowflake
Either
textual-ml
ortextual-ml-gpu
, depending on whether you plan to use a GPU compute pool
The images are now available in Snowflake.
Create the API service
The API service exposes the functions that are used to redact sensitive values in Snowflake. The service must be attached to a compute pool. You can scale the instances as needed, but you likely only need one API.
Create the machine learning (ML) service
Next, you create the ML service, which recognizes personally identifiable information (PII) and other sensitive values in text. This is more likely to need scaling.
Create functions
You can create custom SQL functions that use your API and ML services. These functions are accessible from directly within Snowflake.
Example usage
It can take a couple of minutes for the containers to start. After the containers are started, you can use the functions that you created in Snowflake.
To test the functions, use an existing table. You can also create this simple test table:
You use the function in the same way as any other user-defined function. You can pass in additional configuration to determine how to process specific types of sensitive values.
For example:
By default, the function redacts the entity values. In other words, it replaces the values with a placeholder that includes the type. Synthesis
indicates to replace the value with a realistic replacement value. Off
indicates to leave the value as is.
The textual_redact
function works identically to the textual_redact
function in the Snowflake Native App.
The response from the above example should look something like this:
Message | Redacted | Synthesized |
---|---|---|
Hi my name is John Smith | Hi my name is [NAME_GIVEN_Kx0Y7] [NAME_FAMILY_s9TTP0] | Hi my name is Lamar Smith |
Hi John, mine is Jane Doe | Hi [NAME_GIVEN_Kx0Y7], mine is [NAME_GIVEN_veAy9] [NAME_FAMILY_6eC2] | Hi Lamar, mine is Doris Doe |
The textual_parse
function works identically to the textual_parse
function in the Snowflake Native App.
Last updated