Snowpark Container Services (SPCS) allow developers to run containerized workloads directly within Snowflake. Because Tonic Textual is distributed using a private Docker repository, you can use these images in SPCS to run Textual workloads.
It is quicker to use the Snowflake Native App, but SPCS allows for more customization.
To use the Textual images, you must add them to Snowflake. The Snowflake documentation and tutorial walks through the process in great detail, but the basic steps are as follows:
To pull down the required images, you must have access to our private Docker image repository on Quay.io. You should have been provided credentials during onboarding. If you require new credentials, or you experience issues accessing the repository, contact support@tonic.ai. Once you have access, pull down the following images:
textual-snowflake
Either textual-ml
or textual-ml-gpu
, depending on whether you plan to use a GPU compute pool
The images are now available in Snowflake.
The API service exposes the functions that are used to redact sensitive values in Snowflake. The service must be attached to a compute pool. You can scale the instances as needed, but you likely only need one API.
Next, you create the ML service, which recognizes personally identifiable information (PII) and other sensitive values in text. This is more likely to need scaling.
You can create custom SQL functions that use your API and ML services. These functions are accessible from directly within Snowflake.
It can take a couple of minutes for the containers to start. After the containers are started, you can use the functions that you created in Snowflake.
To test the functions, use an existing table. You can also create this simple test table:
You use the function in the same way as any other user-defined function. You can pass in additional configuration to determine how to process specific types of sensitive values.
For example:
By default, the function redacts the entity values. In other words, it replaces the values with a placeholder that includes the type. Synthesis
indicates to replace the value with a realistic replacement value. Off
indicates to leave the value as is.
The textual_redact
function works identically to the textual_redact
function in the Snowflake Native App.
The response from the above example should look something like this:
The textual_parse
function works identically to the textual_parse
function in the Snowflake Native App.
Message | Redacted | Synthesized |
---|---|---|
Hi my name is John Smith
Hi my name is [NAME_GIVEN_Kx0Y7] [NAME_FAMILY_s9TTP0]
Hi my name is Lamar Smith
Hi John, mine is Jane Doe
Hi [NAME_GIVEN_Kx0Y7], mine is [NAME_GIVEN_veAy9] [NAME_FAMILY_6eC2]
Hi Lamar, mine is Doris Doe