Amazon Redshift

Overview

Tonic supports moving data from one database to another within a single Redshift instance and can also move data between Redshift instances.
In both cases, Tonic uses S3 as an intermediate stage to host both real and masked data.

Diagram of the Tonic ETL process

The following high-level diagram describes how Tonic orchestrates the processing and moving of data in Redshift.
This diagram is not the same as the Tonic architectural diagram.
Tonic orchestrates the moving and transforming of data between Redshift databases. Tonic uses the S3, SQS, and Lambda services in AWS to accomplish this task.
Tonic manages the lifetimes of data and resources used in AWS. It only requires that the necessary permissions are assigned to the IAM role that Tonic uses.
At a high level, the process is:
  1. 1.
    Create an AWS Lambda function for your version of Tonic. This step is performed once per version of Tonic. The Lambda function is created when you run your first job after a new installation or after a version upgrade.
  2. 2.
    Create AWS SQS queue and S3 Event Triggers. This is done once per job. The resource names are scoped to your specific generation job.
  3. 3.
    COPY table data into S3. The S3 bucket path is specific in the Tonic UI.
  4. 4.
    As files land in S3, S3 Event notifications place messages in SQS. Messages in SQS trigger Lambda function invocations. By default, each file placed in S3 has a maximum file size of 50MB. Each Lambda invocation processes a single file. Lambda processes each file and then writes them back to S3 in a different location, which the user also specifies in the Tonic application.
  5. 5.
    Once all files for a table are processed, Tonic copies data back into Redshift, into the destination database.
  6. 6.
    Once all tables are processed, Tonic removes ephemeral AWS components such as SQS and S3 Event Notifications.

Limitations

Redshift support includes all standard Tonic features except for subsetting, which is not currently supported.
If you would like to use Subsetting in conjunction with Redshift, reach out to [email protected]
Copy link
On this page
Overview
Diagram of the Tonic ETL process
Limitations