Structural process overview for Amazon Redshift
Last updated
Last updated
The following high-level diagram describes how Tonic Structural orchestrates the processing and moving of data in Amazon Redshift.
This diagram is not the same as the Tonic architectural diagram.
Structural orchestrates the moving and transforming of data between Redshift databases. To do this, Structural uses the Amazon S3, Amazon SQS, and AWS Lambda services in AWS.
Structural manages the lifetimes of data and resources used in AWS. It only requires you to assign the necessary permissions to the IAM role that Structural uses.
At a high level, the process is:
Structural creates a Lambda function for your version of Structural. This step is performed once per version of Structural. The Lambda function is created when you run your first data generation job after you install or update Structural.
Structural creates an Amazon SQS queue and Amazon S3 event triggers. This is done once for each data generation job. The resource names are scoped to your specific generation job.
Structural copies the table data into Amazon S3 as CSV files. You specify the S3 bucket path in the Structural workspace configuration. Within the S3 bucket, the data files are copied into an input
folder.
As files land in Amazon S3, Amazon S3 event notifications place messages in Amazon SQS.
Messages in Amazon SQS trigger Lambda function invocations. By default, each file placed in Amazon S3 has a maximum file size of 50MB. Each Lambda invocation processes a single file. Lambda processes each file and then writes them back to Amazon S3 in an output
folder in the S3 bucket.
After it processes all of the files for a table, Structural copies data back into Amazon Redshift, into the destination database.
After it processes all of the tables, Structural removes ephemeral AWS components such as Amazon SQS and Amazon S3 event notifications.