Structural process overview for Amazon Redshift
The data generation process for Amazon Redshift is slightly different based on whether you use the previous data generation process or the newer Data Pipeline V2 process. The Data Pipeline V2 process is used by default.
Both processes use Amazon S3 to store the data as CSV files during data generation.
Data Pipeline V2 process
The following high-level diagram describes how Tonic orchestrates the processing and moving of data in Amazon Redshift during the Data Pipeline V2 data generation process.

This diagram specifically shows the Amazon Redshift data generation process. For the Structural architecture diagram, go to Structural architecture.
Structural orchestrates the moving and transforming of data between Amazon Redshift databases. To do this, Structural uses Amazon S3.
Structural manages the lifetimes of data and resources used in AWS. It only requires you to assign the necessary permissions to the IAM role that Structural uses.
At a high level, the process is:
Structural copies the table data into Amazon S3 as CSV files. You specify the S3 bucket path in the Structural workspace configuration. Within the S3 bucket, the data files are copied into an
inputfolder.After it transforms the data in a file, Structural copies the transformed file to the
outputfolder in the configured S3 bucket.After it processes all of the files for a table, Structural copies the output data back into Amazon Redshift, into the destination database.
Previous data generation process
The following high-level diagram describes how Structural orchestrates the processing and moving of data in Amazon Redshift during the previous data generation process.

This diagram specifically shows the Amazon Redshift data generation process. For the Structural architecture diagram, go to Structural architecture.
Structural orchestrates the moving and transforming of data between Amazon Redshift databases. To do this, Structural uses the Amazon S3, Amazon SQS, and AWS Lambda services.
Structural manages the lifetimes of data and resources used in AWS. It only requires you to assign the necessary permissions to the IAM role that Structural uses.
At a high level, the process is:
Structural creates a Lambda function for your version of Structural. This step is performed once per version of Structural. The Lambda function is created when you run your first data generation job after you install or update Structural.
Structural creates an Amazon SQS queue and Amazon S3 event triggers. This is done once for each data generation job. The resource names are scoped to your specific generation job.
Structural copies the table data into Amazon S3 as CSV files. You specify the S3 bucket path in the Structural workspace configuration. Within the S3 bucket, the data files are copied into an
inputfolder.As files land in Amazon S3, Amazon S3 event notifications place messages in Amazon SQS. Messages in Amazon SQS trigger Lambda function invocations. By default, each file placed in Amazon S3 has a maximum file size of 50MB. Each Lambda invocation processes a single file. Lambda processes each file and then writes them back to Amazon S3 in an
outputfolder in the S3 bucket.After it processes all of the files for a table, Structural copies data back into Amazon Redshift, into the destination database.
After it processes all of the tables, Structural removes ephemeral AWS components such as Amazon SQS and Amazon S3 event notifications.
Last updated
Was this helpful?

