1 of 6

Running data generation jobs

Data generation uses the configured tables modes and generators to transform the data in the source database or source files. The transformed data is used to create the destination database or to write transformed files to file storage.

Types of data generation

Simple data generation

In the simplest type of data generation, Tonic Structural uses the configured table modes and generators to transform data in the source database and then write the transformed data to the destination location. The destination location is usually a database server, but might also be:

A storage location such as an S3 bucket.
A container repository.
A Tonic Ephemeral snapshot.

For a file connector workspace, the data generation job uses the configured generators for each file group to transform the data in the source files. The transformed data is used to create output files that correspond to the source files.

Subsetting data generation

When subsetting is enabled, Structural first identifies the tables and rows to include in the subset. It uses the configured table modes and generators to transform the data. It then writes the transformed data to the destination location.

Upsert data generation

Required license: Professional or Enterprise

When upsert is enabled, Structural runs a data generation job that writes the transformed data to an intermediate database. The data generation can include subsetting.

After the initial data generation, Structural runs an upsert job to add or update the appropriate records from the intermediate database to the destination database. The upsert job only adds and updates records. It does not remove any records from previous data generation jobs.

Before Structural can run an upsert job, the destination database must already exist and have the correct schema defined. To initialize the destination database:

Disable upsert.
Run a regular data generation.
Re-enable upsert.

Data generation process

The following diagram provides an overview of how Tonic Structural produces destination data during data generation.

Tear down the destination database

To start, Structural analyzes the destination database to understand all of the database objects, including the schemas, tables, views, columns, and constraints.

For tables that use Preserve Destination or Incremental mode, Structural creates a copy of those tables in a temporary schema. The temporary schema holds those tables outside of the teardown process.

To remove all of the destination database objects, Structural issues logically ordered DROP and DELETE statements to the destination database. It continues to issue the statements until the destination database is empty.

Rebuild the destination database

After it tears down the destination database, Structural next analyzes the source database to understand all of the affected database objects.

In a logical order, Structural dumps the source database schema into pre-data and post-data schemas.

Pre-data schema - Schemas, tables, and columns with data types. The pre-data schema does not include keys or constraints.
Post-data schema - Other database elements, including keys, constraints, indexes, and views.

During data generation, Structural stores the pre-data and post-data schemas as temporary files.

Structural then applies the pre-data schema to the destination database.

Populate the destination data

Next, Structural de-identifies the source data, and uses the de-identified data to populate the destination database.

Structural processes the tables in the following order:

Tables that directly have generators applied to them.
Tables that have foreign key dependencies on those de-identified tables.
Tables for which no data is de-identified, also referred to as Passthrough tables.

For all of these tables, Structural:

Reads the data from the source database.
De-identifies the data based on the workspace configuration.
Writes the de-identified data to the destination database.

For tables that use Incremental mode, Structural copies the data back to the destination database. To understand the data to copy, Structural executes queries from the destination database to the source database.

Complete post-data tasks

After Structural copies all of the required data for a destination table, it applies the post-data schema elements.

Structural performs this process in an order that ensures that all of the dependencies for a database object are in place before Structural attempts to create the object.

Structural then updates any sequences that were created to ensure that they have the most accurate data.

For the Preserve Destination tables, Structural restores the data from the temporary schema. It then removes the temporary schema.

Run post-job tasks

When the data generation job is complete, Structural runs any post-job tasks, including:

Webhooks
Post-job scripts

Running data generation manually

Selecting the data generation option

Required workspace permission: Run data generation

To start the data generation, at the top right of the workspace management view, click Generate Data.

As you configure the data generation options, Structural runs checks to verify that you can use the current configuration to generate data.

If any of these checks do not pass, then when you click Generate Data, Structural displays information about why you cannot run the data generation job.

If all of those checks pass, then when you click Generate Data, if there are no warnings, the Confirm Generation panel displays.

Warning for non-conflicting schema changes

Data generation is always blocked by conflicting schema changes.

The workspace configuration includes whether to block data generation for all schema changes, including non-conflicting changes.

If this setting is turned off, then if there are non-conflicting schema changes, when you click Generate Data, a warning displays. Non-conflicting schema changes include new tables and columns. If the new columns contain sensitive data, then if you do not assign generators before you generate data, that sensitive data will be in the destination database.

If you are sure that the data in the new tables and columns is not sensitive, then to continue to the Confirm Generation panel, click Continue to Data Generation.

Confirming the generation details

The Confirm Generation panel allows you to confirm the details for the data generation.

Indicating whether to use upsert

If upsert is available for the workspace, then you can also determine whether to use upsert for data generation.

If upsert is enabled for the workspace, then by default Use Upsert is in the on position.

To not use upsert, toggle Use Upsert to the off position. When upsert is turned off, the data generation is a simple data generation that directly populates and replaces the destination database.

Indicating whether to generate a subset

If you configured subsetting, then you can indicate whether to only generate the subset.

To create a subset based on the current subsetting configuration, toggle Use Subsetting to the on position.

The initial setting matches the current setting in the subsetting configuration. If Use subsetting is enabled in Subsetting view, then it is enabled by default on the Generation Confirmation panel.

When you change the setting on the generation confirmation panel, it also updates the setting on Subsetting view.

Determining the data generation process to use (Oracle and SQL Server only)

Tonic.ai has released an improved version of the data generation process. It is used automatically for several data connectors. It is optional for Oracle and SQL Server.

For the new process, the job type is Data Pipeline Generation instead of Data Generation.

By default, Oracle and SQL Server workspaces use the new process.

To instead use the previous process, on the Confirm Generation panel, toggle Data Pipeline V2 to the off position.

Enabling diagnostic logging for the job

Required global permission: Enable diagnostic logging

If the data connector is not configured to use diagnostic logging, then you can choose whether to enable diagnostic logging for an individual data generation job. The option is also available for data connectors that do not have a diagnostic logging setting.

On the Confirm Generation panel, to enable diagnostic logging for the job, toggle Enable Diagnostic Logging to the on position.

Access to diagnostic logs is also controlled by the Enable diagnostic logging global permission. If you do not have this permission, then you cannot download diagnostic logs.

Indicating whether to generate performance metrics

Required workspace permission: Download job logs

To help to troubleshoot issues, for workspaces that use the newer data generation processing, you can configure the data generation job to also generate performance metrics.

The performance metrics start when a specified table is processed, and continue for a specified length of time.

To enable performance metrics for the data generation job:

Toggle Collect Performance Metrics to the on position.
From the Table Trigger dropdown list, select the table that triggers the performance metrics.
From the Trace Duration dropdown list, select the length of time to run the performance metrics.

Viewing the destination location

The Confirm Generation panel provides the destination information for the workspace. To display the destination database connection details, click Destination Settings.

Depending on the workspace configuration and data connector type, the destination information is either:

Connection information for a database server.
A storage location such as an S3 bucket.
Configuration for an Ephemeral snapshot.
Information to create container artifacts.

If the destination information is incorrect, to navigate to the workspace configuration view to make updates, click Edit Destination Settings.

For a file connector workspace, if the source files came from a local file system, then the destination files are written to the large file store in the Structural application database. You can download the most recently generated files.

Verifying the intermediate database connection information (for upsert)

When upsert is enabled, the Confirm Generation panel provides access to the connection information for the intermediate database. To display the intermediate database connection details, click Intermediate Upsert Database.

If the intermediate database information is incorrect, to navigate to the workspace configuration view to make updates, click Edit Intermediate.

Viewing generation performance tips

For data generation, assigning Truncate table mode to tables that you don't need data for can improve generation performance.

For subsetting, if an upstream table is very large, and the foreign key columns are not indexed, then it can make the subsetting process run more slowly.

The Want faster generations? message displays at the bottom of the Confirm Generation panel. It displays for all non-subsetting jobs. For subsetting jobs, the panel only displays if Structural identified columns that you should consider indexing.

To display information about tips for faster generation, click Generation Tips.

Viewing suggested columns to index

On the Generation Tips panel for subsetting jobs, the Add Indexes panel displays the first few columns that you might consider indexing.

To display a panel with a suggested SQL command to add the index, click the information icon next to the column.

On the panel, to copy the command to the clipboard, click Copy SQL to Clipboard.

If there are additional columns that are not listed, then to display the full list of columns to index, click Show all columns.

On the full list, to download the list to a CSV file, click Download list of columns (.csv).

Hint to truncate tables

On the Generation Tips panel for non-subsetting jobs, the Truncate Tables panel displays the hint to truncate tables that contain data that you do not need in the destination database.

To navigate to Database View to change the current configuration, click Go to Database View.

Starting the generation job

On the Confirm Generation panel, after you confirm the generation details, to start the data generation, click Run Generation.

When upsert is enabled, to start the data generation and upsert jobs:

Click Run Generation + Upsert.
In the menu, click Run Generation + Upsert.

Structural displays a notification that the job has started. To track the progress of the data generation job and view the results, click the View Job button on the notification, or go to Jobs view.

Starting an upsert job based on the most recent data generation

If upsert is enabled for a workspace, then on the Confirm Generation panel, the more common option is to run both data generation and upsert.

After you run at least one successful data generation to the intermediate database, then you can also choose to run only the upsert process.

For example, if the data generation succeeds but the upsert process fails, then after you address the issues that caused the upsert to fail, you can run the upsert process again.

You also must start the upsert job manually if you turn off Automatically Start Upsert After Successful Data Generation in the workspace settings.

From the Confirm Generation panel, to run upsert only:

Click the Run Generation + Upsert button.
In the menu, click Run Upsert Only.

When you run upsert only, the process uses the results of the most recent data generation.

Scheduling data generation

Required workspace permission: Run data generation

Not available for Structural free trials, or for expired pay-as-you-go accounts.

You can also configure data generation to run on a regular schedule. For example, if new data is added regularly to your source database, then you might want to automatically run data generation jobs to transform the new data.

To schedule the job, you set up one or more configurations. Each configuration includes a cron expression to specify the schedule.

Creating and editing a job schedule

You manage the job schedule from Jobs view.

To create a schedule for a job, click Create Schedule.

To edit an existing job schedule, click Edit Schedule.

Set up the schedule configurations, then click Save.

Adding a configuration to a schedule

To add a configuration to the schedule:

Click Add Configuration.
By default, the configuration is active, and Active is in the on position. To have Structural ignore the configuration, toggle Active to the off position.
In the Cron Expression field, provide the schedule expression. The default value is 0 0 * * *, which means to run the job every day at midnight.
From the time zone dropdown list, select the time zone to use for the schedule.

Overview of the cron expression syntax

A cron expression is made up of five values separated by a space.

<minute> <hour> <day of month> <month> <day of week>

Where:

<minute>

The minute at which to run the job. Is a value between 0 and 59. Use * to run every minute. Use , to separate a list of values. For example, to run at 10 and 40 past the hour, use 10,40. Use - to provide a range of values. For example, to run every minute between 20 and 25 past the hour, use 20-25.

<hour>

The hour at which to run the job. Is a value between 0 and 23. Use * to run every hour. Use , to separate a list of values. For example, to run at 11:00 AM and 2:00 PM, use 11,14.

Use - to provide a range of values. For example, to run every hour between 5:00 PM and 9:00 PM, use 17-21.

<day of month>

The day of the month on which to run the job. Is a value between 1 and 31.

Use * to run every day.

Use , to separate a list of values. For example, to run on the 5th and the 15th of the month, use 5,15.

Use - to provide a range of values. For example, to run on the 8th through the 12th of the month, use 8-12.

<month>

The month in which to run the job. Can use either numbers (1 through 12) or 3-letter abbreviations (JAN through DEC).

Use * to run every month.

Use , to separate a list of values. For example, to run in March and September, use 3,9 or MAR,SEP.

Use - to provide a range of values. For example, to run every month between June and August, use 6-8 or JUN-AUG.

<day of week>

The day of the week on which to run the job. Can use either numbers (0 through 6) or 3-letter abbreviations (SUN through SAT).

Use * to run every day.

Use , to separate a list of values. For example, to run every Monday, Wednesday, and Friday, use 1,3,5 or MON,WED,FRI.

Use - to provide a range of values. For example, to run every week from Tuesday through Thursday, use 2-4 or TUE-THU.

Here are some example expressions:

15 * * * *

Every hour at 15 minutes past the hour.

15 10,14,18 * * *

Every day at 10:15 AM, 2:15 PM, and 6:15 PM.

0 0 15 * *

Midnight on the 15th of every month.

0 0 10-15 6 *

Midnight on June 10 through June 15th.

30 12 * * TUE

Every Tuesday at 12:30 PM.

Removing a configuration from the schedule

To remove a configuration from the schedule, click its Delete option.

If you remove all of the configurations and then save, then the job is no longer scheduled.

Scheduled jobs and other workspace configuration

When a job runs on a schedule, you cannot configure the options that are available from the Confirm Generation panel when you run data generation manually.

Subsetting

For a workspace that has subsetting configured, whether the scheduled job generates a subset is based on the Use Subsetting toggle on Subsetting view.

Upsert

For a workspace that has upsert enabled, whether a scheduled job completes both the initial data generation and the upsert process is determined by the workspace setting Automatically Start Upsert After Successful Data Generation.

If this is in the on position, then the scheduled job does both data generation and upsert.

If this is in the off position, then the scheduled job only does the data generation to the intermediate database.

Diagnostic logging

For a scheduled job, whether the job uses diagnostic logging is based on the environment setting configuration for the data connector.

Performance traces

A scheduled job does not generate performance traces.

Issues that prevent data generation

The following issues prevent a data generation or subsetting job.

Structural and workspace access or permissions

Issue

Description

How to resolve

Structural license expired

The current Structural license is expired.

No workspace configured

The Structural instance does not have any workspaces to generate data from.

Insufficient workspace permissions

You do not have permission to run data generation on this workspace.

Source and destination data availability

Issue

Description

How to resolve

Unable to connect to the source database

The source database connection is either missing or incomplete. If the connection uses a secret from a secrets manager, the secret is not available. For example, the authentication details might be incorrect, or the secret might not be in a valid format.

Unable to connect to the destination database

The destination database connection is either missing or incomplete. If the connection uses a secret from a secrets manager, the secret is not available. For example, the authentication details might be incorrect, or the secret might not be in a valid format.

Only for licenses that have a configured size limit - The source database size exceeds the maximum allowed

Only applies to licenses that are configured to enforce a maximum data size. When a Structural license is configured with a maximum source database size, then if the source database is larger than that maximum, the generation fails.

Reduce the size of the source database.

Table mode and generator configuration

Issue

Description

How to resolve

Scale mode - Invalid generators

A table that uses Scale mode has columns with assigned generators that are not valid for Scale mode.

Change the selected generator for the columns.

Scale mode - Passthrough generator or sub-generator for the Conditional generator

A table that uses Scale mode has columns that use the Conditional generator, and that are assigned Passthrough as a generator or sub-generator.

Change the selected generator, sub-generator, or default generator.

Preserve Destination - cannot resolve foreign key references

A table that uses Preserve Destination mode is referenced from another table.

Truncate - cannot resolve foreign key references

A table that uses Truncate mode is referenced from another table.

Cross Table Sum - incomplete generation

A column is assigned the Cross Table Sum generator, but some required configuration field values are missing.

Incremental mode - circular foreign key dependency

There is a circular foreign key dependency between tables that use Incremental mode.

Schema changes

Issue

Description

How to resolve

Unresolved schema changes

There are detected schema changes that are not resolved.

Subsetting configuration

The following errors occur when you attempt to generate a subset. They do not apply if Use subsetting is turned off.

Issue

Description

How to resolve

No target tables configured

The subsetting configuration does not include any target tables.

Invalid target table configuration

A target table has an invalid percentage value or WHERE clause.

In-subset table uses Scale mode

A table that is in the subset uses Scale table mode.

In-subset table uses Truncate mode.

A table that is in the subset uses Truncate table mode.

In-subset table uses Preserve Destination mode.

A table that is in the subset uses Preserve Destination table mode.

In-subset table uses Incremental mode.

A table that is in the subset uses Incremental table mode.

Upsert issues

When upsert is enabled, the following issues cause the upsert job to fail.

Issue

Description

How to resolve

Destination database not populated

The destination database is empty.

Invalid table mode

A table is assigned a mode other than De-Identify or Truncate.

Unable to connect to the intermediate database

The intermediate database connection is either missing or incomplete.

Unresolved schema conflicts

There are schema conflicts between the source and destination databases.

Update the source or destination database schema to resolve the changes.

Issues that prevent data generation

The following issues prevent a data generation or subsetting job.

Structural and workspace access or permissions

Issue

Description

How to resolve

Structural license expired

The current Structural license is expired.

No workspace configured

The Structural instance does not have any workspaces to generate data from.

Insufficient workspace permissions

You do not have permission to run data generation on this workspace.

Source and destination data availability

Issue

Description

How to resolve

Unable to connect to the source database

For a secret from a secrets manager, verify the secrets manager configuration and the secret format. Make sure to test the connection.

Unable to connect to the destination database

For a secret from a secrets manager, verify the secrets manager configuration and the secret format. Make sure to test the connection.

Only for licenses that have a configured size limit - The source database size exceeds the maximum allowed

Reduce the size of the source database.

Table mode and generator configuration

Issue

Description

How to resolve

Scale mode - Invalid generators

A table that uses Scale mode has columns with assigned generators that are not valid for Scale mode.

Change the selected generator for the columns.

Scale mode - Passthrough generator or sub-generator for the Conditional generator

A table that uses Scale mode has columns that use the Conditional generator, and that are assigned Passthrough as a generator or sub-generator.

Change the selected generator, sub-generator, or default generator.

Preserve Destination - cannot resolve foreign key references

A table that uses Preserve Destination mode is referenced from another table.

Truncate - cannot resolve foreign key references

A table that uses Truncate mode is referenced from another table.

Cross Table Sum - incomplete generation

A column is assigned the Cross Table Sum generator, but some required configuration field values are missing.

Incremental mode - circular foreign key dependency

There is a circular foreign key dependency between tables that use Incremental mode.

Schema changes

Issue

Description

How to resolve

Unresolved schema changes

There are detected schema changes that are not resolved.

Subsetting configuration

The following errors occur when you attempt to generate a subset. They do not apply if Use subsetting is turned off.

Issue

Description

How to resolve

No target tables configured

The subsetting configuration does not include any target tables.

Invalid target table configuration

A target table has an invalid percentage value or WHERE clause.

In-subset table uses Scale mode

A table that is in the subset uses Scale table mode.

In-subset table uses Truncate mode.

A table that is in the subset uses Truncate table mode.

In-subset table uses Preserve Destination mode.

A table that is in the subset uses Preserve Destination table mode.

In-subset table uses Incremental mode.

A table that is in the subset uses Incremental table mode.

Upsert issues

When upsert is enabled, the following issues cause the upsert job to fail.

Issue

Description

How to resolve

Destination database not populated

The destination database is empty.

, then . You can then re-enable upsert.

Invalid table mode

A table is assigned a mode other than De-Identify or Truncate.

Unable to connect to the intermediate database

The intermediate database connection is either missing or incomplete.

Unresolved schema conflicts

There are schema conflicts between the source and destination databases.

Update the source or destination database schema to resolve the changes.

Running data generation manually

Selecting the data generation option

Required workspace permission: Run data generation

To start the data generation, at the top right of the workspace management view, click Generate Data.

As you configure the data generation options, Structural runs checks to verify that you can use the current configuration to generate data.

If any of these checks do not pass, then when you click Generate Data, Structural displays information about why you cannot run the data generation job.

If all of those checks pass, then when you click Generate Data, if there are no warnings, the Confirm Generation panel displays.

Warning for non-conflicting schema changes

Data generation is always blocked by conflicting schema changes.

The workspace configuration includes whether to block data generation for all schema changes, including non-conflicting changes.

If you are sure that the data in the new tables and columns is not sensitive, then to continue to the Confirm Generation panel, click Continue to Data Generation.

Confirming the generation details

The Confirm Generation panel allows you to confirm the details for the data generation.

Indicating whether to use upsert

If upsert is available for the workspace, then you can also determine whether to use upsert for data generation.

If upsert is enabled for the workspace, then by default Use Upsert is in the on position.

To not use upsert, toggle Use Upsert to the off position. When upsert is turned off, the data generation is a simple data generation that directly populates and replaces the destination database.

Indicating whether to generate a subset

If you configured subsetting, then you can indicate whether to only generate the subset.

To create a subset based on the current subsetting configuration, toggle Use Subsetting to the on position.

When you change the setting on the generation confirmation panel, it also updates the setting on Subsetting view.

Determining the data generation process to use (Oracle and SQL Server only)

Tonic.ai has released an improved version of the data generation process. It is used automatically for several data connectors. It is optional for Oracle and SQL Server.

For the new process, the job type is Data Pipeline Generation instead of Data Generation.

By default, Oracle and SQL Server workspaces use the new process.

To instead use the previous process, on the Confirm Generation panel, toggle Data Pipeline V2 to the off position.

Enabling diagnostic logging for the job

Required global permission: Enable diagnostic logging

By default, Structural redacts sensitive values from the logs. To help support troubleshooting, you can configure some Structural data connectors to use diagnostic logging, which generates unredacted versions of the log files. For details, go to .

On the Confirm Generation panel, to enable diagnostic logging for the job, toggle Enable Diagnostic Logging to the on position.

Access to diagnostic logs is also controlled by the Enable diagnostic logging global permission. If you do not have this permission, then you cannot download diagnostic logs.

Indicating whether to generate performance metrics

Required workspace permission: Download job logs

To help to troubleshoot issues, for workspaces that use the newer data generation processing, you can configure the data generation job to also generate performance metrics.

The performance metrics start when a specified table is processed, and continue for a specified length of time.

To enable performance metrics for the data generation job:

Toggle Collect Performance Metrics to the on position.
From the Table Trigger dropdown list, select the table that triggers the performance metrics.
From the Trace Duration dropdown list, select the length of time to run the performance metrics.

Viewing the destination location

The Confirm Generation panel provides the destination information for the workspace. To display the destination database connection details, click Destination Settings.

Depending on the workspace configuration and data connector type, the destination information is either:

Connection information for a database server.
A storage location such as an S3 bucket.
Configuration for an Ephemeral snapshot.
Information to create container artifacts.

If the destination information is incorrect, to navigate to the workspace configuration view to make updates, click Edit Destination Settings.

If the destination data is written to a container repository, then from the Confirm Generation panel, you can configure custom tag values to use for the artifacts that the data generation job generates. For information about how to configure the tag values, go to .

Verifying the intermediate database connection information (for upsert)

If the intermediate database information is incorrect, to navigate to the workspace configuration view to make updates, click Edit Intermediate.

Viewing generation performance tips

For data generation, assigning Truncate table mode to tables that you don't need data for can improve generation performance.

For subsetting, if an upstream table is very large, and the foreign key columns are not indexed, then it can make the subsetting process run more slowly.

To display information about tips for faster generation, click Generation Tips.

Viewing suggested columns to index

On the Generation Tips panel for subsetting jobs, the Add Indexes panel displays the first few columns that you might consider indexing.

To display a panel with a suggested SQL command to add the index, click the information icon next to the column.

On the panel, to copy the command to the clipboard, click Copy SQL to Clipboard.

If there are additional columns that are not listed, then to display the full list of columns to index, click Show all columns.

On the full list, to download the list to a CSV file, click Download list of columns (.csv).

Hint to truncate tables

On the Generation Tips panel for non-subsetting jobs, the Truncate Tables panel displays the hint to truncate tables that contain data that you do not need in the destination database.

To navigate to Database View to change the current configuration, click Go to Database View.

Starting the generation job

On the Confirm Generation panel, after you confirm the generation details, to start the data generation, click Run Generation.

When upsert is enabled, to start the data generation and upsert jobs:

Click Run Generation + Upsert.
In the menu, click Run Generation + Upsert.

Starting an upsert job based on the most recent data generation

If upsert is enabled for a workspace, then on the Confirm Generation panel, the more common option is to run both data generation and upsert.

After you run at least one successful data generation to the intermediate database, then you can also choose to run only the upsert process.

For example, if the data generation succeeds but the upsert process fails, then after you address the issues that caused the upsert to fail, you can run the upsert process again.

You also must start the upsert job manually if you turn off Automatically Start Upsert After Successful Data Generation in the workspace settings.

From the Confirm Generation panel, to run upsert only:

Click the Run Generation + Upsert button.
In the menu, click Run Upsert Only.

When you run upsert only, the process uses the results of the most recent data generation.

Scheduling data generation

Required workspace permission: Run data generation

Not available for Structural free trials, or for expired pay-as-you-go accounts.

To schedule the job, you set up one or more configurations. Each configuration includes a cron expression to specify the schedule.

Creating and editing a job schedule

You manage the job schedule from Jobs view.

To create a schedule for a job, click Create Schedule.

To edit an existing job schedule, click Edit Schedule.

Set up the schedule configurations, then click Save.

Adding a configuration to a schedule

To add a configuration to the schedule:

Click Add Configuration.
By default, the configuration is active, and Active is in the on position. To have Structural ignore the configuration, toggle Active to the off position.
In the Cron Expression field, provide the schedule expression. The default value is 0 0 * * *, which means to run the job every day at midnight.
From the time zone dropdown list, select the time zone to use for the schedule.

Overview of the cron expression syntax

A cron expression is made up of five values separated by a space.

<minute> <hour> <day of month> <month> <day of week>

Where:

<minute>

<hour>

The hour at which to run the job. Is a value between 0 and 23. Use * to run every hour. Use , to separate a list of values. For example, to run at 11:00 AM and 2:00 PM, use 11,14.

Use - to provide a range of values. For example, to run every hour between 5:00 PM and 9:00 PM, use 17-21.

<day of month>

The day of the month on which to run the job. Is a value between 1 and 31.

Use * to run every day.

Use , to separate a list of values. For example, to run on the 5th and the 15th of the month, use 5,15.

Use - to provide a range of values. For example, to run on the 8th through the 12th of the month, use 8-12.

<month>

The month in which to run the job. Can use either numbers (1 through 12) or 3-letter abbreviations (JAN through DEC).

Use * to run every month.

Use , to separate a list of values. For example, to run in March and September, use 3,9 or MAR,SEP.

Use - to provide a range of values. For example, to run every month between June and August, use 6-8 or JUN-AUG.

<day of week>

The day of the week on which to run the job. Can use either numbers (0 through 6) or 3-letter abbreviations (SUN through SAT).

Use * to run every day.

Use , to separate a list of values. For example, to run every Monday, Wednesday, and Friday, use 1,3,5 or MON,WED,FRI.

Use - to provide a range of values. For example, to run every week from Tuesday through Thursday, use 2-4 or TUE-THU.

Here are some example expressions:

15 * * * *

Every hour at 15 minutes past the hour.

15 10,14,18 * * *

Every day at 10:15 AM, 2:15 PM, and 6:15 PM.

0 0 15 * *

Midnight on the 15th of every month.

0 0 10-15 6 *

Midnight on June 10 through June 15th.

30 12 * * TUE

Every Tuesday at 12:30 PM.

Removing a configuration from the schedule

To remove a configuration from the schedule, click its Delete option.

If you remove all of the configurations and then save, then the job is no longer scheduled.

Scheduled jobs and other workspace configuration

When a job runs on a schedule, you cannot configure the options that are available from the Confirm Generation panel when you run data generation manually.

Subsetting

For a workspace that has subsetting configured, whether the scheduled job generates a subset is based on the Use Subsetting toggle on Subsetting view.