Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Required license: Professional or Enterprise
Not compatible with writing output to a container repository or a Tonic Ephemeral snapshot.
By default, Tonic Structural data generation replaces the existing destination database with the transformed data from the current job.
Upsert adds and updates rows in the destination database, but keeps all of the other existing rows intact. For example, you might have a standard set of test records that you do not want to replace every time you generate data in Structural.
If you enable upsert, then you cannot write the destination data to a container repository or to a Tonic Ephemeral snapshot. You must write the data to a database server.
Upsert is currently only supported for the following data connectors:
MySQL
Oracle
PostgreSQL
SQL Server
For an overview of upsert, you can also view the video tutorial.
When upsert is enabled, the data generation job writes the generated data to an intermediate database. Structural then runs the upsert job to write the new and updated records to the destination database.
The destination database must already exist. Structural cannot run an upsert job to an empty destination database.
The upsert job adds and updates records based on the primary keys.
If the primary key for a record already exists in the destination database, the upsert job updates the record.
If the primary key for a record does not exist in the destination database, the upsert job inserts a new row.
To only update or insert records that Structural creates based on source records, and ignore other records that are already in the destination database, ensure that the primary keys for each set of records operate on different ranges. For example, allocate the integer range 1-1000 for existing destination database records that you add manually. Then ensure that the source database records, and by extension the records that Structural creates during data generation, use a different range.
Also note that when upsert is enabled, the Truncate table mode does not actually truncate the destination table. Instead, it works more like Preserve Destination table mode, which preserves existing records in the destination table.
To enable upsert, in the Upsert section of the workspace details, toggle Enable Upsert to the on position.
When you enable upsert for a workspace, you are prompted to configure the upsert processing and provide the connection details for the intermediate database.
When you enable upsert, Structural displays the following settings to configure the upsert process.
Disable Triggers
Indicates whether to disable any user-defined triggers before the upsert job runs. This prevents duplicate rows from being added to the destination database. By default, this is enabled.
Automatically Start Upsert After Successful Data Generation
Persist Conflicting Data Tables
When an upsert job cannot process rows with unique constraint conflicts, as well as rows that have foreign keys to those rows, this setting indicates whether to preserve the temporary tables that contain those rows. By default, this is disabled. Structural only keeps the applicable temporary tables from the most recent upsert job.
Warn on Mismatched Constraints
Indicates whether to treat mismatched foreign key and unique constraints between the source and destination databases as warnings instead of errors, so that the upsert job does not fail. By default, this is disabled.
Required license: Enterprise
The intermediate database must have the same schema as the destination database. If the schemas do not match, then the upsert process fails.
To ensure that schema changes are automatically reflected in the intermediate database, you can connect the workspace to your own database migration script or tool. Structural then runs the migration script or tool whenever you run upsert data generation.
When you start an upsert data generation job:
If migration is enabled, Structural calls the endpoint to start the migration.
Structural cannot start the upsert data generation until the migration completes successfully. It regularly calls the status check endpoint to check whether the migration is complete.
When the migration is complete, Structural starts the upsert data generation.
Required. Structural calls this endpoint to start the migration process specified by the provided URL.
The request includes:
Any custom parameter values that you add.
The connection information for the intermediate database.
The request uses the following format:
The response contains the identifier of the migration task.
The response uses the following format:
Required. Structural calls this endpoint to check the current status of the migration process.
The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter.
The response provides the current status of the migration task. The possible status values are:
Unknown
Queued
Running
Canceled
Completed
Failed
The response uses the following format:
Optional. Structural calls this endpoint to retrieve the log entries for the migration process. It adds the migration logs to the upsert logs.
The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter
The response body of the request should be 'text/plain'
. It contains the raw logs.
Optional. Structural calls this endpoint to cancel the migration process.
The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or query parameter.
To enable the migration process, toggle Enable Migration Service to the on position.
When you enable the migration process, you must configure the POST Start Schema Changes
and GET Status of Schema Change
endpoints.
You can optionally configure the GET Schema Change Logs
and DELETE Cancel Schema Changes
endpoints.
To configure the endpoints:
To configure the POST Start Schema Changes
endpoint:
In the URL field, provide the URL of the migration script.
Optionally, in the Parameters field, provide any additional parameter values that your migration scripts need.
To configure the GET Status of Schema Change
endpoint, in the URL field, provide the URL for the status check.
The URL must include an {id}
placeholder. This is used to pass the identifier that is returned from the Start Schema Changes
endpoint.
To configure the GET Schema Change Logs
endpoint, in the URL field, provide the URL to use to retrieve the logs.
The URL must include an {id}
placeholder. This is used to pass the identifier that is returned from the Start Schema Changes
endpoint.
To configure the DELETE Cancel Schema Changes
endpoint, in the URL field, provide the URL to use for the cancellation.
The URL must include an {id}
placeholder. This is used to pass the identifier that is returned from the Start Schema Changes
endpoint.
When you enable upsert, you must provide the connection information for the intermediate database.
For details, go to the workspace configuration information for the data connector.
Most workspaces that connect to a database have a Block data generation if schema changes detected toggle. The setting is usually in the Source Settings section.
By default, the option is turned off. When the option is off, Structural only blocks data generation when there are conflicting schema changes. Structural does not block data generation when there are non-conflicting schema changes.
If this option is turned on, then if Structural detects any changes at all to the schema, then data generation is blocked until you resolve the schema changes. For more information, go to .
For generators where is enabled, a statistics seed enables consistency across data generation runs. The Structural-wide statistics seed value ensures consistency across both data generation runs and workspaces.
You use the Override Statistics Seed setting to override the Structural-wide statistics seed value. For workspaces that connect to a database, the setting is under Destination Settings. For a file connector workspace, the setting is under Output Location.
You can either disable consistency across data generations, or provide a seed value for the workspace. The workspace seed value ensures consistency across data generation runs for that workspace, and across other workspaces that have the same seed value.
For details about using seed values to ensure consistency across data generation runs and databases, go to .
Every workspace includes the following settings to identify the workspace and to select the type of data connector.
All workspaces have the following fields that identify the workspace:
In the Workspace name field, enter the name of the workspace.
In the Workspace description field, provide a brief description of the workspace. The description can contain up to 200 characters.
In the Tags field, provide a comma-separated list of tags to assign to the workspace. For more information on managing tags, go to .
Under Connection Type, select the type of data connector to use for the workspace data. You cannot change the connection type on a .
The Basic and Professional licenses limit the number and type of data connectors you can use.
A Basic instance can only use one data connector type, which can be either PostgreSQL or MySQL. After you create your first workspace, any subsequent workspaces must use the same data connector type.
A Professional instance can use up two different data connector types, which can be any type other than Oracle or Db2 for LUW. After you create workspaces that use two different data connector types, any subsequent workspaces must use one of those data connector types.
If you don't see the database that you want to connect to, or you want to have different database types for your source and destination database, contact support@tonic.ai.
When you select a connector type, Structural updates the view to display the connection fields used for that connector type. The specific fields vary based on the .
Required license: Enterprise
Required global permission: Manage secrets managers
Your organization might use a secrets manager to secure credentials, including database connection credentials.
For data connector credentials, you can configure a set of available secrets managers. In the workspace configuration, users can then select a secret name from a secrets manager.
Structural currently supports AWS Secrets Manager.
Structural only supports secrets that store passwords. For AWS Secrets Manager, the passwords must be in one of the following formats:
String
JSON
The JSON must contain a map of key-value pairs. It can either:
Contain a single key for which the value is the password in plaintext.
Contain a key that is labeled either password
or pw
, for which the value is the password in plaintext.
To display the list of secrets managers, on Structural Settings view, click Secrets Manager.
To create a secrets manager:
On the Secrets Manager tab, click Add Secrets Manager.
On the Create Secrets Manager panel, in the Name field, provide a name to use to identify the secrets manager. Secrets manager names must be unique. The name is used in the secrets manager dropdown list on the workspace settings view.
From the Type dropdown list, select the secrets manager product. Structural currently supports AWS Secrets Manager.
Configure the credentials to use to connect to the secrets manager.
Click Save.
For an existing secrets manager, you can change the name and the credentials configuration.
You cannot change the type.
To edit an existing secrets manager:
In the secrets manager list, click the edit icon for the secrets manager.
On the Edit Secrets Manager panel, update the configuration.
Click Save.
When you delete a secrets manager, it is removed from the workspace database connections that use it. Structural is no longer able to connect to those databases.
To delete a secrets manager:
In the secrets manager list, click the delete icon for the secrets manager.
On the confirmation panel, click Delete.
The AWS Secrets Manager credentials that you provide must have the following permissions:
secretsmanager:ListSecrets
On each secret to use, secretsmanager:GetSecretValue
On the encryption key for secrets that are encrypted with a customer managed key (CMK), kms:Decrypt
For AWS Secrets Manager, under Authentication, select the source of the credentials:
Environment - Only available on self-hosted instances. Indicates to use either:
The credentials for the AWS Identity and Access Management (IAM) role on the host machine.
TONIC_AWS_ACCESS_KEY_ID
- An AWS access key that is associated with an IAM user or role
TONIC_AWS_SECRET_ACCESS_KEY
- The secret key that is associated with the access key
TONIC_AWS_REGION
- The AWS Region to send the authentication request to
Assumed role - Indicates to use the specified assumed role.
User credentials - Indicates to use the provided user credentials.
To provide an assumed role, click Assume Role, then:
In the Role ARN field, provide the Amazon Resource Name (ARN) for the role.
In the Session Name field, provide the role session name.
If you do not provide a session name, then Structural automatically generates a default unique value. The generated value begins with TonicStructural
.
In the Duration (in seconds) field, provide the maximum length in seconds of the session. The default is 3600, indicating that the session can be active for up to 1 hour. The provided value must be less than the maximum session duration that is allowed for the role.
From the AWS Region dropdown list, select the AWS Region to send the authentication request to.
Structural generates the external ID that is used in the assume role request. Your role’s trust policy must be configured to condition on your unique external ID.
Here is an example trust policy:
To provide the credentials, click User Credentials, then:
In the AWS Access Key field, enter the AWS access key that is associated with an IAM user or role.
In the AWS Secret Key field, enter the secret key that is associated with the access key.
Optional. In the AWS Session Token field, provide the session token to use.
From the AWS Region dropdown list, select the AWS Region to send the authentication request to.
The workspace settings for a new workspace (New Workspace view) or edited workspace (Workspace Settings tab) provide information about the workspace and its data.
After you select the connector type, you configure:
Where to find the source data
Where to write the data generation output
For data connectors that connect to a database, the Source Settings section provides connection information for the source database.
You cannot change the source data configuration for a .
For information about the source connection fields for a specific data connector, go to the workspace configuration topic for that .
For data connectors that support upsert, the workspace configuration includes an Upsert section to allow you to enable and configure upsert. Upsert adds and updates rows in the destination database, but keeps all other existing rows intact.
If you enable upsert, then you cannot write output to an Ephemeral database or to a container repository. You must write the output to a destination database.
For more information, go to .
For data connectors that connect to a database, the Destination Settings section provides information about where and how Structural writes the output data from data generation.
Depending on the data connector type, you might be able to write to either:
Destination database - Writes the output data to a destination database on a database server.
Ephemeral snapshot - Writes the output data to a Tonic Ephemeral user snapshot.
Container repository - Writes the output data to a data volume in a container repository.
When you write the output to a destination database, the destination database must be of the same type as the source database.
Structural does not create the destination database. It must exist before you generate data.
If available, the Copy Settings from Source allows you to copy the source connection details to the destination database, if both databases are in the same location. Structural does not copy the connection password.
If Ephemeral supports your workspace database type, then you can write the destination data to a snapshot in Ephemeral. For data larger than 10 GB, this option is recommended instead of writing to a container repository.
From Ephemeral, you can use the snapshot to start new Ephemeral databases.
Some data connectors allow you to write the transformed data to a data volume in a container repository instead of to a database server.
You can use the resulting data volume to create a database in Tonic Ephemeral. If you do plan to use the data to start an Ephemeral database, and the size of the data is larger than 10 GB, then the recommendation is to write the data to an Ephemeral user snapshot instead.
When you provide connection details for a database server, Structural provides a Test Connection button to test the connection, and verify that Structural can use the connection details to connect to the database. Structural uses the connection details to try to reach the database, and indicates whether it succeeded or failed. We strongly recommend that you test the connections.
For file connector workspaces, the File Location section indicates where the source files are obtained from - either a local file system or a cloud storage solution (Amazon S3 or Google Cloud Storage).
When the files come from cloud storage, the Output Location section indicates where to write the transformed files. You must also provide the cloud storage connection credentials.
Only available for PostgreSQL, MySQL, SQL Server, and Oracle.
Not compatible with upsert.
Not compatible with Preserve Destination or Incremental table modes.
Tonic Ephemeral is a separate Tonic.ai product that allows you to create temporary databases to use for testing and demos. For more information about Ephemeral, go to the .
If Ephemeral supports your workspace database type, then you can write the destination data to a snapshot in Ephemeral. You can then use the snapshot to start Ephemeral databases.
To write the transformed data to Ephemeral, under Destination Settings, click Ephemeral Database.
Structural can write the data snapshot to either Ephemeral Cloud or to a self-hosted instance of Ephemeral. By default, Structural writes the data snapshot to Ephemeral Cloud.
All workspaces on the same self-hosted Structural instance or in the same Structural Cloud organization must write to the same instance of Ephemeral. When you change the Ephemeral output configuration in one workspace, it is automatically changed in other workspaces that write to Ephemeral.
For Ephemeral Cloud, Structural writes the snapshot to the account for the user who runs the data generation job. If that user has an Ephemeral account on Ephemeral Cloud, then Structural uses that account. If the user does not have an account, then Structural creates a two-week Ephemeral free trial account for the user.
Note that if you are on a self-hosted instance of Ephemeral, then you must always provide an Ephemeral API key.
To write a snapshot to Ephemeral Cloud:
Click Tonic Ephemeral cloud.
If you are on a self-hosted instance of Structural:
In the API Key field, provide an Ephemeral API key from your Ephemeral account.
To test the connection, click Test Connection.
To write the snapshot to a self-hosted instance of Ephemeral:
Click Tonic Ephemeral self-hosted.
In the API Key field, provide an Ephemeral API key from your Ephemeral account. Structural writes the snapshot to the Ephemeral account that is associated with the API key.
In the Tonic Ephemeral URL field, provide the URL to your self-hosted Ephemeral instance.