Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
When you go to Tonic Structural for the first time, you create an account. How you create an account depends on the type of user you are.
A new Structural user can be one of the following:
A completely new user who is starting a Structural 14-day free trial. Free trial users use Structural Cloud to explore and experiment with Structural before they decide whether to purchase it.
A new user on a self-hosted Structural instance. Self-hosted instances are installed on-premises. The customer administers the Structural users.
A new user in an existing Structural Cloud organization. New users are added to existing organizations based on their email domain.
The workspace settings for a new workspace (New Workspace view) or edited workspace (Workspace Settings tab) provide information about the workspace and its data.
The Tonic Structural synthetic data platform combines sensitive data detection and data transformation to allow users to create safe, secure, and compliant datasets.
Common Structural use cases include creating staging and development environments, and trying out a new cloud provider without complex data agreements.
Structural allows you to reduce bug counts, shorten testing life cycles, and share data with partners, all while helping to ensure security and compliance with the latest regulations, from GDPR to CCPA.
You can use the Structural API to integrate with CI/CD pipelines or to create automated processes that ensure that the generated data is available on demand.
Every workspace includes the following settings to identify the workspace and to select the type of data connector.
All workspaces have the following fields that identify the workspace:
In the Workspace name field, enter the name of the workspace.
In the Workspace description field, provide a brief description of the workspace. The description can contain up to 200 characters.
In the Tags field, provide a comma-separated list of tags to assign to the workspace. For more information on managing tags, go to .
Under Connection Type, select the type of data connector to use for the workspace data. You cannot change the connection type on a .
The Basic and Professional licenses limit the number and type of data connectors you can use.
A Basic instance can only use one data connector type, which can be either PostgreSQL or MySQL. After you create your first workspace, any subsequent workspaces must use the same data connector type.
A Professional instance can use up two different data connector types, which can be any type other than Oracle or Db2 for LUW. After you create workspaces that use two different data connector types, any subsequent workspaces must use one of those data connector types.
If the database that you want to connect to isn't in the list, or you want to have different database types for your source and destination database, contact [email protected].
When you select a connector type, Structural updates the view to display the connection fields used for that connector type. The specific fields vary based on the .
Tonic Structural data generation combines sensitive data detection and data transformation to create safe, secure, and compliant datasets.
The Structural data generation workflow involves the following steps:
You can also view this video overview of the Structural data generation workflow.
To get started, you create a workspace. When you create a workspace, you identify the type of source data, such as PostgreSQL or MySQL, and establish the connections to the source database and the destination location. The source database contains the original data that you want to synthesize. The destination location is where Structural stores the synthesized data. It might be a database, a storage location, a container repository, or an Ephemeral database snapshhot.
Next, you analyze the results of the initial sensitivity scan. The sensitivity scan identifies columns that contain sensitive data. These columns need to be protected by a generator.
Based on the sensitivity scan results, you configure the data generation. The configuration includes:
Assigning table modes to tables. The table mode controls the number of rows and columns that are copied to the destination database.
Indicating column sensitivity. You can make adjustments to the initial sensitivity assignments. For example, you can mark additional columns as sensitive that the initial scan did not identify as sensitive.
Assigning and configuring column generators. To protect the data in a column, especially a sensitive column, you assign a generator to it. The generator replaces the source value with a different value in the destination database. For example, the generator might scramble the characters or assign a random value of the same type.
After you complete the configuration, you run the data generation job. The data generation job uses the configured table modes and generators to transform the data from the source database and write the transformed data to the destination location. You can track the job progress and view the job results.
A Tonic Structural workspace provides a context within which to configure and generate transformed data.
A workspace represents a path between the source data and the transformed output data. For example, postgres-prod-copy to postgres-staging.
A workspace includes:
Where to find the source data to transform during data generation
Where to write the transformed data
The rules for the transformation
You can .
For a self-hosted instance, Structural provides administrator tools that allow you to and .
You can to customize your instance.
On a self-hosted instance, based on your , you have access to the full set of supported data connectors.
Structural Cloud is our secure hosted environment. On Structural Cloud, Tonic.ai handles monitoring Structural services and updating Structural.
For , Structural Cloud only supports Okta.
Structural Cloud does not include:
. Structural Cloud uses a single configuration.
Access to the following data connectors:
Structural Cloud also supports a pay-as-you-go plan, where free trial users can move on to set up a monthly subscription. For more information, go to .
Each Structural Cloud user belongs to a Structural Cloud organization, which is determined either by the user's email domain or by a workspace invitation. Structural Cloud users do not have any access to workspaces or users from other organizations.
Each free trial user is in a separate organization, along with any users that they invite to have access to a free trial workspace.
For information about Structural Cloud organizations, go to .
The Account Admin permission set allows a Structural Cloud user to manage organization users and workspaces. For information about granting access to the Account Admin permission set, go to .
A Tonic Structural implementation can involve the following roles - from those who set up the Structural environment to the consumers of the data that Structural processes.
Note that these roles are not related to role-based access (RBAC) within Structural, which is managed using .
For self-hosted instances of Structural.
Infrastructure engineers set up the Structural application and its relevant dependencies. They are typically DevOps, Site Reliability Engineering (SRE), or Kubernetes cluster administrators.
Infrastructure engineers perform the following Structural-related tasks:
Ensure that the proper infrastructure is ready for Structural installation based on the .
. Works with Tonic.ai support as needed.
Perform routine maintenance of Structural and the Structural environment. and its dependencies as needed.
Create Structural-processed data pipelines for development and testing workflows.
For both self-hosted instances of Structural and Structural Cloud.
Database administrators integrate Structural into your data architecture to support .
They ensure that source databases are available to Structural, and that Structural can write to destination databases.
perform the following Structural-related tasks:
Set up the required Structural access to source databases.
Set up destination databases for Structural to write transformed data to.
Structural users are the actual users of the Structural application.
Depending on the use case, Structural users might be compliance analysts, DevOps, or data engineers.
Tonic users perform the following Structural-related tasks:
Use the to configure the logic used to transform the source data and to generate the transformed data.
Work with data consumers to produce usable data.
Data consumers are the end users of transformed destination data.
They are typically QA testers, developers, or analysts.
Data consumers perform the following Structural-related tasks:
Validate the usability of the destination data.
Provide guidance on application-specific requirements for data.
Security and compliance specialists ensure and validate that the data that Structural produces meets expectations, and that Structural is compliant with other security-related processes.
Security and compliance specialists perform the following Structural-related tasks:
Provide guidance on what data is sensitive.
Sign off on proposed approaches to mask sensitive data.
Approve data access and permissions.

You can associate custom tags with each workspace. Tags can help to organize and provide a quick glance into the workspace configuration.
Tags are accessible to every user that has access to the workspace.
Tags are stored in the workspace JSON, and are included in the workspace export. You can also use the API to get access to tags.
You can add and edit tags in the Tags field on the New Workspace and Workspace Settings views.
To add tags, enter a comma-separated list of the tags to add.
To remove a tag, click its delete icon.
You can also manage tags directly from Workspaces view.
To add tags to a workspace that does not currently have tags:
Hover over the Tags column for the workspace.
Click Add Tags.
In the tag input field, type a comma-separated list of tags to apply.
Press Enter.
To edit the assigned tags:
Click the Tags column for the workspace.
In the tag input field, to remove tag, click its delete icon.
To add tags, type a comma-separated list of the tags to add.
To save the tag changes, press Enter.
Every workspace has an owner. The owner is always a user.
The user who creates the workspace is automatically the owner of the workspace.
By default, the workspace owner is assigned the built-in Manager workspace permission set. On Enterprise instances, you can choose a different workspace permission set to assign to all workspace owners.
You cannot remove that permission set from the workspace owner.
You can transfer a workspace to a different owner. The new owner is assigned the owner permission set. If the previous owner does not otherwise have access to the owner permission set, then that permission set is removed.
To transfer workspace ownership:
To transfer ownership of a single workspace, from the workspace actions menu, select Transfer Ownership.
To transfer ownership of multiple workspaces:
Check the checkbox for each workspace to grant access to.
From the Actions menu, select Transfer Ownership.
On the transfer ownership panel, from the User dropdown list, select the new owner.
If you are the current owner of the workspace, then to grant yourself non-owner access after you transfer the ownership:
Toggle Receive access to workspace to the on position.
Select the workspace permission set to assign to yourself.
Click Transfer Ownership.
For each column on Database View, you can display a sample list of the column values.
For columns that have an assigned generator, the sample shows both the current values and the possible values after the generator is applied.
To display the sample values, in the Column column, click the magnifying glass icon.
If the generator is Passthrough, then the sample data panel contains only Original Data.
If a different generator is assigned, then the sample data panel contains both Original Data and Protected Output.
During sensitivity scans and schema change scans, Tonic Structural identifies groups of similar columns.
To identify similar columns, Structural uses a text embedding model to calculate the semantic similarity between any two column names in the database. When a column name's semantic similarity to the name of a given column is above a specified threshold, then the column is similar to the given column.
If a column has similar columns, then the Applied Generator column contains an icon that includes the count of similar columns.
By default, the similar columns icon is hidden. To display the similar columns icon, hover over the column row.
When you assign a generator to a column, the similar columns icon for that column remains visible during your current session.
When you click the similar columns icon, Structural displays a panel with an option to filter the list to display the current column and its similar columns. To apply the filter, click Filter.
The similar columns filter is applied, and other column filters are removed. Table filters remain in place.
Workspaces view lists the workspaces that you have access to. To display Workspaces view, in the Tonic Structural heading, click Workspaces.
The workspace list contains:
Workspaces that you own
Workspaces that you are granted access to
If you have the global permission Copy any workspace or Manage user access to Tonic and to any workspace, then list includes all of the workspaces.
The Permissions column lists the workspace permission sets that you are granted in each workspace. The permission sets include both permission sets that were granted to you directly as a user, and permission sets that were granted to an SSO group that you are a member of.
always display under their parent workspace. The list only includes child workspaces that you have access to. If you have access to a child workspace, but not to its parent workspace, then the parent workspace is grayed out. You cannot select it.
You can filter the workspaces based on the following information:
Name - In the filter field, begin to type text that is in the name of the workspaces to display in the list.
Owner - From the Filter by Owner dropdown list, select the owner of the workspaces to display in the list.
Database type - From the Filter by Database Type dropdown list, select the type of database for the workspaces to display in the list.
Generation status - In the Generation Status column heading, click the filter icon. Check the checkbox next to the generation status values for the workspaces to display in the list.
Tags - In the Tags column heading, click the filter icon. By default, the workspaces are not filtered by tag, and all of the checkboxes are unchecked. To only include workspaces that have specific tags, check the checkbox next to each tag to include. To uncheck all of the selected tags, click Reset Tags. When you filter by tag, Structural checks whether each workspace contains any of the selected tags.
Permissions - In the Permissions column heading, click the filter icon. You can check and uncheck checkboxes to include or exclude specific permission sets. For example, you can filter the list to only display workspaces for which the Editor permission set is granted either to you or to an SSO group that you belong to. For users that have the global permission Copy any workspace, the Permissions filter panel also contains an Any permissions checkbox. By default, Any permissions is unchecked, and the list includes workspaces for which you are not assigned any workspace permission sets. To display all of the workspaces for which you have any assigned workspace permission sets, check Any permissions. If you filter the list based on a specific permission set, to clear the filter and show all workspaces for which you have any permission set, check Any permissions. To display all workspaces, including workspaces that you do not have any permissions for, uncheck Any permissions.
You can combine different filters. For example, you can filter the list to only include workspaces that use PostgreSQL and for which the generation status is Canceled or Failed.
Child workspaces always display under their parent workspace, even if the parent workspace does not match the filter.
You can sort the workspace list by name, status, or owner.
By default, the list is sorted alphabetically by name.
To sort by a column, click the column heading. To reverse the order of the sort, click the column heading again.
Child workspaces always display under their parent workspace. The child workspaces are sorted within the parent.
Workspaces view provides the following information about each workspace:
Name - Contains the name and database type for the workspace. To view the workspace description, hover over the name.
Generation status - The status for the most recent generation job. To display the job details for the job, click the job status. To display more details about the date, time, and duration for the job, hover over the generation timestamp. If a job failed recently, you are given additional information about how long this job has been failing (the date of the first failure occurrence among a continuous series of failures).
Schema changes - Indicates whether Structural detected changes to the source database schema. If there are changes, the column shows the number of changes. Hover over the column value to display additional details, and to navigate to the Schema Changes view. Go to .
Tags - The tags that are assigned to the workspace.
Permissions - The permission sets that are assigned to you for the workspace.
Owner - The name and email address of the workspace owner.
On Workspaces view, when you click the workspace name, the for the workspace is displayed. The Privacy Hub tab is selected.
The Name column also provides access to a menu of workspace configuration options. When you select an option, the is displayed, open to the view for the selected option.
The last column in the workspaces list provides additional workspace options:
Subsetting icon - Displays the subsetting configuration for the workspace. Go to .
Post-job actions icon - Displays the post-job actions for the workspace. For more information, go to and .
Actions menu - Provides access to additional options.
The Actions menu at the top left of the workspaces list allows you to to perform bulk actions on multiple workspaces. It is enabled when you check one or more of the checkboxes in the first column of each row. The Actions menu provides options for the selected workspaces.
When you create a new workspace, you can either:
The copy initially uses the configuration from the original workspace. After the copy is created, it is completely independent from the original workspace.
Child workspaces inherit configuration from the parent workspace. They continue to be updated automatically when the parent workspace is updated. For more information, go to .
You can also view this .
To create a completely new workspace, on Workspaces view, click Create Workspace > New Workspace.
To create a workspace based on an existing workspace, either:
On the workspace management view of the workspace to copy, from the workspace actions menu, select Duplicate Workspace.
On Workspaces view, click the actions menu for the workspace, then select Duplicate Workspace.
When you create a copy of a workspace, the copy initially inherits the following workspace configuration:
Source and destination database connections
Sensitivity designations, including manual designations that override the sensitivity scan results
Table mode assignments
Generator configuration
Subsetting configuration
Post-job scripts
You can create a workspace that is a child of an existing workspace. You cannot create a child workspace of another child workspace.
The parent workspace must have a source database configured. You cannot create a child workspace from a workspace that uses the Databricks, Spark (Amazon EMR or self-managed Spark cluster), or MongoDB data connector.
To create a child workspace, either:
On Workspaces view:
Click Create Workspace > Child Workspace.
Click the actions menu for the parent workspace, then select Create Child Workspace.
On the workspace management view, from the workspace actions menu, select Create Child Workspace.
On the New Workspace view, under Child Workspace, Parent Workspace identifies the parent workspace.
If you used the Create Workspace > Child Workspace option to create the child workspace, then Parent Workspace is not populated. From the Parent Workspace dropdown list, select the parent workspace for the new child workspace.
If you selected the child workspace option for a specific workspace, then Parent Workspace is set to that workspace.
If you originally chose to create a completely new workspace, then on the New Workspace view:
To change to a child workspace, select Create Child Workspace from the Create a child workspace panel at the right. Structural adds the Child Workspace panel to the New Workspace view.
From the Parent Workspace dropdown list, select the parent workspace for the new child workspace.
To edit the configuration for an existing workspace, either:
On the workspace management view:
On the workspace navigation bar, click Workspace Settings.
From the workspace actions menu, select Workspace Settings.
On Workspaces view, click the actions menu for the workspace, then select Workspace Settings.
You can delete workspaces that you no longer need.
You cannot delete a parent workspace. You must first delete all of its child workspaces.
To delete a workspace:
On the workspace management view, from the workspace actions menu, select Delete Workspace.
On the Workspaces view, click the actions menu for the workspace, then select Delete.
Use these tutorial videos to learn more about how to use Tonic Structural.
Provides an overview of the Structural workflow and how to use Structural to generate de-identified data. For more information, go to .
Provides an overview of what a Structural workspace is and how to create a new Structural workspace. For more information, go to .
Provides an overview of how Structural detects sensitive values and how you can apply recommended generators to the detected values.
Provides an overview of workspace owners, permissions, and permission sets. Explains how to share and transfer ownership of a workspace. For more information, go to .
Identifies the types of generators and transformations that you can use in Structural, and explains how to assign a generator to a column. For more information, go to .
Provides an overview of generator presets. Includes how to create and update them, and how to track where each generator preset is used. For more information, go to .
Provides an overview of the file connector and how to manage file groups in a file connector workspace. For more information, go to .
Provides an overview of the consistency generator property and how it works. For more information, go to .
Provides an overview of how to enable Document View for a JSON column and how to use it to configure generators for JSON fields.
Provides an overview of subsetting, how it is configured, and how Structural uses the configuration to generate a subset. For more information, go to .
Provides an overview of upsert data generation. Includes how it works and how to enable and run it for a workspace. For more information, go to .
Provides an overview of how to write destination data to a container repository instead of a database server. For more information, go to .
From the User Settings view, you can manage settings for your individual Tonic Structural account.
To display the User Settings view:
Click your user image at the top right.
In the menu, click User Settings.
The User Settings view includes options to:
(if your Structural instance does not use SSO).
You can select an image to associate with your account. The image is displayed next to your name and email address throughout Structural.
If your instance uses Google or Azure single sign-on (SSO) to manage Structural users, then by default your Structural account image is the image from the SSO.
Otherwise, the default image displays your initials.
To change your user image, click Upload, then select the image file.
Below your user image file name is the identifier of the organization that your account belongs to.
To copy the identifier, click the copy icon.
Structural allows users to provide comments on columns. You can do this from and .
From the Comment Notification Settings section of User Settings, you can configure when to receive email notifications for comments.
The available options are:
I am an owner, editor, auditor, or am being replied to This is the default option. You receive email notifications when comments are made on columns in a workspace that you are an owner, editor, or auditor for. You also receive an email notification when someone replies to a comment that you made.
I am @ mentioned You only receive an email notification if someone specifically mentions you in a comment.
Never You never receive email notifications for column comments.
Before you can use the Structural API, you must create an API token. From the User API Tokens section of the User Settings view, you can create and manage API tokens.
To create an API token:
Click Create Token.
On the Create New Token dialog, enter a name for the new token.
Click Confirm.
In the list, the new token displays as clear text. To copy the new token, click the copy icon next to the token.
The new token text and copy icon only display during the current session. After that, Structural masks the token and removes the copy icon.
To revoke a token, click the Revoke option for the token.
If your Structural account is not managed using SSO, then from User Settings, you can change your Structural password.
If your Structural instance uses SSO to manage users, then your user credentials are managed in the SSO system. You cannot change your user password in Structural.
Under Password Change, to change your Structural password:
In the Old Password field, type your current Structural password.
In the New Password field, type your new Structural password.
In the Repeat New Password field, type your new Structural password again.
Click Confirm.
From User Settings, you can delete your Structural account. If your instance uses SSO to manage users, then deleting your account only affects your access to Structural.
You cannot delete your Structural account if you are the owner of a workspace for which other users are granted access. Before you can delete your Structural account, you must either:
To delete your Structural account, click Delete Account.
When you delete your account, you are logged out of Structural.
The minimum screen width is 1120 pixels.
If the locally running database that you want to connect to runs in a Docker container:
Run: docker inspect
In the networks section of the results, find the Gateway IP address.
Use this IP address as the server address in Structural.
If the locally running database does NOT run in a container, but runs on the machine, then:
On Windows or Mac, use host.docker.internal.
On Linux, use 172.17.0.1, which is the IP address of the docker0 interface.
If you use Structural Cloud, and your database only allows connections from allowlisted IP addresses, then you need to allowlist Structural static IP addresses.
This is not required for self-hosted instances of Structural.
For the United States-based instance (), the static IP address is:
54.92.217.68
For the Europe-based instance (), the static IP address is:
3.69.249.144
The URL https://telemetry.tonic.ai/ is used for our Amplitude telemetry.
https://telemetry.tonic.ai/logs is used specifically for log sharing.
Allowlist https://telemetry.tonic.ai/ or the following IP address:
44.193.110.147
Telemetry sharing is required. These metrics are valuable for us as we debug, make product roadmaps, and determine feature viability.
No customer data is included. For more information about the specific telemetry data that we collect, go to .
For more information on how to verify that telemetry is shared, go to .
To support the one-click update option, Structural needs to be able to retrieve information about the latest Structural version.
For more information, go to .
Click your user image at the top right. The menu includes the Tonic version.
We recommend that you use a static copy of your production database that was restored from a backup.
If that's not possible, consider the following when you connect Structural to your source data:
Structural cannot guarantee referential integrity of the output data if the source database is written to while data is generated. For this reason we recommend that you connect to a static copy of production data.
Read replicas and fast followers can be problematic for Structural because of how long it takes some queries to run. Read replicas tend to have short query timeout limits, which causes the queries to time out. Read replicas also reflect recent writes, which means that we cannot guarantee the referential integrity of the output.
For details about the types of data that Tonic.ai does and does not collect, go to .
On Workspace Settings view for a workspace, the schema management settings are generally at the end of the Source Settings section.
Schema changes include:
Schema changes that could expose data, which if not addressed can result in data leakage. These changes include new tables and columns, and changes to data types.
Notifications, which Structural can handle automatically during each data generation. These include removed tables and columns.
On the Workspace Settings view, under Block Data Generation on Schema Changes, select how Structural responds when there are unaddressed changes to the database schema.
The options are:
Do Not Block - With this option, schema changes never block data generation. Structural ignores sensitive schema changes, and automatically handles notifications during data generation.
Block On Changes That Could Expose Data - Indicates to only block data generation if there are schema changes that might expose data, such as new columns. Structural automatically handles notifications during data generation. For this option, Structural does not block data generation for schema changes on truncated tables.
Block On All Changes - For this option, if there are any unaddressed schema changes at all, either sensitive changes or notifications, then data generation fails.
For more information, go to .
By default, every time you load a workspace, Structural queries the source database to retrieve the schema.
You can instead configure the workspace to cache the schema. Structural then updates the cache at a regular interval, and whenever a change to the workspace triggers a schema cache update.
You can also trigger a cache update manually.
By default, the schema cache is only used by calls from within Structural. To enable an external API request to use the cached schema, add the query parameter useSchemaCache=true to the request.
In the application, each update to the schema cache is represented by a schema retrieval job. Schema retrieval jobs are short-lived, and run on the Structural web server. You can view the schema retrieval jobs from the .
Note that the schema cache does not include the schema for JSON columns that use Document View. Those schemas are detected by a different scan.
To enable and configure the caching:
On the Workspace Settings view, toggle Cache source schema for faster loading to the on position.
Under Schema Freshness, configure the maximum length of time between schema retrievals.
In the field, provide the value.
From the dropdown list, select the unit of time. You can configure the length of time in minutes, hours, or days.
If the cached schema is older than that length of time, then the next time the application loads, it queries the source database for the current schema. The default value is 6 hours. Note that for some data connectors, schema retrievals run automatically in the background. This setting does not affect the frequency of those schema retrievals. For example, a schema retrieval runs automatically in the background every 2 hours. If you set the schema freshness to 6 hours, the background retrieval still runs every 2 hours. However, if you set the schema freshness to 1 hour, then schema retrieval occurs no more than 1 hour after the previous schema retrieval.
You can optionally enable diagnostic logging for the schema retrieval. Diagnostic logging adds additional diagnostic errors to help with troubleshooting. Note that this additional information might contain sensitive information such as schema identifiers. To enable diagnostic logging:
Click Show advanced options.
Toggle Enable diagnostic logging to the on position.
You can export a workspace configuration to a JSON file, and import configuration from a workspace configuration JSON file.
For example, you might want to preserve a version of the workspace configuration before you test other changes. You can then use the exported file to restore the original configuration.
Or you might want to use a script to make changes to an exported configuration file. You can then import the updated file to update the workspace configuration.
The workspace JSON configuration file includes the following information:
Sensitivity designations that you assigned to columns
Assigned table modes
Assigned column generators
Subsetting configuration
Post-job script configuration
To export the workspace configuration, either:
On the workspace management view, from the download menu, select Export Workspace.
On Workspaces view, click the actions menu for the workspace, then select Export.
When you export a child workspace, the exported workspace does not retain any of the inheritance information. The exported information is the same for all exported workspaces.
To import a workspace configuration file:
Select the import option. Either:
On the workspace management view, from the download menu, select Import Workspace.
On Workspaces view, click the actions menu for the workspace, then select Import.
On the Import Workspace dialog, to select the file to import, click Browse.
After you select the file, click Import.
When you import a workspace configuration into a child workspace, Tonic Structural only updates the configuration that can be overridden. If a configuration must be inherited from the parent workspace, then it is not affected by the imported configuration. For more information, go to .
When you create a workspace, you become the owner of the workspace, and by default are assigned the built-in Manager workspace permission set for the workspace. The Manager permission set provides full access to the workspace configuration, data, and results.
With a Professional or Enterprise license, you can also assign workspace permission sets to other users and to SSO groups. You can also transfer a workspace to a different owner.
If you are granted access to any workspace permission set for a workspace, then you have access to all of the workspace management views for that workspace. However, you can only perform tasks that you have permission for in that workspace.
Workspace access is managed from the Workspaces view. You cannot assign workspace permission sets from Structural Settings view.
You can also view an .
You can configure a workspace to , to reduce the number of times that Tonic Structural needs to query the source database.
When schema caching is enabled for a workspace, then in the , below the workspace name, Structural shows the current status of the schema cache.
The status indicates when:
Structural is retrieving the schema information.
Structural is checking for schema updates.
Structural is refreshing the schema cache.
Structural fails to connect to the source database.
The schema cache refresh fails.
The schema cache is updated. The status includes the timestamp of the most recent refresh.
When the schema cache is updated, or when Structural has detected updates to the schema, then the schema cache status includes an option to refresh the schema cache.
To refresh the schema:
Click the schema cache status.
On the panel, click Refresh Schema.
Structural starts a new schema retrieval job. To track the progress of the job, go to the .
Tonic Structural uses its sensitivity scan to identify source data columns that contain sensitive information. The scan ignores truncated tables.
The sensitivity scan identifies Structural's built-in sensitivity types. It also looks for custom types that you define.
You can also manually mark a column as sensitive or not sensitive.
You can also manually indicate that a column is sensitive or not sensitive.
For example, the sensitivity scan might incorrectly identify a column as sensitive. Or a column might contain data that you consider sensitive but that does not match a detected sensitivity type.
When you manually change a column from not sensitive to sensitive, Structural marks the sensitivity detection as full confidence.
For information on how to change whether a column is sensitive:
For Privacy Hub, go to .
For Database View, go to:
For a single column,
For multiple selected columns,
For Table View, go to .
The Structural API also provides .
Uses a single value to mask all of the values in the column.
For example, you can replace every value in a string column with the value String1. Or you can replace every value in a numeric column with the value 12345.
To configure the generator, in the Constant Value field, provide the value to use.
The value must be compatible with the field type. For example, you cannot provide a string value for an integer column.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This generator replaces all instances of the find string with the replace string.
For example, you can indicate to replace all instances of abc with 123.
To configure the generator:
In the Find field, type the string to look for in the source column value.
To use a regular expression to identify the source value, check the Use Regex checkbox.
If you use a regular expression, use backslash ( \ ) as the escape character.
In the Replace field, type the string to replace the matching string with.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates unique numeric strings of the same length as the input value.
For example, for the input value 123456, the output value would be something like 832957.
You can apply this generator only to columns that contain numeric strings.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
From Database View, you can add comments to columns. For example, you might use a comment to explain why you selected a particular generator or marked a column as sensitive or not sensitive.
If a column does not have any comments, then to add a comment:
In the Applied Generator column, click the comment icon.
In the comment field, type the comment text.
Click Comment.
When a column has existing comments, the comment icon is green. To add comments:
Click the comment icon. The comments panel shows the previous comments. Each comment includes the comment user.
In the comment field, type the comment text.
Click Reply.
Generates a random MAC address formatted string.
To configure the generator:
In the Bytes Preserved field, enter the number of bytes to preserve in the generated address.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
For document-based data connectors - currently and - Database View and Table View are replaced by Collection View. "Collection" is the term that Structural uses to refer to MongoDB collections and DynamoDB tables.
For JSON columns in file connector and PostgreSQL workspaces, you can use Document View to view and assign generators to JSON fields.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
















Scan collections Collection scans identify the fields and data types in a collection.





After you select the connector type, you configure:
Where to find the source data
Where to write the data generation output
For data connectors that connect to a database, the Source Settings section provides connection information for the source database.
You cannot change the source data configuration for a child workspace.
For information about the source connection fields for a specific data connector, go to the workspace configuration topic for that connector type.
For data connectors that support upsert, the workspace configuration includes an Upsert section to allow you to enable and configure upsert. Upsert adds and updates rows in the destination database, but keeps all other existing rows intact.
If you enable upsert, then you cannot write output to an Ephemeral database or to a container repository. You must write the output to a destination database.
For more information, go to Enabling and configuring upsert.
For data connectors that connect to a database, the Destination Settings section provides information about where and how Structural writes the output data from data generation.
Depending on the data connector type, you might be able to write to either:
Destination database - Writes the output data to a destination database on a database server.
Ephemeral snapshot - Writes the output data to a Tonic Ephemeral user snapshot.
Container repository - Writes the output data to a data volume in a container repository.
When you write the output to a destination database, the destination database must be of the same type as the source database.
Structural does not create the destination database. It must exist before you generate data.
In Destination Settings, you provide the connection information for the destination database. For information about the destination database connection fields for a specific data connector, go to the workspace configuration topic for that connector type.
If available, the Copy Settings from Source allows you to copy the source connection details to the destination database, if both databases are in the same location. Structural does not copy the connection password.
Tonic Ephemeral is a separate Tonic.ai product that allows you to create temporary databases to use for testing and demos. For more information about Ephemeral, go to the .
If Ephemeral supports your workspace database type, then you can write the destination data to a snapshot in Ephemeral. For data larger than 10 GB, this option is recommended instead of writing to a container repository.
From Ephemeral, you can use the snapshot to start new Ephemeral databases.
For more information, go to Writing output to Tonic Ephemeral.
Some data connectors allow you to write the transformed data to a data volume in a container repository instead of to a database server.
You can use the resulting data volume to create a database in Tonic Ephemeral. If you do plan to use the data to start an Ephemeral database, and the size of the data is larger than 10 GB, then the recommendation is to write the data to an Ephemeral user snapshot instead.
For more information, go to Writing output to a container repository.
When you provide connection details for a database server, Structural provides a Test Connection button to test the connection, and verify that Structural can use the connection details to connect to the database. Structural uses the connection details to try to reach the database, and indicates whether it succeeded or failed. We strongly recommend that you test the connections.
The environment setting TONIC_TEST_CONNECTION_TIMEOUT_IN_SECONDS determines the number of seconds before a connection test times out. You can configure this setting from the Environment Settings tab on Structural Settings. By default, the connection test times out after 15 seconds.
A file connector workspace uses files as its source data and produces transformed versions of those files as its output.
For file connector workspaces, the File Location section indicates where the source files are obtained from - either a local file system or a cloud storage solution (Amazon S3 or Google Cloud Storage).
When the files come from cloud storage, the Output Location section indicates where to write the transformed files. You must also provide the cloud storage connection credentials.
For more information, go to Configuring the file connector storage type and output options.
Tonic Structural uses workspace permission sets for role-based access (RBAC) of each workspace.
A workspace permission set is a set of workspace permissions. Each permission provides access to a specific workspace feature or function.
Structural provides built-in workspace permission sets. Enterprise instances can also configure custom permission sets.
To share workspace access, you assign workspace permission sets to users and, if you use SSO to manage Structural users, to SSO groups. Before you assign a workspace permission set to an SSO group, make sure that you are aware of who is in the group. The permissions that are granted to an SSO group automatically are granted to all of the users in the group. For information on how to configure Structural to filter the allowed SSO groups, go to Synchronizing SSO groups with Structural.
You cannot remove the owner workspace permission set from the workspace owner. By default, the owner permission set is the built-in Manager permission set.
To change the current access to the workspace:
To manage access to a single workspace, either:
On the workspace management view, in the heading, click the share icon.
On Workspaces view, click the actions menu for the workspace, then select Share.
To manage access for multiple workspaces:
Check the checkbox for each workspace to grant access to.
From the Actions menu, select Share Workspaces.
The workspace access panel contains the current list of users and groups that have access to the workspace. To add a user or group to the list of users and groups, begin to type the user email address or group name. From the list of matching users or groups, select the user or group to add. Free trial users can invite other users to start their own free trial. Provide the email addresses of the users to invite. The email addresses must have the same corporate email domain as your email address. When the invited users sign up for the free trial, they are added to the Structural organization for the free trial user that invited them and have access to the workspace.
For a user or group, to change the assigned workspace permission sets:
Click Access. The dropdown list is populated with the list of custom and built-in workspace permission sets. If you selected multiple workspaces, then on the initial display of the workspace sharing panel, for each permission set that a user or group currently has access to, the list shows the number of workspaces for which the user or group has that permission set. For example, you select three workspaces. A user currently has Editor access for one workspace and Viewer access for the other two. The Editor permission set has 1 next to it, and the Viewer permission set has 2 next to it.
Under Custom Permission Sets, check the checkbox next to each workspace permission set to assign to the user or group. Uncheck the checkbox next to each workspace permission set to remove from the user or group.
Under Built-In Permission Sets, check the workspace permission set to assign to the user or group. You can only assign one built-in permission set. By default, for an added user or group, the Editor permission set is selected. To select a built-in workspace permission set that is lower in access than the currently selected permission set, you must first uncheck the selected permission set. For example, if Editor is currently checked, then to change the selection to Viewer, you must first uncheck Editor.
To remove all access for a user or group, and remove the user or group from the list, click Access, then click Revoke.
To save the new access, click Save.
Database View provides a complete view of your source database structure and configuration.
To display Database View, either:
On the workspace management view, in the workspace navigation bar, click Database View.
On Workspaces view, from the dropdown menu in the Name column, select Database View.
Database View consists of:
On the left, the list of tables in the source database.
On the right, the list of columns in those tables.
The Categorical generator shuffles the existing values within a field while maintaining the overall frequency of the values. It disassociates the values from other pieces of data. Note that NULL is considered a separate value.
For example, a column contains the values Small, Medium, and Large. Small appears 3 times, Medium appears 4 times, and Large appears 5 times. In the output data, each value still appears the same number of times, but the values are shuffled to different rows.
This generator is optimized for categories with fewer than 10,000 unique values. If your underlying data has more unique values (for example, your field is populated by freeform text entry), we recommend that you use the Character Scramble or Custom Categorical generator.
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
Configurable
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
2 if differential privacy enabled
3 if differential privacy not enabled
Generator ID (for the API)
To configure the generator:
From the Link To dropdown, select the columns to link to the current column. You can select from other columns that use the Categorical generator.
Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, differential privacy is disabled.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates a random company name-like string.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.
By default, the generator is not consistent.
If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.
When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Company is always mapped to New Company.
When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates a random company name-like string.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.
By default, the generator is not consistent.
If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.
When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Business is always mapped to New Business.
When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates timestamps that fit an event distribution. The source timestamp must include a date. It cannot be a time-only value.
Link columns to create a sequence of events across multiple columns. This generator can be partitioned by other columns.
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator:
From the Link To dropdown list, select the other Event Timestamps generator columns to link this column to. Linking creates a sequence across multiple columns.
From the Partition drop-down list, select one or more columns to use to partition the data. The selected columns must have their generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to Partitioning a column.
The Options list displays the current column and linked columns. Use the Up and Down buttons to configure the column sequence.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
The FNR generator transforms Norwegian national identity numbers. In Norwegian, the term for national identity number abbreviates to FNR.
The first six digits of an FNR reflects the person's birthdate. You can choose to preserve the birthdates from the source values in the destination values. If you do not preserve the source values, the destination values are still within the same date range as the source values.
Another digit in an FNR indicates whether the person is male or female. You can specify whether to preserve in the generated value the gender indicated in the source value.
The last digits in an FNR are a checksum value. The last digits in the destination value are not a checksum - the values are random.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
To preserve the gender from the source value in the destination value, toggle Preserve Gender to the on position.
To preserve the birthdate from the source value in the destination value, toggle Preserve Birthdate to the on position.
Toggle the Consistency setting to indicate whether to make the generator consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given value for that other column in the source database results in the same value in the destination database. For example, if the FNR column is consistent with a Name column, then every instance of John Smith in the source database results in the same FNR in the destination database.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates unique alphanumeric strings based on any printable ASCII characters. The length of the source string is not preserved. You can choose to exclude lowercase letters from the generated values.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
To exclude lowercase letters from the generated values, toggle Exclude Lowercase Alphabet to the on position.
Toggle the Consistency setting to indicate whether to make the generator consistent. By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates unique object identifiers.
Can be assigned to text columns that contain MongoDB ObjectId values. The column value must be 12 bytes long.
Consistency
Yes, can be made self-consistent
Linking
No, cannot be linked
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
A MongoID object identifier consists of an epoch timestamp, a random value, and an incremented counter. To only change the random value portion of the identifier, but keep the timestamp and counter portions, toggle Preserve Timestamp and Incremental Counter to the on position.
Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This generator scrambles characters, but preserves formatting and keeps the file extension intact.
For example, for the following input value:
DataSummary1.pdf
The output value would look something like:
RsnoPwcsrtv5.pdf
This generator securely masks letters and numbers. There is no way to recover the original data.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates unique alphanumeric strings of the same length as the input.
For example, for the origin value ABC123, the output value is a six-character alphanumeric string such as D24N05.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
You use the workspace management view to configure and run data generation for an individual workspace.
When you log in to Tonic Structural, it displays the workspace management view for the workspace that was selected when you logged out.
The workspace management view includes the following components.
The top left of the workspace management view provides information about the workspace, including:
The workspace name
When the workspace was last updated
The user who last updated the workspace
Whether the workspace is a
The top right of the workspace management view provides general options for working with the workspace, including:
Undo and redo options for configuration changes
The workspace share icon, to
The workspace download menu to:
Download sensitivity scan and privacy reports
The workspace actions menu
The Generate Data button, to
The workspace navigation bar provides access to workspace configuration options.
To display the workspace management view for a workspace:
On Workspaces view, in the Name column either:
Click the workspace name. The workspace management view opens to Privacy Hub.
Click the dropdown icon, then select a workspace management option.
Click the search field at the top. A list of available type the name of the workspace. As you type, Tonic displays a list of matching workspaces. In the list, click the workspace name.
To reduce the amount of vertical space used by the heading of the workspace management view, you can collapse it.
To collapse the heading, click the collapse icon in the Structural heading.
When you collapse the workspace management heading:
The workspace information is hidden. The workspace name is displayed in the search field.
The workspace options are moved up into the Structural heading.
The workspace navigation bar remains visible.
When you collapse the heading, the collapse icon changes to an expand icon. To restore the full heading, click the expand icon.
When you first connect to a or database, Tonic Structural performs a scan to determine the available fields in each collection, the field types, and how prevalent the fields are. It performs this scan at the same time as the initial sensitivity scan.
For each collection, Structural creates a hybrid document, which is a superset of all of the fields contained in the collection documents.
By default, for each collection:
The scan includes all of the documents in the collection, and continues until the scan is finished.
Every unique path (field+data type) in the collection is added to the hybrid document.
You can change the default scan behavior. To change the scan configuration, use the following . You can add these settings manually to the Environment Settings list on Structural Settings.
Note that these settings, including settings that include MONGO in the name, apply to both MongoDB and Amazon DynamoDB.
The following options control the number of documents that Structural scans in a collection.
These options allow you to limit the number of scanned documents when the additional documents do not add fields to the hybrid document.
For large homogenous collections, where all or most documents have the same structure, configuring these options can improve performance.
If you set both options, then the scan completes when it reaches either limit. For example, if the maximum document count is 10 and the maximum scan time is 360 seconds, then the scan completes either after 10 documents or after 360 seconds, whichever comes first.
Typically, the number of unique fields in a collection is small relative to the number of documents. However, in some cases the number of fields is similar to or greater than the number of documents. This most commonly occurs when documents have "data as keys", such as keys that are ObjectIds, UUIDs, or incrementing integers.
In these cases, adding every unique field to the hybrid document can result in a large hybrid document that has an undesirable structure.
Structural offers configuration options to "collapse" fields within the hybrid document. This shrinks the size of the hybrid document. It also allows you to assign a generator to the collapsed group instead of to each unique key.
By default, Structural does not collapse fields.
To enable this, set the  TONIC_MONGO_OBJECT_ID_COLLAPSE_THRESHOLD to the number of ObjectId keys that an object can contain before Structural collapses the object schema into a single key.
For example, if this is 10, then any object that has 10 or more ObjectId keys is collapsed into a single key.
A negative value indicates to not collapse the keys.
The default value is -1.
To enable Structural to collapse fields, you provide a regular expression to identify the fields that can be collapsed into the same field. You then configure the number of matches that must exist before Structural collapses the fields.
To configure how the fields are collapsed, use the following :
For example:
To collapse keys that are integer values, use the regular expression [0-9]+ or \d+
To collapse keys that are UUIDs, use the regular expression [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}
On Privacy Hub, the Latest Collection Scan table shows the most recent scans on each scanned collection.
The Build Schema option runs a new scan on the collection.
When the source database has a new collection, then on Collection View, you are prompted to run a scan either on that collection or on all collections.
The bulk edit option on Database View allows you to configure multiple columns at the same time. From the bulk editing panel, you can:
Mark the selected columns as sensitive or not sensitive.
Assign a generator to the selected columns.
Apply the recommended generator to the selected columns.
Reset the generator configuration to the baseline. This option requires that all of the selected columns are assigned the same preset.
Depending on the column selection, you can also create a new sensitivity rule.
To select the columns and display the bulk edit option:
Check the checkbox next to each column to update.
Click Bulk Edit.
On the Bulk Edit panel, under Sensitivity:
To mark the selected columns as sensitive, click Sensitive.
To mark the selected columns as not sensitive, click Not Sensitive.
On the Bulk Edit panel, under Bulk Edit Applied Generator, select and configure the generator to assign to the selected columns.
If any of the selected columns have a recommended generator, then on the Bulk Edit panel, the Generator recommendations found panel displays. The panel indicates the number of selected columns that have a recommendation.
To assign the recommended generators to those columns, click Apply.
For a generator preset, the baseline configuration is the configuration that is saved for that preset. The baseline configuration determines the default configuration to use when you assign the preset to a column. After you select the preset, you can override the baseline configuration.
If all of the selected columns are assigned the same preset, then to restore the baseline configuration for all of the columns, click Reset to Baseline.
You might bulk edit columns that could benefit from a custom sensitivity rule.
For example, in your data, the Widget column is in multiple tables and contains sensitive data that Structural cannot identify. You select all of the Widget columns so that you can mark them as sensitive and apply the Character Scramble generator to them.
However, a custom sensitivity rule would ensure that in the future, Widget columns are always marked as sensitive and have the Character Scramble generator recommended.
On the Bulk Edit panel, when all of the selected columns:
Have the same data type.
Do not have a generator assigned.
Do not have a recommended generator.
Then Structural displays the Create a Sensitivity Rule panel, which contains the option to create a new sensitivity rule.
To create a sensitivity rule:
Click Create Custom Rule.
On the Create Custom Rule view, configure the new sensitivity rule. Structural automatically selects a data type based on the selected columns. The current workspace is used as the testing workspace to verify the columns that match the rule configuration. For details about the sensitivity rule configuration, go to .
When you finish configuring the new rule:
To both save the rule and apply the generator preset to all workspace columns that match the rule, click Save and Apply. On the confirmation panel, click Confirm Auto Apply.
To save the rule, but not apply the configured generator preset to matching columns, click Save.
Structural closes the sensitivity rule configuration view and returns you to Database View. It maintains the previous column selection.
If you did not apply the generator preset, then the sensitivity rule is included in the next sensitivity scan.
For self-hosted instances, Structural provides to configure features that include:
Consistency across runs and databases
Data generation performance
The Advanced Workspace Overrides section of the workspace details view allows you to override those environment settings for an individual workspace.
For example, the environment setting TONIC_TABLE_PARALLELISM determines the number of tables that Structural processes simultaneously. You can then override that value within individual workspaces.
The workspace overrides are available on both self-hosted instances and on Structural Cloud.
To display the available override settings, expand Advanced Workspace Overrides.
For information on how to configure the statistics seed, go to .
For other settings, to enable the override and set the override value:
Toggle the setting to the on position.
Set the value.
To remove the override, toggle the setting to the off position.
For generators where is enabled, a statistics seed enables consistency across data generation runs. The Structural-wide statistics seed value ensures consistency across both data generation runs and workspaces.
You use the Override Statistics Seed setting to override the Structural-wide statistics seed value.
You can either disable consistency across data generations, or provide a seed value for the workspace. The workspace seed value ensures consistency across data generation runs for that workspace, and across other workspaces that have the same seed value.
For details about using seed values to ensure consistency across data generation runs and databases, go to .
Structural provides environment settings to manage . For example, these settings include configuration for parallel processing.
From Advanced Workspace Overrides, you can override some of these data generation performance settings for an individual workspace.
To use Structural data encryption, you must .
You use the Override Data Decryption Key and Override Data Encryption Key settings to override the Structural-wide keys that are provided in the environment settings.
Some data connectors allow you to configure whether you provide the schema for the destination database. For more information, go to related information for , , , , and .
From Advanced Workspace Overrides, you can override the instance-wide configuration.
Databricks and Amazon EMR allow you to configure how Structural handles overwrites of existing data.
You use the Override Workspace Default Error on Override and Override Workspace Default Save Mode settings to override the instance-wide configuration.
The algebraic generator identifies the algebraic relationship between three or more numeric values and generates new values to match. At least one of the values must be a non-integer.
If a relationship cannot be found, then the generator defaults to the generator.
This generator can be linked with other Algebraic generators.
To configure the generator, from the Link To dropdown list, select the columns to link this column to. You can select other columns that are assigned the Algebraic generator.
You must select at least three columns.
The column values must be numeric. At least one of the columns must contain a value other than an integer.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This is a .
Applies different generators to the value conditionally based on any value in the table.
For example, a Users table contains Name, Username, and Role columns. For the Username column, you can use a conditional generator to indicate that if the value of Role is something other than Test, then use the Character Scramble generator for the Username value. For Test users, the name is not masked.
The generator consists of a list of options. Each option includes the required conditions and the generator to use if those conditions are met.
The generator always contains a Default option. The Default option is used if the value does not meet any of the conditions. To configure the Default option:
From the Default dropdown list, select the generator to use by default.
Configure the selected generator.
To add a condition option:
Click + Conditional Generator.
To add a condition:
Click + Condition.
From the column list, select the column for which to check the value.
Select the comparison type.
Enter the column value to check for.
To remove a condition, click the delete icon for the condition.
From the Generator dropdown list, select the generator to run on the current column if the conditions are met. You cannot select another composite generator.
Choose the configuration options for the selected generator.
To view details for and edit a condition option, click the expand icon for that option.
To remove a condition option, click the delete icon for the option.
A version of the generator that can be used for array values.
This generator replaces letters with random other letters, and numbers with random other numbers. Punctuation and whitespace are preserved.
For example, for the following array value:
["ABC.123", 3, "last week"]
The output might be something like:
["KFR.860", 7, "sdrw mwoc"]
This generator securely masks letters and numbers. There is no way to recover the original data.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Performs a random character replacement that preserves formatting (spaces, capitalization, and punctuation).
Characters are replaced with other characters from within the same Unicode Block. A given source character is always mapped to the same destination character. For example, M might always map to V.
For example, for the following input string:
Miami Store #162
The output would be something like:
Vgkjg Gmlvf #681
Note that for a numeric column, when a generated number starts with a 0, the starting 0 is removed. This could result in matching output values in different columns. For example, one column is changed to 113 and the other to 0113, which also becomes 113.
Character Substitution is similar to , with a couple of key differences. Because Character Substitution always maps the same source character to the same destination character, it is always consistent. It also can be used for unique columns.
In Character Scramble, the character mapping is random, which makes Character Scramble slightly more secure. However, Character Scramble cannot be used for unique columns.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates a continuous distribution to fit the underlying data.
This generator can be linked to other Continuous generators to create multivariate distributions and can be partitioned by other columns.
To configure the generator:
From the Link To drop-down list, select the other Continuous generator columns to link to. The linking creates a multivariate distribution.
From the Partition By drop-down list, select one or more columns to use to partition the data. The selected columns must have the generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to .
Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, the generator is not differentially private.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates an address-like string to replace either:
For a Canadian postal address, the street name or postal code.
For a United Kingdom (UK) mailing address, the postal code.
To replace a Canadian postal code:
The generator selects a real postal code that starts with the same three digits - has the same Forward Sortation Area (FSA) - as the original postal code, but that has a different Local Delivery Unit (LDU).
For a postal code whose FSA is not on the list that the generator uses, you can provide a fallback value to use.
To replace a UK postal code, the generator selects a real postal code.
To configure the generator:
From the Generator Type dropdown list, select International Address.
From the Country dropdown list, select the country (Canada or United Kingdom).
From the Address Component dropdown list, select the address component that this column contains. For Canada, the available options are:
Street Name
Postal Code
For the UK, the only option is to generate a postal code.
For a Canadian postal code, in the Fallback Value field, type the FSA to use if the value in the data does not exist.
For example, the FSA in the data might be new and not yet in the list that Structural uses, or the FSA might be invalid.
By default, the fallback value is NULL, meaning that in the destination data, the postal code value is the string literal "NULL".
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generators transform the data in a source database column. You assign the generators to use. Tonic Structural offers a variety of generators to transform different types of data.
For details about how to assign and configure generators, and manage generator presets, go to .
You can also view this .
Generates a random name string from a dictionary of first and last names.
You specify the name information that is contained in the column. A column might only contain a first name or last name, or it might contain a full name. A full name might be first name first or last name first.
For example, a Name column contains a full name in the format Last, First. For the input value Smith, John, the output value would be something like, Jones, Mary.
To configure the generator:
From the name format dropdown list, select the type of name value that the column contains:
First. This also is commonly used for standalone middle name fields.
Last
First Last
First Middle Last
First Middle Initial Last
Last, First
Last, First Middle
Middle Initial
Toggle the Preserve Capitalization setting to indicate whether to preserve the capitalization of the column value. By default, the capitalization is not preserved.
Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates random host names, based on the English language.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.
By default, the generator is not consistent.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from Consistent to, select the column.
When the generator is consistent with itself, then a given value in the source database is mapped to the same value in the destination database. For example, Host123 in the source database always produces MyHostABC in the destination database.
When the generator is consistent with another column, then a given source value in the other column results in the same host name value in the destination database. For example, a host name column is consistent with a department column. Every instance of Sales in the source data is given the same host name in the destination database.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This generator can be used to mask columns of latitude and longitude.
The Geo generator divides the globe into grids that are approximately 4.9 x 4.9 km. It then counts the number of points within each grid.
During data generation, each (latitude, longitude) pair is mapped to its grid.
If the grid contains a sufficient number of points to preserve privacy, then the generator returns a randomly chosen point in that grid.
If the grid does not contain enough points to preserve privacy, then the generator returns a random coordinate from the nearest grid that contains enough points.
To configure the generator:
From the Link To dropdown list, select the column to link to this one. You typically assign the Geo generator to both the latitude and longitude column, then link those columns.
From the value type dropdown, select whether this column contains a latitude value or a longitude value.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates a valid Finnish Personal Identity Code (PIC) that would have been issued during a specific date range.
To configure the generator:
Under Date Range, set the start and date for the date range to generate the PICs for.
Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random IP address formatted string.
To configure the generator:
In the Percent IPv4 field, type the percentage of output values that are IPv4 addresses.
For example, if you set this to 60, then 60% of the generated IP addresses are IPv4 addresses, and 40% of the generated IP addresses are IPv6 addresses.
If you set this to 100, then all of the generated IP addresses are IPv4 addresses.
If you set this to 0, then all of the generated IP addresses are IPv6 addresses.
Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given source value in that column always results in the same IP address value in the destination database. For example, an IP address column is consistent with a username column. For each instance of User1 in the source database, the value in the IP address column is the same.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This generator replaces letters with random other letters and numbers with random other numbers. Punctuation, whitespace, and mathematical symbols are preserved.
For example, for the following input string:
ABC.123 123-456-789 Go!
The output would be something like:
PRX.804 296-915-378 Ab!
This generator securely masks letters and numbers. There is no way to recover the original data.
Character Scramble is similar to , with a couple of key differences.
While you can enable consistency for the entire value, Character Scramble does not always replace the same source character with the same destination character. Because there is no guarantee of unique values, you cannot use Character Scramble on unique columns.
Character Substitution, however, does always map the same source character to the same destination character. Character Substitution is always consistent, which makes it less secure than Character Scramble. You can use Character Substitution on unique columns.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
6
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No, cannot be made differentially private.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
Yes
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
Consistency
Determined by the selected generators.
Linking
Determined by the selected generators.
Differential privacy
Determined by the selected generators.
Data-free
Determined by the selected generators.
Allowed for primary keys
Yes, but:
Make sure that the configuration preserves uniqueness.
Do not use on primary key columns that are used for subsetting.
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
If a fallback generator is selected, then the lower of either 5 or the fallback generator.
5 if no fallback generator is selected
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
This generator is implicitly self-consistent. You do not specify whether the generator is consistent. Every occurrence of a character always maps to the same substitute character. Because of this, it can be used to preserve a join between two text columns, such as a join on a name or email.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
4
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
Configurable
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
2 if differential privacy enabled
3 if differential privacy not enabled
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column. Note that all Name generator columns that have the same consistency configuration are automatically consistent with each other. The columns must either be all self-consistent or all consistent with the same other column. For example, you can use this to ensure that a first name and last name column value always match the first name and last name in a full name column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent
Linking
No, cannot be linked
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)








TONIC_DOCUMENT_SCAN_MAX_DOCS_COUNT
The maximum number of documents to scan for each schema in a collection. For example, if this is 10, then Structural scans up to 10 documents, and ignores the remaining documents. When this value is empty, Structural scans all of the documents.
TONIC_DOCUMENT_SCAN_MAX_TIME_SECONDS
The maximum amount of time in seconds to scan a schema. For example, if this is 360, then Structural scans a schema for up to 360 seconds. When this value is empty, Structural continues the scan until it is complete.
TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX
The regular expression that identifies the fields that can be collapsed into a single field. By default, this value is empty.
TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX_THRESHOLD
The number of fields that match the regular expression before Structural collapses the fields into a single field. For example, if this is 5, then after Structural finds 5 fields that match the regular expression, it collapses all of the matching fields into a single field. A negative value indicates to not collapse the fields. The default value is -1.



View and configure tables
Filter the table list, and assign table modes to tables.
View the column list
Apply filters to and sort the list of columns.
View sample data
View example source and destination data for a column.
Configure an individual column
Assign a generator and determine the column sensitivity.
Configure multiple columns
Use the bulk edit option to configure multiple columns.
Identify similar columns
Identify and filter to columns that are similar to a column, based on the column name.
Comment on columns
Add and respond to column comments.

Structural identifies the following types of sensitive values. These include some information types that are considered by many privacy standards and frameworks such as HIPAA, GDPR, CCPA, and PCI.
For more information about the HIPAA and Safe Harbor information types that Structural detects, go to the Tonic.ai guide Using Tonic Structural and the Safe Harbor method to de-identify PHI.
Names
First
Last
Full
Organization
Location
Street address
ZIP
PO Box
City
State and two-letter abbreviation
Country
Postal code
GPS coordinates
Contact information
Email address
Telephone number
User credentials
Username
Password
Financial information
Credit card number
International bank account number (IBAN)
SWIFT code for bank transfers
Money amount
BTC (Bitcoin) address
Identification
Social Security Number
Passport number
Driver's license number
Birth date
Gender
Biometric identifier, such as a fingerprint or voiceprint
Full face photographic images and similar images
Medical information
ICD-9 and ICD-10 codes (Used to identify diseases)
Medical record number
Health plan beneficiary number
Admission date
Discharge date
Date of death
Other personal information
Marital status
Accounts and licenses
Account number
Certificate or license number
Network and web location
IP address
IPv6 address
MAC address
Web URL
International Mobile Equipment Identity (IMEI)
Vehicle information
Vehicle identification number (VIN)
License plate number




Tonic Ephemeral is a separate Tonic.ai product that allows you to create temporary databases to use for testing and demos. For more information about Ephemeral, go to the .
If Ephemeral supports your workspace database type, then you can write the destination data to a snapshot in Ephemeral. You can then use the snapshot to start Ephemeral databases.
You can view and manage the snapshots and databases from the Outputs tab of the workspace management view. For more information, go to Viewing and managing Ephemeral output.
To write the transformed data to Ephemeral, under Destination Settings, click Ephemeral Database.
Structural can write the data snapshot to either Ephemeral Cloud or to a self-hosted instance of Ephemeral. By default, Structural writes the data snapshot to Ephemeral Cloud.
All workspaces on the same self-hosted Structural instance or in the same Structural Cloud organization must write to the same instance of Ephemeral. When you change the Ephemeral output configuration in one workspace, it is automatically changed in other workspaces that write to Ephemeral.
Structural writes the snapshot to the Ephemeral Cloud account for the user who runs the data generation job.
If that user has an Ephemeral account on Ephemeral Cloud, then Structural uses that account.
If the user does not have an account, then Structural creates an account for them.
On Structural Cloud, when you save the workspace, if you do not have an Ephemeral Cloud account, then an Ephemeral Cloud account is created for you.
When Structural creates an Ephemeral account, if the user belongs to an existing Ephemeral Cloud organization, then the account is added to the organization. Otherwise, the account is a two-week free trial account.
For a self-hosted Structural workspace, you must provide an API key from an existing Ephemeral Cloud account.
To write a snapshot to Ephemeral Cloud:
Click Tonic Ephemeral cloud.
If you are on a self-hosted instance of Structural:
In the API Key field, provide an Ephemeral API key from your Ephemeral account.
To test the connection, click Test Connection.
When you write to a self-hosted instance of Ephemeral, then you must always provide an Ephemeral API key.
To write the snapshot to a self-hosted instance of Ephemeral:
Click Tonic Ephemeral self-hosted.
In the API Key field, provide an Ephemeral API key from your Ephemeral account. Structural writes the snapshot to the Ephemeral account that is associated with the API key.
In the Tonic Ephemeral URL field, provide the URL to your self-hosted Ephemeral instance.
To test the connection, click Test Connection.
For Oracle, you select the base image to use to create the data snapshot.
If you write to Ephemeral Cloud, then you must use the Oracle 23c base image that comes with Ephemeral. This image has the following limitations:
A maximum of 12GB of user data
A maximum of 2CPU cores and 2GB of RAM
If you write to a self-hosted instance of Ephemeral, then you can also select a custom image that you created in Ephemeral.
For information about how to create and manage custom images for Oracle, go to .
If you do not configure any advanced settings, then:
The snapshot uses the same name as the workspace, and has no description.
The snapshot size allocation is determined by the source data size.
Structural discards the temporary Ephemeral database that is created during the data generation.
To change any of these settings, click Advanced settings.
By default, the snapshot name uses the workspace name.
When you run data generation, if a snapshot with the same name already exists in Ephemeral, then Structural overwrites that snapshot with the new snapshot.
Under Advanced settings:
In the Snapshot name field, provide the name of the snapshot. The snapshot name can use the following placeholder values to help identify the snapshot:
{workspaceName} - Inserts the name of the workspace.
{workspaceId} - Inserts the identifier of the workspace.
{jobId} - Inserts the identifier of the data generation job that created the snapshot.
{timestamp} - Inserts the timestamp when the snapshot was created.
Including the job ID or timestamp ensures that a data generation job does not overwrite a previous snapshot.
Optionally, in the Snapshot description field, provide a longer description of the snapshot.
By default, the resources used for the snapshot are based on the size of the source data.
For source data that is 25 GB or less, Nano is used.
For source data larger than 25 GB, Micro is used.
To select a specific option:
Toggle Custom pod resources to the on position.
From the dropdown list, select the option to use for the combination of vCPUs and memory:
Nano - 0.125 vCPU with 0.5 GB RAM
Micro - 0.5 vCPU with 2 GB RAM
Small - 1 vCPU with 4 GB RAM
Medium - 2 vCPU with 8 GB RAM
Large - 4 vCPU with 16 GB RAM
By default, the Ephemeral size allocation for the snapshot is based on the size of the source data.
To instead provide a custom data size allocation, under Advanced settings:
Toggle Custom data size allocation to the on position.
In the field, enter the size allocation in gigabytes.
When Structural creates the Ephemeral snapshot, it creates a temporary Ephemeral database.
By default, Structural deletes that database when the data generation is complete.
To instead keep the database, under Advanced settings, toggle Keep database active in Tonic Ephemeral after data generation to the on position.
In Ephemeral Cloud, by default, databases are publicly accessible. To limit database access, you can configure Ephemeral Cloud with an IP allowlist for your organization. For more information, go to in the Ephemeral documentation.
For a MySQL or PostgreSQL workspace, you can provide a customization file that helps to ensure that the temporary Ephemeral database is configured correctly.
To provide the customization details:
Toggle Use custom configuration to the on position.
In the text area, paste the contents of the customization file.
This is a composite generator.
A version of the JSON Mask generator that can be used for array values.
Runs a selected generator on values that match a user-specified JSONPath.
Consistency
Determined by the specified sub-generators.
Linking
Determined by the specified sub-generators.
Differential privacy
Determined by the specified sub-generators.
Data-free
Determined by the specified sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the JSONPath expression to identify the value to apply the generator to. To populate a path expression, you can also click a value in the Cell JSON field. Matched JSON Values shows the result from the value in Cell JSON.
By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, and Null.
From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This is a composite generator.
Masks text columns by parsing the values as rows whose columns are delimited by a specified character.
You can assign specific generators to specific indexes. You can also use the generator that is assigned to a specific index as the default. This applies the generator to every index that does not have an assigned generator.
The output value maintains the quotes around the index values.
For example, a column contains the following value:
"first","second","third"
You assign the Character Scramble generator to index 0 and assign Passthrough to index 2. You select index 0 as the index to use for the default generator.
In the output, the first and second values are masked by the Character Scramble generator. The third value is not masked. The output looks something like:
"wmcop", "xjorsl", "third"
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
In the Delimiter field, type the delimiter that is used as a separator in the value.
For example, for the value "first","second","third", the delimiter is a comma.
You can configure a generator for any or all of the indexes. To add a sub-generator for an index:
Under Sub-Generators, click Add Generator. On the add generator dialog, the Cell CSV field contains a sample value from the source data. You can use the navigation icons to page through the values.
In the CSV Index field, type the index to assign a generator to. The index numbers start with 0. You cannot use an index that already has an assigned generator. Matched CSV values shows the value at that index for the current sample column value.
Under Generator Configuration, from the Select a Generator dropdown list, select the generator to use for the selected index. You cannot select another composite generator. To remove the selection, click the delete icon.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another index, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
After you configure a generator for at least one index, the Default Link dropdown list is displayed.
From the Default Link dropdown list, select the index to use to determine how to mask values for indexes that do not have an assigned generator.
For example, you assign the Character Scramble generator to index 2. If you set Default Link to 2, then all indexes that do not have an assigned generator use the Character Scramble generator.
A version of the Categorical generator that selects from values that you provide instead of shuffling the original values.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
Yes, can be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
From the Link To dropdown list, select the columns to link this column to. You can only select other columns that use the Custom Categorical generator.
In the Custom Categories text area, enter the list of values that the generator can choose from.
Put each value on a separate line.
To add a NULL value to the list, use the keyword {NULL}.
Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given source value in that column always results in the same value for the current column in the destination database. For example, a department column is consistent with a username column. For each instance of User1 in the source database, the value in the department column is the same.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Links columns in two tables. This column value is the sum of the values in a column in another table.
This generator does not provide a preview. The sums are not computed until the other table is generated.
For example, a Customers table contains a Total_Sales column. The Transactions table uses a foreign key Customer_ID column to identify the customer who made the transaction, and an Amount column that contains the amount of the sale. The Customer_ID value in the Transactions table is a value from the ID primary key column in the Customers table.
You assign the Cross Table Sum generator to the Total_Sales column. In the generator configuration, you indicate that the value is the sum of the Amount values for the Customer_ID value that matches the primary key ID value for the current row.
For the Customers row for ID 123, the Total_Sales column contains the sum of the Amount column for Transactions rows where Customer_ID is 123.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator:
From the Foreign Table dropdown list, select the table that contains the column for which to sum the values.
From the Foreign Key dropdown list, select the foreign key. The foreign key identifies the row from the current table that is referred to in the foreign table.
From the Sum Over dropdown list, select the column for which to sum the values.
From the Primary Key dropdown list, select the primary key for the current table.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Scrambles the characters in an email address. It preserves formatting and keeps the @ and . characters.
For example, for the following input value:
The output value would be something like:
By default, the generator scrambles the domain. You can configure the generator to not mask specific domains. You can also specify a domain to use for all of the output email addresses.
For example, if you configure the generator to not scramble the domain company.com, then the output for [email protected] would look something like:
This generator securely masks letters and numbers. There is no way to recover the original data.
If your email addresses include name values - for example, [email protected] - then you can use the Regex Mask generator to produce email addresses that are tied to name values in the same table. For information on how to do this, go to .
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
In the Email Domain field, enter a domain to use for all of the output values.
For example, use @mycompany.com for all of the generated values. The generator scrambles the content before the @.
In the Excluded Email Domains field, enter a comma-separated list of domains for which email addresses are not masked in the output values. This allows you, for example, to maintain internal or testing email addresses that are not considered sensitive.
Toggle the Replace invalid emails setting to indicate whether to replace an invalid email address with a generated valid email address. By default, invalid email addresses are not replaced. In the replacement values, the username is generated. If you specify a value for Email Domain, then the email addresses use that domain. Otherwise, the domain is generated.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Generates a random telephone number that matches the country or region of the input telephone number, but preserves the format. For example, (123) 456-7890 or 123-456-7890.
If the input is not a valid telephone number, the generator randomly replaces numeric characters. You can also replace invalid numbers with valid numbers.
By default, the numbers are United States telephone numbers.
If the input is a valid telephone number, or if you replace invalid numbers, then the generated numbers pass Google's libphonenumber verification.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator:
Toggle the Replace invalid numbers setting to indicate whether to replace invalid input values with a valid output value. By default, the generator does not replace invalid values. It randomly replaces numeric characters.
Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, consistency is disabled.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This is a composite generator.
A version of the Regex Mask generator that can be used for array values.
Uses regular expressions to parse strings and replace specified substrings with the output of specified generators. The parts of the string to replace are specified inside unnamed top-level capture groups.
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
To add a regular expression:
Click Add Regex. On the configuration panel, Cell Value shows a sample value from the source database. You can use the previous and next options to navigate through the values.
By default, Replace all matches is enabled. To only match the first occurrence of a pattern, toggle Replace all matches to the off position.
In the Pattern field, enter a regular expression. If the expression is valid, then Structural displays the capture groups for the expression.
For each capture group, to select and configure the generator to apply, click the selected generator. You cannot select another composite generator.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Regexes list:
To edit a regular expression, click the edit icon.
To remove a regular expression, click the delete icon.
Generates unique integer values. By default, the generated values are within the range of the column’s data type.
You can also specify a range for the generated values. The source values must be within that range.
This generator cannot be used to transform negative numbers.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
In the Minimum field, enter the minimum value to use for an output value. The minimum value cannot be larger than any of the values in the source data.
In the Maximum field, enter the maximum value to use for an output value. The maximum value cannot be smaller than any of the values in the source data.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
This is a composite generator.
Runs selected generators on specified key values in an HStore column in a PostgreSQL database. HStore columns contain a set of key-value pairs.
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
To assign a generator to a key:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HStore field contains a sample value from the source database. You can use the previous and next icons to page through different values.
Under Enter a key, enter the name of a key from the column value.
For example, for the column value:
 "pages"=>"446", "title"=>"The Iliad", "category"=>"mythology"
To apply a generator to the title, you would enter title as the key.
Matched HStore Values shows the result from the value in Cell HStore.
From the Generator Configuration dropdown list, select the generator to apply to the key value. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another key, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
The Tonic Structural platform creates safe, realistic datasets to use in staging environments or for local development. The Structural web application and API can be used by engineers, data analysts, or security experts.
Structural connects to source databases that contain sensitive data such as personally identifiable information (PII) or protected health information (PHI). To protect that data, Structural transforms the sensitive values and then writes the transformed data to a destination location.
New to Structural? Review the Tonic Structural workflow overview. For information on how to create a Structural account and start a Structural free trial, go to Getting started with the Structural free trial.
Want to know what's in the latest Structural releases? Go to the Tonic Structural release notes.
The Structural application heading includes a feature updates icon, which displays a summary of the newest features, and includes a link to the Structural release notes.
Need help with Structural? Contact [email protected].
For an individual column in Database View, you can configure the assigned generator and determine the column sensitivity.
From the column list, to display the generator configuration panel, in the Applied Generator column, click the generator name tag.
The Structural sensitivity scan provides an initial indication of whether a column is sensitive and, if it is sensitive:
The type of sensitive data that is in the column.
The confidence level of the sensitivity detection.
For more information, go to .
In a child workspace, you cannot configure whether a column is sensitive. A child workspace always inherits the sensitivity designations from its parent workspace.
From the Status column, to confirm or change the column sensitivity, click the Status value.
The status panel indicates whether the column is sensitive. It identifies the sensitivity type, and indicates how the sensitivity was determined - by a sensitivity scan or by a user.
For a column that matches a built-in sensitivity type, the first time that you display the panel, the Sensitive data? setting displays Yes and No options for you to confirm or change the sensitivity.
To indicate that the column is sensitive, click Yes.
To indicate that the column is not sensitive, click No.
When you click Yes or No, the Yes and No options change to a simple toggle. When you click Yes, the sensitivity confidence level changes to full.
After that:
To indicate that the column is sensitive, toggle Sensitive data? to the on position.
To indicate that the column is not sensitive, toggle Sensitive data? to the off position.
When a column matches a sensitivity rule, the sensitivity panel indicates that the column matched a sensitivity rule.
You use the Sensitive data? toggle to indicate whether the column is actually sensitive.
When a column does not match a built-in sensitivity type or a custom sensitivity rule, the sensitivity panel indicates that column is not sensitive.
The Sensitive data? setting displays Yes and No options for you to confirm or change the sensitivity.
To indicate that the column is sensitive, click Yes.
To confirm that the column is not sensitive, click No.
When you click Yes or No, the Yes and No options change to a simple toggle.
If you click Yes:
The panel updates to indicate that a user confirmed that the column is sensitive.
The sensitivity confidence level is set to full confidence.
After that:
To indicate that the column is sensitive, toggle Sensitive data? to the on position.
To indicate that the column is not sensitive, toggle Sensitive data? to the off position.
To configure the sensitivity, you can also use the Sensitive Data toggle on the column configuration panel.
To indicate that a column is sensitive, toggle the sensitivity setting to the on position.
To indicate that the column is not sensitive, toggle the sensitivity setting to the off position.
When you change the sensitivity from the generator configuration panel, the Sensitive data? setting on the sensitivity panel also changes from the Yes and No options to the toggle.
When a sensitivity scan identifies a column, Structural recommends a generator for the column. For example, when the sensitivity scan identifies a column as a first name, Structural recommends the Name generator configured to generate a first name value.
In the Assigned Generator column on Database View, columns that do not have an assigned generator, and that have a recommended generator, display the available recommendation icon.
When you click the generator dropdown, the column configuration panel includes the following information:
The sensitivity confidence level.
The recommended generator.
Sample source and destination values based on the recommended generator.
From the panel, you choose whether to assign or ignore the recommended generator for that type.
To assign the recommended generator, click Apply.
To ignore the recommendation, click Ignore. Structural clears the recommendation.
To change the generator that is assigned to a selected column:
Click the generator name tag for the column.
To assign a different generator to the column, from the Generator Type dropdown list, select the generator.
Configure the generator options.
To reset an assigned generator to Passthrough, which indicates to not transform the data:
Click the generator name tag.
On the generator configuration panel, click the delete icon next to the generator dropdown.
For details about the configuration options for each generator, go to the .
For more information about selecting and configuring generators and generator presets, go to .
For a JSON column, instead of assigning a generator, you can enable Document View.
From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to .
To enable Document View, on the column configuration panel, toggle Use Document View to the on position. Note that if you have , or enabled , then the Use Document View toggle is in the advanced options.
When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.
The Structural sensitivity scan identifies sensitive columns in source data. The scan ignores truncated tables.
For most data connectors, Structural runs sensitivity scans automatically based on specific events. You can also run manual sensitivity scans on demand.
On a self-hosted instance, sensitivity scans can also run automatically at the same time each day.
Structural automatically runs a sensitivity scan when you:
Create a completely new workspace and connect a data source.
Change the data connection details for the source database.
Add a file group to a file connector workspace.
A child workspace always inherits the sensitivity designations from its parent workspace.
When you copy a workspace, Structural runs a new sensitivity scan on the copy to identify sensitive columns. However, it keeps the sensitivity designation for columns that you specifically marked as sensitive or not sensitive.
In addition to the automatic scans, from Privacy Hub, you can .
On self-hosted instances, Structural can also run scheduled daily sensitivity scans in the background.
The daily scans only run on the 10 workspaces that had the most recent activity. Activity includes:
User-initiated updates that are included in the .
Data generation jobs.
By default, Structural runs the sensitivity scans each day at midnight.
To enable and configure the daily sensitivity scans, use the following . You can add these settings to the Environment Settings list on Structural Settings.
TONIC_ENABLE_SCHEDULED_SENSITIVITY_SCAN - Boolean to indicate whether to enable the scheduled daily sensitivity scans.
The default value is true. To disable the scheduled daily scan, set this to false.
TONIC_SENSITIVITY_SCAN_HOUR - When scheduled scans are enabled, the hour at which to run the scans. The setting uses the local time zone.
The value is an integer between 0 and 23, where 0 is midnight and 23 is 11:00 PM.
For example, a value of 14 indicates to run the job at 2:00 PM.
The default value is 0.
TONIC_PII_SCAN_MAX_TIMEOUT_IN_MINUTES_IF_AUTOMATIC - The number of minutes after which a scheduled scan times out.
By default, the scan times out after 3 minutes.
For improved performance, sensitivity scans can use parallel processing.
For relational databases such as PostgreSQL and SQL Server, to configure parallel processing, you use the  TONIC_PII_SCAN_PARALLELISM_RDBMS. The default value is 4.
For document-based databases such as MongoDB, you use the environment setting TONIC_PII_SCAN_PARALLELISM_DOCUMENTDB. The default value is 1.
The Structural sensitivity scan uses the following rules and processes to:
Identify sensitive columns.
Recommend generators for those columns. For information about applying recommended generators to columns, go to .
Indicate its confidence that an identified column is sensitive and is of the detected sensitivity type.
Note that this process cannot guarantee perfect precision and recall. We strongly recommend that a human reviews the sensitivity scan results and the broader dataset to ensure that nothing sensitive was missed.
To identify that a column contains sensitive information for a , Structural looks at the data type, column name, and column values.
This part of the sensitivity scan uses regular expression matching and dictionary lookups. It produces high, medium, or low confidence detections.
When this part of the sensitivity scan determines that a column contains sensitivity data, it:
Marks the column as sensitive
Assigns the sensitivity type to the column
Recommends the generator configuration for the identified sensitivity type. Note that if the recommended generator is not compatible with the column, then Structural discards the recommendation.
Marks the sensitivity detection as high, medium, or low confidence. The confidence level is based on a calculation of how well the column matched the applicable rules.
The sensitivity scan also looks for any columns that match custom sensitivity types that you define in your custom sensitivity rules.
Custom sensitivity rules are based on the column data type and column name. For more information about custom sensitivity rules, go to .
Custom sensitivity rules always produce full confidence detections.
When a column matches a custom sensitivity rule, Structural:
Marks the column as sensitive.
Assigns the sensitivity rule name as the sensitivity type.
Recommends the generator preset from the sensitivity rule.
Marks the sensitivity detection as full confidence.
To identify additional sensitive columns that might not be captured by the other parts of the scan, the sensitivity scan uses an artificial intelligence (AI) model. Note that the model is pre-trained. Structural does not use customer data to train the model, and it does not send any customer data externally.
This part of the scan produces medium or low confidence detections for built-in entity types.
The model considers the table and column name. If the combination of table and column name is similar in meaning to a sensitivity type that Structural has a recommended generator for, then Structural:
Marks the column as sensitive.
Assigns the sensitivity type to the column.
Recommends the generator configuration for that sensitivity type.
Uses AI to compare the table name and column name combination to the sensitivity type, and produces a semantic similarity score.
Based on the semantic similarity score, marks the sensitivity detection as either medium or low confidence.
To download the log of the most recent sensitivity scan, either:
On the workspace management view, from the download menu, select Download Sensitivity Scan Log.
On Privacy Hub, click Reports and Logs, then select Scan Log.
The log tracks the progress of the scan.
The table list at the left of Database View contains the list of tables in the source database. You can filter the table list and assign tables modes to the tables.
The table list is grouped by schema. You can expand and collapse the list of tables in each schema. This does not affect the displayed columns.
For a workspace, each table corresponds to a file group.
For each table, the table list includes the following information:
The name of the table.
The number of columns that have an assigned generator (a generator other than Passthrough). The number does not display if none of the table columns has an assigned generator.
The . The table list only shows the first letter of the table mode:
D = De-identify
S = Scale
T = Truncate
P = Preserve Destination
I = Incremental
For a child workspace, if the selected table mode overrides the parent workspace configuration, then the override icon displays.
To display for a table, click the arrow icon to the right of the table entry.
You can filter the table list and . You can also filter the tables based on .
As you filter the table list, the column list also is filtered to only include the columns for the filtered tables.
To filter the table list by name, in the filter field, begin to type text that is in the table name.
As you type, Tonic Structural filters the list to only display tables with names that contain the filter text.
To filter the table list based on the assigned table mode:
Click Filters.
On the filter panel, check the checkbox next to each table mode to include. By default, the list includes all of the table modes. As you check and uncheck the table mode checkboxes, Structural adds and removes the associated tables from the list.
You can filter the table list to only display tables that have no assigned generators:
Click Filters.
On the filter panel, to only show tables that do not have assigned generators, check the No Generators Applied checkbox.
The table mode determines the number of rows and columns in the destination database. For details about the available table modes and how they work, go to .
To change the assigned table mode for a single table:
Click the table mode dropdown next to the table name.
From the table mode dropdown list, select the table mode.
For a child workspace, the table mode selection panel indicates whether the selected table mode is inherited from the parent workspace. If the child workspace currently overrides the parent workspace configuration, then to reset the table mode to the table mode that is assigned in the parent workspace, click Reset.
To change the assigned table mode for multiple tables:
Check the checkbox for each table to change the table mode for. To select a continuous range of tables, click the first table in the range, then Shift-click the last table in the range. To select all of the tables in a schema, click the schema name.
Click Bulk Edit.
On the panel, click the radio button for the table mode to assign to the selected tables.
Generates a random mailing address-like string.
You can indicate which part of an address string that the column contains. For example, the column might contain only the street address or the city, or it might contain the full address.
To configure the generator:
From the Link To dropdown list, select the columns to link this column to. You can link columns that use the Address generator to mask one of the following address components:
City
City State
Country
Country Code
State
State Abbreviation
Zip Code
Latitude
Longitude
Note that when linked to another address column, a country or country code is always the United States.
From the address component dropdown list, select the address component that this column contains. The available options are:
Building Number
Cardinal Direction (North, South, East, West)
City
City Prefix (Examples: North, South, East, West, Port, New)
City Suffix (Examples: land, ville, furt, town)
City with State (Example: Spokane, Washington)
City with State Abbr (Example: Houston, TX)
Country (Examples: Spain, Canada)
Country Code (Uses the 2-character country code. Examples: ES, CA)
County
Direction (Examples: North, Northeast, Southwest, East)
Full Address
Latitude (Examples: 33.51, 41.32)
Longitude (Examples: -84.05, -74.21)
Ordinal Direction (Examples: Northeast, Southwest)
Secondary Address (Examples: Apt 123, Suite 530)
State (Examples: Alabama, Wisconsin)
State Abbr (Examples: AL, WI)
Street Address (Example: 123 Main Street)
Street Name (Examples: Broad, Elm)
Street Suffix (Examples: Way, Hill, Drive)
US Address
US Address with Country
Zip Code (Example: 12345)
Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.
If consistency is enabled, then by default, the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When the Address generator is consistent with itself, then the same value in the source database is always mapped to the same destination value. For example, for a column that contains a state name, Alabama is always mapped to Illinois. When the Address generator is consistent with another column, then the same value in the other column always results in the same destination value for the address column. For example, if the address column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same address value in the destination database.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
For the Address generator, Spark workspaces (Amazon EMR, Databricks, and self-managed Spark clusters) only support the following address parts:
Building Number
City
Country
Country Code
Full Address
Latitude
Longitude
State
State Abbr
Street Address
Street Name
Street Suffix
US Address
US Address with Country
Zip Code
This generator can be used to generate cities, states, and zip codes that follow HIPAA guidelines for safe harbor.
How the HIPAA Address generator handles zip codes is based on whether the Replace zeros in truncated Zip Code toggle in the generator configuration is off or on.
By default, the setting is off. In this case, the last two digits of the zip code in the column are replaced with zeros, unless the zip code is a low population area as designated by the current census. For a low population area, all of the digits in the zip code are replaced with zeros.
If the setting is on, then the generator selects a real zip code that starts with the same three digits as the original zip code. For a low population area, if a state is linked, then the generator selects a random zip code from within that state. Otherwise the generator selects a random zip code from the United States.
When a zip code column is not linked, a random city is chosen in the United States. When a zip code is already added to the link, a city is chosen at random that has at least some overlap with the zip code.
If the original zip code is designated as a low population area, then a random city is chosen within the state. This is done only if the user has linked a State column. If they have not, a random city within the United States is chosen.
For example, if the original city and zip code are (Atlanta, 30305), the zip code would be replaced with 30300. Many cities contain zip codes that begin in 303, such as Atlanta, Decatur, Chamblee, Hapeville, Dunwoody, and College Park. One of these cities is chosen at random so that, for example, the final value is (Chamblee, 30300).
HIPAA guidelines allow for information at the state level to be kept. Therefore, these values are passed through.
GPS coordinates are randomly generated in descending order of dependence of the linked HIPAA address components:
If a zip code is linked, a random point within the same 3-digit zip code prefix is generated, if the 3-digit zip code prefix is not designated a low population area. If it is a low population area, use the linked state.
If a state is available and a zip code and city are not, or the zip code or city are in a 3-digit zip code prefix that is designated a low population area, then a random GPS coordinate is generated somewhere within the state.
If no zip code, city, or state is linked, or one or more of them were provided, but there was a problem generating a random GPS coordinate within the linked areas, then a GPS coordinate is generated at a random location within the United States.
Note: If the city component of the HIPAA address is linked with latitude and/or longitude, the GPS coordinate components are randomly generated independently of the city.
All other address parts are generated randomly. The output value is not influenced at all by the underlying value in the column.
To configure the generator:
From the Link To dropdown list, select the other columns to link to. You can only select columns that are also assigned the HIPAA Address generator.
From the address part dropdown list, select the type of address value that is in the column.
Toggle the Replace zeros in truncated Zip Code setting how to generate zip codes. If the setting is off, then the last two digits are replaced with zero. For low population areas, the entire zip code is populated with zeroes. If the setting is on, then a real zip code is selected that starts with the first three digits of the original zip code. For low population areas, if a state is linked, a random zip code from the state is used. Otherwise, a random zip code from the United States is used.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
For the HIPAA Address generator, Spark workspaces (Amazon EMR, Databricks, and self-managed Spark clusters) only support the following address parts:
City
City with State
City with State Abbr
State
State Abbr
US Address
US Address with Country
Zip Code
The provides support for additional address parts in Spark workspaces.
This is a .
Masks text columns by parsing the contents as HTML, and applying sub-generators to specified path expressions.
If applying a sub-generator fails because of an error, the generator selected as the fallback generator is applied instead.
Path expressions are defined using the .
For example, for the following HTML:
To get the value of h1, the expression is //h1/text().
To get the value of the first list item, the expression is //ul/li[1]/text().
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HTML field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the path expression to identify the value to apply the generator to. Matched HTML Values shows the result from the value in Cell HTML.
From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.
The options are:
Truncates a date value or a timestamp to a specific part.
For a date or a timestamp, you can truncate to the year, month, or day.
For a timestamp, you can also truncate to the hour, minute, or second.
To configure the generator:
From the dropdown list, select the part of the date or timestamp to truncate to. For both date and timestamp values, you can truncate to the year, month, or day. When you select one of these options, the time portion of a timestamp is set to 00:00:00. For the date, the values below the selected truncation value are set to 01. For example, when you truncate to month, the day value is set to 01, and the timestamp is set to 00:00:00. For a timestamp value, you also can truncate to the hour, minute, or second. The date values remain the same as the original data. The time values below the selected truncation value are set to 00. For example, when you truncate to minute, the seconds value is set to 00.
Toggle the Birth Date option. When you enable Birth Date, the generator shifts dates that are more than 90 years before the generation date to the date exactly 90 years before the generation date. For example, data generation occurs on January 1, 2023. Any date that occurs before January 1, 1933 is changed to January 1, 1933.
This is mostly intended for birthdate values, to group birthdates for everyone who is older than 89 into a single year. This is used to comply with HIPAA Safe Harbor.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Here are examples of date and time values and how the selected truncation affects the output:
Masks values in numeric columns. Adds or multiplies the original value by random noise.
The additive noise generator draws noise from an interval around 0 scaled to the magnitude of original value. For example, the default scale is 10% of the underlying value. The larger the value, the larger the amount of noise that is added.
The multiplicative noise generator multiplies the original value by a random scaling factor that falls within a specified range.
You can use either the additive noise generator or the multiplicative noise generator, then set the other generator settings.
To use the additive noise generator:
From the dropdown list, choose Additive.
In the Relative noise scale field, type the percentage of the underlying value to scale the noise to. The default value is 10.
In the decimal places field, set the number of decimal places to use. The default value is 2.
Tonic samples the additive noise from a range between [-{scale/100} * |value|, {scale/ 100} * |value|), where scale is the noise scale, and value is the original data value.
The lower value of the range is inclusive, and the upper value of the range is exclusive.
For example, for the default noise scale of 10, and a data value of 20, the additive noise range would be [-.1 * 20, .1 * 20). In other words, between -2 (inclusive) and 2 (exclusive).
To use the multiplicative noise generator:
From the dropdown list, choose Multiplicative.
In the Min field, type the minimum value for the scaling factor. The minimum value is inclusive. The default value is 0.5.
In the Max field, type the maximum value for the scaling factor. The maximum value is exclusive. The default value is 5.
In the decimal places field, set the number of decimal places to use. The defalt value is 2.
Tonic scales the original value from a range between [min, max), where min is the minimum scaling factor, and max is the maximum scaling factor.
For example, for the default values of 0.5 and 5, Tonic multiplies the original data value by a value from between 0.5 (inclusive) and 5 (exclusive).
To configure the generator consistency and data encryption:
Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. If the generator is self-consistent, then a given value in the source database is masked in exactly the same way to produce the value in the destination database. If the generator is consistent with another column, then for a given value in that other column, the column that is assigned the Noise generator is always masked in exactly the same way in the destination database. For example, a field containing a salary value is assigned the Noise Generator and is consistent with the username field. For each instance of User1, the Noise Generator masks the salary value in exactly the same way.
If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
Yes, can be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
<html>
<body>
  <div class="container">
    <h1>Title</h1>
    <p>Paragraph content</p>
    <ul>
      <li>Item 1</li>
      <li>Item 2</li>
      <li>Item 3</li>
    </ul>
  </div>
</body>
</html>Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Original value
2021-12-20
2021-12-20 13:42:55
Truncate to year
2021-01-01
2021-01-01 00:00:00
Truncate to month
2021-12-01
2021-12-01 00:00:00
Truncate to day
2021-12-20
2021-12-20 00:00:00
Truncate to hour
Not applicable
2021-12-20 13:00:00
Truncate to minute
Not applicable
2021-12-20 13:42:00
Truncate to second
Not applicable
2021-12-20 13:42:55
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)















Workspaces
A workspace contains the data connections and data generation configuration.
Data connectors
Each data connector allows Structural to read from and write to a specific type of data source.
Privacy Hub
View and update the current protection status based on the sensitivity scan and workspace configuration.
Database View
Configure transformation options for tables and columns.
Generators
A generator is assigned to a column and performs a data transformation.
Subsetting
Configure a subset of source data to include in the transformed destination data.
Generate data
Run the data generation process to produce transformed destination data.
Schema changes
Review and address changes to the source data schema.
User access
Manage who has access to your instance.
Monitoring and logging
Monitor Structural services and share logs with Tonic.ai.
Updating Structural
Upgrade to the latest version of Structural.



Your organization might use a secrets manager to secure credentials, including database connection credentials.
For data connector credentials, you can configure a set of available secrets managers. In the workspace configuration, users can then select a secret name from a secrets manager.
Structural currently supports AWS Secrets Manager.
Structural only supports secrets that store passwords. For AWS Secrets Manager, the passwords must be in one of the following formats:
String
JSON
The JSON must contain a map of key-value pairs. It can either:
Contain a single key for which the value is the password in plaintext.
Contain a key that is labeled either password or pw, for which the value is the password in plaintext.
To display the list of secrets managers, on Structural Settings view, click Secrets Manager.
To create a secrets manager:
On the Secrets Manager tab, click Add Secrets Manager.
On the Create Secrets Manager panel, in the Name field, provide a name to use to identify the secrets manager. Secrets manager names must be unique. The name is used in the secrets manager dropdown list on the workspace settings view.
From the Type dropdown list, select the secrets manager product. Structural currently supports AWS Secrets Manager.
Configure the credentials to use to connect to the secrets manager.
Click Save.
For an existing secrets manager, you can change the name and the credentials configuration.
You cannot change the type.
To edit an existing secrets manager:
In the secrets manager list, click the edit icon for the secrets manager.
On the Edit Secrets Manager panel, update the configuration.
Click Save.
When you delete a secrets manager, it is removed from the workspace database connections that use it. Structural is no longer able to connect to those databases.
To delete a secrets manager:
In the secrets manager list, click the delete icon for the secrets manager.
On the confirmation panel, click Delete.
The AWS Secrets Manager credentials that you provide must have the following permissions:
secretsmanager:ListSecrets
On each secret to use, secretsmanager:GetSecretValue 
On the encryption key for secrets that are encrypted with a customer managed key (CMK), kms:Decrypt
Here is an example policy that grants the required Secrets Manager permissions:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowListingAllSecrets",
      "Effect": "Allow",
      "Action": "secretsmanager:ListSecrets",
      "Resource": "*"
    },
    {
      "Sid": "AllowReadingSecrets",
      "Effect": "Allow",
      "Action": [
        "secretsmanager:GetSecretValue"
      ],
      "Resource": "arn:aws:secretsmanager:us-east-1:111111111111:secret:mySecretNamespace/*"
    }
  ]
}For AWS Secrets Manager, under Authentication, select the source of the credentials:
Environment - Only available on self-hosted instances. Indicates to use either:
The credentials for the AWS Identity and Access Management (IAM) role on the host machine.
The credentials set in the following environment settings:
TONIC_AWS_ACCESS_KEY_ID - An AWS access key that is associated with an IAM user or role
TONIC_AWS_SECRET_ACCESS_KEY - The secret key that is associated with the access key
TONIC_AWS_REGION - The AWS Region to send the authentication request to
Assumed role - Indicates to use the specified assumed role.
User credentials - Indicates to use the provided user credentials.
To provide an assumed role, click Assume Role, then:
In the Role ARN field, provide the Amazon Resource Name (ARN) for the role.
In the Session Name field, provide the role session name.
If you do not provide a session name, then Structural automatically generates a default unique value. The generated value begins with TonicStructural.
In the Duration (in seconds) field, provide the maximum length in seconds of the session. The default is 3600, indicating that the session can be active for up to 1 hour. The provided value must be less than the maximum session duration that is allowed for the role.
From the AWS Region dropdown list, select the AWS Region to send the authentication request to.
Structural generates the external ID that is used in the assume role request. Your role’s trust policy must be configured to condition on your unique external ID.
Here is an example trust policy:
{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Principal": {
      "AWS": "<originating-account-id>"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": {
        "sts:ExternalId": "<external-id>"
      }
    }
  }
}To provide the credentials, click User Credentials, then:
In the AWS Access Key field, enter the AWS access key that is associated with an IAM user or role.
In the AWS Secret Key field, enter the secret key that is associated with the access key.
Optional. In the AWS Session Token field, provide the session token to use.
From the AWS Region dropdown list, select the AWS Region to send the authentication request to.
By default, when a Structural security scan runs on a workspace, it looks for the .
You can also define custom sensitivity rules to identify other values and the corresponding recommended generator. Your data might include values that are specific to your organization.
Each custom sensitivity rule specifies:
The data type for matching columns.
Text matching criteria for the names of matching columns.
The recommended generator preset.
To display the current list of sensitivity rules, in the Structural navigation menu, click Sensitivity Rules.
The list contains the sensitivity rules for a self-hosted Structural instance or a Structural Cloud organization.
For each rule, the list includes:
The rule name and description
The recommended generator preset
When the rule was most recently modified
You can filter the rule list by the following:
Rule name
Rule description
Generator preset name
Name of the user who most recently updated the rule
In the filter field, start to type text from any of those values. As you type, the list is filtered to only include matching rules.
Note that when the list is filtered, you cannot change the display sequence of the rules.
Structural applies the rules based on their display order in the list.
If a column matches more than one rule, Structural applies the first matching rule.
To change the display order of a rule, drag and drop it to the new location in the list.
Note that you cannot change the rule sequence when the list is filtered.
To create a sensitivity rule:
On the Sensitivity Rules view, click New Custom Rule.
On the Create Custom Rule view, .
Click Save.
To change the configuration of a sensitivity rule:
On the Sensitivity Rules view, click the edit icon for the rule.
On the Edit Custom Rule view, .
Click Save.
Note that any changes to a sensitivity rule do not take effect until the next sensitivity scan.
In the Name field, type the name of the sensitivity rule. The rule name becomes the sensitivity type for matching columns. The rule name must be unique, and also cannot match the name of a built-in sensitivity type.
Optionally, in the Description field, type a longer description of the sensitivity rule.
From the Data Type dropdown list, select the data type for matching columns. For example, a rule might only be used for columns that contain text.
The available data types are general types that map to specific data types in a given database. The available types are:
Array
Binary
Boolean
Continuous Numerical
Date Range
Datetime
Integer
JSON
MAC Address
Network Address
Text
UUID
XML
Under Column Name Match, provide the criteria to identify matching columns based on the column name.
Note that a matching column must match both the data type and the column name criteria.
When you provide a list of text matching conditions, a matching column must match all of the conditions. In other words, the conditions are joined by AND.
To apply the same generator preset to columns that have completely different names, you must create separate sensitivity rules.
To create a list of text matching conditions:
Click Text Match.
To add a column name condition, click Add String Match.
For each condition:
From the comparison type dropdown list, select the type of comparison. For example, Contains, Starts with, Ends with.
In the comparison text field, provide the text to check for.
The comparison text is case insensitive. For example, if you set a condition to match column names that contain the text term, it also matches column names that contain TERM or Term or tErM.
To remove a column name condition, click its delete icon.
To use a regular expression to identify matching columns based on the column name:
Click Regular Expression.
In the field, provide the regular expression.
From the Recommended Generator Preset dropdown list, select the generator preset that is the recommended generator for matching columns.
To search for a specific preset, begin to type the generator preset name.
When you configure a sensitivity rule, you can also create a new generator preset or update the configuration of the selected generator preset.
To create a new generator preset, click Create Preset. On the generator preset details panel, provide the generator preset configuration, then click Create.
To edit the selected generator preset, click Edit Current Preset. On the generator preset details panel, update the generator preset configuration, then click Save and Apply.
For more information about generator preset configuration, go to .
If you have access to a workspace, then you can use the workspace to preview the sensitivity rule results.
Under Test Results, from the workspace dropdown list, select the workspace to use.
Structural searches the workspace schema for matching columns based on the sensitivity rule configuration.
It displays any matching columns. You can filter the matching columns based on the table or column name.
For each matching column, the list includes:
The column name and table
A sample value from the source data. The sample source value is only present if you have the Preview source data permission for the workspace.
A sample replacement value, based on the selected generator preset for the sensitivity rule. The sample replacement value is only present if you have the Preview destination data permission for the workspace.
To delete a sensitivity rule, on the Sensitivity Rules view, click the delete icon for the rule.
Note that existing generator recommendations for the rule remain in place until the next sensitivity scan.
This is a .
Runs a selected sub-generator on values that match a user specified . You can only search for and apply sub-generators to individual key values. You cannot apply a sub-generator to an object or to an array.
If an error occurs, the selected fallback generator is used for the entire JSON value.
For JSON columns in a file connector workspace, you can instead use Document View to assign generators to individual paths. For more information, go to .
Sub-generators are applied sequentially, from the sub-generator at the top of the list to the sub-generator at the bottom of the list.
If a key matches more than one JSONPath expression, then the most recently added generator takes priority.
JSON paths can also contain regular expressions and comparison logic, which allow the configured sub-generators to be applied only when there are properties that satisfy the query.
For example, a column contains this JSON:
[ { file_name: "foo.txt", b: 10 }, ... ]
The following JSON path only applies to array elements that contain a file_name key for which the value ends in .txt:
$.[?(@.file_Name =~ /^.*.txt$/)]
A JSON path can also be used to point to a key name recursively. For example, a column contains this JSON:
The following JSON path applies to all properties for which the key is first_name:
$..first_name
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the path expression to identify the value to apply the generator to. You cannot use the exact same path expression more than once. To create a path expression, you can also click the value in Cell JSON that you want the expression to point to. The path expression must identify a key value. You cannot apply sub-generators to an object or to an array. Matched JSON Values shows the result from the value in Cell JSON.
By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, Boolean, and Null.
From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.
The options are:
{
  "first_name": "John",
  "last_name": "Smith",
  "children": [
    {
      "first_name": "Mary",
      "last_name": "Jones",
      "children": [
        {
          "first_name": "Ann",
          "last_name": "Jones"
        }
      ]
    }
  ]
}Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)










On Collection View and Document View, Structural can automatically assign generators to fields that match a configured path expression. Each collection or JSON column has its own set of path generators.
Structural always applies the path generator to matching fields that do not have an assigned generator (are set to Passthrough).
Structural does not apply the path generator to matching fields that have a generator configuration applied directly.
However, in a child workspace, Structural does apply the path generator to matching fields that inherit their current generator configuration from the parent workspace.
To display the list of path generators for the current collection or JSON column, click Path Generators.
For each path generator, the list includes:
The priority order. Structural checks the fields against the paths in the order that the paths are displayed. The first matching path wins.
The path expression to identify matching fields.
The data type filter for matching fields. You can configure a path generator to only apply to fields of a specific type or types.
The name of the generator preset that Structural applies to matching fields.
To create a path generator, you can either create a completely new path generator, or start from a duplicate of an existing path generator.
Structural saves the new generator automatically when the configuration is complete. The Saved button at the bottom right indicates when the generator is saved.
To create a path generator:
On the path generators panel, click Add path generator.
On the path generator details panel, in the Path Expression field, provide the path expression to use to identify matching paths. The path expression uses the JSONPath syntax. Note that for a path generator, you cannot use the expression to check for a field value. For more information about the supported operators and some examples, go to #Supported JSONPath operators and examples. When you provide a path expression, the matching fields list displays the fields that match the expression.
You can optionally filter the matching fields based on the data type. For example, you might only want to apply a generator to text or integer fields. By default, the data type filter list is empty. The available data types are general types that map to specific data types in a given database. Under Data Types, to add a data type to the filter, select it from the dropdown list. To remove a data type, click its delete icon. When you configure data type filters, the matching fields list is updated to only include fields that have one of the specified data types.
From the generator dropdown list, select the generator to apply to matching fields. The available generators are affected by the data type filter. When the data type filter is empty, you can only select from generators that can be used for any type of column. When you specify a list of data types, you can only select from generators that can be used for all of those data types.
Configure the selected generator. For a generator assigned to a path expression:
Linking is not supported.
Consistency with other columns is not supported.
You can create a path generator based on an existing one. For example, for the same path expression, you might want to assign a different generator based on the data type.
To create a new path generator based on an existing path generator:
On the path generators list, click the options menu for the path generator to copy.
Click Duplicate path generator.
On the path generator details panel, edit the configuration.
To update a path generator:
On the path generators list, click the options menu for the path generator.
Click Edit path generator.
On the path generator details panel, edit the configuration.
Structural saves the changes automatically.
For fields that were assigned a generator based on the previous configuration, but that do not match the updated path generator configuration:
If the field matches other path generators, then the next matching configuration is applied.
If the field does not match any other path generators, then the field reverts to Passthrough.
When you delete a path generator, the generator assignment is removed from the matching fields. If a field matched more than one path generator, then the next match is used.
To delete a path configuration:
On the path generators list, click the options menu for the path configuration.
Click Delete path generator.
For fields that were assigned a generator based on the path generator:
If the field matches other path generators, then the next matching path generator is applied.
If the field does not match any other path generators, then the field reverts to Passthrough.
For a path generator path expression, Structural supports the following operators:
$ - Root
. - Child operator
.. - Recursive descent operator
* - Wildcard operator
[*] - Array operator. Note that a path generator must always target all of the items in an array. Any use of the array operator must include the wildcard operator.
For example, a document includes an array of objects. Each object contains name, address, and email address fields.
{
  [
    {
      "name": "John Smith",
      "address": "1 Main Street",
      "email_address": "[email protected]"
    },
    {
      "name": "Mary Jones",
      "address": "5 Elm Avenue",
      "email_address": "[email protected]"
    },
  ]
}You can configure a path generator that assigns a generator to the address field in all of the array objects. You cannot only assign a generator to the address field in one of the array objects.
Here are some example path expressions, based on the following JSON:
{
  "bookstore_name": "Read & Brew Books",
  "mailing_address": {
    "street": "123 Literary Lane",
    "city": "Bookville",
    "state": "MA",
    "zip_code": "02451",
    "country": "USA"
  },
  "phone_number": "555-123-4567",
  "books": [
    {
      "title": "The Great Gatsby",
      "author": "F. Scott Fitzgerald",
      "isbn": "978-0743273565",
      "publication_year": 1925,
      "country": "USA"
    },
    {
      "title": "Moby Dick",
      "author": "Herman Melville",
      "isbn": "978-1503280786",
      "publication_year": 1851,
      "country": "USA"
    },
    {
      "title": "To the Lighthouse",
      "author": "Virginia Woolf",
      "isbn": "978-0156907392",
      "publication_year": 1927,
      "country": "UK"
    },
    {
      "title": "The Catcher in the Rye",
      "author": "J.D. Salinger",
      "isbn": "978-0316769174",
      "publication_year": 1951,
      "country": "USA"
    }
  ]
}$.bookstore_name
Find the bookstore_name field at the top level of the JSON.
$.mailing_address.zip_code
Find the zip_code field in the mailing address.
$.books[*].isbn
Find the isbn field in each entry in the array of books.
$..country
Find all country fields in the JSON.
When Structural looks for matching fields, it checks the path generators in the order that they are displayed on the Path Generators panel.
For each field, it uses the first matching configuration.
To change the order of the path generators, drag and drop each configuration to the appropriate location in the list.
On Collection View and Document View, when the assigned generator comes from a path generator, the generator assignment is marked with an icon.
When you click the generator, the configuration panel indicates that the generator is assigned based on a path generator.
When you change the configuration, the panel indicates that the current configuration overrides the path generator.
To override the path generator configuration, click Override path generator. On the generator configuration panel, update the generator configuration.
Note that you cannot override the generator to Passthrough. If you set the generator to Passthrough, then the field reverts to the path generator.
This generator reference provides the details for each of the the supported generators in Tonic Structural.
The generators are in alphabetical order by the generator name.
Here are some groupings to help to identify generators that are used for different types of values. also provides some suggestions for generators to use for specific uses cases.
For columns that contain JSON content, you can use the JSON Mask generator to assign generators to individual JSON fields. To identify the fields, you use JSONPath expressions.
Another option is to use Document View, which allows you to view the structure of the JSON content and then assign generators to individual JSON fields.
You can also view this .
For a JSON column, the Document View option is available from Privacy Hub, Database View, and Table View.
On the column configuration panel, to enable Document View, toggle Use Document View to the on position. When you enable document parsing:
The generator dropdown changes to an Open in Document View button.
If this is the first column that you enabled Document View for, then the Document View tab becomes visible.
Any existing generator assignment is discarded.
On Privacy Hub, in the protection status display, each JSON path is displayed as a separate column. In the Database Tables list, each JSON path is a separate entry.
Structural also runs a scan on the column to detect the JSON structure and identify sensitive fields.
On workspace management view, you use Document View to view the JSON structure.
Document View is only available when it is enabled for at least one JSON column.
From the Column dropdown list, select the JSON column to configure. The dropdown contains the columns that have Document View enabled.
From the View dropdown list, select the view to use for the selected column.
Hybrid view provides a consolidated view of the schema across all of the rows.
For example, for an array, hybrid view contains a single entry with all of the possible fields.
Single view shows the structure for one row at a time. You can then page through up to 100 rows. For each row, Structural displays the row structure.
For example, for an array, single view shows the actual array entries for each record.
For each JSON field, Document View always displays:
The field name and data type.
The assigned generator.
An example value. In hybrid view, you can use the magnifying glass icon to display additional example values.
Hybrid view also displays a Field Freq column. Field Freq shows the percentage of rows that contain that permutation of field and type. For example, a field might be Null 33% of the time and contain a numeric value 67% of the time. Or a field value might be an Int32 value 3% of the time and an Int64 value 6% of the time. The percentages apply to the first 100 rows.
The Preview toggle at the top right of Document View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to determine exactly how Tonic Structural transforms the data based on the field configuration.
By default, the Preview toggle is in the on position, and the displayed data reflects the assigned generators.
To display the original source data, toggle Preview to the off position.
In single view, you can filter by either a field name or a field value.
In hybrid view, you can filter by either field name or field properties.
You can filter single view to only display fields that have specific text in either the field name or the field value.
To filter by value, toggle Search by Value to the on position.
After you select the filter type, in the search field, type text that is in the field name or value. As you type, Structural filters the list to only include fields that contain the filter text.
To filter hybrid view by field name, in the search field, begin typing text that is in the field name. As you type, Structural filters the list to only include fields with names that include the filter text.
From the hybrid document view, you can filter the fields based on field properties.
To display the Filters panel, click Filters.
To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.
To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the field list.
Above the list, Structural displays tags for the selected filters.
To clear all of the currently selected filters, click Clear All.
The Filters panel in hybrid view includes the following options.
An at-risk JSON field:
Is marked as sensitive
Is assigned the Passthrough generator.
To only display at-risk JSON fields, on the Filters panel, check At-Risk Field.
When you check At-Risk Field, Structural adds the following filters under Privacy Settings:
Sets the sensitivity filter to Sensitive.
Sets the protection status filter to Not protected.
You can filter the JSON fields based on the field sensitivity.
On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive JSON fields.
To only display sensitive JSON fields, click Sensitive.
To only display non-sensitive JSON fields, click Not sensitive.
Note that when you check At-risk Field, Structural automatically selects Sensitive.
You can filter the JSON fields based on whether they have any generator other than Passthrough assigned.
On the Filters panel, under Privacy Settings, the field protection filter is by default set to All, which indicates to display both protected and not protected JSON fields.
To only display JSON fields that have an assigned generator, click Protected.
To only display JSON fields that do not have an assigned generator, click Not protected.
Note that when you check At-Risk Field, Structural automatically selects Not protected.
When Structural detects that a JSON field is sensitive, it can also determine a recommended generator.
For example, when it detects a name value, it also recommends the Name generator.
You can filter the fields to display the fields that have recommended generators.
On the Filters panel, under Recommended Generators, check the checkbox next to the recommended generator for which to display the fields that have that recommendation.
You can filter the fields by the field data type. For example, you might only display columns that contain either numeric or integer values.
To only display fields that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.
The list of data types only includes data types that are present in the currently displayed fields and that are compatible with other applied filters.
To search for a specific data type, in the Filters search field, begin to type the data type.
When the structure of the JSON changes, you might need to update the configuration to reflect those changes. If you do not resolve the changes, then the data generation might fail.
To only display fields that have unresolved changes to the JSON structure, on the Filters panel, check Unresolved Schema Changes.
For detected sensitive fields, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.
To only display fields that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.
The list of sensitivity types only includes sensitivity types that are present in the currently displayed fields.
To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.
When the document scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination.
You can filter the columns based on the confidence level.
To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.
On the field configuration panel, the sensitivity toggle at the top right indicates whether the field is marked as sensitive.
To mark a field as sensitive, toggle the setting to the Sensitive position.
To mark a field as not sensitive, toggle the setting to the Not Sensitive position.
For each node, you assign a generator.
To assign a generator:
Click the generator value for the JSON field.
On the configuration panel, from the Generator Type dropdown list, select the generator. Other than the Conditional and Regex Mask generators, you cannot assign a composite generator to a JSON field.
Configure the generator options. For details about the available configuration options for each generator, go to the .
When you configure a generator in Document View:
You can only link to other JSON fields.
You can only enable self-consistency.
In addition to assigning generators to individual fields, you can assign generators to generic paths. The paths use JSONPath syntax.
For more information, go to .
Each table is assigned a table mode. The table mode determines at a high level how the table is populated in the destination database.
Both Database View and Table View allow you to view and update the selected table mode for a table.
For Database View, go to .
For Table View, go to .
This is the default table mode for new tables.
In this mode, Tonic Structural copies over all of the rows to the destination database.
For columns that have the generator set to Passthrough, Structural copies the original source data to the destination database.
For columns that are assigned a generator other than Passthrough, Structural uses the generator to replace the column data in the destination database.
This mode drops all data for the table in the destination database. Sensitivity scans ignore truncated tables.
For data connectors other than Spark-based data connectors, the table schema and any constraints associated with the table are included in the destination database.
For Spark-based data connectors (, , ), the table is ignored completely.
For the , file groups are treated as tables. When a file group is assigned Truncate mode, the data generation process ignores the files that are in that file group.
Any existing data in the destination database is removed. For example, if you change the table mode to Truncate after an initial data generation, the next data generation clears the table data. For Spark-based data connectors, the table is removed.
If you assign Truncate mode to a table that has a foreign key constraint, it fails during data generation. If this is a requirement, contact [email protected] for assistance.
When is enabled, the Truncate table mode does not actually truncate the destination table. Instead, it works more like Preserve Destination table mode, which preserves existing records in the destination table.
This mode preserves the data in the destination database for this table. It does not add or update any records.
This feature is primarily used for very large tables that don't need to be de-identified during subsequent runs after the data exists in the destination database.
When you assign Preserve Destination mode to a table, Structural locks the generator configuration for the table columns.
The destination database must have the same schema as the source database.
You cannot use Preserve Destination mode when you:
Enable upsert for a workspace.
Write destination data to a container repository.
Write destination data to an Ephemeral snapshot.
Incremental mode only processes the changes that occurred to the source table since the most recent data generation or other changes in the destination. This can greatly reduce generation time for large tables that don't have a lot of changes.
For Incremental mode to work, the following conditions must be satisfied:
The table must exist in the destination database. Either Structural created the table during data generation, or the table was created and populated in some other way.
A reliable date updated column must be present. When you select Incremental mode for a table, Structural prompts you to select the date updated column to use.
The table must have a primary key.
To maximize performance, we recommend that you have an index on the date updated field.
For tables that use Incremental mode, Structural checks the source database for records that have an updated date that that is greater than the maximum date in that column in the destination database.
When identifying records to update, Structural only checks the updated date. It does not check for other updates. Records where the generator configuration is changed are not updated if they do not meet the updated date requirement.
For the identified records, Structural checks for primary key matches between the source and destination databases, then does one of the following:
If the primary key value exists in the destination database, then Structural overwrites the record in the destination database.
If the primary key value does not exist in the destination database, then Structural adds a new record to the destination database.
This mode currently only updates and adds records. Rows that are deleted from the source database remain in the destination database.
To ensure accurate incremental processing of records, we recommend that you do not directly modify the destination database. A direct modification might cause the maximum updated date in the destination database to be after the date of the last data generation. This could prevent records from being identified for incremental processing.
Incremental mode is currently supported on PostgreSQL, MySQL, and SQL Server. If you want to use this table mode with another database type, contact .
You cannot use Incremental mode when you:
Enable upsert for a workspace.
Write destination data to a container repository.
Write destination data to an Ephemeral snapshot.
In this mode, Structural generates an arbitrary number of new rows, as specified by the user, using the generators that are assigned to the table columns.
You can use linking and partitioning to create complex relationships between columns.
Structural generates primary and foreign keys that reflect the distribution (1:1 or 1:many) between the tables in the source database.
You cannot use Scale mode when you enable upsert for a workspace.
For the Databricks data connector, the table mode configuration includes an Error on Overwrite setting. The setting indicates whether to return an error when Structural attempts to write data to a destination table that already contains data. The option is not available when you write destination data to Databricks Delta tables.
To return the error, toggle the setting to the on position.
To not return the error, toggle the setting to the off position.
For workspaces that use following data connectors, the table mode configuration for De-Identify mode includes an option to apply a filter to the table:
Table filters provide a way to generate a smaller set of data when a data connector does not support subsetting. For more information, go to .
This option is only available for workspaces that use the following data connectors:
On the table mode configuration panel, you can use the Repartition or Coalesce option to indicate a number of partitions to generate.
By default, the destination database uses the same partitioning as the source database. The partition option is set to Neither.
The Repartition option allows you to provide a specific number of partitions to generate.
To use the Repartition option:
Click Repartition.
In the field, enter the number of partitions.
The Coalesce option allows you to provide a maximum number of partitions to generate. If the source data has fewer partitions than the number you specify, then Structural only generates that number.
The Coalesce option should be more efficient than the Repartition option.
To use the Coalesce option:
Click Coalesce.
In the field, enter the number of partitions.
If you have multiple workspaces, then it is likely that many of the workspace components and configurations are the same or similar. It can be difficult to maintain that consistency across separate, independent workspaces.
When you copy a workspace, the new workspace is completely independent of the original workspace. There is no visibility into or inheritance of changes from the original workspace.
Workspace inheritance allows you to create workspaces that are children of a selected workspace. Unlike a copy of a workspace, a child workspace remains tied to its parent workspace.
By default, a child workspace's configuration is synchronized with the configuration of the parent. In other words, any changes to the parent workspace are copied to its child workspaces. Child workspaces can also override some of the parent configuration. From the parent workspace, you can track the child workspaces and how they are customized .
For example, you might want separate workspaces for different development teams. Each team can make adjustments to suit their specific projects - such as different subsets - but inherit everything else.
By default, a child workspace inherits all of the configuration from the parent workspace, except for the following:
Workspace name - A child workspace has its own name.
Workspace description - A child workspace has its own description.
Tags - A child workspace has its own tags.
Destination database - A child workspace writes output data to its own destination database. You can copy the destination database from the parent workspace.
Intermediate database - For upsert, a child workspace does not inherit the intermediate database.
Webhooks - A child workspace has its own webhooks.
When you change the configuration of a parent workspace, the configuration is also updated in the child workspaces.
The exception is when a child workspace overrides the configuration. If the configuration is overridden, then the child workspace does not inherit the change.
Tonic Structural indicates on both the parent and child workspaces when the configuration is overridden.
A child workspace can override the following configuration items.
Table modes - A child workspace can override the table mode for individual tables. The other tables continue to inherit the table mode that is configured in the parent workspace.
Column generators - A child workspace can override the generator for individual columns. The other columns continue to inherit the generator that is configured in the parent workspace. For linked columns, a change to any of the linked columns overrides the inheritance for all of the columns.
Subsetting - A child workspace can override the subsetting configuration from the parent workspace. Any change in the child workspace means that the child workspace no longer inherits any changes to the subsetting configuration from the parent workspace. For example, if you change the percentage setting on a single target table from 5 to 6, that eliminates the subsetting inheritance. The child workspace keeps the subsetting configuration that it already has, but it is not updated when the parent workspace is updated.
Post-job scripts - A child workspace can override the post-job scripts. Any change to the post-job scripts in the child workspace means that the child workspace no longer inherits any changes to the post-job scripts configuration.
Statistics seed - A child workspace can override the .
From each view, you can eliminate the overrides and restore the inheritance.
A child workspace cannot override the following configuration items:
Data connector type and source database - A child workspace always uses the same source data as the parent workspace.
Foreign keys - A child workspace always uses the same foreign key configuration as the parent workspace.
Sensitivity designation for a column - A child workspace cannot change whether a column is marked as sensitive.
For removed tables and columns, when a child workspace overrides the parent workspace configuration for the table or column, you must resolve the change in the child workspace.
If there is a conflicting change for the removed table or column in the parent workspace configuration, then regardless of whether the configuration is inherited, you must resolve that change in the parent workspace before the change is resolved for the child workspace.
For changes to column nullability or data type, you resolve the change separately in the child and parent workspaces.
You also dismiss notifications (new tables and columns) separately in the parent and child workspaces.
















The column list on Database View contains information about the sensitivity and generator configuration for each column.
The Column column provides general information about the columns and their content, including:
Table and column name. When you click the column name, Table View for the column table displays.
The name of the schema that contains the table.
The data type for the column.
An indicator when the column is a primary key
The Column column also contains the option to display sample data for the column.
The Status column provides information about whether the column contains sensitive data and whether it has an assigned generator.
The protection status can be one of the following values:
Protected - The column has an assigned generator.
Not Sensitive - The column is marked as not sensitive.
At Risk - The column is sensitive and does not have an assigned generator.
At the right of the Status column is a confidence indicator. For At Risk columns, the confidence indicator shows how confident Structural is that the column is sensitive and contains values of the displayed sensitivity type. Protected columns also reflect the original confidence level.
For more information about how Structural identifies values and assigns the confidence level, go to Running the Structural sensitivity scan #How Structural identifies sensitive values.
From the Status column, you can change whether a column is sensitive.
The Applied Generator column is where you select and configure the generator to assign.
The generator dropdown indicates the currently assigned generator. It also indicates when an unprotected column has a recommended generator.
For foreign key columns, the generator dropdown is disabled and the column is marked as a foreign key. Foreign key columns always inherit the generator that is assigned to the primary key.
In a child workspace, when the generator configuration overrides the parent workspace, the generator dropdown displays the override icon.
The Applied Generator column also contains the option to display and create column comments.
To filter the column list, you can:
Use the table list to filter the displayed columns based on the table that the columns belong to.
Use the filter field to filter the columns by table or column name.
Use the Filters panel to filter the columns based on column attributes and generator configuration.
You can use column filters to quickly find columns that you want to verify or to update the configuration for.
To filter the column list to only include columns for specific tables, either:
Check the checkbox for each table to display columns for.
To filter the column list by table or column name, in the filter field, begin to type text that is in the table or column name.
As you type, Structural filters the column list.
The Filters panel provides access to column filters other than the table and column name.
To display the Filters panel, click Filters.
To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.
For each filter, the Filters panel indicates the number of matching columns, based on the selected tables and the current filters.
To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the column list. Above the list, Structural displays tags for the selected filters.
To clear all of the currently selected filters, click Clear All.
To only display detected sensitive columns for which there is a recommended generator, on the Filters panel, check Has Generator Recommendation.
An at-risk column:
Is marked as sensitive.
Is included in the destination data.
Is assigned the Passthrough generator.
To only display at-risk columns, on the Filters panel, check At-Risk Column.
When you check At-Risk Column, Structural adds the following filters under Privacy Settings:
Sets the sensitivity filter to Sensitive
Sets the protection status filter to Not protected
Sets the column inclusion filter to Included
You can filter the columns based on the column sensitivity.
On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive columns.
To only display sensitive columns, click Sensitive.
To only display non-sensitive columns, click Not sensitive.
Note that when you check At-risk Column, Structural automatically selects Sensitive.
You can filter the columns based on whether they have any generator other than Passthrough assigned. To filter the columns based on specific assigned generators, use the Applied Generator filter.
On the Filters panel, under Privacy Settings, the column protection filter is by default set to All, which indicates to display both protected and not protected columns.
To only display columns that have an assigned generator, click Protected.
To only display columns that do not have an assigned generator, click Not protected.
Note that when you check At-Risk Column, Structural automatically selects Not protected.
You can filter the columns based on whether they are populated in the destination database. For example, if a table is truncated, then the columns in that table are not populated.
On the Filters panel, under Privacy Settings, the column inclusion filter is by default set to All, which indicates to display both included and not included columns.
To only display columns that are populated in the destination database, click Included.
To only display columns that are not populated in the destination database, click Not included.
Note that when you check At-Risk Column, Structural automatically selects Included.
To only display columns that are assigned specific generators, on the Filters panel, under Applied Generator, check the checkbox for each generator to include.
The list of generators only includes generators that are assigned to the currently displayed columns and that are compatible with other applied filters.
To search for a specific generator, in the Filters search field, begin to type the generator name.
You can filter the columns by the column data type. For example, you can only display varchar columns, or only columns that contain either numeric or integer values.
To only display columns that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.
The list of data types only includes data types that are present in the currently displayed columns and that are compatible with other applied filters.
To search for a specific data type, in the Filters search field, begin to type the data type.
When the source database schema changes, you might need to update the configuration to reflect those changes. If you do not resolve the schema changes, then the data generation might fail. The data generation fails if there are unresolved conflicting changes, or if you configure Structural to always fail data generation when there are any unresolved changes.
For more information about schema changes, go to Viewing and resolving schema changes.
To only display columns that have unresolved schema changes, on the Filters panel, check Unresolved Schema Changes.
For detected sensitive columns, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.
To only display columns that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.
The list of sensitivity types only includes sensitivity types that are present in the currently displayed columns.
To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.
When the Structural sensitivity scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination. The Status column displays the confidence level.
You can filter the columns based on the confidence level.
To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.
You can filter the column list based on whether the column is nullable.
On the Filters panel, under Data Attributes, the nullability filter is by default set to All, which indicates to display both nullable and non-nullable columns.
To only display columns that are nullable, click Nullable.
To only display columns that are not nullable, click Non-nullable.
You can filter the column list based on whether the column must be unique.
On the Filters panel, under Data Attributes, the uniqueness filter is by default set to All, which indicates to display both unique and not unique columns.
To only display columns that must be unique, click Unique.
To only display columns that do not require uniqueness, click Not unique.
You can filter the column list to indicate whether to include:
Columns that are not primary or foreign keys.
Columns that are foreign keys.
Columns that are primary keys.
On the Filters panel, under Column Type:
To display columns that are neither a primary key nor a foreign key, check Non-keyed.
To display columns that are primary keys, check Primary key.
To display columns that are foreign keys, check Foreign key.
In a child workspace, to only display columns that override the generator configuration that is in the parent workspace, on the Filters panel, check Overrides Inheritance.
You can enable Structural data encryption, a configuration that allows Structural to:
Decrypt source data before applying the generator.
Encrypt generated data before writing it to the destination database.
For more information, go to Configuring and using Structural data encryption.
When Structural data encryption is enabled, the generator configuration panel includes an option to use Structural data encryption.
To only display columns that are configured to use Structural data encryption, on the Filters panel, check Uses Data Encryption.
By default, the column list is sorted first by table name, then by column name. The columns for each table display together. Within each table, the columns are in alphabetical order.
You can also sort the column list by column name first, then by table. Columns that have the same name display together. Those columns are sorted by the name of the table.
The button at the right of the Column column heading indicates the current sort order.
T.C indicates that the table is sorted by table, then by column
C.T indicates that the table is sorted by column, then by table
To switch the sort order, click the button.
Table View displays source or preview data for a single table. For a workspace, each table corresponds to a file group.
To display Table View:
On the workspace management view, click Table View.
On Workspaces view, from the dropdown menu in the Name column, select Table View.
From Database View, either click the arrow icon for the table, or click a row in the table.
From Table View, you can view and update the table and column configuration.
When you display Table View from Database View, it displays the data for the selected table.
When you display Table View from the workspace management view or Workspaces view, it displays the most recently displayed table.
If Table View was never displayed before, then it displays the first table in the workspace.
To change the selected table, from the Table dropdown list, select the table to view.
To change the table mode that is assigned to the table:
Click the current table mode.
On the table mode panel, from the table mode dropdown list, select the new table mode.
When you change the table mode, Tonic Structural updates the preview data as needed. For example, if you change the table mode to Truncate, then the preview data is empty.
For a , the table mode selection panel indicates whether the selected table mode is inherited from the parent workspace.
If the child workspace currently overrides the parent workspace configuration, then to reset the table mode to the table that is assigned in the parent workspace, click Reset.
The Model section of Table View displays the configured generators for the table columns.
The header for each Model entry is the column name.
Linked columns share an entry. The heading is a comma-separated list of the linked columns.
Each entry contains the following information:
The column and generator, in the format Column Name >> Generator Name. For example, First_Name >> Name indicates that the First_Name column has the Name generator applied.
For linked columns, there is a Column Name >> Generator Name entry for each column.
The selected configuration options for the generator.
For a , each Model entry indicates whether the configuration overrides the parent configuration. For configurations that override the parent, to remove the overrides and restore the inheritance, click Reset.
The Model entry also indicates when is enabled for the column.
To remove the generator from a column, click the delete icon.
The Preview toggle at the top right of Table View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to understand exactly how Structural transforms the data based on the table and column configuration.
By default, the Preview toggle is in the on position, and the displayed data reflects the selected table mode and the assigned generators. For tables that use Truncate mode, the preview data is empty. Truncated tables do not have data in the destination database.
To display the original source data, toggle Preview to the off position.
Note that for , you cannot preview the destination data from Table View. You must preview the data from Document View.
You can provide a query to filter the source data. The query is always against the source data, not the preview data, regardless of whether the Preview toggle is off or on.
For example, you configure a first name field to use the Name generator and enable consistency. You can then query the source data for a specific first name value to check that the preview data uses the same destination value for all of those records.
To apply a query to the source data:
Click the query filter icon, located between the table name and the table mode.
On the Table Filter dialog, provide the WHERE clause for the query.
To apply the query, click Apply.
To close the dialog, click Close.
To clear an applied query, on the Table Filter dialog, click Clear.
If no filter is applied, then the query filter icon has a white background.
If a valid filter is applied, then the query filter icon has a gray background.
If the provided WHERE clause is not valid, then the query filter icon has a red background.
In addition to the column name, the column heading provides details about the column type and protection status. It also provides access to change the column configuration.
The column heading indicates when a column is either a primary key or a foreign key.
The column heading indicates the column protection status:
At risk columns are sensitive and do not have an assigned generator.
Protected columns have an assigned generator.
Not sensitive columns are not sensitive and do not have an assigned generator.
The sensitivity confidence indicator indicates the confidence in the detection.
For sensitive columns that Structural detected, the confidence level can be high, medium, or low.
For custom sensitivity rule matches or columns that you manually marked as sensitive, the confidence level is full confidence.
For more information about how Structural identifies values and assigns the confidence level, go to .
The column heading displays the type of data that the column contains.
In a , when a column overrides the parent configuration, an Overriding label displays in the column heading.
To filter Table View to only display columns with overrides, toggle Show Overrides Only to the on position.
When a sensitivity scan identifies a column, Structural recommends a generator for the column. For example, when the sensitivity scan identifies a column as a first name, Structural recommends the Name generator configured to generate a first name value.
For unprotected columns that have a recommended generator, the column heading displays the available recommendation icon.
When you click the dropdown, the column configuration panel includes the following information:
The sensitivity confidence level
The recommended generator
Sample source and destination values based on the recommended generator
From the panel, you can choose whether to assign or ignore the recommended generator for that type.
To assign the recommended generator, click Apply.
To ignore the recommendation, click Ignore. Structural clears the recommendation.
To assign a generator to a column that does not have an assigned generator, or to change the current configuration, click the dropdown in the column heading.
On the generator configuration panel, from the generator type dropdown list, select the generator to assign to the column.
Structural displays the available configuration options for the selected generator. For details about the configuration options for each generator, go to the .
To remove the selected generator or generator preset, and reset the generator to Passthrough, click the delete icon next to the generator.
For more information about selecting and configuring generators and generator presets, go to .
On the column configuration panel, the Sensitive Data toggle indicates whether the column is marked as sensitive. The initial configuration is based on the sensitivity scan.
To mark a column as sensitive, toggle the setting to the on position.
To mark a column as not sensitive, toggle the setting to the off position.
In a , you cannot configure whether a column is sensitive. A child workspace always inherits the sensitivity designation from its parent workspace.
When you copy a workspace, Structural performs a new sensitivity scan on the copy. It does not copy the sensitivity designations from the original workspace.
For a JSON column, instead of assigning a generator, you can enable Document View.
From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to .
To enable Document View, on the column configuration panel, toggle Use Document View to the on position. Note that if you have , or enabled , then the Use Document View toggle is in the advanced options.
When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.
For and , Collection View replaces Database View and Table View. From Collection View, you can view the fields in a selected collection. You can then assign a collection mode to the collection, and assign generators to fields.
From the Collection dropdown list, select the collection to view.
Collection mode is the term used for table mode. The collection mode determines at the collection level how Structural uses the collection data to generate the destination database.
By default, the collection mode is De-Identify. In this mode, Structural uses the assigned generators to transform the source database into the destination database.
For and , the only other options are Truncate and Preserve Destination.
Truncate means that only the collection structure is included in the destination database. The collection has no data in the destination database.
Preserve Destination means that Tonic does not change the data that is currently in the destination database.
To assign the collection mode:
Click the Collection Mode dropdown list.
On the panel, click the current collection mode.
From the drop-down list, select the mode to use.
You can view a collection either as a hybrid document or as single documents. From the View dropdown list, select the view to use.
The default view is Hybrid Document. For the hybrid document view, the key list reflects all of the permutations of every field from every document. For example, a field might sometimes be a datetime value and sometimes a string. Hybrid document view lists both types.
Single Document view displays a single document at a time. You can then page through up to 100 documents. Single Document view displays the structure for each document.
For each field, Collection View always displays:
The field name and type.
For fields that you configured as primary or foreign keys, a key icon.
The assigned generator.
An example value. For the hybrid view, you can use the magnifying glass icon to display additional example values.
For the hybrid document view, there is also a Field Freq column. Field Freq shows the percentage of documents that contain that permutation of field and type.
For example, a field might be Null 33% of the time and contain a numeric value 67% of the time. Or a field value might be an Int32 value 3% of the time and an Int64 value 6% of the time. The percentages apply to the first 100 documents.
The Preview toggle at the top right of Collection View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to determine exactly how Tonic Structural transforms the data based on the collection and field configuration.
By default, the Preview toggle is in the on position, and the displayed data reflects the selected collection mode and the assigned generators. For collections that use Truncate mode, the preview data is empty. Truncated collections do not have data in the destination database.
To display the original source data, toggle Preview to the off position.
In the single document view, you can filter the fields by either the field name or the field value.
In the hybrid document view, you can filter the fields based on either the field name or field properties.
You can filter single document view to only display fields that have specific text in either the field name or the field value.
To filter by value, toggle Search by Value to the on position.
After you select the filter type, in the search field, type text that is in the field name or value. As you type, Structural filters the list to only include fields that contain the filter text.
To filter hybrid view by field name, in the search field, begin to type text that is in the field name. As you type, Structural filters the list to only include fields with names that include the filter text.
From the hybrid document view, you can filter the fields based on field properties.
To display the Filters panel, click Filters.
To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.
To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the field list.
Above the list, Structural displays tags for the selected filters.
To clear all of the currently selected filters, click Clear All.
The Filters panel in hybrid view includes the following fields.
An at-risk field:
Is marked as sensitive
Is assigned the Passthrough generator.
To only display at-risk fields, on the Filters panel, check At-Risk Field.
When you check At-Risk Field, Structural adds the following filters under Privacy Settings:
Sets the sensitivity filter to Sensitive.
Sets the protection status filter to Not protected.
You can filter the fields based on the field sensitivity.
On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive fields.
To only display sensitive fields, click Sensitive.
To only display non-sensitive fields, click Not sensitive.
Note that when you check At-risk Field, Structural automatically selects Sensitive.
You can filter the fields based on whether they have any generator other than Passthrough assigned.
On the Filters panel, under Privacy Settings, the field protection filter is by default set to All, which indicates to display both protected and not protected fields.
To only display fields that have an assigned generator, click Protected.
To only display fields that do not have an assigned generator, click Not protected.
Note that when you check At-Risk Field, Structural automatically selects Not protected.
When Structural detects that a field is sensitive, it can also determine a recommended generator.
For example, when it detects a name value, it also recommends the Name generator.
You can filter the fields to display the fields that have recommended generators.
On the Filters panel, under Recommended Generators, check the checkbox next to the recommended generator for which to display the fields that have that recommendation.
You can filter the fields by the field data type. For example, you might only display columns that contain either numeric or integer values.
To only display fields that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.
The list of data types only includes data types that are present in the currently displayed fields and that are compatible with other applied filters.
To search for a specific data type, in the Filters search field, begin to type the data type.
When the source database schema changes, you might need to update the configuration to reflect those changes. If you do not resolve the schema changes, then the data generation might fail. The data generation fails if there are unresolved conflicting changes, or if you configure Structural to always fail data generation when there are any unresolved changes.
For more information about schema changes, go to .
To only display fields that have unresolved schema changes, on the Filters panel, check Unresolved Schema Changes.
For detected sensitive fields, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.
To only display fields that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.
The list of sensitivity types only includes sensitivity types that are present in the currently displayed fields.
To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.
When the Structural sensitivity scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination.
You can filter the columns based on the confidence level.
To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.
You can filter the column list to indicate whether to include:
Columns that are not primary or foreign keys.
Columns that are foreign keys.
Columns that are primary keys.
On the Filters panel, under Field Type:
To display fields that are neither a primary key nor a foreign key, check Non-keyed.
To display fields that are primary keys, check Primary key.
To display fields that are foreign keys, check Foreign key.
You can add comments to fields. For example, you might use a comment to explain why you selected a particular generator or marked a field as sensitive or not sensitive.
If a field does not have any comments, then to add a comment:
Click the comment icon.
In the comment field, type the comment text.
Click Comment.
When a field has existing comments, the comment icon is green. To add comments:
Click the comment icon. The comments panel shows the previous comments. Each comment includes the comment user and timestamp.
In the comment field, type the comment text.
Click Reply.
On the field configuration panel, the sensitivity toggle at the top right indicates whether the field is marked as sensitive.
To mark a field as sensitive, toggle the setting to the Sensitive position.
To mark a field as not sensitive, toggle the setting to the Not Sensitive position.
You can assign a generator to each combination of field and type. For example, depending on the document, the data type for a field might be either string or integer. You can indicate to use the Character Scramble generator when the field type is a string and the Random Integer generator when the field type is integer.
In hybrid document view, the Null type reflects when the column value is Null. You do not assign a generator to it.
To assign a generator:
Click the generator value for the field.
On the configuration panel, from the Generator Type dropdown list, select the generator.
Configure the generator options. For details about the available configuration options for each generator, go to the .
In addition to assigning generators to individual fields, you can assign generators to generic paths. The paths use JSONPath syntax.
For more information, go to .
By default, Structural retrieves 100 documents. It then uses the data in these documents to populate example values in the hybrid document.
For sparsely populated collections, where less common fields are not present in those 100 documents, Structural retrieves extra documents until it has example values for all fields. For very sparsely populated collections, this might cause the collection view to load slowly, because it must retrieve many documents.
To disable examples for sparse collections, set the  TONIC_MONGO_DISABLE_EXTRA_EXAMPLES to true. You can add this setting manually to the Environment Settings list on Structural Settings.
Note that this setting applies to both MongoDB and Amazon DynamoDB.
When this setting is true, fields that do not have a retrieved value use a dummy default value that is based on the data type.




































By default, Tonic Structural data generation replaces the existing destination database with the transformed data from the current job.
Upsert adds and updates rows in the destination database, but keeps all of the other existing rows intact. For example, you might have a standard set of test records that you do not want to replace every time you generate data in Structural.
If you enable upsert, then you cannot write the destination data to a container repository or to a Tonic Ephemeral snapshot. You must write the data to a database server.
Upsert is currently only supported for the following data connectors:
MySQL
Oracle
PostgreSQL
SQL Server
For an overview of upsert, you can also view the video tutorial.
When upsert is enabled, the data generation job writes the generated data to an intermediate database. Structural then runs the upsert job to write the new and updated records to the destination database.
The destination database must already exist. Structural cannot run an upsert job to an empty destination database.
The upsert job adds and updates records based on the primary keys.
If the primary key for a record already exists in the destination database, the upsert job updates the record.
If the primary key for a record does not exist in the destination database, the upsert job inserts a new row.
To only update or insert records that Structural creates based on source records, and ignore other records that are already in the destination database, ensure that the primary keys for each set of records operate on different ranges. For example, allocate the integer range 1-1000 for existing destination database records that you add manually. Then ensure that the source database records, and by extension the records that Structural creates during data generation, use a different range.
Also note that when upsert is enabled, the Truncate table mode does not actually truncate the destination table. Instead, it works more like Preserve Destination table mode, which preserves existing records in the destination table.
To enable upsert, in the Upsert section of the workspace details, toggle Enable Upsert to the on position.
When you enable upsert for a workspace, you are prompted to configure the upsert processing and provide the connection details for the intermediate database.
When you enable upsert, Structural displays the following settings to configure the upsert process.
Disable Triggers
Indicates whether to disable any user-defined triggers before the upsert job runs. This prevents duplicate rows from being added to the destination database. By default, this is enabled.
Automatically Start Upsert After Successful Data Generation
Indicates whether to immediately run the upsert job after the initial data data generation to the intermediate database. By default, this is enabled. If you turn this off, then after the initial data generation, you must start the upsert job manually. For more information, go to .
Persist Conflicting Data Tables
When an upsert job cannot process rows with unique constraint conflicts, as well as rows that have foreign keys to those rows, this setting indicates whether to preserve the temporary tables that contain those rows. By default, this is disabled. Structural only keeps the applicable temporary tables from the most recent upsert job.
Warn on Mismatched Constraints
Indicates whether to treat mismatched foreign key and unique constraints between the source and destination databases as warnings instead of errors, so that the upsert job does not fail. By default, this is disabled.
The intermediate database must have the same schema as the destination database. If the schemas do not match, then the upsert process fails.
To ensure that schema changes are automatically reflected in the intermediate database, you can connect the workspace to your own database migration script or tool. Structural then runs the migration script or tool whenever you run upsert data generation.
When you start an upsert data generation job:
If migration is enabled, Structural calls the endpoint to start the migration.
Structural cannot start the upsert data generation until the migration completes successfully. It regularly calls the status check endpoint to check whether the migration is complete.
When the migration is complete, Structural starts the upsert data generation.
Required. Structural calls this endpoint to start the migration process specified by the provided URL.
The request includes:
Any custom parameter values that you add.
The connection information for the intermediate database.
The request uses the following format:
{ 
  "parameters": {/* user supplied parameters */ },
  "databaseConnectionDetails": {
        "server": "rds.amazon.com",
        "port": "54321",
        "username": "user",
        "password": "password",
        "databaseName": "tonic_upsert",
        "schemaName": "<Oracle schema to use>",
        "sslEnabled": true,
        "trustServerCertificate": false
  }
}The response contains the identifier of the migration task.
The response uses the following format:
{ "id": "<unique-string-identifier>" }Required. Structural calls this endpoint to check the current status of the migration process.
The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter.
The response provides the current status of the migration task. The possible status values are:
Unknown
Queued
Running
Canceled
Completed
Failed
The response uses the following format:
{
  "id": "a0c5c4c3-a593-4daa-a935-53c45ec255ea",
  "status": "Completed",
  "errors": []
}Optional. Structural calls this endpoint to retrieve the log entries for the migration process. It adds the migration logs to the upsert logs.
The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter
The response body of the request should be 'text/plain'. It contains the raw logs.
Optional. Structural calls this endpoint to cancel the migration process.
The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or query parameter.
To enable the migration process, toggle Enable Migration Service to the on position.
When you enable the migration process, you must configure the POST Start Schema Changes and GET Status of Schema Change endpoints.
You can optionally configure the GET Schema Change Logs and DELETE Cancel Schema Changes endpoints.
To configure the endpoints:
To configure the POST Start Schema Changes endpoint:
In the URL field, provide the URL of the migration script.
Optionally, in the Parameters field, provide any additional parameter values that your migration scripts need.
To configure the GET Status of Schema Change endpoint, in the URL field, provide the URL for the status check.
The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.
To configure the GET Schema Change Logs endpoint, in the URL field, provide the URL to use to retrieve the logs.
The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.
To configure the DELETE Cancel Schema Changes endpoint, in the URL field, provide the URL to use for the cancellation.
The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.
When you enable upsert, you must provide the connection information for the intermediate database.
For details, go to the workspace configuration information for the data connector.
During upsert data generation, when Structural finds inconsistencies between the source and destination database schemas:
Where possible, Structural attempts to address the issue so that the data generation can succeed.
Structural does not change the schema of the destination database.
For constraint-related schema issues, Structural only attempts to address the issues if Warn on Mismatched Constraints is enabled for the workspace. If the setting is turned off, then the job fails.
Here are some common schema issues that can occur, and how Structural responds to them.
In this case, a column that is present in the source schema is not present in the destination schema.
For example, a new column is added to a production source table, but is not in the schema of the de-identified destination database that is used for testing.
When this occurs, Structural ignores the column. It does not add the column to the destination schema.
Structural adds a warning to the job logs.
In this case, a column that is present in the destination schema is not present in the source schema.
For example, a developer adds a column to the de-identified destination database so that they can test a new feature. The new feature is not yet released, so the source production data doesn't include the column.
When this occurs:
If the destination column is nullable, then Structural sets the value to NULL.
If the destination column is not nullable, but the column has a default value, then Structural sets the destination value to the default.
If the non-nullable destination column does not have a default value, then Structural attempts to set a value based on the column data type. For example, Structural might set an integer column to 0, or a varchar column to an empty string.
If Structural is unable to set a value, then the data generation fails and Structural returns an error.
In this case, the same column has different data types in the source and destination schemas.
For example, a column might be a string in the source schema and a timestamp in the destination schema.
When this occurs, for each record:
If possible, Structural converts the values. For example, the source column is a string and contains datetime values. The generator also produces datetime values. In that case, Structural should be able to populate a datetime destination column.
If it cannot convert the value, and the column is nullable, then Structural sets the destination column value to NULL.
If it cannot convert the value, and the column is not nullable, then the record is excluded from the upsert.
For each of these actions, Structural also adds warnings to the job logs.
If Structural cannot perform any of those actions to work around the issue, then the data generation fails and Structural returns an error.
In this case, a constraint on a source column is not present in the destination schema.
For example, a column is required in the source schema but optional in the destination schema.
If Warn On Mismatched Constraints is enabled for the workspace, then Structural does not have to make any changes to the data. It populates the destination column correctly.
Structural also adds a warning to the job logs.
If Warn On Mismatched Constraints is turned off, then the job fails.
In this case, a constraint on a destination column is not present in the source schema.
For example, a column has no constraints in the source schema, but has a uniqueness constraint in the destination schema.
When this occurs, if Warn on Mismatched Constraints is enabled for the workspace, Structural removes any records that fail the constraint. For example, for a uniqueness constraint, Structural removes duplicate records.
Structural also adds warnings to the job logs.
If Warn on Mismatched Constraints is turned off, then the job fails.
In this case, a table in the source schema is not present in the destination schema.
For example, a new table is added to a production source table, but is not yet in the schema of the de-identified destination database that is used for testing.
When this occurs, Structural ignores the table. It does not add the table to the destination schema.
Structural also adds warnings to the job logs.
In this case, a table in the destination schema is not present in the source schema.
For example, a developer adds a table to the de-identified destination database so that they can test a new feature. Because the new feature is not yet released, the source production data doesn't include the table.
When this occurs, Structural ignores the table. It does not attempt to populate the destination table.
Structural also adds warnings to the job logs.
Structural cannot detect that a table is renamed.
From Structural's perspective, the original table is removed, and the table with the new name is added.
For example, a source and destination schema both contain a table called Users.
In the source database, the Users table is renamed to People.
Structural would detect the following schema issues:
The source schema contains a People table that is not in the destination schema. For information about how Structural addresses this, go to #Source table is not in the destination schema.
The destination schema contains a Users table that is not in the source schema. For information about how Structural addresses this, go to #Destination table is not in the source schema.
You can view a list of jobs for the workspace, and view details for individual jobs.
Tonic Structural runs the following types of jobs on a workspace:
Sensitivity scans - Analyze the source database to identify sensitive data.
Collection scans - Analyze the source data for a MongoDB workspace to determine the available fields in each collection, the field types, and how prevalent the fields are.
Schema retrieval jobs - Refresh the cached version of the source database schema.
Data generation, data pipeline generation, and containerized generation jobs - Generate the destination data from the source data.
Upsert data generation jobs - Generate the intermediate database from the source database.
Upsert jobs - Use data from the intermediate database to add new rows to and update changed rows in the destination database. If the migration process is enabled, then it is a step in the upsert job.
SDK table statistics jobs - Only run when you use the SDK to generate data in a Spark workspace, and the assigned generators require the statistics.
A job can have one of the following statuses:
Queued - The job is queued to run, but has not yet started. A job is queued for one of the following reasons:
Another job is currently running on the same workspace. For example, you cannot run a sensitivity scan and a data generation, or multiple data generations, at the same time on the same workspace. This is true regardless of the number of workers on the instance. On Structural Cloud, there is also a limit on the number of concurrent running jobs for each organization. When that maximum is reached, a new job remains queued until a current running job completes.
There isn't an available worker on the instance to run the job. A Structural instance with one worker can only run one job at a time. If a job from one workspace is currently running, a job from another workspace cannot start until the first job is finished.
To view information about why a job is queued, click the status value.
Running - The job is in progress.
Canceled - The job is canceled.
Completed - The job completed successfully.
Failed - The job failed to complete.
Each of these statuses has a corresponding "with warnings" status. For example, Running with warnings, Completed with warnings. A "with warnings" status indicates that the job had at least one warning at the time of the request.
Jobs view displays the list of jobs for the workspace.
To display Jobs view:
On the workspace management view, in the workspace navigation bar, click Jobs.
On Workspaces view, from the dropdown menu in the Name column, select Jobs.
On Jobs view, you use the tabs to filter the jobs based on the job type.
The possible tabs are:
All Jobs - Always displayed. Contains all of the workspace jobs across all job types. When you first display Jobs view, All Jobs is selected.
Data Generation - Always displayed. Includes the following types of jobs:
Data generation
Data pipeline generation
Containerized data generation
Upsert data generation
Upsert
Sensitivity Scan - Always displayed. Lists the sensitivity scans for the workspace.
Collection Scan - Displays for MongoDB and Amazon DynamoDB workspaces. Lists the collection scans on the source data.
Schema - Lists the schema retrieval jobs for the workspace.
Statistics - Displays for workspaces that use Spark-based data. Lists the SDK table statistics jobs.
The list is always sorted by the submission date, with the most recent jobs at the top of the list.
For each job, the job list includes the following information:
Job ID - The identifier of the job. To copy the job identifier, click the icon at the left of the row.
Type - The type of job.
Status - The current status of the job, and how long ago the job reached that status. When you hover over the status, a tooltip displays the actual timestamp for the status change, and the length of time that the job ran. For queued jobs, to display a panel with information about why the job is queued, click the status value.
Submitted - The date and time when the job was submitted.
Completed - The date and time when the job finished running.
To filter the list by the job status:
Click the filter icon in the Status column heading. The status panel displays all of the statuses that are currently in the list. For example, if there are no Queued jobs, then the Queued status is not in the list. By default, all of the statuses are included, and none of the checkboxes are checked.
To only include jobs that have specific statuses, check the checkbox next to each status to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.
To filter the list by the job identifier, in the filter field, provide the full identifier.
For jobs other than Queued jobs, you can display details about the workspace and the job progress.
From the Jobs view, to display the details for a job, click the job row.
The left side of the job details view contains the workspace information.
For a sensitivity scan, the workspace information is limited to the owner, database type, and worker version.
For a data generation job, the workspace information also includes:
Whether subsetting, post-job scripts, or webhooks are used.
The number of schemas, tables, and columns in the source database.
The number of schemas, tables, and columns in the destination database.
The Job Log tab shows the start date, start time, and duration of the job, followed by the list of job process steps.
For data generation jobs, the Privacy Report tab displays the number of at-risk, protected, and not sensitive columns in the source database.
At-risk columns contain sensitive data, but still have Passthrough as the assigned generator.
Protected columns have an assigned generator other than Passthrough.
Not sensitive columns have Passthrough as the assigned generator, but do not contain sensitive data.
A workspace can write output to a Tonic Ephemeral snapshot, with an option to preserve the temporary Ephemeral database that is used to create the snapshot.
For data generation jobs that write to Ephemeral, the Data available in Tonic Ephemeral panel displays.
It contains:
A link to Outputs view. View Job Outputs displays Outputs view, filtered to only display snapshots and databases for this job. For more information, go to Viewing and managing Ephemeral output.
A link to Ephemeral
If you preserver the temporary database, access to the database connection details
When the temporary database is not preserved, the Data available in Tonic Ephemeral panel contains the following:
When the temporary database is preserved, the Data available in Tonic Ephemeral panel contains the following:
To display the connection details for the Ephemeral database, click View connection info.
For an Ephemeral database, the connection details include the database location and credentials. Each field contains a copy icon to allow you to copy the value.
The job identifier is a unique identifier for the job. To copy the job identifier, either:
From Jobs view, click the copy () icon in the leftmost column.
From the job details view, click Copy Job ID.
You can cancel Queued or Running jobs.
For jobs with those statuses, the rightmost column in the job list contains a cancel icon.
To cancel the job, click the icon.
For workspaces that are configured to write destination data to a container repository, the Jobs view also provides access to the generated artifacts. For more information, go to Viewing and downloading container artifacts.
For all jobs, the job logs provide detailed information about the job processing. Tonic.ai support might request the job logs to help diagnose issues.
For a failed data generation to Ephemeral, the job logs include the Ephemeral logs and the destination database pod logs.
For upsert jobs where the migration process is enabled, and you configured the GET Schema Change Logs endpoint, the upsert job logs include the migration process logs.
You can download the job logs from the Jobs view or the job details view. The download includes up to 1MB of log entries.
On the Jobs view, to download the logs for a job, click the download icon in the rightmost column.
On the job details view, to download the logs for a job, click Reports and Logs, then select Job Logs.
By default, Structural redacts sensitive values from the job logs. To help support troubleshooting, you can configure data connectors or an individual data generation job to create unredacted versions of the log files, referred to as diagnostic logs. For more information, go to Redacted and diagnostic (unredacted) logs.
To access diagnostic log files, you must have the Enable diagnostic logging global permission.
If you do not have the Enable diagnostic logging global permission, then you cannot download the logs for that job. The download option is disabled.
From the job details view, you can download a Privacy Report file that provides an overview of the current protection status of the database columns based on the workspace configuration at the time that the job ran.
You can download either:
The Privacy Report .csv file, which provides details about the table columns, the column content, and the current protection configuration.
The Privacy Report PDF file, which provides charts that summarize the privacy ranking scores for the table columns. It also includes the table from the .csv file.
To display the download options, click Reports and Logs. In the menu:
To download the Privacy Report .csv file, click Privacy Report CSV.
To download the Privacy Report PDF file, click Privacy Report PDF.
For more information about the Privacy Report files and their content, go to Using the Privacy Report to verify data protection.
For a workspace that writes the output to a container repository, the job includes the following additional logs:
Database logs - Logs for the database container that is used as the destination.
Datapacker logs - Logs for creating the OCI artifact and uploading it to an OCI registry.
To download these logs for a data generation job, on the job details view, click Reports and Logs, then select Database Logs or Datapacker Logs.
For workspaces that are connected to Amazon Redshift or Snowflake on AWS databases, the data generation job requires multiple calls to a Lambda function. For these data generation jobs, the CloudWatch logs monitor the progress of and display errors for these Lambda function calls.
To download the CloudWatch logs for a data generation job, on the job details view, click Reports and Logs, then select CloudWatch Logs.
The CloudWatch Logs option only displays for Amazon Redshift and Snowflake on AWS data generation jobs.
For an Oracle data generation, if both of the following are true:
The data generation job ran SQL Loader (sqlldr).
sqlldr either failed or succeeded with errors.
Then to download the sqlldr log files, click Reports and Logs, then select sqlldr Logs.
The job details include an option to send a log package to Tonic.ai.
You would likely send the log package at the request of the Structural support team, to help to troubleshoot a data generation issue.
To send the package, from the Reports and Logs dropdown list, select Send logs to Tonic.ai.
Structural creates the package, then uploads it to an S3 bucket. Packages are removed from the S3 bucket automatically after 30 days.
For a data generation from a file connector workspace that uses local files, you can download the transformed files for that job.
The download is a .zip file that contains the files for a selected file group.
On the job details view, when files are available to download, the Data available for file groups panel displays.
To download the files for a file group:
Click Download Results.
From the list, select the file group. Use the filter field to filter the list by the file group name.
For workspaces that use the newer data generation processing, users can configure a data generation job to also generate performance metrics. This is usually done for troubleshooting purposes.
On the job details view, to download the performance metrics for the job, click Reports and Logs, then click Performance Metrics.
From the job details view, you can display a Gantt chart that shows the flow of a data generation job over time. The chart can help you to understand the different steps of a data generation job and how long it takes Structural to complete each step.
Note that this option is only available for data generation jobs that use the newer data generation process. For more information, go to . Data generation jobs that use the older process do not produce the Gantt chart.
To display the chart, click Reports and Logs, then select View Gantt.
The Job Visualization page displays the Gantt chart of the job progress.
If you are a user who wants to set up an account in an existing Tonic Structural Cloud or self-hosted organization, go to .
The Structural 14-day free trial allows you to explore and experiment in Structural Cloud before you decide whether to purchase Structural.
When you sign up for a free trial, Structural automatically creates a sample workspace for you to use. You can also create a workspace that uses your own database or files.
The free trial provides tools to introduce you to Structural and to guide you through configuring and completing a data generation.
Structural tracks and displays the amount of time remaining in your free trial. You can request a demonstration and contact support.
When the free trial period ends, you can continue to use Structural to configure workspaces. You can no longer generate data or train models. Contact Tonic.ai to discuss purchasing a Structural license, or select the option to .
To start a new free trial of Structural:
Go to .
Click Create Account.
On the Create your account dialog, to create an account, either:
To use a corporate Google email address to create the account, click Create account using Google.
To create a new Structural account:
Enter your email address. You cannot use a public email address for a free trial account.
Create and confirm a Structural password.
Click Create Account.
Structural sends an activation link to your email address.
After you activate your account and log in, Structural next prompts you to select the use case that best matches why you are exploring Structural.
If none of the provided use cases fits, use the Other option to tell us about your use case.
After you select a use case, click Next. The Create Your Workspace panel displays.
When you sign up for a free trial, Structural provides access to a sample PostgreSQL workspace that you can use to explore how to configure and run data generation.
You can also choose to create a workspace that uses your own data, either from local files or from a database.
If you do connect to your own data, then you must allowlist the Structural static IP addresses. For more information, go to .
On the Create your workspace panel:
To use the sample workspace, click Use a sample workspace, then click Next. Structural displays , which summarizes the protection status for the source data. It also displays the and the .
To create a workspace that uses local files as the source data, click Upload Files, then click Next. Go to .
To create a new workspace that uses your own data, click Bring your own data, then click Next. Go to .
The Upload files option creates a local files workspace. The source data consists of groups of files selected from a local file system. The files in a file group must have the same type and structure. Each file group becomes a "table" in the source data.
For other workspaces that you create during the free trial, you can also create a file connector workspace that uses files from cloud storage ( Amazon S3 or Google Cloud Storage).
After you select Upload files and click Next, you are prompted to provide a name for the workspace.
In the field provided, enter the name to use for the workspace, then click Next.
Structural displays the File Groups view, where you can .
It also displays the with links to resources to help you get started.
After you create at least one file group, you can start to use the other Structural features and functions.
If you connect to your own data, then you must allowlist the Structural static IP addresses. For more information, go to .
If you choose to create a workspace with your own data, then the first step is to provide a name for the workspace.
In the field provided, enter the name to use for your first workspace, then click Next.
The Invite others to Tonic panel displays.
Under Invite others to Tonic, you can optionally invite other users with the same corporate email domain to start their own Structural free trial. The users that you invite are able to view and edit your workspace.
For example, you might want to invite other users if you don't have access to the connection information for the source data. You can invite a user who does have access. They can then update the workspace configuration to add the connection details.
To continue without inviting other users, click Skip this step.
To invite users:
For each user to invite, enter the email address, then press Enter. The email addresses must have the same corporate email domain as your email address.
After you create the list of users to invite, click Next.
The Add source data connection view displays.
The final step in the workspace creation is to provide the source data to use for your workspace.
Structural provides data connectors that allow you to connect to an existing database. Each data connector allows you to connect to a specific type of database. Structural supports several types of application databases, data warehouses, and Spark data solutions.
For the first workspace that you create using the free trial wizard, you can choose:
For subsequent workspaces that you create from Workspaces view, you can also choose , , and .
To connect to an existing database, on the Add source data connection panel, click the data connector to use, then click Add connection details.
The panel also includes a Local files option, which creates a local files file connector workspace, the same as the Upload files option.
Use the connection details fields to provide the connection information for your source data. The specific fields depend on the type of data connector that you select.
After you provide the connection details, to test the connection, click Test Connection.
To save your workspace, click Save.
Structural displays , which summarizes the protection status for the source data.
It also displays the with links to resources to help you get started.
The Structural free trial includes a couple of resources to introduce you to Structural and to guide you through the tasks for your first data generation.
The Getting Started Guide panel provides access to Structural information and support resources.
The Getting Started Guide panel displays automatically when you first start the free trial. To display the Getting Started Guide panel manually, in the Structural heading, click Getting Started.
The Getting Started Guide panel provides links to Structural instructional videos and this Structural documentation. It also contains links to request a Structural demo, contact Tonic.ai support, and purchase a Structural Cloud pay-as-you-go subscription.
For each free trial workspace, Structural provides access to a workspace checklist.
The checklist displays at the bottom left of the workspace management view. It displays automatically when you display the workspace management view. To hide the checklist, click the minimize icon. To display the checklist again, click the checklist icon.
The checklist provides a basic list of tasks to perform in order to complete a Structural data generation.
Each checklist task is linked to the Structural location where you can complete that task. Structural automatically detects and marks when a task is completed.
The checklist tasks are slightly different based on the type of workspace.
For workspaces that are connected to a database, including the sample PostgreSQL workspace and workspaces that you connect to your own data, the checklist contains:
Connect a source database - Set the connection to the source database. In most cases, you set the source connection when you create the workspace. When you click this step, Structural navigates to the Source Settings section of the workspace details view.
Connect to destination database - Set the location where Structural writes the transformed data. When you click this step, Structural navigates to the Destination Settings section of the workspace details view.
Apply generators to modify dataset - Configure how Structural transforms at least one column in the source data. When you click this step:
If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.
If there are no available generator recommendations, then Structural navigates to Database View.
Generate data - Run the data generation to produce the destination data. When you click this item, Structural navigates to the Confirm Generation panel.
For workspaces that use data from local files, the checklist contains:
Create a file group - Create a file group with files that you upload from a local file system. Each file group becomes a table in the workspace. When you click this step, Structural navigates to the File Groups view for the workspace.
Apply generators to modify dataset - Configure how Structural transforms at least one column in the source files. When you click this step:
If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.
If there are no available generator recommendations, then Structural navigates to Database View.
Generate data - Run the data generation to produce transformed versions of the source files. When you click this step, Structural navigates to the Confirm Generation panel.
Download your dataset - Download the transformed files from the Structural application database.
For workspaces that use data from files in cloud storage (Amazon S3 or Google Cloud Storage), the checklist contains:
Configure output location - Configure the cloud storage location where Structural writes the transformed files. When you click this step, Structural navigates to the Output location section of the workspace details view.
Create a file group - Create a file group that contains files selected from cloud storage. When you click this step, Structural navigates to the File Groups view for the workspace.
Apply generators to modify dataset - Configure how Structural transforms at least one column in the source data. When you click this step:
If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.
If there are no available generator recommendations, then Structural navigates to Database View.
Generate data - Run the data generation to produce transformed versions of the source files. When you click this step, Structural navigates to the Confirm Generation panel.
In addition to the workspace checklists, Structural uses next step hints to help guide you through the workspace configuration and data generation.
When a next step hint is available, it displays as an animated marker next to the suggested next action.
When you hover over the highlighted action, Structural displays a help text popup that explains the recommended action.
When you click the highlighted action, the hint is removed, and the next hint is displayed.
For a file connector workspace, to identify the source data, you create file groups. A file group is a set of files of the same type and with the same structure. Each file group becomes a table in the workspace. For CSV files, each column becomes a table column. For XML and JSON file groups, the table contains a single XML or JSON column.
On the File Groups view, click Create File Group.
For a file connector workspace that uses local files, you can either drag and drop files from your local file system to the file group, or you can search for and select files to add. For more information, go to .
For a file connector workspace that uses cloud storage, you select the files to include in the file group. For more information, go to .
For files that contain CSV content, you configure the delimiters and other file settings. For more information, go to .
To get value out of the data generation process, you assign generators to the data columns.
A generator indicates how to transform the data in a column. For example, for a column that contains a name value, you might assign the Name generator, which indicates how to generate a replacement name in the generation output.
For sensitive columns that Structural detects, Structural can also provide a recommended generator configuration.
When there are recommendations available, Privacy Hub displays a link to review all of the recommendations.
The Recommended Generators by Sensitivity Type panel displays a list of sensitive columns that Structural detected, along with the suggested generators to apply.
After reviewing, to apply all of the suggested generators, click Apply All. For more information about using this panel, go to .
You can also choose to apply an individual generator manually. You can do this from , , or .
To display Database View, on the workspace management view, click Database View.
On Database View, in the column list, the Applied Generator column lists the currently assigned generator for each column. For a new workspace, the columns are all assigned the Passthrough generator. The Passthrough generator simply passes the source value through to the destination data without masking it.
Click a column that is marked as Passthrough, and that is not marked as sensitive. For example, in the sample workspace, the customers.Last_Transaction column. The column configuration panel displays. To select a generator, click the generator dropdown.  The list contains generators that can be assigned to the column based on the column data type. For customers.Last_Transaction, the Timestamp Shift generator is a good option.
For Passthrough columns that Structural identified as containing sensitive data, the Applied Generator column displays an icon to indicate that there is a recommended generator.
In Database View, click one of those columns. For example, in the sample workspace, the customers.email column is marked as containing an email address.
For customers.Email, click the generator dropdown. Instead of the column configuration panel, there is a panel that indicates the recommended generator. For customers.Email, the recommended generator is Email. To assign the Email generator, click Apply. The column configuration panel displays with the generator assigned.
To run a data generation, Structural must have a destination for the transformed data.
For a local files workspace, Structural saves the transformed files to the application database.
For workspaces that use data from a database, and for workspaces that use cloud storage files, you configure where Structural writes the output data.
The destination location for data generation output can be one of the following:
If the data connector supports Tonic Ephemeral, then the default option is to .
For database-based data connectors, you can write the transformed data to a destination database.
For some Structural data connectors, Structural can .
For file connector workspaces that transform files from cloud storage (Amazon S3 or Google Cloud Storage), you .
To display the destination configuration for the workspace:
Click the Workspace Settings tab.
Scroll to the Destination Settings section or, for a file connector workspace that uses cloud storage files, scroll to the Output location section.
For data connectors that Ephemeral supports, the default option is to write the output to Ephemeral.
For the Ephemeral option, the default configuration is:
Structural writes the output to Ephemeral Cloud. If you do not have an Ephemeral Cloud account, then we create an Ephemeral free trial account for you. If your organization has a self-hosted Ephemeral instance, then you can choose to write the output to that instance. Note that all workspaces in the same organization or for the same self-hosted Structural instance must use the same Ephemeral instance.
Structural uses the output data to create an Ephemeral user snapshot. You can use the user snapshot to create Ephemeral databases.
When Structural creates the user snapshot in Ephemeral, it creates a temporary Ephemeral database to use as the basis for the user snapshot. There is an option to keep that temporary database. For a free trial workspace, this option is enabled by default. The database expires after 48 hours.
For details about how to configure Structural to write output to Ephemeral, go to . For more information about Ephemeral, go to the .
To write the data to a destination database, click Database Server. Structural displays the configuration fields for the destination database.
For information on how to configure the destination information for a specific data connector, go to the workspace configuration information for that data connector. The contains a list of the available data connectors, and provides a link to the documentation for each data connector.
To write the data to a data volume in a container repository, click Container Repository. Structural displays the configuration fields to select a base image and provide the details about the repository.
For more information, go to .
For a file connector workspace that uses files from cloud storage (Amazon S3 or Google Cloud Storage), you configure the cloud storage output location where Structural writes the transformed files. The configuration includes the required credentials to use.
For more information, go to .
After you complete the workspace and generator configuration, you can run your first data generation.
The data generation process uses the assigned generators to transform the source data. It writes the transformed data to the configured destination location.
For a local files workspace, it writes the files to the Structural application database.
The Generate Data option is at the top right of the Tonic heading.
When you click Generate Data, Structural displays the Confirm Generation panel.
The Confirm Generation panel provides access to the current destination configuration, along with other advanced generation options such as subsetting and upsert.
It also indicates if there are any issues that prevent you from starting the data generation. For example, if the workspace does not have a configured destination, then Structural cannot run the data generation.
To start the data generation, click Run Generation. For more information about running data generation, go to .
For a new Tonic Ephemeral account, the first time that you run data generation, you also receive an activation email message for the account.
To view the job status and details:
Click Jobs.
In the list, click the data generation job.
For a data generation that writes the output to an Ephemeral database, the Data Available in Tonic Ephemeral panel provides access to the database connection information.
To display the connection details, click Connecting to your database.
The connection details include the database location and credentials. Each field contains a copy icon to allow you to copy the value.
The first time that you complete all of the steps in a checklist, Structural displays a panel with options to chat with our sales team, schedule a demo, or purchase a subscription.
You can also continue to get to know Structural and experiment with other Structural features such as or using to mask more complex values such as JSON or XML.
If your free trial has expired, to get an extension, you can reach out to us using either the in-app chat or an email message.
You can configure a workspace to write destination data to a container repository instead of to a database server.
When Structural writes data generation output to a repository, it writes the destination data to a container volume. From the list of container artifacts, you can copy the volume digest, and download a Docker Compose file that provides connection settings for the database on the volume. Structural generates the Compose file when you make the request to download it. For more information about getting access to the container artifacts, go to .
You can also use the data volume to start a Tonic Ephemeral database. However, if the data is larger than 10 GB, we recommend that you write the data to an Ephemeral user snapshot instead. For information about writing to an Ephemeral snapshot, go to .
For an overview of writing destination data to container artifacts, you can also view the .
Under Destination Settings, to indicate to write the destination data to container artifacts, click Container Repository.
For a Structural instance that is deployed on Docker, unless you , the Container Repository option is hidden.
You can switch between writing to a database server and writing to a container repository at any time. Structural preserves the configuration details for both options. When you run data generation, it uses the currently selected option for the workspace.
From the Database Image dropdown list, select the image to use to create the container artifacts.
Select an image version that is compatible with the version of the database that is used in the workspace.
For a MySQL workspace, you can provide a customization file that helps to ensure that the temporary destination database is configured correctly.
To provide the customization details:
Toggle Use customization to the on position.
In the text area, paste the contents of the customization file.
To provide the location where Structural publishes the container artifacts:
In the Registry field, type the path to the container registry where Structural publishes the data volume.
Do not include the HTTP protocol, such as http:// or https://.
In the Repository Path field, provide the path within the registry where Structural publishes the data volume.
For a Google Artifact Registry (GAR) repository, the path format is PROJECT-ID/REPOSITORY/IMAGE.
For more information about repository and image names, go to the .
You next provide the credentials that Structural uses to read from and write to the registry.
When you provide the registry, Structural detects whether the registry is from Amazon Elastic Container Registry (Amazon ECR), Google Artifact Registry (GAR), or a different container solution.
It displays the appropriate fields based on the registry type.
For a registry other than an Amazon ECR or a GAR registry, the credentials can be either a username and access token, or a secret.
In general, the credentials must be for a user that has read and write permissions for the registry.
The secret is the name of a Kubernetes secret that lives on the pod that the Structural worker runs on. The secret type must be kubernetes.io/dockerconfigjson. The Kubernetes documentation provides information on .
To use a username and access token:
Click Access token.
In the Username field, provide the username.
In the Access Token field, provide the access token.
To use a secret:
Click Secret name.
In the Secret Name field, provide the name of the secret.
For ACR, the provided credentials must be for a service principal that has sufficient permissions on the registry.
For Structural, the service principal must at least have the permissions that are associated with the.
For a GAR registry, you upload a service account file, which is a JSON file that contains credentials that provide access to Google Cloud Platform (GCP).
The associated service account must have the Artifact Registry Writer role.
For Service Account File, to search for and select the file, click Browse.
For an Amazon ECR registry, you can either:
Provide the AWS access and secret key that is associated with the IAM user that will connect to the registry
Provide an assumed role
(Self-hosted only) Use the credentials configured in the Structural environment settings TONIC_AWS_ACCESS_KEY_ID and TONIC_AWS_SECRET_ACCESS_KEY.
(Self-hosted only) If Structural is deployed in Amazon Elastic Kubernetes Service (Amazon EKS), then you can use the AWS credentials that live on the EC2 instance.
To provide an AWS access key and secret key:
Click Access Keys.
In the AWS Access Key field, enter an AWS access key that is associated with an IAM user or role.
In the AWS Secret Key field, enter the secret key that is associated with the access key.
Optionally, in the AWS Session Token field, enter the session token to use for the connection.
To provide an assumed role:
Click Assume Role.
In the Role ARN field, provide the Amazon Resource Name (ARN) for the role.
In the Session Name field, provide the role session name.
If you do not provide a session name, then Structural automatically generates a default unique value. The generated value begins with TonicStructural.
In the Duration (in seconds) field, provide the maximum length in seconds of the session. 
The default is 3600, indicating that the session can be active for up to 1 hour.
The provided value must be less than the maximum session duration that is allowed for the role.
For the assumed role, Structural generates the external ID that is used in the assume role request. Your role’s trust policy must be configured to condition on your unique external ID.
Here is an example trust policy:
On a self-hosted instance, to use the credentials configured in the environment settings, click Environment Variables.
On a self-hosted instance, to use the AWS credentials from the EC2 instance, click Instance Profile.
The IAM user must have permission to list, push, and pull images from the registry. The following example policy includes the required permissions.
For additional security, a repository name filter allows you to limit access to only the repositories that are used in Structural. You need to make sure that the repositories that you create for Structural match the filter.
For example, you could prefix Structural repository names with tonic-. In the policy, you include a filter based on the tonic- prefix:
In the Tags field, provide the tag values to apply to the container artifacts. You can also change the tag configuration for individual data generation jobs.
Use commas to separate the tags.
A tag cannot contain spaces. Structural provides the following built-in values for you to use in tags:
{workspaceId} - The identifier of the workspace.
{workspaceName} - The name of the workspace.
{timestamp} - The timestamp when the data generation job that created the artifact completed.
{jobId} - The identifier of the data generation job that created the artifact.
For example, the following creates a tag that contains the workspace name, job identifier, and timestamp:
{workspaceName}_{jobId}_{timestamp}
To also tag the artifacts as latest, check the Tag as "latest" in your repository checkbox.
You can also optionally configure custom resource values for the Kubernetes pods. You can specify the ephemeral storage, memory, and CPU millicores.
To provide custom resources:
Toggle Set custom pod resources to the on position.
Under Storage Size:
In the field, provide the number of megabytes or gigabytes of storage.
From the dropdown list, select the unit to use.
The storage can be between 32MB and 25GB.
Under Memory Size:
In the field, provide the number of megabytes or gigabytes of RAM.
From the dropdown list, select the unit to use.
The memory can be between 512MB and 4 GB.
Under Processor Size:
In the field, provide the number of millicores.
From the dropdown list, select the unit.
The processor size can be between 250m and 1000m.
In the Custom Database Name field, provide the name to use for the destination database.
If you do not provide a custom database name, then the destination database uses the same name as the source database.
In the Custom Password field, provide the password for the destination database user.
If you do not provide a password, then Structural generates a password.
The destination database username is always the default user for the database:
For PostgreSQL, postgres
For MySQL, root
For SQL Server, sa
If your Kubernetes nodes are configured with taints, then on a self-hosted instance, you can configure the tolerations that enable the datapacker pods to be scheduled on the nodes. The datapacker pod hosts the temporary database that Structural uses during the data generation.
For an overview of taints and tolerations, go to the .
To configure the tolerations, you configure the following . You can add these settings to the Environment Settings list on Structural Settings.
CONTAINERIZATION_POD_NODE_TOLERATION_KEY - The toleration key value to apply to the datapacker pods. This setting is required. If you do not configure this setting, then Structural ignores the other settings.
CONTAINERIZATION_POD_NODE_TOLERATION_VALUES - A comma-separated list of toleration values to apply to the datapacker pods.
CONTAINERIZATION_POD_NODE_TOLERATION_EFFECT - The toleration effect to apply to the datapacker pods.
CONTAINERIZATION_POD_NODE_TOLERATION_OPERATOR - The toleration operator to apply to the datapacker pods.
{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Principal": {
      "AWS": "<originating-account-id>"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": {
        "sts:ExternalId": "<external-id>"
      }
    }
  }
}{
  {
    "Sid": "ManageTonicRepositoryContents",
    "Effect": "Allow",
    "Action": [
      "ecr:DescribeRepositories",
      "ecr:ListImages",
      "ecr:DescribeImages",
      "ecr:BatchGetImage",
      "ecr:BatchCheckLayerAvailability",
      "ecr:InitiateLayerUpload",
      "ecr:UploadLayerPart",
      "ecr:CompleteLayerUpload",
      "ecr:PutImage"
    ],
    "Resource": [
       "arn:aws:ecr:<region>:<account_id>:repository/<optional name filter>"
    ]
  },
  {
    "Sid": "GetAuthorizationToken",
    "Effect": "Allow",
    "Action": [
      "ecr:GetAuthorizationToken"
    ],
    "Resource": "*"
  }
}"Resource": [
  "arn:aws:ecr:<region>:<account_id>:repository/tonic-*"
]











































Privacy Hub tracks the current protection status of source data columns based on:
Column sensitivity, either from the most recent sensitivity scan or from manual assignments
Assigned table modes
Assigned generators
To display Privacy Hub, either:
On the workspace management view, in the workspace navigation bar, click Privacy Hub.
On Workspaces view, click the workspace name.
From Privacy Hub, you can:
Review and apply the recommended generators for all detected sensitive columns
View the current protection status of columns
Manually mark columns as sensitive or not sensitive
Configure protection for sensitive columns
Download a preview Privacy Report
Run a new sensitivity scan
You can also track the history of changes to column sensitivity and the assigned column generators. For more information, go to Tracking changes to workspaces, generator presets, and sensitivity rules.
The sensitivity scan detects specific types of sensitive data.
If your workspace contains any columns that the sensitivity scan identified, and for which you have not either:
Assigned a generator
Marked as not sensitive
Then Tonic Structural displays a Sensitivity Recommendations banner that contains a count of those columns.
The count only includes sensitive columns that the sensitivity scan detects. If you manually mark a column as sensitive, it is not included in the list.
On the banner, the Review Recommendations option allows you to review the detected columns and the recommended generators for each detected sensitive data type.
You can then apply the recommended generators or ignore the recommendations. When you ignore a recommendation, you either:
Indicate to remove the generator recommendation for the column.
Indicate that the column data is not sensitive.
For more information, go to Reviewing and applying recommended generators.
The protection status panels at the top of Privacy Hub provide an overview of the current protection status of the columns in the source data.
Each panel displays:
The number of columns that are in that category.
The estimated percentage of columns that are in that category.
Note that for a JSON column that uses Document View, the protection status displays a separate box for each combination of JSON path and data type.
From each panel, you can display details for and configure protection for each column.
The column counts do not include columns that do not have data in the destination database. For example, if a table is assigned Truncate table mode, then Privacy Hub ignores the columns in that table.
The information on these panels updates automatically as you change whether columns are sensitive and assign generators to columns.
The At-Risk Columns panel reflects columns that:
Are populated in the destination database.
Are marked as sensitive.
Have the generator set to Passthrough, which indicates that Structural does not perform any transformation on the data.
For each column, the At-Risk Columns panel also indicates the sensitivity confidence, from full confidence (completely red) to low confidence (a small percentage of red).
The goal is to have 0 at-risk columns.
When you click Open in Database View, you navigate to Database View. The column list is filtered to show columns that are at risk.
The Protected Columns panel reflects columns that:
Are populated in the destination database.
Are assigned a generator other than Passthrough.
It includes both sensitive and non-sensitive columns.
Note that a column is considered protected based solely on the assigned generator. Some more complex generators, such as JSON Mask or Conditional, allow you to apply different generators to specific portions of a value or based on a specific condition. However, the protection status does not reflect these sub-generators. An applied sub-generator could be Passthrough.
When you click Open in Database View, you navigate to Database View. The column list is filtered to show all included columns that are protected.
The Not Sensitive Columns panel reflects columns that:
Are populated in the destination database.
Are marked as not sensitive.
Have the generator set to Passthrough.
When you click Open in Database View, you navigate to Database View. The column list is filtered to show included columns that are not sensitive and are not protected.
The Database Tables list shows the protection status for each table in the source database. You can view the number of columns that have each protection status, and update the column configuration.
The list does not include tables where the table mode is Truncate or Preserve Destination. Truncated tables are not populated in the destination database. For Preserve Destination tables, the existing data in the destination database does not change.
For each table, Database Tables provides the following information:
Name - The table name. For a file connector workspace, each table corresponds to a file group. Each JSON column that uses Document View is also in a separate row. For JSON columns, the Name column displays both the table name and the column name.
Not Sensitive - The number of not sensitive columns in the table. Not sensitive columns are not marked as sensitive and have Passthrough as the generator. When you click the value, you navigate to Database View, filtered to display the not sensitive columns for the table.
Protected - The number of protected columns in the table. Protected columns have an assigned generator. A protected column can be either sensitive or not sensitive. When you click the value, you navigate to Database View, filtered to display the protected columns for the table.
At-Risk - The number of at-risk columns in the table. These columns are marked as sensitive, but have Passthrough as the generator. The goal is to have 0 unprotected sensitive columns. When you click the value, you navigate to Database View, filtered to display the at-risk columns for the table.
Privacy Status - Indicates the current protection status of the columns in the table. It provides the same view and configuration options as the protection status panels at the top of Privacy Hub.
You can filter the Database Tables list either by the table name or by the schema.
To filter the list by table name, in the filter field, begin to type text that is in the table name. As you type, Structural updates the list to only display matching tables.
To filter the list to only include tables that belong to a specific schema:
Click Filter by Schema.
From the schema dropdown list, select the schema.
When you select a schema, Structural adds it to the filter field.
You can sort the Database Tables list by any column except for the Privacy Status column.
To sort by a column, click the column heading. To reverse the sort order, click the heading again.
The Privacy Status column in the Database Tables list indicates the protection status of the columns in the table.
This column provides the same options to view and configure columns as the protection status panels at the top of Privacy Hub, but is limited to the columns in a specific table.
Each protection status panel displays a series of boxes to represent the columns that apply to that status. For example, if the source data contains four columns that are at-risk, then the At-Risk Columns panel displays four boxes, one for each column.
The Privacy Status column in the Database Tables list displays the same set of boxes for the columns in an individual table.
If the number of columns is too large to fit, then the last box shows the number of additional columns that apply. For example, if there are 15 columns that don't fit, then the last box is labeled +15.
When you hover over a box, the column name displays in a tooltip.
When you click a box, the details panel for that column displays.
When you click the box for remaining columns, the details panel for the first column in the remaining columns displays.
You can use the next and previous icons at the bottom right of the details panel to display the details for the next or previous column.
The column details panel opens to the settings view. The settings view contains the following information:
The table and column name.
Whether the column is flagged as sensitive.
The type of sensitive data that the column contains.
The data type for the column data.
The generator that is assigned to the column.
For a child workspace, whether the column configuration is inherited from the parent workspace. For columns that have overrides, you can reset to the parent configuration.
From the settings view of the column details, you can configure the column sensitivity.
You cannot change the sensitivity of columns in a child workspace. A child workspace always inherits the sensitivity from its parent workspace. For more information, go to About workspace inheritance.
As you change the column sensitivity, Structural updates the protection status panels.
To change whether the column is sensitive, toggle the Sensitive option. The column is moved if needed to reflect its new status. However, you remain on the current panel.
For example, from the At-Risk Columns panel, you change a column to be not sensitive. The column is moved to the Not Sensitive Columns panel. When you click the next or previous icons, you view the details for the next or previous column on the At-Risk Columns panel.
From the column details, you can assign and configure the column generator.
When you change the column generator, Structural updates the protection status panels.
If the column generator was previously Passthrough, then the column is moved to the Protected Columns panel. However, you remain on the current panel. For example, you assign a generator to a column that is on the At-Risk Columns panel. The column is moved to the Protected Columns panel, but when you click the next or previous icons, you view the details for the next or previous column on the At-Risk Columns panel.
For sensitive columns that are not protected, Structural displays the recommended generator as a button.
For self-hosted instances that have an Enterprise license, the recommended generator is the built-in generator preset.
To assign the recommended generator to the column, click the button.
Otherwise, select the generator from the Generator Type dropdown list.
For more information about selecting a generator, go to Assigning and configuring generators.
If the selected generator requires additional configuration, then below the Generator Type dropdown list is an Edit Generator Options link.
To display the configuration fields for the generator, click Edit Generator Options.
For information about configuring a selected generator or generator preset, go to Assigning and configuring generators.
After you configure the generator, to return to the settings view, click Back.
From the column details, you can display sample data for the column. The sample data allows you to compare the source and destination versions of the column values.
To display the sample data, click the view sample (magnifying glass) icon.
On the sample data view of the column details:
The Original Data tab shows the values in the source data.
The Protected Output tab shows the values that the generator produced.
For a JSON column, instead of assigning a generator, you can enable Document View.
From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to Using Document View for JSON columns.
To enable Document View, on the column details panel, toggle Use Document View to the on position. When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.
From the column details, you can view and add comments on the column. You might use a comment to explain why you selected a particular generator or marked a column as sensitive or not sensitive.
From the column details, to display the comments for the column, click the comment icon.
The comments view displays any existing comments on the column. The most recent comment is at the bottom of the list. Each comment includes the name of the user who made the comment.
To add the first comment to a column, type the comment into the comment text area, then click Comment.
To add an additional comment, type the comment into the comment text area, then click Reply.
The Privacy Report files that you download from Privacy Hub or the workspace download menu provide an overview of the current protection status based on the current configuration.
This is different from the Privacy Report files that you download from the data generation job details, which show the protection status for the data produced by that data generation.
You can download either:
The Privacy Report .csv file, which provides details about the table columns, the column content, and the current protection configuration.
The Privacy Report PDF file, which provides charts that summarize the privacy ranking scores for the table columns. It also includes the table from the .csv file.
For more information about the Privacy Report files and their content, go to Using the Privacy Report to verify data protection.
To download the report from the workspace management view, click the download icon. In the download menu:
To download the Privacy Report PDF file, click Download Privacy Report PDF.
To download the Privacy Report .csv file, click Download Privacy Report CSV.
To download the report from Privacy Hub, click Reports and Logs, then:
To download the Privacy Report .csv file, click Privacy Report CSV.
To download the Privacy Report PDF file, click Privacy Report PDF.
Privacy Hub provides an option to manually start a new sensitivity scan. For example, you might want to run a new sensitivity scan when:
You add columns to the source database. The new scan identifies whether the new columns contain sensitive data.
The data in a column changes significantly, and a column that Structural originally marked as not sensitive might now contain sensitive data.
You cannot run a sensitivity scan on a child workspace. Child workspaces always inherit the sensitivity results from their parent workspace.
To run a new sensitivity scan, click Run Sensitivity Scan.
When Structural runs a new sensitivity scan:
Structural analyzes and determines the sensitivity of any new columns.
It does not change the sensitivity of existing columns that you marked as sensitive or not sensitive.
For existing columns that you did not change the sensitivity of:
Structural does not change the sensitivity of columns that the original scan marked as sensitive.
It can change the sensitivity of columns that the original scan marked as not sensitive.
The protection status panels are updated to reflect the results of the new scan.











Tonic Structural provides different license plans to accommodate organizations that are of different sizes and that have more or less complex data architectures.
The Basic license is designed for very small organizations that have a very simple data architecture. It provides access to Structural's core de-identification and data generation features.
The Basic license allows access for a single user, with an option to purchase an additional two users.
There is no access to single sign-on (SSO).
With a Basic license, you can create workspaces for one data connector type. The data connector type must be one of the following:
With a Basic license, your Structural instance can have only one Structural worker. This means that only one sensitivity scan or data generation job can run at a time.
With a Basic license, you can create and configure workspaces, and run data generation for those workspaces.
You can use Privacy Hub to view the current sensitivity status based on the current workspace configuration.
The Basic license does NOT provide access to the following features:
Virtual foreign keys - Can view foreign keys from the data, but cannot add virtual foreign keys
Custom generators
With a Basic license, you only have access to the basic version of the Structural API.
You cannot use the basic Structural API to perform the following API tasks, which require the advanced API:
The Professional license is designed for larger organizations that have more complex data architectures. The organization might have a larger team that supports multiple databases.
The Professional license is also granted to pay-as-you-go subscriptions on Structural Cloud.
The Professional license provides access to a larger set of Structural features than the Basic license.
The Professional license allows up to 10 users. You can purchase access for unlimited users as an add-on.
You can use single sign-on (SSO) to manage your Structural users.
With a Professional license, you can create workspaces for up to two types of data connectors. You can purchase one additional data connector type as an add-on.
Those data connectors can be of any type except for Oracle and Db2 for LUW.
With a Professional license, your Structural instance can have more than one Structural worker.
This means that you can run multiple jobs from different workspaces at the same time. You can never run multiple jobs from the same workspace at the same time.
With a Professional license, you can do the following:
Create and configure workspaces, and run data generation for those workspaces.
Use Privacy Hub to view the current sensitivity status for your workspace configuration.
Grant other users Manager and Editor access to your workspaces. The Professional license does not allow you to assign the built-in Viewer and Auditor permission sets.
Make comments on table columns. The comments can trigger email notifications.
Run post-job scripts and configure webhooks.
Use subsetting to generate a smaller destination database.
Create and manage generator presets.
Create and manage custom sensitivity rules.
Create virtual foreign keys.
Use upsert to add destination database records and update existing destination database records, but keep unchanged destination database records in place. The Professional license does not allow you to connect to migration scripts.
Use Schema Changes view to view and address both conflicting and non-conflicting changes to the source data schema.
Use Structural data encryption to have Structural decrypt source data, encrypt destination data, or both.
Request custom value processors, which are primarily developed to preserve encryption that can't be managed using Structural data encryption. You can also purchase custom generators.
The Professional license does NOT provide access to the following features:
With a Professional license, you only have access to the basic version of the Structural API.
You cannot use the basic Structural API to perform the following API tasks, which require the advanced API:
The Enterprise license is ideal for very large organizations that have multiple teams that support very large and complex data structures, and that might have more requirements related to scale and compliance.
It provides full access to all Structural features.
An Enterprise instance does not limit the number of users.
You can use any number of any of the available data connectors.
The Enterprise license provides exclusive access to the Oracle and Db2 for LUW data connectors.
The following features are exclusive to the Enterprise license:
The Enterprise license provides exclusive access to the advanced API.
The advanced Structural API provides access to all of the available API tasks, including the following tasks that are not available in the basic API:
The following table compares the available features for the Structural license plans.
Number of users
1
2 additional users available as add-ons
10
Unlimited users available as an add-on
Unlimited
1 data connector
PostgreSQL or MySQL
2 data connectors
1 additional data connector available as an add-on
Any data connector except for Oracle or Db2 for LUW
Unlimited number from any available data connector
Manager
Manager, Editor
Manager, Editor, Auditor, Viewer
Custom generators
Available for purchase
2 included Additional ones available for purchase
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
Concurrent jobs (more than 1 worker)
✓
✓
✓
✓
✓
✓
✓
Structural API
The following table summarizes the available generators. The table includes generator characteristics that you might take into account when you select the generator to use for a column.
Generator hints and tips also provides some suggestions for generators to use for specific use cases.
API:
Generates replacement values for U.S. mailing addresses. You select the address component or format for the replacement values. For example, the column might only contain a street address or a postal code, or it might contain a full address.
Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Identifies the algebraic relationship between 3 or more numeric values, including at least one non-integer. Based on the relationship, generates new values to match. If there is no relationship, uses the Categorical generator.
Linkable - linking is required Privacy ranking: 3
API:
Generates unique alphanumeric strings of the same length as the input.
For example, for the origin value ABC123, the output value is a six-character alphanumeric string such as D24N05.
Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Within an array, replaces letters with random other letters, and numbers with random other numbers. Preserves punctuation and whitespace.
Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Used to transform array values in JSON.
To identify values to transform, you provide a list of JSONPaths. For each JSONPath, you assign a sub-generator to apply to matching values.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5
API:
Used to transform values in an array. To identify values to transform, you provide a regular expression. For each capture group in an expression, you assign a sub-generator to apply to matching values.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5
API:
Generates unique alpha-numeric strings based on any printable ASCII characters. You can optionally exclude lowercase letters from the generated values. The replacement value does not preserve the length of the original value.
Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Generates a random company name-like string.
Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Shuffles the original values for a column to different rows. Maintains the overall frequency of each value.
For example, a column contains the values Small (3 times), Medium (4 times), and Large (5 times). In the transformed data, each value appears the same number of times, but the values are shuffled to different rows.
Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy
API:
Replaces letters with random other letters and numbers with random other numbers. Preserves punctuation, whitespace, and mathematical symbols.
Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Replaces characters with other random characters. Preserves punctuation, capitalization, and whitepace.
A replacement character is always from within the same Unicode Block as the source character.
A source character is always mapped to the same destination character. For example, M might always map to V.
Always self-consistent Unique columns allowed Privacy ranking: 4
(Deprecated) API:
This generator is deprecated. Use the generator instead. Generates a random company name-like string.
Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Applies different generators to rows conditionally based on the column value. For example, apply the Character Scramble generator for values other than Test. You configure a list of conditions. Each condition performs a check against the column value. For each condition, you assign a sub-generator to apply to matching values.
Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: If a fallback generator is selected, then the lower of 5 or the fallback generator. 5 if no fallback generator is selected.
API:
Uses a single specified value to replace all of the values in the column. The replacement value must be compatible with the column data type.
Differential privacy Data-free Privacy ranking: 1
API:
Generates a continuous distribution to fit the underlying data. Can link to other columns to create multivariate distributions. Can also be partitioned by other columns.
Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy
API:
Populates the column using the sum of values from a column in another table. To select the rows to use, uses a foreign key value that matches the primary key value for the current row. For example, to transform the Total_Sales column in the Customers table, from the Transactions table, use the sum of the Amount values for rows where the Customer_ID value matches the primary key value for the current customer.
Privacy ranking: 3
API:
Used to mask text in a delimited format.
Parses the text as a row where the columns are delimited by a specified character. For each index, you assign a sub-generator to apply to the index value.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5
API:
Replaces the original column value with a value from list of values that you provide.
Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Truncates dates or timestamps to a specific date or time component. For example, you might truncate a date value to the month or a timestamp to the hour.
Privacy ranking: 5
API:
Scrambles characters in an email address.
Preserves the formatting and keeps the @ and .. 
You can identify specific email domains to not scramble.
Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Generates timestamps that fit an event distribution. You can link columns to create a sequence of events across multiple columns. You can also partition the generator by other columns.
Linkable Privacy ranking: 3
API:
Scrambles characters in a file name.
Preserves the formatting and the file extension.
Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Replaces all instances of the find string with the replace string. For the find string, you can optionally provide a regular expression.
Privacy ranking: 5
API:
Generates a valid Finnish Personal Identity Code (PIC).
You configure the date range during which the PIC was issued.
Consistency - Self only
Data-free if not consistent
Unique columns allowed
Format-preserving encryption (FPE)
Privacy ranking:
1 if not consistent
4 if consistent
API:
Transforms Norwegian national identity numbers. You can optionally preserve the gender and birthdate portions of the identifier values.
Consistency - Self and other Unique columns allowed Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Used to transform columns that contain latitude and longitude values.
Linkable Unique columns allowed Privacy ranking: 3
API:
Can be used to generate cities, states, zip codes, and latitude/longitude values that follow HIPAA guidelines for safe harbor.
Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Generates random host names, based on the English language.
Consistency - Self and other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Used to transform values in an HStore column in a PostgreSQL database. You specify a list of keys for which to transform the values. For each key, you assign a generator to apply to the key value.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5
API:
Used to transform columns that contain HTML content. To identify the values to transform, you provide a list of path expressions. For each path expression, you assign a generator to apply to the matching value.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5
API:
Generates unique integer values.
By default, the generated values are within the range of the column’s data type.
You can also specify a range for the generated values. The source values must be within that range.
Consistency - Self only Differential privacy if not consistent Data-free if not consistent Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent
API:
For Canadian mailing addresses, can generate:
Street name
Postal code
For United Kingdom (UK) mailing addresses, can generate postal codes.
Consistency - Self only Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Generates a random IP address-formatted string. You specify the percentage of IPv4 addresses. The remaining addresses are IPv6.
Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Used to transform values in JSON columns. To identify values to transform, you provide a list of JSONPaths.
For each JSONPath, you assign a sub-generator to apply to matching values.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5
API:
Generates a random MAC address formatted string.
Consistency - Self only Differential privacy if not consistent Data-free if not consistent Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Generates unique MongoDB objectId values. Can be assigned to text columns that contain MongoDB ObjectId values. The column value must be 12 bytes long.
Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Generates a random name string from a dictionary of first and last names. You specify the name format. For example, a column might contain only a first name, or a full name that is last name first.
Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Masks values in numeric columns.
Either adds or multiplies the original value by random noise.
Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Replaces all of the column values with NULL values.
Differential privacy Data-free Unique columns allowed Privacy ranking: 1
API:
Generates unique numeric strings of the same length as the input numeric string.
Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Default generator. Does not perform any transformation on the source data.
Unique columns allowed Privacy ranking: 6
API:
Generates a random telephone number that matches the country or region and format of the input telephone number. For invalid telephone numbers, either replaces individual numbers or generates a valid replacement number.
Consistency - Self only Privacy ranking: 3
API:
Generates a random boolean value. You specify the percentage of true values. The remaining values are false.
Differential privacy Data-free Privacy ranking: 1
API:
Generates a random double number that is between the specified minimum (inclusive) and maximum (exclusive) values.
Differential privacy Data-free Privacy ranking: 1
API:
Generates a random hash string.
Differential privacy Data-free Privacy ranking: 1
API:
Returns a random integer that is between the specified minimum (inclusive) and maximum (exclusive) values.
Differential privacy Data-free Privacy ranking: 1
API:
Generates random dates, times, and timestamps that fall within a specified range.
Differential privacy Data-free Privacy ranking: 1
API:
Generates a random new UUID string.
Differential privacy Data-free Unique columns allowed Privacy ranking: 1
API:
To identify values to transform, you provide a regular expression.
For each capture group in an expression, you assign a sub-generator to apply to matching values.
Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: 5
API:
Generates a column of unique integer values that start with specified value, and then increment by 1 for each processed row.
Linkable Unique columns allowed Privacy ranking: 3
API:
Generates values of ISO 6346 compliant shipping container codes. The codes are all in the freight ("U") category.
Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Generates a new valid Canadian Social Insurance Number. Preserves the formatting from the original value.
Consistency - Self only Data-free if not consistent Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Generates a new valid United States Social Security Number. For numeric columns, the dashes (xxx-xx-xxxx) are always excluded. Otherwise, you can specify the percentage of values for which to include the dashes.
Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent
API:
Used to transform StructFields within a StructType in Spark databases (Databricks and Amazon EMR). To identify the StructField value to transform, you provide a path expression. For each path expression, you assign a sub-generator to apply to the matching values.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5
API:
Shifts timestamps by a random amount of a specific unit of time, within a set range. The range can start before the original value.
Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Generates unique email addresses.
Replaces the username with a randomly generated GUID, and masks the domain with a character scramble.
Consistency - Self only Unique columns allowed Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Used to transform URLs. Preserves the formatting. Keeps the URL scheme and top-level domain intact.
Unique columns allowed Privacy ranking: 3
API:
Generates UUIDs.
Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent
API:
Used to transform values in XML columns. To identify the values to transform, you provide XPaths. For each XPath, you assign a sub-generator to apply to the matching values.
Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5