Only this pageAll pages
Powered by GitBook
Couldn't generate the PDF for 415 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

Tonic Structural

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Creating and managing workspaces

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Configuring data generation

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Structural implementation roles

A Tonic Structural implementation can involve the following roles - from those who set up the Structural environment to the consumers of the data that Structural processes.

Note that these roles are not related to role-based access (RBAC) within Structural, which is managed using permission sets.

Infrastructure engineers

For self-hosted instances of Structural.

Infrastructure engineers set up the Structural application and its relevant dependencies. They are typically DevOps, Site Reliability Engineering (SRE), or Kubernetes cluster administrators.

Infrastructure engineers perform the following Structural-related tasks:

  • Ensure that the proper infrastructure is ready for Structural installation based on the deployment checklist.

  • Follow the installation instructions. Works with Tonic.ai support as needed.

  • Perform routine maintenance of Structural and the Structural environment. Updates Structural and its dependencies as needed.

  • Create Structural-processed data pipelines for development and testing workflows.

Database administrators

For both self-hosted instances of Structural and Structural Cloud.

Database administrators integrate Structural into your data architecture to support Structural data connectors.

They ensure that source databases are available to Structural, and that Structural can write to destination databases.

Database administrators perform the following Structural-related tasks:

  • Set up the required Structural access to source databases.

  • Set up destination databases for Structural to write transformed data to.

Structural users

Structural users are the actual users of the Structural application.

Depending on the use case, Structural users might be compliance analysts, DevOps, or data engineers.

Tonic users perform the following Structural-related tasks:

  • Use the Structural data generation workflow to configure the logic used to transform the source data and to generate the transformed data.

  • Work with data consumers to produce usable data.

Data consumers

Data consumers are the end users of transformed destination data.

They are typically QA testers, developers, or analysts.

Data consumers perform the following Structural-related tasks:

  • Validate the usability of the destination data.

  • Provide guidance on application-specific requirements for data.

Security and compliance

Security and compliance specialists ensure and validate that the data that Structural produces meets expectations, and that Structural is compliant with other security-related processes.

Security and compliance specialists perform the following Structural-related tasks:

  • Provide guidance on what data is sensitive.

  • Sign off on proposed approaches to mask sensitive data.

  • Approve data access and permissions.

Structural deployment types

Self-hosted Tonic Structural instance

You can deploy a self-hosted, on-premises instance of Tonic Structural.

For a self-hosted instance, Structural provides administrator tools that allow you to monitor Structural services and manage Structural users.

You can configure Structural environment settings to customize your instance.

On a self-hosted instance, based on your license plan, you have access to the full set of supported data connectors.

Structural Cloud

Structural Cloud is our secure hosted environment. On Structural Cloud, Tonic.ai handles monitoring Structural services and updating Structural.

For single sign-on (SSO), Structural Cloud only supports Okta.

Structural Cloud does not include:

  • Custom permission sets

  • Environment setting configuration. Structural Cloud uses a single configuration.

  • Structural data encryption

  • Access to the following data connectors:

    • Amazon EMR

    • Amazon Redshift

    • Db2 for LUW

    • Spark SDK

Structural Cloud also supports a pay-as-you-go plan, where free trial users can move on to set up a monthly subscription. For more information, go to Setting up and managing a Structural Cloud pay-as-you-go subscription.

Each Structural Cloud user belongs to a Structural Cloud organization, which is determined either by the user's email domain or by a workspace invitation. Structural Cloud users do not have any access to workspaces or users from other organizations.

Each free trial user is in a separate organization, along with any users that they invite to have access to a free trial workspace.

For information about Structural Cloud organizations, go to Structural organizations.

The Account Admin permission set allows a Structural Cloud user to manage organization users and workspaces. For information about granting access to the Account Admin permission set, go to Granting Account Admin access for a Structural Cloud organization.

Tonic Structural User Guide

The Tonic Structural platform creates safe, realistic datasets to use in staging environments or for local development. The Structural web application and API can be used by engineers, data analysts, or security experts.

Structural connects to source databases that contain sensitive data such as personally identifiable information (PII) or protected health information (PHI). To protect that data, Structural transforms the sensitive values and then writes the transformed data to a destination location.

Data flow from the source database through Tonic Structural to the destination database

New to Structural? Review the Tonic Structural workflow overview. For information on how to create a Structural account and start a Structural free trial, go to Getting started with the Structural free trial.

Want to know what's in the latest Structural releases? Go to the Tonic Structural release notes.

The Structural application heading includes a feature updates icon, which displays a summary of the newest features, and includes a link to the Structural release notes.

Feature updates icon

Connect to your data

Configure and generate transformed data

Manage a self-hosted Tonic Structural instance

Need help with Structural? Contact [email protected].

Structural data generation workflow

Tonic Structural data generation combines sensitive data detection and data transformation to create safe, secure, and compliant datasets.

The Structural data generation workflow involves the following steps:

Overview diagram of the Tonic Structural data generation workflow

You can also view this video overview of the Structural data generation workflow.

  1. To get started, you create a workspace. When you create a workspace, you identify the type of source data, such as PostgreSQL or MySQL, and establish the connections to the source database and the destination location. The source database contains the original data that you want to synthesize. The destination location is where Structural stores the synthesized data. It might be a database, a storage location, a container repository, or an Ephemeral database snapshhot.

  2. Next, you analyze the results of the initial sensitivity scan. The sensitivity scan identifies columns that contain sensitive data. These columns need to be protected by a generator.

  3. Based on the sensitivity scan results, you configure the data generation. The configuration includes:

    • Assigning table modes to tables. The table mode controls the number of rows and columns that are copied to the destination database.

    • Indicating column sensitivity. You can make adjustments to the initial sensitivity assignments. For example, you can mark additional columns as sensitive that the initial scan did not identify as sensitive.

    • Assigning and configuring column generators. To protect the data in a column, especially a sensitive column, you assign a generator to it. The generator replaces the source value with a different value in the destination database. For example, the generator might scramble the characters or assign a random value of the same type.

  4. After you complete the configuration, you run the data generation job. The data generation job uses the configured table modes and generators to transform the data from the source database and write the transformed data to the destination location. You can track the job progress and view the job results.

About Tonic Structural

The Tonic Structural synthetic data platform combines sensitive data detection and data transformation to allow users to create safe, secure, and compliant datasets.

Common Structural use cases include creating staging and development environments, and trying out a new cloud provider without complex data agreements.

Structural allows you to reduce bug counts, shorten testing life cycles, and share data with partners, all while helping to ensure security and compliance with the latest regulations, from GDPR to CCPA.

You can use the Structural API to integrate with CI/CD pipelines or to create automated processes that ensure that the generated data is available on demand.

Structural data generation workflow

Overview of the Structural steps to generate de-identified data.

Structural deployment types

You can use Structural Cloud or set up a self-hosted Structural instance.

Structural implementation roles

Functions that participate in a Structural implementation.

Structural license plans

View the license options and their available features.

Workspaces

A workspace contains the data connections and data generation configuration.

Data connectors

Each data connector allows Structural to read from and write to a specific type of data source.

Privacy Hub

View and update the current protection status based on the sensitivity scan and workspace configuration.

Database View

Configure transformation options for tables and columns.

Generators

A generator is assigned to a column and performs a data transformation.

Subsetting

Configure a subset of source data to include in the transformed destination data.

Generate data

Run the data generation process to produce transformed destination data.

Schema changes

Review and address changes to the source data schema.

User access

Manage who has access to your instance.

Monitoring and logging

Monitor Structural services and share logs with Tonic.ai.

Updating Structural

Upgrade to the latest version of Structural.

Assigning tags to a workspace

Required workspace permission: Configure workspace settings

You can associate custom tags with each workspace. Tags can help to organize and provide a quick glance into the workspace configuration.

Tags are accessible to every user that has access to the workspace.

Tags are stored in the workspace JSON, and are included in the workspace export. You can also use the API to get access to tags.

Managing tags from workspace settings

You can add and edit tags in the Tags field on the New Workspace and Workspace Settings views.

  • To add tags, enter a comma-separated list of the tags to add.

  • To remove a tag, click its delete icon.

Managing tags from Workspaces view

You can also manage tags directly from Workspaces view.

Assigning tags

To add tags to a workspace that does not currently have tags:

  1. Hover over the Tags column for the workspace.

  2. Click Add Tags.

  3. In the tag input field, type a comma-separated list of tags to apply.

  4. Press Enter.

Editing the assigned tags

To edit the assigned tags:

  1. Click the Tags column for the workspace.

  2. In the tag input field, to remove tag, click its delete icon.

  3. To add tags, type a comma-separated list of the tags to add.

  4. To save the tag changes, press Enter.

Displaying sample data for a column

Required workspace permission:

  • Source data: Preview source data

  • Destination data: Preview destination data

For each column on Database View, you can display a sample list of the column values.

For columns that have an assigned generator, the sample shows both the current values and the possible values after the generator is applied.

To display the sample values, in the Column column, click the magnifying glass icon.

If the generator is Passthrough, then the sample data panel contains only Original Data.

Sample data for a column that does not have an assigned generator

If a different generator is assigned, then the sample data panel contains both Original Data and Protected Output.

Sample data for a column that has an assigned generator

Identifying similar columns

During sensitivity scans and schema change scans, Tonic Structural identifies groups of similar columns.

To identify similar columns, Structural uses a text embedding model to calculate the semantic similarity between any two column names in the database. When a column name's semantic similarity to the name of a given column is above a specified threshold, then the column is similar to the given column.

If a column has similar columns, then the Applied Generator column contains an icon that includes the count of similar columns.

Similar columns icon with the count of similar columns

By default, the similar columns icon is hidden. To display the similar columns icon, hover over the column row.

When you assign a generator to a column, the similar columns icon for that column remains visible during your current session.

When you click the similar columns icon, Structural displays a panel with an option to filter the list to display the current column and its similar columns. To apply the filter, click Filter.

Similar columns panel with filter option

The similar columns filter is applied, and other column filters are removed. Table filters remain in place.

Database View column list with a similar columns filter applied

Manually indicating whether a column is sensitive

You can also manually indicate that a column is sensitive or not sensitive.

For example, the sensitivity scan might incorrectly identify a column as sensitive. Or a column might contain data that you consider sensitive but that does not match a detected sensitivity type.

When you manually change a column from not sensitive to sensitive, Structural marks the sensitivity detection as full confidence.

For information on how to change whether a column is sensitive:

  • For Privacy Hub, go to .

  • For Database View, go to:

    • For a single column,

    • For multiple selected columns,

  • For Table View, go to .

The Structural API also provides endpoints to designate columns as sensitive or not sensitive.

Logging into Structural for the first time

When you go to Tonic Structural for the first time, you create an account. How you create an account depends on the type of user you are.

A new Structural user can be one of the following:

  • Free trial users use Structural Cloud to explore and experiment with Structural before they decide whether to purchase it.

  • Self-hosted instances are installed on-premises. The customer administers the Structural users.

  • New users are added to existing organizations based on their email domain.

Workspace configuration settings

The workspace settings for a new workspace (New Workspace view) or edited workspace (Workspace Settings tab) provide information about the workspace and its data.

Workspace identification and connection type

Every workspace includes the following settings to identify the workspace and to select the type of data connector.

Fields to identify the workspace

All workspaces have the following fields that identify the workspace:

  1. In the Workspace name field, enter the name of the workspace.

  2. In the Workspace description field, provide a brief description of the workspace. The description can contain up to 200 characters.

  3. In the Tags field, provide a comma-separated list of tags to assign to the workspace. For more information on managing tags, go to .

Connection type

Under Connection Type, select the type of data connector to use for the workspace data. You cannot change the connection type on a .

The Basic and Professional licenses limit the number and type of data connectors you can use.

  • A Basic instance can only use one data connector type, which can be either PostgreSQL or MySQL. After you create your first workspace, any subsequent workspaces must use the same data connector type.

  • A Professional instance can use up two different data connector types, which can be any type other than Oracle or Db2 for LUW. After you create workspaces that use two different data connector types, any subsequent workspaces must use one of those data connector types.

If the database that you want to connect to isn't in the list, or you want to have different database types for your source and destination database, contact [email protected].

When you select a connector type, Structural updates the view to display the connection fields used for that connector type. The specific fields vary based on the .

Data generation settings

Blocking data generation for all schema changes

Most workspaces that connect to a database have a Block data generation if schema changes detected toggle. The setting is usually in the Source Settings section.

By default, the option is turned off. When the option is off:

  • Structural blocks data generation when there are sensitive schema changes. A sensitive schema change is a change that might result in data leakage. For example, a new column is added. Until you assign a generator to the column, the data is passed through as is to the output data. Before you can generate data with the new column, you must resolve the schema change.

  • Structural does not block data generation when there are only notification schema changes. A notification schema change is a change that Structural can ignore during data generation. For example, a column is removed. Structural can ignore that column during data generation, until you are able to permanently resolve the schema change.

If this option is turned on, then if Structural detects any changes at all to the schema, then until you resolve the schema changes, data generation is blocked.

For more information, go to .

Advanced workspace overrides

For self-hosted instances, Structural provides to configure features that include:

  • Consistency across runs and databases

  • Data generation performance

The Advanced Workspace Overrides section of the workspace details view allows you to override those environment settings for an individual workspace.

For example, the environment setting TONIC_TABLE_PARALLELISM determines the number of tables that Structural processes simultaneously. You can then override that value within individual workspaces.

The workspace overrides are available on both self-hosted instances and on Structural Cloud.

Configuring the overrides

To display the available override settings, expand Advanced Workspace Overrides.

Enabling and setting an override

For information on how to configure the statistics seed, go to .

For other settings, to enable the override and set the override value:

  1. Toggle the setting to the on position.

  1. Set the value.

Removing an override

To remove the override, toggle the setting to the off position.

Available overrides

Workspace statistics seed for cross-run consistency

For generators where is enabled, a statistics seed enables consistency across data generation runs. The Structural-wide statistics seed value ensures consistency across both data generation runs and workspaces.

You use the Override Statistics Seed setting to override the Structural-wide statistics seed value.

You can either disable consistency across data generations, or provide a seed value for the workspace. The workspace seed value ensures consistency across data generation runs for that workspace, and across other workspaces that have the same seed value.

For details about using seed values to ensure consistency across data generation runs and databases, go to .

Data generation performance settings

Structural provides environment settings to manage . For example, these settings include configuration for parallel processing.

From Advanced Workspace Overrides, you can override some of these data generation performance settings for an individual workspace.

Data encryption and decryption keys

To use Structural data encryption, you must .

You use the Override Data Decryption Key and Override Data Encryption Key settings to override the Structural-wide keys that are provided in the environment settings.

Destination database schema creation

Some data connectors allow you to configure whether you provide the schema for the destination database. For more information, go to related information for , , , , , and .

From Advanced Workspace Overrides, you can override the instance-wide configuration.

Exporting and importing the workspace configuration

Required workspace permission: Export and import workspace

You can export a workspace configuration to a JSON file, and import configuration from a workspace configuration JSON file.

For example, you might want to preserve a version of the workspace configuration before you test other changes. You can then use the exported file to restore the original configuration.

Or you might want to use a script to make changes to an exported configuration file. You can then import the updated file to update the workspace configuration.

Information in the exported file

The workspace JSON configuration file includes the following information:

  • Sensitivity designations that you assigned to columns

  • Assigned table modes

  • Assigned column generators

  • Subsetting configuration

  • Post-job script configuration

Exporting the workspace configuration

To export the workspace configuration, either:

  • On the workspace management view, from the download menu, select Export Workspace.

  • On Workspaces view, click the actions menu for the workspace, then select Export.

When you export a child workspace, the exported workspace does not retain any of the inheritance information. The exported information is the same for all exported workspaces.

Importing a workspace configuration file

To import a workspace configuration file:

  1. Select the import option. Either:

    • On the workspace management view, from the download menu, select Import Workspace.

    • On Workspaces view, click the actions menu for the workspace, then select Import.

  2. On the Import Workspace dialog, to select the file to import, click Browse.

  3. After you select the file, click Import.

When you import a workspace configuration into a child workspace, Tonic Structural only updates the configuration that can be overridden. If a configuration must be inherited from the parent workspace, then it is not affected by the imported configuration. For more information, go to .

Managing access to workspaces

When you create a workspace, you become the owner of the workspace, and by default are assigned the built-in Manager workspace permission set for the workspace. The Manager permission set provides full access to the workspace configuration, data, and results.

With a Professional or Enterprise license, you can also assign workspace permission sets to other users and to SSO groups. You can also transfer a workspace to a different owner.

If you are granted access to any workspace permission set for a workspace, then you have access to all of the workspace management views for that workspace. However, you can only perform tasks that you have permission for in that workspace.

Workspace access is managed from the Workspaces view. You cannot assign workspace permission sets from Structural Settings view.

You can also view an .

Commenting on columns

Required license: Professional or Enterprise

From Database View, you can add comments to columns. For example, you might use a comment to explain why you selected a particular generator or marked a column as sensitive or not sensitive.

Creating a new comment

If a column does not have any comments, then to add a comment:

  1. In the Applied Generator column, click the comment icon.

  2. In the comment field, type the comment text.

  3. Click Comment.

Responding to existing comments

When a column has existing comments, the comment icon is green. To add comments:

  1. Click the comment icon. The comments panel shows the previous comments. Each comment includes the comment user.

  2. In the comment field, type the comment text.

  3. Click Reply.

Working with document-based data

For document-based data connectors - currently and - Database View and Table View are replaced by Collection View.

"Collection" is the term that Structural uses to refer to MongoDB collections and DynamoDB tables.

Structural also allows you to run collection scans to identify the data structure.

Identifying sensitive data

Tonic Structural uses its sensitivity scan to identify source data columns that contain sensitive information. The scan ignores truncated tables.

The sensitivity scan identifies Structural's built-in sensitivity types. It also looks for custom types that you define.

You can also manually mark a column as sensitive or not sensitive.

Algebraic

The algebraic generator identifies the algebraic relationship between three or more numeric values and generates new values to match. At least one of the values must be a non-integer.

If a relationship cannot be found, then the generator defaults to the generator.

This generator can be linked with other Algebraic generators.

Characteristics

How to configure

To configure the generator, from the Link To dropdown list, select the columns to link this column to. You can select other columns that are assigned the Algebraic generator.

You must select at least three columns.

The column values must be numeric. At least one of the columns must contain a value other than an integer.

If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

ASCII Key

Generates unique alphanumeric strings based on any printable ASCII characters. The length of the source string is not preserved. You can choose to exclude lowercase letters from the generated values.

Characteristics

How to configure

To configure the generator:

  1. To exclude lowercase letters from the generated values, toggle Exclude Lowercase Alphabet to the on position.

  2. Toggle the Consistency setting to indicate whether to make the generator consistent. By default, the generator is not consistent.

  3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Find and Replace

This generator replaces all instances of the find string with the replace string.

For example, you can indicate to replace all instances of abc with 123.

Characteristics

How to configure

To configure the generator:

  1. In the Find field, type the string to look for in the source column value. To use a regular expression to identify the source value, check the Use Regex checkbox. If you use a regular expression, use backslash ( \ ) as the escape character.

  2. In the Replace field, type the string to replace the matching string with.

  3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Finnish Personal Identity Code

Generates a valid Finnish Personal Identity Code (PIC) that would have been issued during a specific date range.

Characteristics

How to configure

To configure the generator:

  1. Under Date Range, set the start and date for the date range to generate the PICs for.

  2. Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.

  3. If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No, cannot be made differentially private.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

Yes

Generator ID (for the API)

FinnishPicGenerator

A completely new user who is starting a Structural 14-day free trial.
A new user on a self-hosted Structural instance.
A new user in an existing Structural Cloud organization.
Assigning tags to a workspace
child workspace
connector type
Viewing and resolving schema changes

Consistency

No, cannot be made consistent.

Linking

Yes, can be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

3

Generator ID (for the API)

AlgebraicGenerator

Categorical
Structural data encryption

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

Yes

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

Yes

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

AsciiPkGenerator

Structural data encryption

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

FindAndReplaceGenerator

Structural data encryption

Identification and connection type

Settings to identify the workspace and to select the data connector.

Data connection settings

Connect to source and destination databases or, for the file connector, local or cloud storage files.

Data generation settings

Block data generation on schema changes.

Enable and configure upsert

Add new destination records and update changed destination records. Ignore other unchanged destination records.

Write output to Tonic Ephemeral

Use the data generation output to create an Ephemeral user snapshot.

Write output to a container repository

Use the data generation output to populate a container data volume.

Advanced workspace overrides

Workspace-specific settings for cross-run consistency and data generation performance.

About workspace inheritance
Download menu for a workspace
Workspace options column and dropdown list
overview video tutorial about workspace access

Share workspace access

Grant access to other users. The assigned workspace permission sets determine the level of access.

Transfer ownership of a workspace

Make another user the workspace owner. You can also assign yourself workspace permission sets.

Comment panel for a column that has no comments
Replying to existing comments on a column
MongoDB
Amazon DynamoDB

Scan collections Collection scans identify the fields and data types in a collection.

Use Collection View Configure generators and collection modes.

Run the Structural sensitivity scan

Run, configure, and get the results of the sensitivity scan.

Set column sensitivity manually

Options to override the results of the sensitivity scan.

Built-in sensitivity types

Types of sensitive data that the sensitivity scan can identify.

Configure custom sensitivity rules

Set up rules to enable the scan to identify other sensitive columns based on the column data types and names.

Transferring ownership of a workspace

Required permission

  • Global permission: View organization users. This permission is only required for the Tonic Structural application. It is not needed when you use the Structural API.

  • Either:

    • Workspace permission: Transfer workspace ownership

    • Global permission: Manage access to Tonic Structural and to any workspace

To grant yourself access after the transfer:

  • Workspace permission: Share workspace access

Every workspace has an owner. The owner is always a user.

The user who creates the workspace is automatically the owner of the workspace.

By default, the workspace owner is assigned the built-in Manager workspace permission set. On Enterprise instances, you can choose a different workspace permission set to assign to all workspace owners.

You cannot remove that permission set from the workspace owner.

You can transfer a workspace to a different owner. The new owner is assigned the owner permission set. If the previous owner does not otherwise have access to the owner permission set, then that permission set is removed.

To transfer workspace ownership:

  1. To transfer ownership of a single workspace, from the workspace actions menu, select Transfer Ownership.

  2. To transfer ownership of multiple workspaces:

    1. Check the checkbox for each workspace to grant access to.

    2. From the Actions menu, select Transfer Ownership.

  3. On the transfer ownership panel, from the User dropdown list, select the new owner.

  4. If you are the current owner of the workspace, then to grant yourself non-owner access after you transfer the ownership:

    1. Toggle Receive access to workspace to the on position.

    2. Select the workspace permission set to assign to yourself.

  5. Click Transfer Ownership.

environment settings
consistency
data generation performance
provide encryption and decryption keys
Databricks
MySQL
Oracle
Snowflake on AWS
Snowflake on Azure
SQL Server
Advanced Workspace Overrides section for a workspace
Enabling and setting an override value

Database View

Database View provides a complete view of your source database structure and configuration.

To display Database View, either:

  • On the workspace management view, in the workspace navigation bar, click Database View.

  • On Workspaces view, from the dropdown menu in the Name column, select Database View.

Database View consists of:

  • On the left, the list of tables in the source database.

  • On the right, the list of columns in those tables.

Database View

View table and column information

Configure and comment on columns

Generator information

Generators transform the data in a source database column. You assign the generators to use. Tonic Structural offers a variety of generators to transform different types of data.

For details about how to assign and configure generators, and manage generator presets, go to Generator assignment and configuration.

You can also view this video overview of generators and how they work.

About the available generators

Generator characteristics and types

Alphanumeric String Key

Generates unique alphanumeric strings of the same length as the input.

For example, for the origin value ABC123, the output value is a six-character alphanumeric string such as D24N05.

Characteristics

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

Yes

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

Yes

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

By default, the generator is not consistent.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Array Character Scramble

A version of the Character Scramble generator that can be used for array values.

This generator replaces letters with random other letters, and numbers with random other numbers. Punctuation and whitespace are preserved.

For example, for the following array value:

["ABC.123", 3, "last week"]

The output might be something like:

["KFR.860", 7, "sdrw mwoc"]

This generator securely masks letters and numbers. There is no way to recover the original data.

Characteristics

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

By default, the generator is not consistent.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Business Name

Generates a random company name-like string.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.

By default, the generator is not consistent.

If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.

When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Business is always mapped to New Business.

When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Categorical

The Categorical generator shuffles the existing values within a field while maintaining the overall frequency of the values. It disassociates the values from other pieces of data. Note that NULL is considered a separate value.

For example, a column contains the values Small, Medium, and Large. Small appears 3 times, Medium appears 4 times, and Large appears 5 times. In the output data, each value still appears the same number of times, but the values are shuffled to different rows.

This generator is optimized for categories with fewer than 10,000 unique values. If your underlying data has more unique values (for example, your field is populated by freeform text entry), we recommend that you use the Character Scramble or Custom Categorical generator.

Characteristics

Consistency

No, cannot be made consistent.

Linking

Yes, can be linked.

Differential privacy

Configurable

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 2 if differential privacy enabled

  • 3 if differential privacy not enabled

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the Link To dropdown, select the columns to link to the current column. You can select from other columns that use the Categorical generator.

  2. Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, differential privacy is disabled.

  3. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Character Scramble

This generator replaces letters with random other letters and numbers with random other numbers. Punctuation, whitespace, and mathematical symbols are preserved.

For example, for the following input string:

ABC.123 123-456-789 Go!

The output would be something like:

PRX.804 296-915-378 Ab!

This generator securely masks letters and numbers. There is no way to recover the original data.

Character Scramble is similar to Character Substitution, with a couple of key differences.

While you can enable consistency for the entire value, Character Scramble does not always replace the same source character with the same destination character. Because there is no guarantee of unique values, you cannot use Character Scramble on unique columns.

Character Substitution, however, does always map the same source character to the same destination character. Character Substitution is always consistent, which makes it less secure than Character Scramble. You can use Character Substitution on unique columns.

Characteristics

Consistency

Yes, can be made self-consistent

Linking

No, cannot be linked

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

By default, the generator is not consistent.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Character Substitution

Performs a random character replacement that preserves formatting (spaces, capitalization, and punctuation).

Characters are replaced with other characters from within the same Unicode Block. A given source character is always mapped to the same destination character. For example, M might always map to V.

For example, for the following input string:

Miami Store #162

The output would be something like:

Vgkjg Gmlvf #681

Note that for a numeric column, when a generated number starts with a 0, the starting 0 is removed. This could result in matching output values in different columns. For example, one column is changed to 113 and the other to 0113, which also becomes 113.

Character Substitution is similar to Character Scramble, with a couple of key differences. Because Character Substitution always maps the same source character to the same destination character, it is always consistent. It also can be used for unique columns.

In Character Scramble, the character mapping is random, which makes Character Scramble slightly more secure. However, Character Scramble cannot be used for unique columns.

Characteristics

Consistency

This generator is implicitly self-consistent. You do not specify whether the generator is consistent. Every occurrence of a character always maps to the same substitute character. Because of this, it can be used to preserve a join between two text columns, such as a join on a name or email.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

No

Privacy ranking

4

Generator ID (for the API)

How to configure

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Company Name

This generator is deprecated. Use the Business Name generator instead.

Generates a random company name-like string.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.

By default, the generator is not consistent.

If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.

When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Company is always mapped to New Company.

When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Constant

Uses a single value to mask all of the values in the column.

For example, you can replace every value in a string column with the value String1. Or you can replace every value in a numeric column with the value 12345.

Characteristics

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

Yes

Data-free

Yes

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

1

Generator ID (for the API)

How to configure

To configure the generator, in the Constant Value field, provide the value to use.

The value must be compatible with the field type. For example, you cannot provide a string value for an integer column.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Continuous

Generates a continuous distribution to fit the underlying data.

This generator can be linked to other Continuous generators to create multivariate distributions and can be partitioned by other columns.

Characteristics

Consistency

No, cannot be made consistent.

Linking

Yes, can be linked.

Differential privacy

Configurable

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 2 if differential privacy enabled

  • 3 if differential privacy not enabled

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the Link To drop-down list, select the other Continuous generator columns to link to. The linking creates a multivariate distribution.

  2. From the Partition By drop-down list, select one or more columns to use to partition the data. The selected columns must have the generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to Partitioning a column.

  3. Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, the generator is not differentially private.

  4. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Custom Categorical

A version of the Categorical generator that selects from values that you provide instead of shuffling the original values.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

Yes, can be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the Link To dropdown list, select the columns to link this column to. You can only select other columns that use the Custom Categorical generator.

  2. In the Custom Categories text area, enter the list of values that the generator can choose from. Put each value on a separate line. To add a NULL value to the list, use the keyword {NULL}.

  3. Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.

  4. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given source value in that column always results in the same value for the current column in the destination database. For example, a department column is consistent with a username column. For each instance of User1 in the source database, the value in the department column is the same.

  5. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Event Timestamps

Generates timestamps that fit an event distribution. The source timestamp must include a date. It cannot be a time-only value.

Link columns to create a sequence of events across multiple columns. This generator can be partitioned by other columns.

Characteristics

Consistency

No, cannot be made consistent.

Linking

Yes, can be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

3

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the Link To dropdown list, select the other Event Timestamps generator columns to link this column to. Linking creates a sequence across multiple columns.

  2. From the Partition drop-down list, select one or more columns to use to partition the data. The selected columns must have their generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to Partitioning a column.

  3. The Options list displays the current column and linked columns. Use the Up and Down buttons to configure the column sequence.

  4. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

File Name

This generator scrambles characters, but preserves formatting and keeps the file extension intact.

For example, for the following input value:

DataSummary1.pdf

The output value would look something like:

RsnoPwcsrtv5.pdf

This generator securely masks letters and numbers. There is no way to recover the original data.

Characteristics

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

By default, the generator is not consistent.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

FNR

The FNR generator transforms Norwegian national identity numbers. In Norwegian, the term for national identity number abbreviates to FNR.

The first six digits of an FNR reflects the person's birthdate. You can choose to preserve the birthdates from the source values in the destination values. If you do not preserve the source values, the destination values are still within the same date range as the source values.

Another digit in an FNR indicates whether the person is male or female. You can specify whether to preserve in the generated value the gender indicated in the source value.

The last digits in an FNR are a checksum value. The last digits in the destination value are not a checksum - the values are random.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

No, cannot be linked

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. To preserve the gender from the source value in the destination value, toggle Preserve Gender to the on position.

  2. To preserve the birthdate from the source value in the destination value, toggle Preserve Birthdate to the on position.

  3. Toggle the Consistency setting to indicate whether to make the generator consistent. By default, consistency is disabled.

  4. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given value for that other column in the source database results in the same value in the destination database. For example, if the FNR column is consistent with a Name column, then every instance of John Smith in the source database results in the same FNR in the destination database.

  5. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Geo

This generator can be used to mask columns of latitude and longitude.

The Geo generator divides the globe into grids that are approximately 4.9 x 4.9 km. It then counts the number of points within each grid.

During data generation, each (latitude, longitude) pair is mapped to its grid.

  • If the grid contains a sufficient number of points to preserve privacy, then the generator returns a randomly chosen point in that grid.

  • If the grid does not contain enough points to preserve privacy, then the generator returns a random coordinate from the nearest grid that contains enough points.

Characteristics

Consistency

No, cannot be made consistent.

Linking

Yes, can be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

No

Privacy ranking

3

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the Link To dropdown list, select the column to link to this one. You typically assign the Geo generator to both the latitude and longitude column, then link those columns.

  2. From the value type dropdown, select whether this column contains a latitude value or a longitude value.

  3. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Hostname

Generates random host names, based on the English language.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.

By default, the generator is not consistent.

If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from Consistent to, select the column.

When the generator is consistent with itself, then a given value in the source database is mapped to the same value in the destination database. For example, Host123 in the source database always produces MyHostABC in the destination database.

When the generator is consistent with another column, then a given source value in the other column results in the same host name value in the destination database. For example, a host name column is consistent with a department column. Every instance of Sales in the source data is given the same host name in the destination database.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Integer Key

Generates unique integer values. By default, the generated values are within the range of the column’s data type.

You can also specify a range for the generated values. The source values must be within that range.

This generator cannot be used to transform negative numbers.

Characteristics

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

Yes

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

Yes

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. In the Minimum field, enter the minimum value to use for an output value. The minimum value cannot be larger than any of the values in the source data.

  2. In the Maximum field, enter the maximum value to use for an output value. The maximum value cannot be smaller than any of the values in the source data.

  3. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

  4. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Mongo ObjectId Key

Generates unique object identifiers.

Can be assigned to text columns that contain MongoDB ObjectId values. The column value must be 12 bytes long.

Characteristics

Consistency

Yes, can be made self-consistent

Linking

No, cannot be linked

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. A MongoID object identifier consists of an epoch timestamp, a random value, and an incremented counter. To only change the random value portion of the identifier, but keep the timestamp and counter portions, toggle Preserve Timestamp and Incremental Counter to the on position.

  2. Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.

  3. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Random Boolean

Generates a random boolean value.

Characteristics

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

Yes

Data-free

Yes

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

1

Generator ID (for the API)

How to configure

To configure the generator, in the Percent True field, enter the percentage of values to set to True in the output.

For example, if you set this to 60, then 60% of the output values are True, and 40% of the output values are False.

If you set this to 100, then all of the output values are True.

If you set this to 0, then all of the output values are False.

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Null

Generates NULL values to fill the rows of the specified column.

Characteristics

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

Yes

Data-free

Yes

Allowed for primary keys

No

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

No

Privacy ranking

1

Generator ID (for the API)

How to configure

The Null generator has no configuration options.

Passthrough

Passthrough is the default option.

It passes through the value from the source database to the destination database without masking it.

Characteristics

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

No

Privacy ranking

6

Generator ID (for the API)

How to configure

Passthrough has no configuration options.

Data connection settings

After you select the connector type, you configure:

  • Where to find the source data

  • Where to write the data generation output

Source database connection

For data connectors that connect to a database, the Source Settings section provides connection information for the source database.

You cannot change the source data configuration for a .

For information about the source connection fields for a specific data connector, go to the workspace configuration topic for that .

Upsert configuration

For data connectors that support upsert, the workspace configuration includes an Upsert section to allow you to enable and configure upsert. Upsert adds and updates rows in the destination database, but keeps all other existing rows intact.

If you enable upsert, then you cannot write output to an Ephemeral database or to a container repository. You must write the output to a destination database.

For more information, go to .

Destination data location

For data connectors that connect to a database, the Destination Settings section provides information about where and how Structural writes the output data from data generation.

Depending on the data connector type, you might be able to write to either:

  • Destination database - Writes the output data to a destination database on a database server.

  • Ephemeral snapshot - Writes the output data to a Tonic Ephemeral user snapshot.

  • Container repository - Writes the output data to a data volume in a container repository.

Destination database

When you write the output to a destination database, the destination database must be of the same type as the source database.

Structural does not create the destination database. It must exist before you generate data.

In Destination Settings, you provide the connection information for the destination database. For information about the destination database connection fields for a specific data connector, go to the workspace configuration topic for that .

If available, the Copy Settings from Source allows you to copy the source connection details to the destination database, if both databases are in the same location. Structural does not copy the connection password.

Tonic Ephemeral snapshot

Tonic Ephemeral is a separate Tonic.ai product that allows you to create temporary databases to use for testing and demos. For more information about Ephemeral, go to the .

If Ephemeral supports your workspace database type, then you can write the destination data to a snapshot in Ephemeral. For data larger than 10 GB, this option is recommended instead of writing to a container repository.

From Ephemeral, you can use the snapshot to start new Ephemeral databases.

For more information, go to .

Container repository

Some data connectors allow you to write the transformed data to a data volume in a container repository instead of to a database server.

You can use the resulting data volume to create a database in Tonic Ephemeral. If you do plan to use the data to start an Ephemeral database, and the size of the data is larger than 10 GB, then the recommendation is to write the data to an Ephemeral user snapshot instead.

For more information, go to .

Testing database connections

When you provide connection details for a database server, Structural provides a Test Connection button to test the connection, and verify that Structural can use the connection details to connect to the database. Structural uses the connection details to try to reach the database, and indicates whether it succeeded or failed. We strongly recommend that you test the connections.

The TONIC_TEST_CONNECTION_TIMEOUT_IN_SECONDS determines the number of seconds before a connection test times out. You can configure this setting from the Environment Settings tab on Structural Settings. By default, the connection test times out after 15 seconds.

File connector source and destination data

A workspace uses files as its source data and produces transformed versions of those files as its output.

For file connector workspaces, the File Location section indicates where the source files are obtained from - either a local file system or a cloud storage solution (Amazon S3 or Google Cloud Storage).

When the files come from cloud storage, the Output Location section indicates where to write the transformed files. You must also provide the cloud storage connection credentials.

For more information, go to .

Sharing workspace access

Required license: Professional or Enterprise

Required permission

  • Global permission: View organization users. This permission is only required for the Tonic Structural application. It is not needed when you use the Structural API.

  • Either:

    • Workspace permission: Share workspace access

    • Global permission: Manage user access to Tonic and to any workspace

Tonic Structural uses workspace permission sets for role-based access (RBAC) of each workspace.

A workspace permission set is a set of . Each permission provides access to a specific workspace feature or function.

Structural provides . Enterprise instances can also .

To share workspace access, you assign workspace permission sets to users and, if you use SSO to manage Structural users, to SSO groups. Before you assign a workspace permission set to an SSO group, make sure that you are aware of who is in the group. The permissions that are granted to an SSO group automatically are granted to all of the users in the group. For information on how to configure Structural to filter the allowed SSO groups, go to .

You cannot remove the owner workspace permission set from the workspace owner. By default, the owner permission set is the built-in Manager permission set.

To change the current access to the workspace:

  1. To manage access to a single workspace, either:

    • On the workspace management view, in the heading, click the share icon.

    • On Workspaces view, click the actions menu for the workspace, then select Share.

  2. To manage access for multiple workspaces:

    1. Check the checkbox for each workspace to grant access to.

    2. From the Actions menu, select Share Workspaces.

  3. The workspace access panel contains the current list of users and groups that have access to the workspace. To add a user or group to the list of users and groups, begin to type the user email address or group name. From the list of matching users or groups, select the user or group to add. Free trial users can invite other users to start their own free trial. Provide the email addresses of the users to invite. The email addresses must have the same corporate email domain as your email address. When the invited users sign up for the free trial, they are added to the Structural organization for the free trial user that invited them and have access to the workspace.

  4. For a user or group, to change the assigned workspace permission sets:

    1. Click Access. The dropdown list is populated with the list of custom and built-in workspace permission sets. If you selected multiple workspaces, then on the initial display of the workspace sharing panel, for each permission set that a user or group currently has access to, the list shows the number of workspaces for which the user or group has that permission set. For example, you select three workspaces. A user currently has Editor access for one workspace and Viewer access for the other two. The Editor permission set has 1 next to it, and the Viewer permission set has 2 next to it.

    2. Under Custom Permission Sets, check the checkbox next to each workspace permission set to assign to the user or group. Uncheck the checkbox next to each workspace permission set to remove from the user or group.

    3. Under Built-In Permission Sets, check the workspace permission set to assign to the user or group. You can only assign one built-in permission set. By default, for an added user or group, the Editor permission set is selected. To select a built-in workspace permission set that is lower in access than the currently selected permission set, you must first uncheck the selected permission set. For example, if Editor is currently checked, then to change the selection to Viewer, you must first uncheck Editor.

  5. To remove all access for a user or group, and remove the user or group from the list, click Access, then click Revoke.

  6. To save the new access, click Save.

Built-in sensitivity types that Structural detects

Structural identifies the following types of sensitive values. These include some information types that are considered by many privacy standards and frameworks such as HIPAA, GDPR, CCPA, and PCI.

For more information about the HIPAA and Safe Harbor information types that Structural detects, go to the Tonic.ai guide .

Names

  • First

  • Last

  • Full

Organization

Location

  • Street address

  • ZIP

  • PO Box

  • City

  • State and two-letter abbreviation

  • Country

  • Postal code

  • GPS coordinates

Contact information

  • Email address

  • Telephone number

User credentials

  • Username

  • Password

Financial information

  • Credit card number

  • International bank account number (IBAN)

  • SWIFT code for bank transfers

  • Money amount

  • BTC (Bitcoin) address

Identification

  • Social Security Number

  • Passport number

  • Driver's license number

  • Birth date

  • Gender

  • Biometric identifier, such as a fingerprint or voiceprint

  • Full face photographic images and similar images

Medical information

  • ICD-9 and ICD-10 codes (Used to identify diseases)

  • Medical record number

  • Health plan beneficiary number

  • Admission date

  • Discharge date

  • Date of death

Other personal information

  • Marital status

Accounts and licenses

  • Account number

  • Certificate or license number

Network and web location

  • IP address

  • IPv6 address

  • MAC address

  • Web URL

International Mobile Equipment Identity (IMEI)

Vehicle information

  • Vehicle identification number (VIN)

  • License plate number

Array Regex Mask

This is a .

A version of the generator that can be used for array values.

Uses regular expressions to parse strings and replace specified substrings with the output of specified generators. The parts of the string to replace are specified inside unnamed top-level capture groups.

Characteristics

How to configure

Adding a regular expression

To add a regular expression:

  1. Click Add Regex. On the configuration panel, Cell Value shows a sample value from the source database. You can use the previous and next options to navigate through the values.

  2. By default, Replace all matches is enabled. To only match the first occurrence of a pattern, toggle Replace all matches to the off position.

  3. In the Pattern field, enter a regular expression. If the expression is valid, then Structural displays the capture groups for the expression.

  4. For each capture group, to select and configure the generator to apply, click the selected generator. You cannot select another composite generator.

  5. To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

Managing the regular expressions list

From the Regexes list:

  • To edit a regular expression, click the edit icon.

  • To remove a regular expression, click the delete icon.

Cross Table Sum

Links columns in two tables. This column value is the sum of the values in a column in another table.

This generator does not provide a preview. The sums are not computed until the other table is generated.

For example, a Customers table contains a Total_Sales column. The Transactions table uses a foreign key Customer_ID column to identify the customer who made the transaction, and an Amount column that contains the amount of the sale. The Customer_ID value in the Transactions table is a value from the ID primary key column in the Customers table.

You assign the Cross Table Sum generator to the Total_Sales column. In the generator configuration, you indicate that the value is the sum of the Amount values for the Customer_ID value that matches the primary key ID value for the current row.

For the Customers row for ID 123, the Total_Sales column contains the sum of the Amount column for Transactions rows where Customer_ID is 123.

Characteristics

How to configure

To configure the generator:

  1. From the Foreign Table dropdown list, select the table that contains the column for which to sum the values.

  2. From the Foreign Key dropdown list, select the foreign key. The foreign key identifies the row from the current table that is referred to in the foreign table.

  3. From the Sum Over dropdown list, select the column for which to sum the values.

  4. From the Primary Key dropdown list, select the primary key for the current table.

  5. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

HStore Mask

This is a .

Runs selected generators on specified key values in an HStore column in a PostgreSQL database. HStore columns contain a set of key-value pairs.

Characteristics

How to configure

Adding a sub-generator

To assign a generator to a key:

  1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HStore field contains a sample value from the source database. You can use the previous and next icons to page through different values.

  2. Under Enter a key, enter the name of a key from the column value. For example, for the column value: "pages"=>"446", "title"=>"The Iliad", "category"=>"mythology" To apply a generator to the title, you would enter title as the key. Matched HStore Values shows the result from the value in Cell HStore.

  3. From the Generator Configuration dropdown list, select the generator to apply to the key value. You cannot select another composite generator.

  4. Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  5. To save the configuration and immediately add a generator for another key, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

Managing the sub-generators list

From the Sub-Generators list:

  • To edit a generator assignment, click the edit icon.

  • To remove a generator assignment, click the delete icon.

  • To move a generator assignment up or down in the list, click the up or down arrow.

International Address

Generates an address-like string to replace either:

  • For a Canadian postal address, the street name or postal code.

  • For a United Kingdom (UK) mailing address, the postal code.

To replace a Canadian postal code:

  • The generator selects a real postal code that starts with the same three digits - has the same Forward Sortation Area (FSA) - as the original postal code, but that has a different Local Delivery Unit (LDU).

  • For a postal code whose FSA is not on the list that the generator uses, you can provide a fallback value to use.

To replace a UK postal code, the generator selects a real postal code.

Characteristics

How to configure

To configure the generator:

  1. From the Generator Type dropdown list, select International Address.

  2. From the Country dropdown list, select the country (Canada or United Kingdom).

  3. From the Address Component dropdown list, select the address component that this column contains. For Canada, the available options are:

    • Street Name

    • Postal Code

    For the UK, the only option is to generate a postal code.

  4. For a Canadian postal code, in the Fallback Value field, type the FSA to use if the value in the data does not exist. For example, the FSA in the data might be new and not yet in the list that Structural uses, or the FSA might be invalid. By default, the fallback value is NULL, meaning that in the destination data, the postal code value is the string literal "NULL".

  5. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

  6. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Phone

Generates a random telephone number that matches the country or region of the input telephone number, but preserves the format. For example, (123) 456-7890 or 123-456-7890.

If the input is not a valid telephone number, the generator randomly replaces numeric characters. You can also replace invalid numbers with valid numbers.

By default, the numbers are United States telephone numbers.

If the input is a valid telephone number, or if you replace invalid numbers, then the generated numbers pass Google's verification.

Characteristics

How to configure

To configure the generator:

  1. Toggle the Replace invalid numbers setting to indicate whether to replace invalid input values with a valid output value. By default, the generator does not replace invalid values. It randomly replaces numeric characters.

  2. Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, consistency is disabled.

  3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Random Double

Generates a random double number between the specified minimum (inclusive) and maximum (exclusive).

Characteristics

How to configure

To configure the generator:

  1. In the Minimum field, type the minimum value to use in the output values. The minimum value is inclusive. The output values can be that value or higher.

  2. In the Maximum field, type the maximum value to use in the output values. The maximum value is exclusive. The output values are lower than that value.

  3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Random Hash

Generates a random hash string.

Characteristics

How to configure

If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Numeric String Key

Generates unique numeric strings of the same length as the input value.

For example, for the input value 123456, the output value would be something like 832957.

You can apply this generator only to columns that contain numeric strings.

Characteristics

How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

By default, the generator is not consistent.

If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Generator summary

Summary list of generators.

Generator reference

Details about the characteristics and configuration options for each generator.

Generator API reference

Details about the structure of each generator assignment in the API.

Generator characteristics

Common generator characteristics to be aware of, such as consistency and linking.

Composite generators

Composite generators apply a generator to a specific data element or based on a condition.

Primary key generators

Learn about generators that you can apply to primary key columns.

AlphaNumericPkGenerator
ArrayTextMaskGenerator
BusinessNameGenerator
CategoricalGenerator
TextMaskGenerator
StringMaskGenerator
CompanyNameGenerator
ConstantGenerator
GaussianGenerator
CustomCategoricalGenerator
EventGenerator
FileNameGenerator
FnrGenerator
GeoGenerator
HostnameGenerator
IntegerPkGenerator
ObjectIdPkGenerator
RandomBooleanGenerator
NullGenerator
PassthroughGenerator
Using Tonic Structural and the Safe Harbor method to de-identify PHI

Consistency

Determined by the selected sub-generators.

Linking

Determined by the selected sub-generators.

Differential privacy

Determined by the selected sub-generators.

Data-free

Determined by the selected sub-generators.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

ArrayRegexMaskGenerator

composite generator
Regex Mask

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

3

Generator ID (for the API)

CrossTableAggregateGenerator

Structural data encryption

Consistency

Determined by the selected sub-generators.

Linking

Determined by the selected sub-generators.

Differential privacy

Determined by the selected sub-generators.

Data-free

Determined by the selected sub-generators.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

HStoreMaskGenerator

composite generator

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

InternationalAddressGenerator

Structural data encryption

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

3

Generator ID (for the API)

USPhoneNumberGenerator

libphonenumber
Structural data encryption

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

Yes

Data-free

Yes

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

1

Generator ID (for the API)

RandomDoubleGenerator

Structural data encryption

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

Yes

Data-free

Yes

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

1

Generator ID (for the API)

RandomStringGenerator

Structural data encryption

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

Yes

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

Yes

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

NumericStringPkGenerator

Structural data encryption
workspace permissions
built-in workspace permission sets
configure custom permission sets
Synchronizing SSO groups with Structural
Workspace access panel

View and configure tables

Filter the table list, and assign table modes to tables.

View the column list

Apply filters to and sort the list of columns.

View sample data

View example source and destination data for a column.

Configure an individual column

Assign a generator and determine the column sensitivity.

Configure multiple columns

Use the bulk edit option to configure multiple columns.

Identify similar columns

Identify and filter to columns that are similar to a column, based on the column name.

Comment on columns

Add and respond to column comments.

About workspace inheritance

Required license: Enterprise

If you have multiple workspaces, then it is likely that many of the workspace components and configurations are the same or similar. It can be difficult to maintain that consistency across separate, independent workspaces.

When you copy a workspace, the new workspace is completely independent of the original workspace. There is no visibility into or inheritance of changes from the original workspace.

Workspace inheritance allows you to create workspaces that are children of a selected workspace. Unlike a copy of a workspace, a child workspace remains tied to its parent workspace.

By default, a child workspace's configuration is synchronized with the configuration of the parent. In other words, any changes to the parent workspace are copied to its child workspaces. Child workspaces can also override some of the parent configuration. From the parent workspace, you can track the child workspaces and how they are customized .

For example, you might want separate workspaces for different development teams. Each team can make adjustments to suit their specific projects - such as different subsets - but inherit everything else.

What does a child workspace inherit?

By default, a child workspace inherits all of the configuration from the parent workspace, except for the following:

  • Workspace name - A child workspace has its own name.

  • Workspace description - A child workspace has its own description.

  • Tags - A child workspace has its own tags.

  • Destination database - A child workspace writes output data to its own destination database. You can copy the destination database from the parent workspace.

  • Intermediate database - For upsert, a child workspace does not inherit the intermediate database.

  • Webhooks - A child workspace has its own webhooks.

How parent workspace changes affect child workspaces

When you change the configuration of a parent workspace, the configuration is also updated in the child workspaces.

The exception is when a child workspace overrides the configuration. If the configuration is overridden, then the child workspace does not inherit the change.

Tonic Structural indicates on both the parent and child workspaces when the configuration is overridden.

What can a child workspace override?

A child workspace can override the following configuration items.

  • Table modes - A child workspace can override the table mode for individual tables. The other tables continue to inherit the table mode that is configured in the parent workspace.

  • Column generators - A child workspace can override the generator for individual columns. The other columns continue to inherit the generator that is configured in the parent workspace. For linked columns, a change to any of the linked columns overrides the inheritance for all of the columns.

  • Subsetting - A child workspace can override the subsetting configuration from the parent workspace. Any change in the child workspace means that the child workspace no longer inherits any changes to the subsetting configuration from the parent workspace. For example, if you change the percentage setting on a single target table from 5 to 6, that eliminates the subsetting inheritance. The child workspace keeps the subsetting configuration that it already has, but it is not updated when the parent workspace is updated.

  • Post-job scripts - A child workspace can override the post-job scripts. Any change to the post-job scripts in the child workspace means that the child workspace no longer inherits any changes to the post-job scripts configuration.

  • Statistics seed - A child workspace can override the statistics seed configuration.

From each view, you can eliminate the overrides and restore the inheritance.

What must a child workspace inherit?

A child workspace cannot override the following configuration items:

  • Data connector type and source database - A child workspace always uses the same source data as the parent workspace.

  • Foreign keys - A child workspace always uses the same foreign key configuration as the parent workspace.

  • Sensitivity designation for a column - A child workspace cannot change whether a column is marked as sensitive.

How schema changes are resolved in parent and child workspaces

For removed tables and columns, when a child workspace overrides the parent workspace configuration for the table or column, you must resolve the change in the child workspace.

If there is a conflicting change for the removed table or column in the parent workspace configuration, then regardless of whether the configuration is inherited, you must resolve that change in the parent workspace before the change is resolved for the child workspace.

For changes to column nullability or data type, you resolve the change separately in the child and parent workspaces.

You also dismiss notifications (new tables and columns) separately in the parent and child workspaces.

Running the Structural sensitivity scan

The Structural sensitivity scan identifies sensitive columns in source data. The scan ignores truncated tables.

When sensitivity scans run

Structural runs sensitivity scans automatically based on specific events. You can also run manual sensitivity scans on demand.

On a self-hosted instance, sensitivity scans can also run automatically at the same time each day.

Event-based sensitivity scans

Structural automatically runs a sensitivity scan when you:

  • Create a completely new workspace and connect a data source.

  • Change the data connection details for the source database.

  • Add a file group to a file connector workspace.

A child workspace always inherits the sensitivity designations from its parent workspace.

When you copy a workspace, Structural runs a new sensitivity scan on the copy to identify sensitive columns. However, it keeps the sensitivity designation for columns that you specifically marked as sensitive or not sensitive.

Manual sensitivity scans

In addition to the automatic scans, from Privacy Hub, you can start a sensitivity scan manually.

Scheduling daily sensitivity scans

On self-hosted instances, Structural can also run scheduled daily sensitivity scans in the background.

The daily scans only run on the 10 workspaces that had the most recent activity. Activity includes:

  • User-initiated updates that are included in the Protection Audit Trail.

  • Data generation jobs.

By default, Structural runs the sensitivity scans each day at midnight.

To enable and configure the daily sensitivity scans, use the following environment settings. You can add these settings to the Environment Settings list on Structural Settings.

  • TONIC_ENABLE_SCHEDULED_SENSITIVITY_SCAN - Boolean to indicate whether to enable the scheduled daily sensitivity scans. The default value is true. To disable the scheduled daily scan, set this to false.

  • TONIC_SENSITIVITY_SCAN_HOUR - When scheduled scans are enabled, the hour at which to run the scans. The setting uses the local time zone. The value is an integer between 0 and 23, where 0 is midnight and 23 is 11:00 PM. For example, a value of 14 indicates to run the job at 2:00 PM. The default value is 0.

  • TONIC_PII_SCAN_MAX_TIMEOUT_IN_MINUTES_IF_AUTOMATIC - The number of minutes after which a scheduled scan times out. By default, the scan times out after 3 minutes.

Configuring parallel processing for sensitivity scans

For improved performance, sensitivity scans can use parallel processing.

For relational databases such as PostgreSQL and SQL Server, to configure parallel processing, you use the environment setting TONIC_PII_SCAN_PARALLELISM_RDBMS. The default value is 4.

For document-based databases such as MongoDB, you use the environment setting TONIC_PII_SCAN_PARALLELISM_DOCUMENTDB. The default value is 1.

How Structural identifies sensitive values

The Structural sensitivity scan uses the following rules and processes to:

  • Identify sensitive columns.

  • Recommend generators for those columns. For information about applying recommended generators to columns, go to Reviewing and applying recommended generators.

  • Indicate its confidence that an identified column is sensitive and is of the detected sensitivity type.

Note that this process cannot guarantee perfect precision and recall. We strongly recommend that a human reviews the sensitivity scan results and the broader dataset to ensure that nothing sensitive was missed.

Rule-based data type, column name, and value analysis - High, medium, or low confidence

To identify that a column contains sensitive information for a built-in sensitivity type, Structural looks at the data type, column name, and column values.

This part of the sensitivity scan uses regular expression matching and dictionary lookups. It produces high, medium, or low confidence detections.

When this part of the sensitivity scan determines that a column contains sensitivity data, it:

  • Marks the column as sensitive

  • Assigns the sensitivity type to the column

  • Recommends the generator configuration for the identified sensitivity type. Note that if the recommended generator is not compatible with the column, then Structural discards the recommendation.

  • Marks the sensitivity detection as high, medium, or low confidence. The confidence level is based on a calculation of how well the column matched the applicable rules.

Custom sensitivity rules - Full confidence

The sensitivity scan also looks for any columns that match custom sensitivity types that you define in your custom sensitivity rules.

Custom sensitivity rules are based on the column data type and column name. For more information about custom sensitivity rules, go to Creating and managing custom sensitivity rules.

Custom sensitivity rules always produce full confidence detections.

When a column matches a custom sensitivity rule, Structural:

  • Marks the column as sensitive.

  • Assigns the sensitivity rule name as the sensitivity type.

  • Recommends the generator preset from the sensitivity rule.

  • Marks the sensitivity detection as full confidence.

Model-based analysis - Medium and low confidence

To identify additional sensitive columns that might not be captured by the other parts of the scan, the sensitivity scan uses an artificial intelligence (AI) model. Note that the model is pre-trained. Structural does not use customer data to train the model, and it does not send any customer data externally.

This part of the scan produces medium or low confidence detections for built-in entity types.

The model considers the table and column name. If the combination of table and column name is similar in meaning to a sensitivity type that Structural has a recommended generator for, then Structural:

  • Marks the column as sensitive.

  • Assigns the sensitivity type to the column.

  • Recommends the generator configuration for that sensitivity type.

  • Uses AI to compare the table name and column name combination to the sensitivity type, and produces a semantic similarity score.

  • Based on the semantic similarity score, marks the sensitivity detection as either medium or low confidence.

Downloading the sensitivity scan log

To download the log of the most recent sensitivity scan, either:

  • On the workspace management view, from the download menu, select Download Sensitivity Scan Log.

  • On Privacy Hub, click Reports and Logs, then select Scan Log.

The log tracks the progress of the scan.

child workspace
connector type
Enabling and configuring upsert
connector type
Writing output to Tonic Ephemeral
Writing output to a container repository
environment setting
file connector
Configuring the file connector storage type and output options

Managing workspaces

A Tonic Structural workspace provides a context within which to configure and generate transformed data.

A workspace represents a path between the source data and the transformed output data. For example, postgres-prod-copy to postgres-staging.

A workspace includes:

  • Where to find the source data to transform during data generation

  • Where to write the transformed data

  • The rules for the transformation

Manage workspaces

Workspace details and tools

About the workspace management view

You use the workspace management view to configure and run data generation for an individual workspace.

When you log in to Tonic Structural, it displays the workspace management view for the workspace that was selected when you logged out.

Components of the workspace management view

Workspace management view for a workspace

The workspace management view includes the following components.

Workspace information

The top left of the workspace management view provides information about the workspace, including:

Workspace information
  • The workspace name

  • When the workspace was last updated

  • The user who last updated the workspace

  • Whether the workspace is a child workspace

Workspace options

The top right of the workspace management view provides general options for working with the workspace, including:

Workspace options
  • Undo and redo options for configuration changes

  • The workspace share icon, to grant workspace access to other users and groups

  • The workspace download menu to:

    • Download sensitivity scan and privacy reports

    • Export and import workspace configuration

  • The workspace actions menu

  • The Generate Data button, to start a data generation job

Workspace navigation bar

The workspace navigation bar provides access to workspace configuration options.

Workspace navigation bar

Displaying the workspace management view

To display the workspace management view for a workspace:

  • On Workspaces view, in the Name column either:

    • Click the workspace name. The workspace management view opens to Privacy Hub.

    • Click the dropdown icon, then select a workspace management option.

Workspace tools menu
  • Click the search field at the top. A list of available type the name of the workspace. As you type, Tonic displays a list of matching workspaces. In the list, click the workspace name.

Workspace search

Collapsing and expanding the workspace heading

To reduce the amount of vertical space used by the heading of the workspace management view, you can collapse it.

To collapse the heading, click the collapse icon in the Structural heading.

Workspace heading with collapse option highlighted

When you collapse the workspace management heading:

  • The workspace information is hidden. The workspace name is displayed in the search field.

  • The workspace options are moved up into the Structural heading.

The workspace navigation bar remains visible.

When you collapse the heading, the collapse icon changes to an expand icon. To restore the full heading, click the expand icon.

Collapsed workspace heading with expand option highlighted

Viewing and configuring tables

The table list at the left of Database View contains the list of tables in the source database. You can filter the table list and assign tables modes to the tables.

Information in the table list

The table list is grouped by schema. You can expand and collapse the list of tables in each schema. This does not affect the displayed columns.

For a file connector workspace, each table corresponds to a file group.

Table list on Database View

For each table, the table list includes the following information:

  • The name of the table.

  • The number of columns that have an assigned generator (a generator other than Passthrough). The number does not display if none of the table columns has an assigned generator.

  • The assigned table mode. The table list only shows the first letter of the table mode:

    • D = De-identify

    • S = Scale

    • T = Truncate

    • P = Preserve Destination

    • I = Incremental

For a child workspace, if the selected table mode overrides the parent workspace configuration, then the override icon displays.

Table that overrides the configuration from the parent workspace

To display Table View for a table, click the arrow icon to the right of the table entry.

Filtering the table list

You can filter the table list by name and by the assigned table mode. You can also filter the tables based on whether any of the columns have assigned generators.

As you filter the table list, the column list also is filtered to only include the columns for the filtered tables.

Filtering by table name

To filter the table list by name, in the filter field, begin to type text that is in the table name.

As you type, Tonic Structural filters the list to only display tables with names that contain the filter text.

Filtering the table list by table name

Filtering by the assigned table mode

To filter the table list based on the assigned table mode:

Filtering the table list by table mode
  1. Click Filters.

  2. On the filter panel, check the checkbox next to each table mode to include. By default, the list includes all of the table modes. As you check and uncheck the table mode checkboxes, Structural adds and removes the associated tables from the list.

Filtering to exclude tables that have assigned generators

You can filter the table list to only display tables that have no assigned generators:

  1. Click Filters.

  2. On the filter panel, to only show tables that do not have assigned generators, check the No Generators Applied checkbox.

Assigning table modes to tables

Required workspace permission: Assign table modes

The table mode determines the number of rows and columns in the destination database. For details about the available table modes and how they work, go to Table modes.

Updating a single table

To change the assigned table mode for a single table:

Assigning a table mode to a single table
  1. Click the table mode dropdown next to the table name.

  2. From the table mode dropdown list, select the table mode.

  3. For a child workspace, the table mode selection panel indicates whether the selected table mode is inherited from the parent workspace. If the child workspace currently overrides the parent workspace configuration, then to reset the table mode to the table mode that is assigned in the parent workspace, click Reset.

Updating multiple tables

To change the assigned table mode for multiple tables:

Assigning a table mode to multiple tables
  1. Check the checkbox for each table to change the table mode for. To select a continuous range of tables, click the first table in the range, then Shift-click the last table in the range. To select all of the tables in a schema, click the schema name.

  2. Click Bulk Edit.

  3. On the panel, click the radio button for the table mode to assign to the selected tables.

Configuring multiple columns

The bulk edit option on Database View allows you to configure multiple columns at the same time. From the bulk editing panel, you can:

  • Mark the selected columns as sensitive or not sensitive.

  • Assign a generator to the selected columns.

  • Apply the recommended generator to the selected columns.

  • Reset the generator configuration to the baseline. This option requires that all of the selected columns are assigned the same preset.

Depending on the column selection, you can also create a new sensitivity rule.

Displaying the bulk edit option

To select the columns and display the bulk edit option:

  1. Check the checkbox next to each column to update.

  2. Click Bulk Edit.

Bulk Edit panel to update multiple columns

Marking the columns as sensitive or not sensitive

Required workspace permission: Configure column sensitivity

On the Bulk Edit panel, under Sensitivity:

  • To mark the selected columns as sensitive, click Sensitive.

  • To mark the selected columns as not sensitive, click Not Sensitive.

Changing the assigned generator

Required workspace permission: Configure column generators

On the Bulk Edit panel, under Bulk Edit Applied Generator, select and configure the generator to assign to the selected columns.

Assigning the recommended generator to the columns

Required workspace permission: Configure column generators

If any of the selected columns have a recommended generator, then on the Bulk Edit panel, the Generator recommendations found panel displays. The panel indicates the number of selected columns that have a recommendation.

To assign the recommended generators to those columns, click Apply.

Restoring the baseline configuration for the columns

Required workspace permission: Configure column generators

For a generator preset, the baseline configuration is the configuration that is saved for that preset. The baseline configuration determines the default configuration to use when you assign the preset to a column. After you select the preset, you can override the baseline configuration.

If all of the selected columns are assigned the same preset, then to restore the baseline configuration for all of the columns, click Reset to Baseline.

Creating a sensitivity rule

Required license: Enterprise

Required global permission: Create and manage sensitivity rules

You might bulk edit columns that could benefit from a custom sensitivity rule.

For example, in your data, the Widget column is in multiple tables and contains sensitive data that Structural cannot identify. You select all of the Widget columns so that you can mark them as sensitive and apply the Character Scramble generator to them.

However, a custom sensitivity rule would ensure that in the future, Widget columns are always marked as sensitive and have the Character Scramble generator recommended.

On the Bulk Edit panel, when all of the selected columns:

  • Have the same data type.

  • Do not have a generator assigned.

  • Do not have a recommended generator.

Then Structural displays the Create a Sensitivity Rule panel, which contains the option to create a new sensitivity rule.

Bulk Edit panel with the option to create a sensitivity rule

To create a sensitivity rule:

  1. Click Create Custom Rule.

  2. On the Create Custom Rule view, configure the new sensitivity rule. Structural automatically selects a data type based on the selected columns. The current workspace is used as the testing workspace to verify the columns that match the rule configuration. For details about the sensitivity rule configuration, go to .

  3. When you finish configuring the new rule:

    • To both save the rule and apply the generator preset to all workspace columns that match the rule, click Save and Apply. On the confirmation panel, click Confirm Auto Apply.

    • To save the rule, but not apply the configured generator preset to matching columns, click Save.

Structural closes the sensitivity rule configuration view and returns you to Database View. It maintains the previous column selection.

If you did not apply the generator preset, then the sensitivity rule is included in the next sensitivity scan.

Marking the columns as sensitive or not sensitive

Performing scans on collections

Required workspace permission: Run collection scan

When you first connect to a MongoDB or Amazon DynamoDB database, Tonic Structural performs a scan to determine the available fields in each collection, the field types, and how prevalent the fields are. It performs this scan at the same time as the initial sensitivity scan.

For each collection, Structural creates a hybrid document, which is a superset of all of the fields contained in the collection documents.

Configuring the collection scan

By default, for each collection:

  • The scan includes all of the documents in the collection, and continues until the scan is finished.

  • Every unique path (field+data type) in the collection is added to the hybrid document.

You can change the default scan behavior. To change the scan configuration, use the following environment settings. You can add these settings manually to the Environment Settings list on Structural Settings.

Note that these settings, including settings that include MONGO in the name, apply to both MongoDB and Amazon DynamoDB.

Configuring how schemas are scanned

The following options control the number of documents that Structural scans in a collection.

These options allow you to limit the number of scanned documents when the additional documents do not add fields to the hybrid document.

For large homogenous collections, where all or most documents have the same structure, configuring these options can improve performance.

TONIC_DOCUMENT_SCAN_MAX_DOCS_COUNT

The maximum number of documents to scan for each schema in a collection. For example, if this is 10, then Structural scans up to 10 documents, and ignores the remaining documents. When this value is empty, Structural scans all of the documents.

TONIC_DOCUMENT_SCAN_MAX_TIME_SECONDS

The maximum amount of time in seconds to scan a schema. For example, if this is 360, then Structural scans a schema for up to 360 seconds. When this value is empty, Structural continues the scan until it is complete.

If you set both options, then the scan completes when it reaches either limit. For example, if the maximum document count is 10 and the maximum scan time is 360 seconds, then the scan completes either after 10 documents or after 360 seconds, whichever comes first.

Configuring how fields are collapsed in the hybrid document

Typically, the number of unique fields in a collection is small relative to the number of documents. However, in some cases the number of fields is similar to or greater than the number of documents. This most commonly occurs when documents have "data as keys", such as keys that are ObjectIds, UUIDs, or incrementing integers.

In these cases, adding every unique field to the hybrid document can result in a large hybrid document that has an undesirable structure.

Structural offers configuration options to "collapse" fields within the hybrid document. This shrinks the size of the hybrid document. It also allows you to assign a generator to the collapsed group instead of to each unique key.

By default, Structural does not collapse fields.

Collapsing fields when the key is an ObjectId

To enable this, set the environment setting TONIC_MONGO_OBJECT_ID_COLLAPSE_THRESHOLD to the number of ObjectId keys that an object can contain before Structural collapses the object schema into a single key.

For example, if this is 10, then any object that has 10 or more ObjectId keys is collapsed into a single key.

A negative value indicates to not collapse the keys.

The default value is -1.

Collapsing fields when the key matches a custom pattern

To enable Structural to collapse fields, you provide a regular expression to identify the fields that can be collapsed into the same field. You then configure the number of matches that must exist before Structural collapses the fields.

To configure how the fields are collapsed, use the following environment settings:

TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX

The regular expression that identifies the fields that can be collapsed into a single field. By default, this value is empty.

TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX_THRESHOLD

The number of fields that match the regular expression before Structural collapses the fields into a single field. For example, if this is 5, then after Structural finds 5 fields that match the regular expression, it collapses all of the matching fields into a single field. A negative value indicates to not collapse the fields. The default value is -1.

For example:

  • To collapse keys that are integer values, use the regular expression [0-9]+ or \d+

  • To collapse keys that are UUIDs, use the regular expression [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}

Viewing the most recent scans for each collection

On Privacy Hub, the Latest Collection Scan table shows the most recent scans on each scanned collection.

The Build Schema option runs a new scan on the collection.

Latest Collection Scan on Privacy Hub

Starting a collection scan

When the source database has a new collection, then on Collection View, you are prompted to run a scan either on that collection or on all collections.

Collection scan prompt on Collection View

Array JSON Mask

This is a composite generator.

A version of the JSON Mask generator that can be used for array values.

Runs a selected generator on values that match a user-specified JSONPath.

Characteristics

Consistency

Determined by the specified sub-generators.

Linking

Determined by the specified sub-generators.

Differential privacy

Determined by the specified sub-generators.

Data-free

Determined by the specified sub-generators.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

How to configure

Adding a sub-generator

To assign a generator to a path expression:

  1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.

  2. In the Path Expression field, type the JSONPath expression to identify the value to apply the generator to. To populate a path expression, you can also click a value in the Cell JSON field. Matched JSON Values shows the result from the value in Cell JSON.

  3. By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, and Null.

  4. From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.

  5. Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  6. To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

Managing the sub-generator list

From the Sub-Generators list:

  • To edit a generator assignment, click the edit icon.

  • To remove a generator assignment, click the delete icon.

  • To move a generator assignment up or down in the list, click the up or down arrow.

Enabling data encryption

If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Conditional

This is a composite generator.

Applies different generators to the value conditionally based on any value in the table.

For example, a Users table contains Name, Username, and Role columns. For the Username column, you can use a conditional generator to indicate that if the value of Role is something other than Test, then use the Character Scramble generator for the Username value. For Test users, the name is not masked.

Characteristics

Consistency

Determined by the selected generators.

Linking

Determined by the selected generators.

Differential privacy

Determined by the selected generators.

Data-free

Determined by the selected generators.

Allowed for primary keys

No

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • If a fallback generator is selected, then the lower of either 5 or the fallback generator.

  • 5 if no fallback generator is selected

Generator ID (for the API)

How to configure

The generator consists of a list of options. Each option includes the required conditions and the generator to use if those conditions are met.

Setting the default generator

The generator always contains a Default option. The Default option is used if the value does not meet any of the conditions. To configure the Default option:

  1. From the Default dropdown list, select the generator to use by default.

  2. Configure the selected generator.

Adding a condition option

To add a condition option:

  1. Click + Conditional Generator.

  2. To add a condition:

    1. Click + Condition.

    2. From the column list, select the column for which to check the value.

    3. Select the comparison type.

    4. Enter the column value to check for.

    To remove a condition, click the delete icon for the condition.

  3. From the Generator dropdown list, select the generator to run on the current column if the conditions are met. You cannot select another composite generator.

  4. Choose the configuration options for the selected generator.

Viewing and editing condition options

To view details for and edit a condition option, click the expand icon for that option.

Removing a condition option

To remove a condition option, click the delete icon for the option.

CSV Mask

This is a composite generator.

Masks text columns by parsing the values as rows whose columns are delimited by a specified character.

You can assign specific generators to specific indexes. You can also use the generator that is assigned to a specific index as the default. This applies the generator to every index that does not have an assigned generator.

The output value maintains the quotes around the index values.

For example, a column contains the following value:

"first","second","third"

You assign the Character Scramble generator to index 0 and assign Passthrough to index 2. You select index 0 as the index to use for the default generator.

In the output, the first and second values are masked by the Character Scramble generator. The third value is not masked. The output looks something like:

"wmcop", "xjorsl", "third"

Characteristics

Consistency

Determined by the selected sub-generators.

Linking

Determined by the selected sub-generators.

Differential privacy

Determined by the selected sub-generators.

Data-free

Determined by the selected sub-generators.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

How to configure

Setting the delimiter

In the Delimiter field, type the delimiter that is used as a separator in the value.

For example, for the value "first","second","third", the delimiter is a comma.

Adding a sub-generator

You can configure a generator for any or all of the indexes. To add a sub-generator for an index:

  1. Under Sub-Generators, click Add Generator. On the add generator dialog, the Cell CSV field contains a sample value from the source data. You can use the navigation icons to page through the values.

  2. In the CSV Index field, type the index to assign a generator to. The index numbers start with 0. You cannot use an index that already has an assigned generator. Matched CSV values shows the value at that index for the current sample column value.

  3. Under Generator Configuration, from the Select a Generator dropdown list, select the generator to use for the selected index. You cannot select another composite generator. To remove the selection, click the delete icon.

  4. Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  5. To save the configuration and immediately add a generator for another index, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

Managing the sub-generator list

From the Sub-Generators list:

  • To edit a generator assignment, click the edit icon.

  • To remove a generator assignment, click the delete icon.

  • To move a generator assignment up or down in the list, click the up or down arrow.

Setting the default for indexes without a generator

After you configure a generator for at least one index, the Default Link dropdown list is displayed.

From the Default Link dropdown list, select the index to use to determine how to mask values for indexes that do not have an assigned generator.

For example, you assign the Character Scramble generator to index 2. If you set Default Link to 2, then all indexes that do not have an assigned generator use the Character Scramble generator.

Date Truncation

Truncates a date value or a timestamp to a specific part.

For a date or a timestamp, you can truncate to the year, month, or day.

For a timestamp, you can also truncate to the hour, minute, or second.

Characteristics

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the dropdown list, select the part of the date or timestamp to truncate to. For both date and timestamp values, you can truncate to the year, month, or day. When you select one of these options, the time portion of a timestamp is set to 00:00:00. For the date, the values below the selected truncation value are set to 01. For example, when you truncate to month, the day value is set to 01, and the timestamp is set to 00:00:00. For a timestamp value, you also can truncate to the hour, minute, or second. The date values remain the same as the original data. The time values below the selected truncation value are set to 00. For example, when you truncate to minute, the seconds value is set to 00.

  2. Toggle the Birth Date option. When you enable Birth Date, the generator shifts dates that are more than 90 years before the generation date to the date exactly 90 years before the generation date. For example, data generation occurs on January 1, 2023. Any date that occurs before January 1, 1933 is changed to January 1, 1933.

    This is mostly intended for birthdate values, to group birthdates for everyone who is older than 89 into a single year. This is used to comply with HIPAA Safe Harbor.

  3. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Truncation examples

Here are examples of date and time values and how the selected truncation affects the output:

Option
Date value
Timestamp value

Original value

2021-12-20

2021-12-20 13:42:55

Truncate to year

2021-01-01

2021-01-01 00:00:00

Truncate to month

2021-12-01

2021-12-01 00:00:00

Truncate to day

2021-12-20

2021-12-20 00:00:00

Truncate to hour

Not applicable

2021-12-20 13:00:00

Truncate to minute

Not applicable

2021-12-20 13:42:00

Truncate to second

Not applicable

2021-12-20 13:42:55

Email

Scrambles the characters in an email address. It preserves formatting and keeps the @ and . characters.

For example, for the following input value:

[email protected]

The output value would be something like:

[email protected]

By default, the generator scrambles the domain. You can configure the generator to not mask specific domains. You can also specify a domain to use for all of the output email addresses.

For example, if you configure the generator to not scramble the domain company.com, then the output for [email protected] would look something like:

[email protected]

This generator securely masks letters and numbers. There is no way to recover the original data.

If your email addresses include name values - for example, [email protected] - then you can use the Regex Mask generator to produce email addresses that are tied to name values in the same table. For information on how to do this, go to .

Characteristics

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. In the Email Domain field, enter a domain to use for all of the output values. For example, use @mycompany.com for all of the generated values. The generator scrambles the content before the @.

  2. In the Excluded Email Domains field, enter a comma-separated list of domains for which email addresses are not masked in the output values. This allows you, for example, to maintain internal or testing email addresses that are not considered sensitive.

  3. Toggle the Replace invalid emails setting to indicate whether to replace an invalid email address with a generated valid email address. By default, invalid email addresses are not replaced. In the replacement values, the username is generated. If you specify a value for Email Domain, then the email addresses use that domain. Otherwise, the domain is generated.

  4. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

  5. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

HTML Mask

This is a composite generator.

Masks text columns by parsing the contents as HTML, and applying sub-generators to specified path expressions.

If applying a sub-generator fails because of an error, the generator selected as the fallback generator is applied instead.

Path expressions are defined using the XPath syntax.

For example, for the following HTML:

<html>
<body>
  <div class="container">
    <h1>Title</h1>
    <p>Paragraph content</p>
    <ul>
      <li>Item 1</li>
      <li>Item 2</li>
      <li>Item 3</li>
    </ul>
  </div>
</body>
</html>

To get the value of h1, the expression is //h1/text().

To get the value of the first list item, the expression is //ul/li[1]/text().

Characteristics

Consistency

Determined by the selected sub-generators.

Linking

Determined by the selected sub-generators.

Differential privacy

Determined by the selected sub-generators.

Data-free

Determined by the selected sub-generators.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

How to configure

Adding a sub-generator

To assign a generator to a path expression:

  1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HTML field contains a sample value from the source database. You can use the previous and next icons to page through different values.

  2. In the Path Expression field, type the path expression to identify the value to apply the generator to. Matched HTML Values shows the result from the value in Cell HTML.

  3. From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.

  4. Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  5. To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

Managing the sub-generators list

From the Sub-Generators list:

  • To edit a generator assignment, click the edit icon.

  • To remove a generator assignment, click the delete icon.

  • To move a generator assignment up or down in the list, click the up or down arrow.

Selecting the fallback generator

From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.

The options are:

  • Passthrough

  • Constant

  • Null

IP Address

Generates a random IP address formatted string.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. In the Percent IPv4 field, type the percentage of output values that are IPv4 addresses. For example, if you set this to 60, then 60% of the generated IP addresses are IPv4 addresses, and 40% of the generated IP addresses are IPv6 addresses. If you set this to 100, then all of the generated IP addresses are IPv4 addresses. If you set this to 0, then all of the generated IP addresses are IPv6 addresses.

  2. Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.

  3. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given source value in that column always results in the same IP address value in the destination database. For example, an IP address column is consistent with a username column. For each instance of User1 in the source database, the value in the IP address column is the same.

  4. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Name

Generates a random name string from a dictionary of first and last names.

You specify the name information that is contained in the column. A column might only contain a first name or last name, or it might contain a full name. A full name might be first name first or last name first.

For example, a Name column contains a full name in the format Last, First. For the input value Smith, John, the output value would be something like, Jones, Mary.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column. Note that all Name generator columns that have the same consistency configuration are automatically consistent with each other. The columns must either be all self-consistent or all consistent with the same other column. For example, you can use this to ensure that a first name and last name column value always match the first name and last name in a full name column.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the name format dropdown list, select the type of name value that the column contains:

    • First. This also is commonly used for standalone middle name fields.

    • Last

    • First Last

    • First Middle Last

    • First Middle Initial Last

    • Last, First

    • Last, First Middle

    • Middle Initial

  2. Toggle the Preserve Capitalization setting to indicate whether to preserve the capitalization of the column value. By default, the capitalization is not preserved.

  3. Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.

  4. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.

  5. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Managing your user account

From the User Settings view, you can manage settings for your individual Tonic Structural account.

To display the User Settings view:

  1. Click your user image at the top right.

  2. In the menu, click User Settings.

The User Settings view includes options to:

  • (if your Structural instance does not use SSO).

Choosing your user image

You can select an image to associate with your account. The image is displayed next to your name and email address throughout Structural.

If your instance uses Google or Azure single sign-on (SSO) to manage Structural users, then by default your Structural account image is the image from the SSO.

Otherwise, the default image displays your initials.

To change your user image, click Upload, then select the image file.

Viewing and copying your organization identifier

Below your user image file name is the identifier of the organization that your account belongs to.

To copy the identifier, click the copy icon.

Configuring notifications for column comments

Required license: Professional or Enterprise

Structural allows users to provide comments on columns. You can do this from and .

From the Comment Notification Settings section of User Settings, you can configure when to receive email notifications for comments.

The available options are:

  • I am an owner, editor, auditor, or am being replied to This is the default option. You receive email notifications when comments are made on columns in a workspace that you are an owner, editor, or auditor for. You also receive an email notification when someone replies to a comment that you made.

  • I am @ mentioned You only receive an email notification if someone specifically mentions you in a comment.

  • Never You never receive email notifications for column comments.

Generating and managing API tokens

Before you can use the Structural API, you must create an API token. From the User API Tokens section of the User Settings view, you can create and manage API tokens.

Creating an API token

To create an API token:

  1. Click Create Token.

  2. On the Create New Token dialog, enter a name for the new token.

  3. Click Confirm.

In the list, the new token displays as clear text. To copy the new token, click the copy icon next to the token.

The new token text and copy icon only display during the current session. After that, Structural masks the token and removes the copy icon.

Revoking an API token

To revoke a token, click the Revoke option for the token.

Changing your Structural password

If your Structural account is not managed using SSO, then from User Settings, you can change your Structural password.

If your Structural instance uses SSO to manage users, then your user credentials are managed in the SSO system. You cannot change your user password in Structural.

Under Password Change, to change your Structural password:

  1. In the Old Password field, type your current Structural password.

  2. In the New Password field, type your new Structural password.

  3. In the Repeat New Password field, type your new Structural password again.

  4. Click Confirm.

Deleting your Structural account

From User Settings, you can delete your Structural account. If your instance uses SSO to manage users, then deleting your account only affects your access to Structural.

You cannot delete your Structural account if you are the owner of a workspace for which other users are granted access. Before you can delete your Structural account, you must either:

To delete your Structural account, click Delete Account.

When you delete your account, you are logged out of Structural.

Tutorial videos

Use these tutorial videos to learn more about how to use Tonic Structural.

Tonic Structural 101

Provides an overview of the Structural workflow and how to use Structural to generate de-identified data. For more information, go to .

Creating a Structural workspace

Provides an overview of what a Structural workspace is and how to create a new Structural workspace. For more information, go to .

Sensitivity detection and generator recommendations

Provides an overview of how Structural detects sensitive values and how you can apply recommended generators to the detected values.

Managing workspace access

Provides an overview of workspace owners, permissions, and permission sets. Explains how to share and transfer ownership of a workspace. For more information, go to .

Structural generators overview

Identifies the types of generators and transformations that you can use in Structural, and explains how to assign a generator to a column. For more information, go to .

Generator presets

Provides an overview of generator presets. Includes how to create and update them, and how to track where each generator preset is used. For more information, go to .

File connector overview

Provides an overview of the file connector and how to manage file groups in a file connector workspace. For more information, go to .

Generating data with consistency

Provides an overview of the consistency generator property and how it works. For more information, go to .

Using Document View to configure JSON columns

Provides an overview of how to enable Document View for a JSON column and how to use it to configure generators for JSON fields.

Subsetting your data

Provides an overview of subsetting, how it is configured, and how Structural uses the configuration to generate a subset. For more information, go to .

Upsert data generation

Provides an overview of upsert data generation. Includes how it works and how to enable and run it for a workspace. For more information, go to .

Writing destination data to a container repository

Provides an overview of how to write destination data to a container repository instead of a database server. For more information, go to .

Frequently Asked Questions

What is the minimum required screen width for the Tonic Structural application?

The minimum screen width is 1120 pixels.

How do you connect to a local database when running Structural in a Docker container locally?

If the locally running database that you want to connect to runs in a Docker container:

  1. Run: docker inspect

  2. In the networks section of the results, find the Gateway IP address. Use this IP address as the server address in Structural.

If the locally running database does NOT run in a container, but runs on the machine, then:

  • On Windows or Mac, use host.docker.internal.

  • On Linux, use 172.17.0.1, which is the IP address of the docker0 interface.

I allowlist access to my database. What are your static IP addresses?

If you use Structural Cloud, and your database only allows connections from allowlisted IP addresses, then you need to allowlist Structural static IP addresses.

This is not required for self-hosted instances of Structural.

United States-based instance

For the United States-based instance (), the static IP addresses are:

  • 54.92.217.68

  • 52.22.13.250

The following IP addresses are used if needed for scaling or failover:

  • 44.215.74.226

  • 3.232.203.148

  • 3.224.2.189

  • 44.230.136.147

  • 44.230.79.194

Europe-based instance

For the Europe-based instance (), the static IP addresses are:

  • 18.159.127.160

  • 3.69.249.144

The following IP addresses are used if needed for scaling or failover:

  • 18.159.179.95

  • 3.120.214.225

  • 3.75.12.1

  • 16.16.71.42

  • 16.170.51.237

I allowlist network calls. What do I need to allowlist?

URLs for telemetry sharing

The URL https://telemetry.tonic.ai/ is used for our Amplitude telemetry.

https://telemetry.tonic.ai/logs is used specifically for log sharing.

Allowlist https://telemetry.tonic.ai/ or the following IP addresses:

  • 75.2.74.76

  • 99.83.246.105

Telemetry sharing is required. These metrics are valuable for us as we debug, make product roadmaps, and determine feature viability.

No customer data is included. For more information about the specific telemetry data that we collect, go to .

For more information on how to verify that telemetry is shared, go to .

URLs for Structural version information

To support the one-click update option, Structural needs to be able to retrieve information about the latest Structural version.

For more information, go to .

How do I check my current version of Structural?

Click your user image at the top right. The menu includes the Tonic version.

How should we provision our source database?

We recommend that you use a static copy of your production database that was restored from a backup.

If that's not possible, consider the following when you connect Structural to your source data:

  • Structural cannot guarantee referential integrity of the output data if the source database is written to while data is generated. For this reason we recommend that you connect to a static copy of production data.

  • Read replicas and fast followers can be problematic for Structural because of how long it takes some queries to run. Read replicas tend to have short query timeout limits, which causes the queries to time out. Read replicas also reflect recent writes, which means that we cannot guarantee the referential integrity of the output.

What data does Tonic.ai collect from Structural?

For details about the types of data that Tonic.ai does and does not collect, go to .

Creating, editing, or deleting a workspace

Creating a workspace

When you create a new workspace, you can either:

  • The copy initially uses the configuration from the original workspace. After the copy is created, it is completely independent from the original workspace.

  • Child workspaces inherit configuration from the parent workspace. They continue to be updated automatically when the parent workspace is updated. For more information, go to .

You can also view this .

Creating a completely new workspace

Required global permission: Create workspaces

To create a completely new workspace, on Workspaces view, click Create Workspace > New Workspace.

Creating a copy of a workspace

Required workspace permission: Copy workspace (in the workspace to copy)

Or

Required global permission: Copy any workspace

To create a workspace based on an existing workspace, either:

  • On the workspace management view of the workspace to copy, from the workspace actions menu, select Duplicate Workspace.

  • On Workspaces view, click the actions menu for the workspace, then select Duplicate Workspace.

When you create a copy of a workspace, the copy initially inherits the following workspace configuration:

  • Source and destination database connections

  • Sensitivity designations, including manual designations that override the sensitivity scan results

  • Table mode assignments

  • Generator configuration

  • Subsetting configuration

  • Post-job scripts

Creating a child workspace

Required license: Enterprise

Required workspace permission: Create child workspaces (in the parent workspace)

You can create a workspace that is a child of an existing workspace. You cannot create a child workspace of another child workspace.

The parent workspace must have a source database configured. You cannot create a child workspace from a workspace that uses the Databricks, Spark (Amazon EMR or self-managed Spark cluster), or MongoDB data connector.

To create a child workspace, either:

  • On Workspaces view:

    • Click Create Workspace > Child Workspace.

    • Click the actions menu for the parent workspace, then select Create Child Workspace.

  • On the workspace management view, from the workspace actions menu, select Create Child Workspace.

On the New Workspace view, under Child Workspace, Parent Workspace identifies the parent workspace.

If you used the Create Workspace > Child Workspace option to create the child workspace, then Parent Workspace is not populated. From the Parent Workspace dropdown list, select the parent workspace for the new child workspace.

If you selected the child workspace option for a specific workspace, then Parent Workspace is set to that workspace.

If you originally chose to create a completely new workspace, then on the New Workspace view:

  1. To change to a child workspace, select Create Child Workspace from the Create a child workspace panel at the right. Structural adds the Child Workspace panel to the New Workspace view.

  2. From the Parent Workspace dropdown list, select the parent workspace for the new child workspace.

Editing a workspace

Required workspace permission: Configure workspace settings

To edit the configuration for an existing workspace, either:

  • On the workspace management view:

    • On the workspace navigation bar, click Workspace Settings.

    • From the workspace actions menu, select Workspace Settings.

  • On Workspaces view, click the actions menu for the workspace, then select Workspace Settings.

Deleting a workspace

Required workspace permission: Delete workspace

You can delete workspaces that you no longer need.

You cannot delete a parent workspace. You must first delete all of its child workspaces.

To delete a workspace:

  • On the workspace management view, from the workspace actions menu, select Delete Workspace.

  • On the Workspaces view, click the actions menu for the workspace, then select Delete.

Writing output to Tonic Ephemeral

Only available for PostgreSQL, MySQL, SQL Server, and Oracle.

Not compatible with upsert.

Not compatible with Preserve Destination or Incremental table modes.

Tonic Ephemeral is a separate Tonic.ai product that allows you to create temporary databases to use for testing and demos. For more information about Ephemeral, go to the .

If Ephemeral supports your workspace database type, then you can write the destination data to a snapshot in Ephemeral. You can then use the snapshot to start Ephemeral databases.

To write the transformed data to Ephemeral, under Destination Settings, click Ephemeral Database.

Selecting the Ephemeral instance type

Structural can write the data snapshot to either Ephemeral Cloud or to a self-hosted instance of Ephemeral. By default, Structural writes the data snapshot to Ephemeral Cloud.

All workspaces on the same self-hosted Structural instance or in the same Structural Cloud organization must write to the same instance of Ephemeral. When you change the Ephemeral output configuration in one workspace, it is automatically changed in other workspaces that write to Ephemeral.

Writing output to Ephemeral Cloud

Structural writes the snapshot to the Ephemeral Cloud account for the user who runs the data generation job.

  • If that user has an Ephemeral account on Ephemeral Cloud, then Structural uses that account.

  • If the user does not have an account, then Structural creates an account for them.

On Structural Cloud, when you save the workspace, if you do not have an Ephemeral Cloud account, then an Ephemeral Cloud account is created for you.

When Structural creates an Ephemeral account, if the user belongs to an existing Ephemeral Cloud organization, then the account is added to the organization. Otherwise, the account is a two-week free trial account.

For a self-hosted Structural workspace, you must provide an API key from an existing Ephemeral Cloud account.

To write a snapshot to Ephemeral Cloud:

  1. Click Tonic Ephemeral cloud.

  2. If you are on a self-hosted instance of Structural:

    1. In the API Key field, provide an Ephemeral API key from your Ephemeral account.

    2. To test the connection, click Test Connection.

Writing output to self-hosted Ephemeral

When you write to a self-hosted instance of Ephemeral, then you must always provide an Ephemeral API key.

To write the snapshot to a self-hosted instance of Ephemeral:

  1. Click Tonic Ephemeral self-hosted.

  2. In the API Key field, provide an Ephemeral API key from your Ephemeral account. Structural writes the snapshot to the Ephemeral account that is associated with the API key.

  3. In the Tonic Ephemeral URL field, provide the URL to your self-hosted Ephemeral instance.

  4. To test the connection, click Test Connection.

Selecting the image for Oracle

For Oracle, you select the base image to use to create the data snapshot.

If you write to Ephemeral Cloud, then you must use the Oracle 23c base image that comes with Ephemeral. This image has the following limitations:

  • A maximum of 12GB of user data

  • A maximum of 2CPU cores and 2GB of RAM

If you write to a self-hosted instance of Ephemeral, then you can also select a custom image that you created in Ephemeral.

For information about how to create and manage custom images for Oracle, go to .

Configuring advanced settings for the snapshot

If you do not configure any advanced settings, then:

  • The snapshot uses the same name as the workspace, and has no description.

  • The snapshot size allocation is determined by the source data size.

  • Structural discards the temporary Ephemeral database that is created during the data generation.

To change any of these settings, click Advanced settings.

Providing a snapshot name and description

By default, the snapshot name uses the workspace name.

When you run data generation, if a snapshot with the same name already exists in Ephemeral, then Structural overwrites that snapshot with the new snapshot.

Under Advanced settings:

  1. In the Snapshot name field, provide the name of the snapshot. The snapshot name can use the following placeholder values to help identify the snapshot:

    • {workspaceName} - Inserts the name of the workspace.

    • {workspaceId} - Inserts the identifier of the workspace.

    • {jobId} - Inserts the identifier of the data generation job that created the snapshot.

    • {timestamp} - Inserts the timestamp when the snapshot was created.

    Including the job ID or timestamp ensures that a data generation job does not overwrite a previous snapshot.

  2. Optionally, in the Snapshot description field, provide a longer description of the snapshot.

Setting the pod resources for the snapshot

By default, the resources used for the snapshot are based on the size of the source data.

  • For source data that is 25 GB or less, Nano is used.

  • For source data larger than 25 GB, Micro is used.

To select a specific option:

  1. Toggle Custom pod resources to the on position.

  2. From the dropdown list, select the option to use for the combination of vCPUs and memory:

    • Nano - 0.125 vCPU with 0.5 GB RAM

    • Micro - 0.5 vCPU with 2 GB RAM

    • Small - 1 vCPU with 4 GB RAM

    • Medium - 2 vCPU with 8 GB RAM

    • Large - 4 vCPU with 16 GB RAM

Setting the size allocation for the snapshot

By default, the Ephemeral size allocation for the snapshot is based on the size of the source data.

To instead provide a custom data size allocation, under Advanced settings:

  1. Toggle Custom data size allocation to the on position.

  2. In the field, enter the size allocation in gigabytes.

Indicating whether to keep the temporary Ephemeral database

When Structural creates the Ephemeral snapshot, it creates a temporary Ephemeral database.

By default, Structural deletes that database when the data generation is complete.

To instead keep the database, under Advanced settings, toggle Keep database active in Tonic Ephemeral after data generation to the on position.

In Ephemeral Cloud, by default, databases are publicly accessible. To limit database access, you can configure Ephemeral Cloud with an IP allowlist for your organization. For more information, go to in the Ephemeral documentation.

Providing a customization file for MySQL or PostgreSQL

For a MySQL or PostgreSQL workspace, you can provide a customization file that helps to ensure that the temporary Ephemeral database is configured correctly.

To provide the customization details:

  1. Toggle Use custom configuration to the on position.

  2. In the text area, paste the contents of the customization file.

HIPAA Address

This generator can be used to generate cities, states, and zip codes that follow HIPAA guidelines for safe harbor.

Handling of address parts

Zip codes

How the HIPAA Address generator handles zip codes is based on whether the Replace zeros in truncated Zip Code toggle in the generator configuration is off or on.

By default, the setting is off. In this case, the last two digits of the zip code in the column are replaced with zeros, unless the zip code is a low population area as designated by the current census. For a low population area, all of the digits in the zip code are replaced with zeros.

If the setting is on, then the generator selects a real zip code that starts with the same three digits as the original zip code. For a low population area, if a state is linked, then the generator selects a random zip code from within that state. Otherwise the generator selects a random zip code from the United States.

Cities

When a zip code column is not linked, a random city is chosen in the United States. When a zip code is already added to the link, a city is chosen at random that has at least some overlap with the zip code.

If the original zip code is designated as a low population area, then a random city is chosen within the state. This is done only if the user has linked a State column. If they have not, a random city within the United States is chosen.

For example, if the original city and zip code are (Atlanta, 30305), the zip code would be replaced with 30300. Many cities contain zip codes that begin in 303, such as Atlanta, Decatur, Chamblee, Hapeville, Dunwoody, and College Park. One of these cities is chosen at random so that, for example, the final value is (Chamblee, 30300).

States

HIPAA guidelines allow for information at the state level to be kept. Therefore, these values are passed through.

Latitude and longitude (GPS) coordinates

GPS coordinates are randomly generated in descending order of dependence of the linked HIPAA address components:

  1. If a zip code is linked, a random point within the same 3-digit zip code prefix is generated, if the 3-digit zip code prefix is not designated a low population area. If it is a low population area, use the linked state.

  2. If a state is available and a zip code and city are not, or the zip code or city are in a 3-digit zip code prefix that is designated a low population area, then a random GPS coordinate is generated somewhere within the state.

  3. If no zip code, city, or state is linked, or one or more of them were provided, but there was a problem generating a random GPS coordinate within the linked areas, then a GPS coordinate is generated at a random location within the United States.

Note: If the city component of the HIPAA address is linked with latitude and/or longitude, the GPS coordinate components are randomly generated independently of the city.

Other address parts

All other address parts are generated randomly. The output value is not influenced at all by the underlying value in the column.

Characteristics

How to configure

To configure the generator:

  1. From the Link To dropdown list, select the other columns to link to. You can only select columns that are also assigned the HIPAA Address generator.

  2. From the address part dropdown list, select the type of address value that is in the column.

  3. Toggle the Replace zeros in truncated Zip Code setting how to generate zip codes. If the setting is off, then the last two digits are replaced with zero. For low population areas, the entire zip code is populated with zeroes. If the setting is on, then a real zip code is selected that starts with the first three digits of the original zip code. For low population areas, if a state is linked, a random zip code from the state is used. Otherwise, a random zip code from the United States is used.

  4. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

  5. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Spark supported address parts

For the HIPAA Address generator, Spark workspaces (Amazon EMR, Databricks, and self-managed Spark clusters) only support the following address parts:

  • City

  • City with State

  • City with State Abbr

  • State

  • State Abbr

  • US Address

  • US Address with Country

  • Zip Code

The provides support for additional address parts in Spark workspaces.

JSON Mask

This is a .

Runs a selected sub-generator on values that match a user specified . You can only search for and apply sub-generators to individual key values. You cannot apply a sub-generator to an object or to an array.

If an error occurs, the selected fallback generator is used for the entire JSON value.

For JSON columns in a file connector workspace, you can instead use Document View to assign generators to individual paths. For more information, go to .

Sequence for applying the sub-generators

Sub-generators are applied sequentially, from the sub-generator at the top of the list to the sub-generator at the bottom of the list.

If a key matches more than one JSONPath expression, then the most recently added generator takes priority.

JSON path options

Regular expressions and comparisons

JSON paths can also contain regular expressions and comparison logic, which allow the configured sub-generators to be applied only when there are properties that satisfy the query.

For example, a column contains this JSON:

[ { file_name: "foo.txt", b: 10 }, ... ]

The following JSON path only applies to array elements that contain a file_name key for which the value ends in .txt:

$.[?(@.file_Name =~ /^.*.txt$/)]

Using recursion

A JSON path can also be used to point to a key name recursively. For example, a column contains this JSON:

The following JSON path applies to all properties for which the key is first_name:

$..first_name

Characteristics

How to configure

Adding a sub-generator

To assign a generator to a path expression:

  1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.

  2. In the Path Expression field, type the path expression to identify the value to apply the generator to. You cannot use the exact same path expression more than once. To create a path expression, you can also click the value in Cell JSON that you want the expression to point to. The path expression must identify a key value. You cannot apply sub-generators to an object or to an array. Matched JSON Values shows the result from the value in Cell JSON.

  3. By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, Boolean, and Null.

  4. From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.

  5. Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  6. To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

Managing the sub-generators list

From the Sub-Generators list:

  • To edit a generator assignment, click the edit icon.

  • To remove a generator assignment, click the delete icon.

  • To move a generator assignment up or down in the list, click the up or down arrow.

Selecting the fallback generator

From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.

The options are:

Noise Generator

Masks values in numeric columns. Adds or multiplies the original value by random noise.

The additive noise generator draws noise from an interval around 0 scaled to the magnitude of original value. For example, the default scale is 10% of the underlying value. The larger the value, the larger the amount of noise that is added.

The multiplicative noise generator multiplies the original value by a random scaling factor that falls within a specified range.

Characteristics

How to configure

You can use either the additive noise generator or the multiplicative noise generator, then set the other generator settings.

Using the additive noise generator

To use the additive noise generator:

  1. From the dropdown list, choose Additive.

  2. In the Relative noise scale field, type the percentage of the underlying value to scale the noise to. The default value is 10.

Tonic samples the additive noise from a range between [-{scale/100} * |value|, {scale/ 100} * |value|), where scale is the noise scale, and value is the original data value.

The lower value of the range is inclusive, and the upper value of the range is exclusive.

For example, for the default noise scale of 10, and a data value of 20, the additive noise range would be [-.1 * 20, .1 * 20). In other words, between -2 (inclusive) and 2 (exclusive).

Using the multiplicative noise generator

To use the multiplicative noise generator:

  1. From the dropdown list, choose Multiplicative.

  2. In the Min field, type the minimum value for the scaling factor. The minimum value is inclusive. The default value is 0.5.

  3. In the Max field, type the maximum value for the scaling factor. The maximum value is exclusive. The default value is 5.

Tonic scales the original value from a range between [min, max), where min is the minimum scaling factor, and max is the maximum scaling factor.

For example, for the default values of 0.5 and 5, Tonic multiplies the original data value by a value from between 0.5 (inclusive) and 5 (exclusive).

Other generator configuration

To configure the generator consistency and data encryption:

  1. Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.

  2. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. If the generator is self-consistent, then a given value in the source database is masked in exactly the same way to produce the value in the destination database. If the generator is consistent with another column, then for a given value in that other column, the column that is assigned the Noise generator is always masked in exactly the same way in the destination database. For example, a field containing a salary value is assigned the Noise Generator and is consistent with the username field. For each instance of User1, the Noise Generator masks the salary value in exactly the same way.

  3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Workspaces view

View the list of workspaces that you have access to.

Create, edit, and delete workspaces

Add and remove workspaces, or update a workspace configuration.

Export and import workspace configuration

Save an existing workspace configuration. Apply a saved configuration to a workspace.

Assign workspace tags

Use tags to identify and organize your workspaces.

Workspace settings

Includes identifying information, data connection settings, and data generation settings.

Workspace management view

Provides access to workspace configuration and generation tools.

Workspace inheritance

Create child workspaces that inherit source data and configuration from their parent workspace.

ArrayJsonMaskGenerator
ConditionalGenerator
CsvMaskGenerator
DateTruncationGenerator
HtmlMaskGenerator
IPAddressGenerator
NameGenerator

Consistency

Yes, can be made self-consistent.

Linking

Yes, can be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

HipaaAddressGenerator

Structural data encryption
Address generator
{
  "first_name": "John",
  "last_name": "Smith",
  "children": [
    {
      "first_name": "Mary",
      "last_name": "Jones",
      "children": [
        {
          "first_name": "Ann",
          "last_name": "Jones"
        }
      ]
    }
  ]
}

Consistency

Determined by the selected sub-generators.

Linking

Determined by the selected sub-generators.

Differential privacy

Determined by the selected sub-generators.

Data-free

Determined by the selected sub-generators.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

JsonMaskGenerator

composite generator
JSONPath
Document View for JSON columns
Passthrough
Constant
Null

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

NoiseGenerator

Structural data encryption
Configure your user image.
View and copy your organization identifier.
Configure email notifications for column comments.
Generate and manage API tokens.
Change your Structural password
Delete your Structural account.
Privacy Hub
Database View
Revoke access from other users.
Transfer ownership to a different user.
User API Tokens list on the User Settings view
Create a completely new workspace.
Create a copy of an existing workspace.
Create a child of an existing workspace.
About workspace inheritance
video overview of how to create a workspace
Workspace actions menu
Workspace options column and dropdown list
New Child Workspace view for a child workspace
Create Child Workspace option on the New Workspace view
Sensitivity rule configuration
EmailGenerator
Structural data generation workflow
Managing workspaces
Managing access to workspaces
Generator information
Managing generator presets
File connector
Enabling consistency
Subsetting data
Enabling and configuring upsert
Writing output to a container repository
app.tonic.ai
app-de.tonic.ai
Data that Tonic.ai collects
Verifying and enabling telemetry sharing
Data that Tonic.ai collects
Writing Structural data generation output to a Tonic Ephemeral data snapshot

Viewing your list of workspaces

Workspaces view lists the workspaces that you have access to. To display Workspaces view, in the Tonic Structural heading, click Workspaces.

Workspaces view

How the workspace list is displayed

The workspace list contains:

  • Workspaces that you own

  • Workspaces that you are granted access to

If you have the global permission Copy any workspace or Manage user access to Tonic and to any workspace, then list includes all of the workspaces.

The Permissions column lists the workspace permission sets that you are granted in each workspace. The permission sets include both permission sets that were granted to you directly as a user, and permission sets that were granted to an SSO group that you are a member of.

Child workspaces always display under their parent workspace. The list only includes child workspaces that you have access to. If you have access to a child workspace, but not to its parent workspace, then the parent workspace is grayed out. You cannot select it.

Filtering the workspace list

You can filter the workspaces based on the following information:

  • Name - In the filter field, begin to type text that is in the name of the workspaces to display in the list.

  • Owner - From the Filter by Owner dropdown list, select the owner of the workspaces to display in the list.

  • Database type - From the Filter by Database Type dropdown list, select the type of database for the workspaces to display in the list.

  • Generation status - In the Generation Status column heading, click the filter icon. Check the checkbox next to the generation status values for the workspaces to display in the list.

  • Tags - In the Tags column heading, click the filter icon. By default, the workspaces are not filtered by tag, and all of the checkboxes are unchecked. To only include workspaces that have specific tags, check the checkbox next to each tag to include. To uncheck all of the selected tags, click Reset Tags. When you filter by tag, Structural checks whether each workspace contains any of the selected tags.

  • Permissions - In the Permissions column heading, click the filter icon. You can check and uncheck checkboxes to include or exclude specific permission sets. For example, you can filter the list to only display workspaces for which the Editor permission set is granted either to you or to an SSO group that you belong to. For users that have the global permission Copy any workspace, the Permissions filter panel also contains an Any permissions checkbox. By default, Any permissions is unchecked, and the list includes workspaces for which you are not assigned any workspace permission sets. To display all of the workspaces for which you have any assigned workspace permission sets, check Any permissions. If you filter the list based on a specific permission set, to clear the filter and show all workspaces for which you have any permission set, check Any permissions. To display all workspaces, including workspaces that you do not have any permissions for, uncheck Any permissions.

You can combine different filters. For example, you can filter the list to only include workspaces that use PostgreSQL and for which the generation status is Canceled or Failed.

Child workspaces always display under their parent workspace, even if the parent workspace does not match the filter.

Sorting the workspace list

You can sort the workspace list by name, status, or owner.

By default, the list is sorted alphabetically by name.

To sort by a column, click the column heading. To reverse the order of the sort, click the column heading again.

Child workspaces always display under their parent workspace. The child workspaces are sorted within the parent.

Workspace details on Workspaces view

Workspaces view provides the following information about each workspace:

  • Name - Contains the name and database type for the workspace. To view the workspace description, hover over the name.

  • Generation status - The status for the most recent generation job. To display the job details for the job, click the job status. To display more details about the date, time, and duration for the job, hover over the generation timestamp. If a job failed recently, you are given additional information about how long this job has been failing (the date of the first failure occurrence among a continuous series of failures).

  • Schema changes - Indicates whether Structural detected changes to the source database schema. If there are changes, the column shows the number of changes. Hover over the column value to display additional details, and to navigate to the Schema Changes view. Go to Viewing and resolving schema changes.

  • Tags - The tags that are assigned to the workspace.

  • Permissions - The permission sets that are assigned to you for the workspace.

  • Owner - The name and email address of the workspace owner.

Getting access to workspace tools and actions

Displaying the workspace management view

On Workspaces view, when you click the workspace name, the workspace management view for the workspace is displayed. The Privacy Hub tab is selected.

The Name column also provides access to a menu of workspace configuration options. When you select an option, the workspace management view is displayed, open to the view for the selected option.

Workspace tools menu

Options column

The last column in the workspaces list provides additional workspace options:

Options column and dropdown for a workspace
  • Subsetting icon - Displays the subsetting configuration for the workspace. Go to Viewing the current subsetting configuration.

  • Post-job actions icon - Displays the post-job actions for the workspace. For more information, go to Post-job scripts and Webhooks.

  • Actions menu - Provides access to additional options.

Actions menu for bulk actions

The Actions menu at the top left of the workspaces list allows you to to perform bulk actions on multiple workspaces. It is enabled when you check one or more of the checkboxes in the first column of each row. The Actions menu provides options for the selected workspaces.

Actions menu for selected workspaces

Address

Generates a random mailing address-like string.

You can indicate which part of an address string that the column contains. For example, the column might contain only the street address or the city, or it might contain the full address.

Characteristics

Consistency

Yes, can be made self-consistent or consistent with another column.

Linking

Yes, can be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the Link To dropdown list, select the columns to link this column to. You can link columns that use the Address generator to mask one of the following address components:

    • City

    • City State

    • Country

    • Country Code

    • State

    • State Abbreviation

    • Zip Code

    • Latitude

    • Longitude

    Note that when linked to another address column, a country or country code is always the United States.

  2. From the address component dropdown list, select the address component that this column contains. The available options are:

    • Building Number

    • Cardinal Direction (North, South, East, West)

    • City

    • City Prefix (Examples: North, South, East, West, Port, New)

    • City Suffix (Examples: land, ville, furt, town)

    • City with State (Example: Spokane, Washington)

    • City with State Abbr (Example: Houston, TX)

    • Country (Examples: Spain, Canada)

    • Country Code (Uses the 2-character country code. Examples: ES, CA)

    • County

    • Direction (Examples: North, Northeast, Southwest, East)

    • Full Address

    • Latitude (Examples: 33.51, 41.32)

    • Longitude (Examples: -84.05, -74.21)

    • Ordinal Direction (Examples: Northeast, Southwest)

    • Secondary Address (Examples: Apt 123, Suite 530)

    • State (Examples: Alabama, Wisconsin)

    • State Abbr (Examples: AL, WI)

    • Street Address (Example: 123 Main Street)

    • Street Name (Examples: Broad, Elm)

    • Street Suffix (Examples: Way, Hill, Drive)

    • US Address

    • US Address with Country

    • Zip Code (Example: 12345)

  3. Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.

  4. If consistency is enabled, then by default, the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When the Address generator is consistent with itself, then the same value in the source database is always mapped to the same destination value. For example, for a column that contains a state name, Alabama is always mapped to Illinois. When the Address generator is consistent with another column, then the same value in the other column always results in the same destination value for the address column. For example, if the address column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same address value in the destination database.

  5. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Spark supported address parts

For the Address generator, Spark workspaces (Amazon EMR, Databricks, and self-managed Spark clusters) only support the following address parts:

  • Building Number

  • City

  • Country

  • Country Code

  • Full Address

  • Latitude

  • Longitude

  • State

  • State Abbr

  • Street Address

  • Street Name

  • Street Suffix

  • US Address

  • US Address with Country

  • Zip Code

Configuring secrets managers for database connections

Required license: Enterprise

Required global permission: Manage secrets managers

Your organization might use a secrets manager to secure credentials, including database connection credentials.

For data connector credentials, you can configure a set of available secrets managers. In the workspace configuration, users can then select a secret name from a secrets manager.

Supported secrets manager tools and formats

Structural currently supports AWS Secrets Manager.

Structural only supports secrets that store passwords. For AWS Secrets Manager, the passwords must be in one of the following formats:

  • String

  • JSON

The JSON must contain a map of key-value pairs. It can either:

  • Contain a single key for which the value is the password in plaintext.

  • Contain a key that is labeled either password or pw, for which the value is the password in plaintext.

Viewing the secrets manager list

To display the list of secrets managers, on Structural Settings view, click Secrets Manager.

Working with secrets managers

Creating a secrets manager

To create a secrets manager:

  1. On the Secrets Manager tab, click Add Secrets Manager.

  2. On the Create Secrets Manager panel, in the Name field, provide a name to use to identify the secrets manager. Secrets manager names must be unique. The name is used in the secrets manager dropdown list on the workspace settings view.

  3. From the Type dropdown list, select the secrets manager product. Structural currently supports AWS Secrets Manager.

  4. Configure the credentials to use to connect to the secrets manager.

  5. Click Save.

Editing an existing secrets manager

For an existing secrets manager, you can change the name and the credentials configuration.

You cannot change the type.

To edit an existing secrets manager:

  1. In the secrets manager list, click the edit icon for the secrets manager.

  2. On the Edit Secrets Manager panel, update the configuration.

  1. Click Save.

Deleting a secrets manager

When you delete a secrets manager, it is removed from the workspace database connections that use it. Structural is no longer able to connect to those databases.

To delete a secrets manager:

  1. In the secrets manager list, click the delete icon for the secrets manager.

  2. On the confirmation panel, click Delete.

Providing credentials for AWS Secrets Manager

Required AWS Secrets Manager permissions

The AWS Secrets Manager credentials that you provide must have the following permissions:

  • secretsmanager:ListSecrets

  • On each secret to use, secretsmanager:GetSecretValue

  • On the encryption key for secrets that are encrypted with a customer managed key (CMK), kms:Decrypt

Here is an example policy that grants the required Secrets Manager permissions:

Selecting the source of the credentials

For AWS Secrets Manager, under Authentication, select the source of the credentials:

  • Environment - Only available on self-hosted instances. Indicates to use either:

    • The credentials for the AWS Identity and Access Management (IAM) role on the host machine.

    • The credentials set in the following :

      • TONIC_AWS_ACCESS_KEY_ID - An AWS access key that is associated with an IAM user or role

      • TONIC_AWS_SECRET_ACCESS_KEY - The secret key that is associated with the access key

      • TONIC_AWS_REGION - The AWS Region to send the authentication request to

  • Assumed role - Indicates to use the specified assumed role.

  • User credentials - Indicates to use the provided user credentials.

Providing an assumed role

To provide an assumed role, click Assume Role, then:

  1. In the Role ARN field, provide the Amazon Resource Name (ARN) for the role.

  2. In the Session Name field, provide the role session name. If you do not provide a session name, then Structural automatically generates a default unique value. The generated value begins with TonicStructural.

  3. In the Duration (in seconds) field, provide the maximum length in seconds of the session. The default is 3600, indicating that the session can be active for up to 1 hour. The provided value must be less than the maximum session duration that is allowed for the role.

  4. From the AWS Region dropdown list, select the AWS Region to send the authentication request to.

Structural generates the external ID that is used in the assume role request. Your role’s trust policy must be configured to condition on your unique external ID.

Here is an example trust policy:

Providing AWS user credentials

To provide the credentials, click User Credentials, then:

  1. In the AWS Access Key field, enter the AWS access key that is associated with an IAM user or role.

  2. In the AWS Secret Key field, enter the secret key that is associated with the access key.

  3. Optional. In the AWS Session Token field, provide the session token to use.

  4. From the AWS Region dropdown list, select the AWS Region to send the authentication request to.

Configuring an individual column

For an individual column in Database View, you can configure the assigned generator and determine the column sensitivity.

Displaying the generator configuration panel

From the column list, to display the generator configuration panel, in the Applied Generator column, click the generator name tag.

Indicating whether a column is sensitive

Required workspace permission: Configure column sensitivity

The Structural sensitivity scan provides an initial indication of whether a column is sensitive and, if it is sensitive:

  • The type of sensitive data that is in the column.

  • The confidence level of the sensitivity detection.

For more information, go to .

In a child workspace, you cannot configure whether a column is sensitive. A child workspace always inherits the sensitivity designations from its parent workspace.

Status column

From the Status column, to confirm or change the column sensitivity, click the Status value.

The status panel indicates whether the column is sensitive. It identifies the sensitivity type, and indicates how the sensitivity was determined - by a sensitivity scan or by a user.

Built-in sensitivity type

For a column that matches a built-in sensitivity type, the first time that you display the panel, the Sensitive data? setting displays Yes and No options for you to confirm or change the sensitivity.

  • To indicate that the column is sensitive, click Yes.

  • To indicate that the column is not sensitive, click No.

When you click Yes or No, the Yes and No options change to a simple toggle. When you click Yes, the sensitivity confidence level changes to full.

After that:

  • To indicate that the column is sensitive, toggle Sensitive data? to the on position.

  • To indicate that the column is not sensitive, toggle Sensitive data? to the off position.

Sensitivity rule match

When a column matches a sensitivity rule, the sensitivity panel indicates that the column matched a sensitivity rule.

You use the Sensitive data? toggle to indicate whether the column is actually sensitive.

No built-in sensitivity type or sensitivity rule match

When a column does not match a built-in sensitivity type or a custom sensitivity rule, the sensitivity panel indicates that column is not sensitive.

The Sensitive data? setting displays Yes and No options for you to confirm or change the sensitivity.

  • To indicate that the column is sensitive, click Yes.

  • To confirm that the column is not sensitive, click No.

When you click Yes or No, the Yes and No options change to a simple toggle.

If you click Yes:

  • The panel updates to indicate that a user confirmed that the column is sensitive.

  • The sensitivity confidence level is set to full confidence.

After that:

  • To indicate that the column is sensitive, toggle Sensitive data? to the on position.

  • To indicate that the column is not sensitive, toggle Sensitive data? to the off position.

Column configuration panel

To configure the sensitivity, you can also use the Sensitive Data toggle on the column configuration panel.

  • To indicate that a column is sensitive, toggle the sensitivity setting to the on position.

  • To indicate that the column is not sensitive, toggle the sensitivity setting to the off position.

When you change the sensitivity from the generator configuration panel, the Sensitive data? setting on the sensitivity panel also changes from the Yes and No options to the toggle.

Assigning or ignoring the recommended generator

Required workspace permission: Configure column generators

When a sensitivity scan identifies a column, Structural recommends a generator for the column. For example, when the sensitivity scan identifies a column as a first name, Structural recommends the Name generator configured to generate a first name value.

In the Assigned Generator column on Database View, columns that do not have an assigned generator, and that have a recommended generator, display the available recommendation icon.

When you click the generator dropdown, the column configuration panel includes the following information:

  • The sensitivity confidence level.

  • The recommended generator.

  • Sample source and destination values based on the recommended generator.

From the panel, you choose whether to assign or ignore the recommended generator for that type.

  • To assign the recommended generator, click Apply.

  • To ignore the recommendation, click Ignore. Structural clears the recommendation.

Changing the column generator configuration

Required workspace permission: Configure column generators

To change the generator that is assigned to a selected column:

  1. Click the generator name tag for the column.

  2. To assign a different generator to the column, from the Generator Type dropdown list, select the generator.

  3. Configure the generator options.

To reset an assigned generator to Passthrough, which indicates to not transform the data:

  1. Click the generator name tag.

  2. On the generator configuration panel, click the delete icon next to the generator dropdown.

For details about the configuration options for each generator, go to the .

For more information about selecting and configuring generators and generator presets, go to .

Enabling Document View for JSON columns

Supported only for the file connector and PostgreSQL.

For a JSON column, instead of assigning a generator, you can enable Document View.

From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to .

To enable Document View, on the column configuration panel, toggle Use Document View to the on position. Note that if you have , or enabled , then the Use Document View toggle is in the advanced options.

When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.

Indicating whether a column is sensitive

Creating and managing custom sensitivity rules

Required license: Professional

Required global permission: Create and manage sensitivity rules

By default, when a Structural security scan runs on a workspace, it looks for the .

You can also define custom sensitivity rules to identify other values and the corresponding recommended generator. Your data might include values that are specific to your organization.

Each custom sensitivity rule specifies:

  • The data type for matching columns.

  • Text matching criteria for the names of matching columns.

  • The recommended generator preset.

Displaying the list of custom sensitivity rules

To display the current list of sensitivity rules, in the Structural navigation menu, click Sensitivity Rules.

The list contains the sensitivity rules for a self-hosted Structural instance or a Structural Cloud organization.

For each rule, the list includes:

  • The rule name and description

  • The recommended generator preset

  • When the rule was most recently modified

Filtering the rules

You can filter the rule list by the following:

  • Rule name

  • Rule description

  • Generator preset name

  • Name of the user who most recently updated the rule

In the filter field, start to type text from any of those values. As you type, the list is filtered to only include matching rules.

Note that when the list is filtered, you cannot change the display sequence of the rules.

Setting the rule sequence

Structural applies the rules based on their display order in the list.

If a column matches more than one rule, Structural applies the first matching rule.

To change the display order of a rule, drag and drop it to the new location in the list.

Note that you cannot change the rule sequence when the list is filtered.

Creating and editing a sensitivity rule

Creating a sensitivity rule

To create a sensitivity rule:

  1. On the Sensitivity Rules view, click New Custom Rule.

  2. On the Create Custom Rule view, .

  3. Click Save.

Editing a sensitivity rule

To change the configuration of a sensitivity rule:

  1. On the Sensitivity Rules view, click the edit icon for the rule.

  2. On the Edit Custom Rule view, .

  3. Click Save.

Note that any changes to a sensitivity rule do not take effect until the next sensitivity scan.

Sensitivity rule configuration

Rule name and description

In the Name field, type the name of the sensitivity rule. The rule name becomes the sensitivity type for matching columns. The rule name must be unique, and also cannot match the name of a built-in sensitivity type.

Optionally, in the Description field, type a longer description of the sensitivity rule.

Data type

From the Data Type dropdown list, select the data type for matching columns. For example, a rule might only be used for columns that contain text.

The available data types are general types that map to specific data types in a given database. The available types are:

  • Array

  • Binary

  • Boolean

  • Continuous Numerical

  • Date Range

  • Datetime

  • Integer

  • JSON

  • MAC Address

  • Network Address

  • Text

  • UUID

  • XML

Column name criteria

Under Column Name Match, provide the criteria to identify matching columns based on the column name.

Note that a matching column must match both the data type and the column name criteria.

Configuring text matching conditions

When you provide a list of text matching conditions, a matching column must match all of the conditions. In other words, the conditions are joined by AND.

To apply the same generator preset to columns that have completely different names, you must create separate sensitivity rules.

To create a list of text matching conditions:

  1. Click Text Match.

  2. To add a column name condition, click Add String Match.

  3. For each condition:

    1. From the comparison type dropdown list, select the type of comparison. For example, Contains, Starts with, Ends with.

    2. In the comparison text field, provide the text to check for. The comparison text is case insensitive. For example, if you set a condition to match column names that contain the text term, it also matches column names that contain TERM or Term or tErM.

  4. To remove a column name condition, click its delete icon.

Providing a regular expression

To use a regular expression to identify matching columns based on the column name:

  1. Click Regular Expression.

  2. In the field, provide the regular expression.

Generator preset to apply

From the Recommended Generator Preset dropdown list, select the generator preset that is the recommended generator for matching columns.

To search for a specific preset, begin to type the generator preset name.

Managing generator preset configuration

Required global permission: Create and manage generator presets

When you configure a sensitivity rule, you can also create a new generator preset or update the configuration of the selected generator preset.

To create a new generator preset, click Create Preset. On the generator preset details panel, provide the generator preset configuration, then click Create.

To edit the selected generator preset, click Edit Current Preset. On the generator preset details panel, update the generator preset configuration, then click Save and Apply.

For more information about generator preset configuration, go to .

Previewing the rule results

If you have access to a workspace, then you can use the workspace to preview the sensitivity rule results.

Under Test Results, from the workspace dropdown list, select the workspace to use.

Structural searches the workspace schema for matching columns based on the sensitivity rule configuration.

It displays any matching columns. You can filter the matching columns based on the table or column name.

For each matching column, the list includes:

  • The column name and table

  • A sample value from the source data. The sample source value is only present if you have the Preview source data permission for the workspace.

  • A sample replacement value, based on the selected generator preset for the sensitivity rule. The sample replacement value is only present if you have the Preview destination data permission for the workspace.

Deleting a sensitivity rule

To delete a sensitivity rule, on the Sensitivity Rules view, click the delete icon for the rule.

Note that existing generator recommendations for the rule remain in place until the next sensitivity scan.

AddressGenerator
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowSecretsManagerActions",
      "Effect": "Allow",
      "Action": [
        "secretsmanager:GetSecretValue",
        "secretsmanager:ListSecrets"
      ],
      "Resource": "arn:aws:secretsmanager:us-east-1:111111111111:secret:mySecretNamespace/*"
    }
  ]
}
{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Principal": {
      "AWS": "<originating-account-id>"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": {
        "sts:ExternalId": "<external-id>"
      }
    }
  }
}
environment settings
Secrets Manager tab on Structural Settings
Edit Secrets Manager panel
Authentication options for AWS Secrets Manager
Configuration fields for the Assume Role option for AWS Secrets Manager credentials
Configuration fields for the User Credentials option for AWS Secrets Manager credentials
Identifying sensitive data
Generator reference
Assigning and configuring generators
Document View for JSON columns
custom value processors
Structural data encryption
Generator configuration panel
Column status panel with confirmation options
Status panel after you select No or Yes to indicate the sensitivity
Sensitivity panel for a column that matched a sensitivity rule
Sensitivity panel for a not sensitive column
Sensitivity panel after you change the sensitivity on a not sensitive column
Sensitivity toggle on the column configuration panel
Column with the available recommendation icon
Recommended generator panel for a column
built-in sensitivity types
configure the new rule
update the configuration
Sensitivity Rules view with the list of custom sensitivity rules
Details view for a custom sensitivity rule
Column name text match rules for a custom sensitivity rule
Column name regular expression field for a custom sensitivity rule
Test Results section to preview the results for a sensitivity rule

Table modes

Each table is assigned a table mode. The table mode determines at a high level how the table is populated in the destination database.

Selecting the table mode for a table

Required workspace permission: Assign table modes

Both Database View and Table View allow you to view and update the selected table mode for a table.

For Database View, go to Assigning table modes to tables.

For Table View, go to .

Available table modes

De-Identify

This is the default table mode for new tables.

In this mode, Tonic Structural copies over all of the rows to the destination database.

For columns that have the generator set to Passthrough, Structural copies the original source data to the destination database.

For columns that are assigned a generator other than Passthrough, Structural uses the generator to replace the column data in the destination database.

Truncate

This mode drops all data for the table in the destination database. Sensitivity scans ignore truncated tables.

For data connectors other than Spark-based data connectors, the table schema and any constraints associated with the table are included in the destination database.

For Spark-based data connectors (Amazon EMR, Databricks, Spark SDK), the table is ignored completely.

For the file connector, file groups are treated as tables. When a file group is assigned Truncate mode, the data generation process ignores the files that are in that file group.

Any existing data in the destination database is removed. For example, if you change the table mode to Truncate after an initial data generation, the next data generation clears the table data. For Spark-based data connectors, the table is removed.

If you assign Truncate mode to a table that has a foreign key constraint, it fails during data generation. If this is a requirement, contact [email protected] for assistance.

When upsert is enabled, the Truncate table mode does not actually truncate the destination table. Instead, it works more like Preserve Destination table mode, which preserves existing records in the destination table.

Preserve Destination

This mode preserves the data in the destination database for this table. It does not add or update any records.

This feature is primarily used for very large tables that don't need to be de-identified during subsequent runs after the data exists in the destination database.

When you assign Preserve Destination mode to a table, Structural locks the generator configuration for the table columns.

The destination database must have the same schema as the source database.

You cannot use Preserve Destination mode when you:

  • Enable upsert for a workspace.

  • Write destination data to a container repository.

  • Write destination data to an Ephemeral snapshot.

Incremental

Incremental mode only processes the changes that occurred to the source table since the most recent data generation or other changes in the destination. This can greatly reduce generation time for large tables that don't have a lot of changes.

For Incremental mode to work, the following conditions must be satisfied:

  • The table must exist in the destination database. Either Structural created the table during data generation, or the table was created and populated in some other way.

  • A reliable date updated column must be present. When you select Incremental mode for a table, Structural prompts you to select the date updated column to use.

  • The table must have a primary key.

To maximize performance, we recommend that you have an index on the date updated field.

For tables that use Incremental mode, Structural checks the source database for records that have an updated date that that is greater than the maximum date in that column in the destination database.

When identifying records to update, Structural only checks the updated date. It does not check for other updates. Records where the generator configuration is changed are not updated if they do not meet the updated date requirement.

For the identified records, Structural checks for primary key matches between the source and destination databases, then does one of the following:

  • If the primary key value exists in the destination database, then Structural overwrites the record in the destination database.

  • If the primary key value does not exist in the destination database, then Structural adds a new record to the destination database.

This mode currently only updates and adds records. Rows that are deleted from the source database remain in the destination database.

To ensure accurate incremental processing of records, we recommend that you do not directly modify the destination database. A direct modification might cause the maximum updated date in the destination database to be after the date of the last data generation. This could prevent records from being identified for incremental processing.

Incremental mode is currently supported on PostgreSQL, MySQL, and SQL Server. If you want to use this table mode with another database type, contact [email protected].

You cannot use Incremental mode when you:

  • Enable upsert for a workspace.

  • Write destination data to a container repository.

  • Write destination data to an Ephemeral snapshot.

Scale

In this mode, Structural generates an arbitrary number of new rows, as specified by the user, using the generators that are assigned to the table columns.

You can use linking and partitioning to create complex relationships between columns.

Structural generates primary and foreign keys that reflect the distribution (1:1 or 1:many) between the tables in the source database.

You cannot use Scale mode when you enable upsert for a workspace.

Indicating whether to return an error when destination data already exists (Databricks only)

For the Databricks data connector, the table mode configuration includes an Error on Overwrite setting. The setting indicates whether to return an error when Structural attempts to write data to a destination table that already contains data. The option is not available when you write destination data to Databricks Delta tables.

To return the error, toggle the setting to the on position.

To not return the error, toggle the setting to the off position.

Applying a filter to tables

For workspaces that use following data connectors, the table mode configuration for De-Identify mode includes an option to apply a filter to the table:

  • Amazon EMR

  • Amazon Redshift

  • Databricks

  • Google BigQuery

  • Snowflake on AWS

  • Snowflake on Azure

Table filters provide a way to generate a smaller set of data when a data connector does not support subsetting. For more information, go to Using table filtering for data warehouses and Spark-based data connectors.

Configuring partitioning for the destination database

This option is only available for workspaces that use the following data connectors:

  • Amazon EMR

  • Databricks

On the table mode configuration panel, you can use the Repartition or Coalesce option to indicate a number of partitions to generate.

Table mode configuration panel for a Spark-based workspace

By default, the destination database uses the same partitioning as the source database. The partition option is set to Neither.

Using the Repartition option

The Repartition option allows you to provide a specific number of partitions to generate.

To use the Repartition option:

  1. Click Repartition.

  2. In the field, enter the number of partitions.

Using the Coalesce option

The Coalesce option allows you to provide a maximum number of partitions to generate. If the source data has fewer partitions than the number you specify, then Structural only generates that number.

The Coalesce option should be more efficient than the Repartition option.

To use the Coalesce option:

  1. Click Coalesce.

  2. In the field, enter the number of partitions.

MAC Address

Generates a random MAC address formatted string.

Characteristics

How to configure

To configure the generator:

  1. In the Bytes Preserved field, enter the number of bytes to preserve in the generated address.

  2. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

  3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

Yes, if consistency is not enabled.

Data-free

Yes, if consistency is not enabled.

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

Yes

Privacy ranking

  • 1 if not consistent

  • 4 if consistent

Generator ID (for the API)

MACAddressGenerator

Structural data encryption

Generator reference

This generator reference provides the details for each of the the supported generators in Tonic Structural.

Information provided for each generator

For each generator, the reference provides:

  • Overview description

  • A table that contains:

    • Generator characteristics that you might want to take into account when you select the generator.

    • The generator privacy ranking, which indicates the level of protection that the generator provides.

    • The generator ID to use in the Structural API. The generator ID is linked to the API details for the generator.

  • Instructions for how to configure the generator

The generator characteristics include:

  • Consistency - Whether you configure the generator to base the the destination values on the source values.

  • Linking - Whether you can link columns that use the generator to indicate that there is a relationship between them.

  • Differential privacy - Whether the generator supports differential privacy, which ensures that the source value cannot be reverse engineered from the output value.

  • Data-free - Whether the generator is data-free, meaning that the output data is completely unrelated to the source data.

  • Allowed for primary keys - Whether you can assign the generator to primary key columns.

  • Allowed for unique columns - Whether you can assign the generator to columns that require unique values.

  • Uses format-preserving encryption (FPE) - Whether the generator uses FPE to encrypt the values.

The generators are in alphabetical order by the generator name.

Here are some groupings to help to identify generators that are used for different types of values. Generator hints and tips also provides some suggestions for generators to use for specific uses cases.

Composite generators

Transform data that uses complex formats or based on a condition. For more information, go to Composite generators.

  • Array JSON Mask

  • Array Regex Mask

  • Conditional

  • CSV Mask

  • HStore Mask

  • HTML Mask

  • JSON Mask

  • Regex Mask

  • Struct Mask

  • XML Mask

Information type generators

These generators produce specific types of values.

  • Address

  • Business Name (and the deprecated Company Name)

  • Email

  • File Name

  • Finnish Personal Identity Code

  • FNR

  • Geo

  • HIPAA Address

  • Hostname

  • International Address

  • IP Address

  • MAC Address

  • Name

  • Phone

  • Shipping Container

  • SIN

  • SSN

  • Unique Email

  • URL

Datetime value generators

These generators are used to specifically transform datetime values.

  • Date Truncation

  • Event Timestamps

  • Random Timestamp

  • Timestamp Shift Generator

Key generators

Intended for use with primary key columns. For more information, go to Primary key generators.

  • Alphanumeric String Key

  • ASCII Key

  • Integer Key

  • Numeric String Key

  • UUID Key

Numeric value generators

These generators are specifically intended to work with numeric values.

  • Algebraic

  • Continuous

  • Cross Table Sum

  • Noise Generator

  • Random Double

  • Random Integer

  • Sequential Integer

String value generators

These generators are useful for transforming string values that aren't covered by a specific information type generator.

  • Categorical

  • Character Scramble

  • Character Substitution

  • Constant - Also useable for numeric columns.

  • Custom Categorical - Also useable for numeric columns.

  • Find and Replace

  • Regex Mask

Other value substitution and replacement generators

These generators perform other types of transformation on column values.

  • Array Character Scramble

  • Null

  • Random Boolean

  • Random Hash

  • Random UUID

Selecting the table mode
Overriding the Structural seed value for a workspace
Enabling consistency across runs or multiple databases

Enabling and configuring upsert

Required license: Professional or Enterprise

Not compatible with writing output to a container repository or a Tonic Ephemeral snapshot.

By default, Tonic Structural data generation replaces the existing destination database with the transformed data from the current job.

Upsert adds and updates rows in the destination database, but keeps all of the other existing rows intact. For example, you might have a standard set of test records that you do not want to replace every time you generate data in Structural.

If you enable upsert, then you cannot write the destination data to a container repository or to a Tonic Ephemeral snapshot. You must write the data to a database server.

Upsert is currently only supported for the following data connectors:

  • MySQL

  • Oracle

  • PostgreSQL

  • SQL Server

For an overview of upsert, you can also view the video tutorial.

About the upsert process

When upsert is enabled, the data generation job writes the generated data to an intermediate database. Structural then runs the upsert job to write the new and updated records to the destination database.

Data generation process with upsert

The destination database must already exist. Structural cannot run an upsert job to an empty destination database.

The upsert job adds and updates records based on the primary keys.

  • If the primary key for a record already exists in the destination database, the upsert job updates the record.

  • If the primary key for a record does not exist in the destination database, the upsert job inserts a new row.

To only update or insert records that Structural creates based on source records, and ignore other records that are already in the destination database, ensure that the primary keys for each set of records operate on different ranges. For example, allocate the integer range 1-1000 for existing destination database records that you add manually. Then ensure that the source database records, and by extension the records that Structural creates during data generation, use a different range.

Also note that when upsert is enabled, the Truncate table mode does not actually truncate the destination table. Instead, it works more like Preserve Destination table mode, which preserves existing records in the destination table.

Enabling upsert

To enable upsert, in the Upsert section of the workspace details, toggle Enable Upsert to the on position.

When you enable upsert for a workspace, you are prompted to configure the upsert processing and provide the connection details for the intermediate database.

Configuring upsert processing

When you enable upsert, Structural displays the following settings to configure the upsert process.

Disable Triggers

Indicates whether to disable any user-defined triggers before the upsert job runs. This prevents duplicate rows from being added to the destination database. By default, this is enabled.

Automatically Start Upsert After Successful Data Generation

Indicates whether to immediately run the upsert job after the initial data data generation to the intermediate database. By default, this is enabled. If you turn this off, then after the initial data generation, you must start the upsert job manually. For more information, go to .

Persist Conflicting Data Tables

When an upsert job cannot process rows with unique constraint conflicts, as well as rows that have foreign keys to those rows, this setting indicates whether to preserve the temporary tables that contain those rows. By default, this is disabled. Structural only keeps the applicable temporary tables from the most recent upsert job.

Warn on Mismatched Constraints

Indicates whether to treat mismatched foreign key and unique constraints between the source and destination databases as warnings instead of errors, so that the upsert job does not fail. By default, this is disabled.

Connecting to migration scripts for schema changes

Required license: Enterprise

The intermediate database must have the same schema as the destination database. If the schemas do not match, then the upsert process fails.

To ensure that schema changes are automatically reflected in the intermediate database, you can connect the workspace to your own database migration script or tool. Structural then runs the migration script or tool whenever you run upsert data generation.

How upsert works with the migration process

When you start an upsert data generation job:

Upsert data generation process with migration
  1. If migration is enabled, Structural calls the endpoint to start the migration.

  2. Structural cannot start the upsert data generation until the migration completes successfully. It regularly calls the status check endpoint to check whether the migration is complete.

  3. When the migration is complete, Structural starts the upsert data generation.

POST Start Schema Changes endpoint

Required. Structural calls this endpoint to start the migration process specified by the provided URL.

The request includes:

  • Any custom parameter values that you add.

  • The connection information for the intermediate database.

The request uses the following format:

{ 
  "parameters": {/* user supplied parameters */ },
  "databaseConnectionDetails": {
        "server": "rds.amazon.com",
        "port": "54321",
        "username": "user",
        "password": "password",
        "databaseName": "tonic_upsert",
        "schemaName": "<Oracle schema to use>",
        "sslEnabled": true,
        "trustServerCertificate": false
  }
}

The response contains the identifier of the migration task.

The response uses the following format:

{ "id": "<unique-string-identifier>" }

GET Status of Schema Change endpoint

Required. Structural calls this endpoint to check the current status of the migration process.

The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter.

The response provides the current status of the migration task. The possible status values are:

  • Unknown

  • Queued

  • Running

  • Canceled

  • Completed

  • Failed

The response uses the following format:

{
  "id": "a0c5c4c3-a593-4daa-a935-53c45ec255ea",
  "status": "Completed",
  "errors": []
}

GET Schema Change Logs endpoint

Optional. Structural calls this endpoint to retrieve the log entries for the migration process. It adds the migration logs to the upsert logs.

The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter

The response body of the request should be 'text/plain'. It contains the raw logs.

DELETE Cancel Schema Changes endpoint

Optional. Structural calls this endpoint to cancel the migration process.

The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or query parameter.

Enabling and configuring the migration process

To enable the migration process, toggle Enable Migration Service to the on position.

When you enable the migration process, you must configure the POST Start Schema Changes and GET Status of Schema Change endpoints.

You can optionally configure the GET Schema Change Logs and DELETE Cancel Schema Changes endpoints.

To configure the endpoints:

  1. To configure the POST Start Schema Changes endpoint:

    1. In the URL field, provide the URL of the migration script.

    2. Optionally, in the Parameters field, provide any additional parameter values that your migration scripts need.

  2. To configure the GET Status of Schema Change endpoint, in the URL field, provide the URL for the status check.

    The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.

  3. To configure the GET Schema Change Logs endpoint, in the URL field, provide the URL to use to retrieve the logs. The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.

  4. To configure the DELETE Cancel Schema Changes endpoint, in the URL field, provide the URL to use for the cancellation. The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.

Connecting to the intermediate database

When you enable upsert, you must provide the connection information for the intermediate database.

For details, go to the workspace configuration information for the data connector.

Tutorial video: Subsetting your data
Tutorial video: Tonic Structural 101
Tutorial video: Creating a Structural workspace
Tutorial video: Sensitivity detection and generator recommendations
Tutorial video: Using Document View to configure JSON columns
Tutorial video: Enabling upsert data generation
Tutorial video: File connector overview
Tutorial video: Generating data with consistency
Tutorial video: Managing workspace access

Table View

Table View displays source or preview data for a single table. For a workspace, each table corresponds to a file group.

Required workspace permission:

  • Source data: Preview source data

  • Destination data: Preview destination data

If you do not have either of these permissions, then you cannot display Table View.

To display Table View:

  • On the workspace management view, click Table View.

  • On Workspaces view, from the dropdown menu in the Name column, select Table View.

  • From Database View, either click the arrow icon for the table, or click a row in the table.

From Table View, you can view and update the table and column configuration.

Selecting and configuring tables

Selecting the table to view

When you display Table View from Database View, it displays the data for the selected table.

When you display Table View from the workspace management view or Workspaces view, it displays the most recently displayed table.

If Table View was never displayed before, then it displays the first table in the workspace.

To change the selected table, from the Table dropdown list, select the table to view.

Selecting the table mode

Required workspace permission: Assign table modes

To change the table mode that is assigned to the table:

  1. Click the current table mode.

  2. On the table mode panel, from the table mode dropdown list, select the new table mode.

When you change the table mode, Tonic Structural updates the preview data as needed. For example, if you change the table mode to Truncate, then the preview data is empty.

For a , the table mode selection panel indicates whether the selected table mode is inherited from the parent workspace.

If the child workspace currently overrides the parent workspace configuration, then to reset the table mode to the table that is assigned in the parent workspace, click Reset.

Viewing the generator configuration summary

The Model section of Table View displays the configured generators for the table columns.

The header for each Model entry is the column name.

Linked columns share an entry. The heading is a comma-separated list of the linked columns.

Each entry contains the following information:

  • The column and generator, in the format Column Name >> Generator Name. For example, First_Name >> Name indicates that the First_Name column has the Name generator applied. For linked columns, there is a Column Name >> Generator Name entry for each column.

  • The selected configuration options for the generator.

For a , each Model entry indicates whether the configuration overrides the parent configuration. For configurations that override the parent, to remove the overrides and restore the inheritance, click Reset.

The Model entry also indicates when is enabled for the column.

To remove the generator from a column, click the delete icon.

Changing the column data display

Toggling between source and preview data

The Preview toggle at the top right of Table View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to understand exactly how Structural transforms the data based on the table and column configuration.

By default, the Preview toggle is in the on position, and the displayed data reflects the selected table mode and the assigned generators. For tables that use Truncate mode, the preview data is empty. Truncated tables do not have data in the destination database.

To display the original source data, toggle Preview to the off position.

Note that for , you cannot preview the destination data from Table View. You must preview the data from Document View.

Using a query to filter the source data

You can provide a query to filter the source data. The query is always against the source data, not the preview data, regardless of whether the Preview toggle is off or on.

For example, you configure a first name field to use the Name generator and enable consistency. You can then query the source data for a specific first name value to check that the preview data uses the same destination value for all of those records.

To apply a query to the source data:

  1. Click the query filter icon, located between the table name and the table mode.

  2. On the Table Filter dialog, provide the WHERE clause for the query.

  3. To apply the query, click Apply.

  4. To close the dialog, click Close.

To clear an applied query, on the Table Filter dialog, click Clear.

If no filter is applied, then the query filter icon has a white background.

If a valid filter is applied, then the query filter icon has a gray background.

If the provided WHERE clause is not valid, then the query filter icon has a red background.

Information in the column headings

In addition to the column name, the column heading provides details about the column type and protection status. It also provides access to change the column configuration.

Primary and foreign key indicators

The column heading indicates when a column is either a primary key or a foreign key.

Protection status

The column heading indicates the column protection status:

  • At risk columns are sensitive and do not have an assigned generator.

  • Protected columns have an assigned generator.

  • Not sensitive columns are not sensitive and do not have an assigned generator.

Sensitivity confidence

The sensitivity confidence indicator indicates the confidence in the detection.

For sensitive columns that Structural detected, the confidence level can be high, medium, or low.

For custom sensitivity rule matches or columns that you manually marked as sensitive, the confidence level is full confidence.

For more information about how Structural identifies values and assigns the confidence level, go to .

Column data type

The column heading displays the type of data that the column contains.

Child workspace overrides

Required license: Enterprise

In a , when a column overrides the parent configuration, an Overriding label displays in the column heading.

To filter Table View to only display columns with overrides, toggle Show Overrides Only to the on position.

Configuring a column

Applying or ignoring a recommended generator

Required workspace permission: Configure column generators

When a sensitivity scan identifies a column, Structural recommends a generator for the column. For example, when the sensitivity scan identifies a column as a first name, Structural recommends the Name generator configured to generate a first name value.

For unprotected columns that have a recommended generator, the column heading displays the available recommendation icon.

When you click the dropdown, the column configuration panel includes the following information:

  • The sensitivity confidence level

  • The recommended generator

  • Sample source and destination values based on the recommended generator

From the panel, you can choose whether to assign or ignore the recommended generator for that type.

  • To assign the recommended generator, click Apply.

  • To ignore the recommendation, click Ignore. Structural clears the recommendation.

Changing the column generator configuration

Required workspace permission: Configure column generators

To assign a generator to a column that does not have an assigned generator, or to change the current configuration, click the dropdown in the column heading.

On the generator configuration panel, from the generator type dropdown list, select the generator to assign to the column.

Structural displays the available configuration options for the selected generator. For details about the configuration options for each generator, go to the .

To remove the selected generator or generator preset, and reset the generator to Passthrough, click the delete icon next to the generator.

For more information about selecting and configuring generators and generator presets, go to .

Indicating whether a column is sensitive

Required workspace permission: Configure column sensitivity

On the column configuration panel, the Sensitive Data toggle indicates whether the column is marked as sensitive. The initial configuration is based on the sensitivity scan.

  • To mark a column as sensitive, toggle the setting to the on position.

  • To mark a column as not sensitive, toggle the setting to the off position.

In a , you cannot configure whether a column is sensitive. A child workspace always inherits the sensitivity designation from its parent workspace.

When you copy a workspace, Structural performs a new sensitivity scan on the copy. It does not copy the sensitivity designations from the original workspace.

Enabling Document View for JSON columns

Supported only for the file connector and PostgreSQL.

For a JSON column, instead of assigning a generator, you can enable Document View.

From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to .

To enable Document View, on the column configuration panel, toggle Use Document View to the on position. Note that if you have , or enabled , then the Use Document View toggle is in the advanced options.

When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.

Indicating whether a column is sensitive
file connector
child workspace
child workspace
Structural data encryption
JSON columns that use Document View
child workspace
Generator reference
Assigning and configuring generators
child workspace
Document View for JSON columns
custom value processors
Structural data encryption
Table View with highlighted sections
Table Mode selection on Table View
Table mode configuration that overrides the parent workspace
Model section of Table View
Model entry for linked columns
Model entry for a configuration that overrides the parent workspace
Table Filter dialog for Table View
Query filter icon when no query is applied
Query filter icon when a query is applied
Query filter icon when a bad query is applied
Column headings for primary and foreign keys
Protection status indicators in the column headings
Sensitivity confidence level indicator
Data type information in the column headings
Table View column with a configuration override
Table View with Show Overrides Only enabled
Table View column heading with the recommended generator icon
Recommended generator panel for a column on Table View
Generator dropdown list for a Table View column
How Structural identifies sensitive values
Tutorial video: Outputting data to a container repository
Tutorial video: Generator presets
Tutorial video: Tonic Structural generators overview
Allowing Structural to retrieve information about the latest version

Viewing the column list

The column list on Database View contains information about the sensitivity and generator configuration for each column.

Column list on Database View

Column - Column name and type

The Column column provides general information about the columns and their content, including:

  • Table and column name. When you click the column name, Table View for the column table displays.

  • The name of the schema that contains the table.

  • The data type for the column.

  • An indicator when the column is a primary key

The Column column also contains the option to display sample data for the column.

Status - Protection and sensitivity status

The Status column provides information about whether the column contains sensitive data and whether it has an assigned generator.

The protection status can be one of the following values:

Status Values for a column

  • Protected - The column has an assigned generator.

  • Not Sensitive - The column is marked as not sensitive.

  • At Risk - The column is sensitive and does not have an assigned generator.

At the right of the Status column is a confidence indicator. For At Risk columns, the confidence indicator shows how confident Structural is that the column is sensitive and contains values of the displayed sensitivity type. Protected columns also reflect the original confidence level.

Confidence level indicators for database columns

For more information about how Structural identifies values and assigns the confidence level, go to How Structural identifies sensitive values.

From the Status column, you can change whether a column is sensitive.

Applied Generator - Column configuration

The Applied Generator column is where you select and configure the generator to assign.

The generator dropdown indicates the currently assigned generator. It also indicates when an unprotected column has a recommended generator.

Unprotected column that has a recommended generator

For foreign key columns, the generator dropdown is disabled and the column is marked as a foreign key. Foreign key columns always inherit the generator that is assigned to the primary key.

Disable generator dropdown for a foreign key column

In a child workspace, when the generator configuration overrides the parent workspace, the generator dropdown displays the override icon.

Column with a generator configuration that overrides the parent workspace

The Applied Generator column also contains the option to display and create column comments.

Filtering the column list

To filter the column list, you can:

  • Use the table list to filter the displayed columns based on the table that the columns belong to.

  • Use the filter field to filter the columns by table or column name.

  • Use the Filters panel to filter the columns based on column attributes and generator configuration.

You can use column filters to quickly find columns that you want to verify or to update the configuration for.

Filter by table

To filter the column list to only include columns for specific tables, either:

  • Apply a filter to the table list.

  • Check the checkbox for each table to display columns for.

Filter by table or column name

To filter the column list by table or column name, in the filter field, begin to type text that is in the table or column name.

As you type, Structural filters the column list.

Filtering columns by name

Using the Filters panel

The Filters panel provides access to column filters other than the table and column name.

To display the Filters panel, click Filters.

Filters panel for columns

Searching for a filter

To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.

For each filter, the Filters panel indicates the number of matching columns, based on the selected tables and the current filters.

Using the column filter search

Adding a filter

To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the column list. Above the list, Structural displays tags for the selected filters.

Filters panel with filters selected

Clearing the selected filters

To clear all of the currently selected filters, click Clear All.

Filters panel filters

Columns with generator recommendations

To only display detected sensitive columns for which there is a recommended generator, on the Filters panel, check Has Generator Recommendation.

At-risk columns

An at-risk column:

  • Is marked as sensitive.

  • Is included in the destination data.

  • Is assigned the Passthrough generator.

To only display at-risk columns, on the Filters panel, check At-Risk Column.

When you check At-Risk Column, Structural adds the following filters under Privacy Settings:

  • Sets the sensitivity filter to Sensitive

  • Sets the protection status filter to Not protected

  • Sets the column inclusion filter to Included

Sensitivity

You can filter the columns based on the column sensitivity.

On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive columns.

  • To only display sensitive columns, click Sensitive.

  • To only display non-sensitive columns, click Not sensitive.

Note that when you check At-risk Column, Structural automatically selects Sensitive.

Protection status

You can filter the columns based on whether they have any generator other than Passthrough assigned. To filter the columns based on specific assigned generators, use the Applied Generator filter.

On the Filters panel, under Privacy Settings, the column protection filter is by default set to All, which indicates to display both protected and not protected columns.

  • To only display columns that have an assigned generator, click Protected.

  • To only display columns that do not have an assigned generator, click Not protected.

Note that when you check At-Risk Column, Structural automatically selects Not protected.

Inclusion in the destination database

You can filter the columns based on whether they are populated in the destination database. For example, if a table is truncated, then the columns in that table are not populated.

On the Filters panel, under Privacy Settings, the column inclusion filter is by default set to All, which indicates to display both included and not included columns.

  • To only display columns that are populated in the destination database, click Included.

  • To only display columns that are not populated in the destination database, click Not included.

Note that when you check At-Risk Column, Structural automatically selects Included.

Assigned generator

To only display columns that are assigned specific generators, on the Filters panel, under Applied Generator, check the checkbox for each generator to include.

The list of generators only includes generators that are assigned to the currently displayed columns and that are compatible with other applied filters.

To search for a specific generator, in the Filters search field, begin to type the generator name.

Column data type

You can filter the columns by the column data type. For example, you can only display varchar columns, or only columns that contain either numeric or integer values.

To only display columns that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.

The list of data types only includes data types that are present in the currently displayed columns and that are compatible with other applied filters.

To search for a specific data type, in the Filters search field, begin to type the data type.

Unresolved schema changes

When the source database schema changes, you might need to update the configuration to reflect those changes. If you do not resolve the schema changes, then the data generation might fail. The data generation fails if there are unresolved conflicting changes, or if you configure Structural to always fail data generation when there are any unresolved changes.

For more information about schema changes, go to Viewing and resolving schema changes.

To only display columns that have unresolved schema changes, on the Filters panel, check Unresolved Schema Changes.

Sensitivity type

For detected sensitive columns, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.

To only display columns that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.

The list of sensitivity types only includes sensitivity types that are present in the currently displayed columns.

To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.

Sensitivity confidence

When the Structural sensitivity scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination. The Status column displays the confidence level.

You can filter the columns based on the confidence level.

To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.

Column nullability

You can filter the column list based on whether the column is nullable.

On the Filters panel, under Data Attributes, the nullability filter is by default set to All, which indicates to display both nullable and non-nullable columns.

  • To only display columns that are nullable, click Nullable.

  • To only display columns that are not nullable, click Non-nullable.

Column uniqueness

You can filter the column list based on whether the column must be unique.

On the Filters panel, under Data Attributes, the uniqueness filter is by default set to All, which indicates to display both unique and not unique columns.

  • To only display columns that must be unique, click Unique.

  • To only display columns that do not require uniqueness, click Not unique.

Primary or foreign keys

You can filter the column list to indicate whether to include:

  • Columns that are not primary or foreign keys.

  • Columns that are foreign keys.

  • Columns that are primary keys.

On the Filters panel, under Column Type:

  • To display columns that are neither a primary key nor a foreign key, check Non-keyed.

  • To display columns that are primary keys, check Primary key.

  • To display columns that are foreign keys, check Foreign key.

Generator overrides in a child workspace

In a child workspace, to only display columns that override the generator configuration that is in the parent workspace, on the Filters panel, check Overrides Inheritance.

Uses Structural data encryption

You can enable Structural data encryption, a configuration that allows Structural to:

  • Decrypt source data before applying the generator.

  • Encrypt generated data before writing it to the destination database.

For more information, go to Configuring and using Structural data encryption.

When Structural data encryption is enabled, the generator configuration panel includes an option to use Structural data encryption.

To only display columns that are configured to use Structural data encryption, on the Filters panel, check Uses Data Encryption.

Sorting the column list

By default, the column list is sorted first by table name, then by column name. The columns for each table display together. Within each table, the columns are in alphabetical order.

You can also sort the column list by column name first, then by table. Columns that have the same name display together. Those columns are sorted by the name of the table.

The button at the right of the Column column heading indicates the current sort order.

Sort button in the Column heading
  • T.C indicates that the table is sorted by table, then by column

  • C.T indicates that the table is sorted by column, then by table

To switch the sort order, click the button.

Using Collection View

For MongoDB and Amazon DynamoDB, Collection View replaces Database View and Table View. From Collection View, you can view the fields in a selected collection. You can then assign a collection mode to the collection, and assign generators to fields.

Collection View for a MongoDB workspace

Selecting the collection to view

From the Collection dropdown list, select the collection to view.

Assigning a collection mode to the collection

Collection mode is the term used for table mode. The collection mode determines at the collection level how Structural uses the collection data to generate the destination database.

Available collection modes

By default, the collection mode is De-Identify. In this mode, Structural uses the assigned generators to transform the source database into the destination database.

For MongoDB and DynamoDB, the only other options are Truncate and Preserve Destination.

  • Truncate means that only the collection structure is included in the destination database. The collection has no data in the destination database.

  • Preserve Destination means that Tonic does not change the data that is currently in the destination database.

Assigning the collection mode

Required workspace permission: Assign table modes

To assign the collection mode:

  1. Click the Collection Mode dropdown list.

  2. On the panel, click the current collection mode.

  3. From the drop-down list, select the mode to use.

Selecting the type of view

You can view a collection either as a hybrid document or as single documents. From the View dropdown list, select the view to use.

Hybrid document view

The default view is Hybrid Document. For the hybrid document view, the key list reflects all of the permutations of every field from every document. For example, a field might sometimes be a datetime value and sometimes a string. Hybrid document view lists both types.

Hybrid Document view of Collection View

Single document view

Single Document view displays a single document at a time. You can then page through up to 100 documents. Single Document view displays the structure for each document.

Single Document view of Collection View

Information on the field list

For each field, Collection View always displays:

  • The field name and type.

  • For fields that you configured as primary or foreign keys, a key icon.

  • The assigned generator.

  • An example value. For the hybrid view, you can use the magnifying glass icon to display additional example values.

For the hybrid document view, there is also a Field Freq column. Field Freq shows the percentage of documents that contain that permutation of field and type.

For example, a field might be Null 33% of the time and contain a numeric value 67% of the time. Or a field value might be an Int32 value 3% of the time and an Int64 value 6% of the time. The percentages apply to the first 100 documents.

Toggling between source and preview data

Required workspace permission:

  • Source data: Preview source data

  • Destination data: Preview destination data

The Preview toggle at the top right of Collection View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to determine exactly how Tonic Structural transforms the data based on the collection and field configuration.

By default, the Preview toggle is in the on position, and the displayed data reflects the selected collection mode and the assigned generators. For collections that use Truncate mode, the preview data is empty. Truncated collections do not have data in the destination database.

To display the original source data, toggle Preview to the off position.

Filtering collection fields

In the single document view, you can filter the fields by either the field name or the field value.

In the hybrid document view, you can filter the fields based on either the field name or field properties.

Filtering single document view by field name or value

You can filter single document view to only display fields that have specific text in either the field name or the field value.

To filter by value, toggle Search by Value to the on position.

Filter field and Search by Value toggle for single document view

After you select the filter type, in the search field, type text that is in the field name or value. As you type, Structural filters the list to only include fields that contain the filter text.

Filtering hybrid view by field name

To filter hybrid view by field name, in the search field, begin to type text that is in the field name. As you type, Structural filters the list to only include fields with names that include the filter text.

Filter field and Filters button for hybrid view

Filtering hybrid view by field properties

From the hybrid document view, you can filter the fields based on field properties.

To display the Filters panel, click Filters.

Filters panel for hybrid view on Collection View

Searching for a filter

To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.

Adding a filter

To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the field list.

Above the list, Structural displays tags for the selected filters.

Clearing the selected filters

To clear all of the currently selected filters, click Clear All.

Filters panel filters

The Filters panel in hybrid view includes the following fields.

At-risk fields

An at-risk field:

  • Is marked as sensitive

  • Is assigned the Passthrough generator.

To only display at-risk fields, on the Filters panel, check At-Risk Field.

When you check At-Risk Field, Structural adds the following filters under Privacy Settings:

  • Sets the sensitivity filter to Sensitive.

  • Sets the protection status filter to Not protected.

Sensitivity

You can filter the fields based on the field sensitivity.

On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive fields.

  • To only display sensitive fields, click Sensitive.

  • To only display non-sensitive fields, click Not sensitive.

Note that when you check At-risk Field, Structural automatically selects Sensitive.

Protection status

You can filter the fields based on whether they have any generator other than Passthrough assigned.

On the Filters panel, under Privacy Settings, the field protection filter is by default set to All, which indicates to display both protected and not protected fields.

  • To only display fields that have an assigned generator, click Protected.

  • To only display fields that do not have an assigned generator, click Not protected.

Note that when you check At-Risk Field, Structural automatically selects Not protected.

Recommended generators

When Structural detects that a field is sensitive, it can also determine a recommended generator.

For example, when it detects a name value, it also recommends the Name generator.

You can filter the fields to display the fields that have recommended generators.

On the Filters panel, under Recommended Generators, check the checkbox next to the recommended generator for which to display the fields that have that recommendation.

Field data type

You can filter the fields by the field data type. For example, you might only display columns that contain either numeric or integer values.

To only display fields that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.

The list of data types only includes data types that are present in the currently displayed fields and that are compatible with other applied filters.

To search for a specific data type, in the Filters search field, begin to type the data type.

Unresolved schema changes

When the source database schema changes, you might need to update the configuration to reflect those changes. If you do not resolve the schema changes, then the data generation might fail. The data generation fails if there are unresolved conflicting changes, or if you configure Structural to always fail data generation when there are any unresolved changes.

For more information about schema changes, go to Viewing and resolving schema changes.

To only display fields that have unresolved schema changes, on the Filters panel, check Unresolved Schema Changes.

Sensitivity type

For detected sensitive fields, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.

To only display fields that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.

The list of sensitivity types only includes sensitivity types that are present in the currently displayed fields.

To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.

Sensitivity confidence

When the Structural sensitivity scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination.

You can filter the columns based on the confidence level.

To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.

Primary or foreign keys

You can filter the column list to indicate whether to include:

  • Columns that are not primary or foreign keys.

  • Columns that are foreign keys.

  • Columns that are primary keys.

On the Filters panel, under Field Type:

  • To display fields that are neither a primary key nor a foreign key, check Non-keyed.

  • To display fields that are primary keys, check Primary key.

  • To display fields that are foreign keys, check Foreign key.

Commenting on fields

Required license: Professional or Enterprise

You can add comments to fields. For example, you might use a comment to explain why you selected a particular generator or marked a field as sensitive or not sensitive.

Adding a new comment

If a field does not have any comments, then to add a comment:

  1. Click the comment icon.

  2. In the comment field, type the comment text.

  3. Click Comment.

Replying to an existing comment

When a field has existing comments, the comment icon is green. To add comments:

  1. Click the comment icon. The comments panel shows the previous comments. Each comment includes the comment user and timestamp.

  2. In the comment field, type the comment text.

  3. Click Reply.

Indicating whether a field is sensitive

Required workspace permission: Configure column sensitivity

On the field configuration panel, the sensitivity toggle at the top right indicates whether the field is marked as sensitive.

To mark a field as sensitive, toggle the setting to the Sensitive position.

To mark a field as not sensitive, toggle the setting to the Not Sensitive position.

Assigning a generator to a field and type

Required workspace permission: Configure column generators

You can assign a generator to each combination of field and type. For example, depending on the document, the data type for a field might be either string or integer. You can indicate to use the Character Scramble generator when the field type is a string and the Random Integer generator when the field type is integer.

In hybrid document view, the Null type reflects when the column value is Null. You do not assign a generator to it.

To assign a generator:

  1. Click the generator value for the field.

  2. On the configuration panel, from the Generator Type dropdown list, select the generator.

  3. Configure the generator options. For details about the available configuration options for each generator, go to the Generator reference.

Disabling examples for sparse collections

By default, Structural retrieves 100 documents. It then uses the data in these documents to populate example values in the hybrid document.

For sparsely populated collections, where less common fields are not present in those 100 documents, Structural retrieves extra documents until it has example values for all fields. For very sparsely populated collections, this might cause the collection view to load slowly, because it must retrieve many documents.

To disable examples for sparse collections, set the environment setting TONIC_MONGO_DISABLE_EXTRA_EXAMPLES to true. You can add this setting manually to the Environment Settings list on Structural Settings.

Note that this setting applies to both MongoDB and Amazon DynamoDB.

When this setting is true, fields that do not have a retrieved value use a dummy default value that is based on the data type.

Aligning email addresses to names

Structural license plans

Tonic Structural provides different license plans to accommodate organizations that are of different sizes and that have more or less complex data architectures.

Basic license

The Basic license is designed for very small organizations that have a very simple data architecture. It provides access to Structural's core de-identification and data generation features.

Users

The Basic license allows access for a single user, with an option to purchase an additional two users.

There is no access to single sign-on (SSO).

Data connectors

With a Basic license, you can create workspaces for one data connector type. The data connector type must be one of the following:

  • PostgreSQL

  • MySQL

Concurrent jobs

With a Basic license, your Structural instance can have only one Structural worker. This means that only one sensitivity scan or data generation job can run at a time.

Structural features

With a Basic license, you can create and configure workspaces, and run data generation for those workspaces.

You can use Privacy Hub to view the current sensitivity status based on the current workspace configuration.

The Basic license does NOT provide access to the following features:

  • Workspace inheritance

  • Workspace sharing

  • Generator presets

  • Custom sensitivity rules

  • Column commenting and email notifications

  • Audit Trail

  • Privacy Report

  • Virtual foreign keys - Can view foreign keys from the data, but cannot add virtual foreign keys

  • Subsetting

  • Upsert

  • Post-job scripts

  • Webhooks

  • Alerts for non-conflicting schema changes

  • Custom generators

  • Custom value processors

  • Custom permission sets

Structural API

With a Basic license, you only have access to the basic version of the Structural API.

You cannot use the basic Structural API to perform the following API tasks, which require the advanced API:

  • Assigning table modes to tables

  • Assigning generators to columns

Professional license

The Professional license is designed for larger organizations that have more complex data architectures. The organization might have a larger team that supports multiple databases.

The Professional license is also granted to pay-as-you-go subscriptions on Structural Cloud.

The Professional license provides access to a larger set of Structural features than the Basic license.

Users

The Professional license allows up to 10 users. You can purchase access for unlimited users as an add-on.

You can use single sign-on (SSO) to manage your Structural users.

Data connectors

With a Professional license, you can create workspaces for up to two types of data connectors. You can purchase one additional data connector type as an add-on.

Those data connectors can be of any type except for Oracle and Db2 for LUW.

Concurrent jobs

With a Professional license, your Structural instance can have more than one Structural worker.

This means that you can run multiple jobs from different workspaces at the same time. You can never run multiple jobs from the same workspace at the same time.

Structural features

With a Professional license, you can do the following:

  • Create and configure workspaces, and run data generation for those workspaces.

  • Use Privacy Hub to view the current sensitivity status for your workspace configuration.

  • Grant other users Manager and Editor access to your workspaces. The Professional license does not allow you to assign the built-in Viewer and Auditor permission sets.

  • Make comments on table columns. The comments can trigger email notifications.

  • Run post-job scripts and configure webhooks.

  • Use subsetting to generate a smaller destination database.

  • Create and manage generator presets.

  • Create and manage custom sensitivity rules.

  • Create virtual foreign keys.

  • Use upsert to add destination database records and update existing destination database records, but keep unchanged destination database records in place. The Professional license does not allow you to connect to migration scripts.

  • Use Schema Changes view to view and address both conflicting and non-conflicting changes to the source data schema.

  • Use Structural data encryption to have Structural decrypt source data, encrypt destination data, or both.

  • Request custom value processors, which are primarily developed to preserve encryption that can't be managed using Structural data encryption. You can also purchase custom generators.

The Professional license does NOT provide access to the following features:

  • Workspace inheritance

  • Secrets managers for database connections

  • Audit Trail

  • Privacy Report

  • Custom permission sets

  • Global permission set assignment

Structural API

With a Professional license, you only have access to the basic version of the Structural API.

You cannot use the basic Structural API to perform the following API tasks, which require the advanced API:

  • Assigning table modes to tables

  • Assigning generators to columns

Enterprise license

The Enterprise license is ideal for very large organizations that have multiple teams that support very large and complex data structures, and that might have more requirements related to scale and compliance.

It provides full access to all Structural features.

Users

An Enterprise instance does not limit the number of users.

Data connectors

You can use any number of any of the available data connectors.

The Enterprise license provides exclusive access to the Oracle and Db2 for LUW data connectors.

Structural features

The following features are exclusive to the Enterprise license:

  • Granting Viewer and Auditor access to workspaces

  • Audit Trail

  • Privacy Report

  • Workspace inheritance

  • Secrets managers for database connections

  • Upsert migration service

Structural API

The Enterprise license provides exclusive access to the advanced API.

The advanced Structural API provides access to all of the available API tasks, including the following tasks that are not available in the basic API:

  • Assigning table modes to tables

  • Assigning generators to columns

  • Managing generator presets

  • Managing custom sensitivity rules

Feature comparison across Structural license plans

The following table compares the available features for the Structural license plans.

Feature
Basic
Professional
Enterprise

Number of users

1

2 additional users available as add-ons

10

Unlimited users available as an add-on

Unlimited

1 data connector

PostgreSQL or MySQL

2 data connectors

1 additional data connector available as an add-on

Any data connector except for Oracle or Db2 for LUW

Unlimited number from any available data connector

Manager

Manager, Editor

Manager, Editor, Auditor, Viewer

Custom generators

Available for purchase

2 included Additional ones available for purchase

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

Concurrent jobs (more than 1 worker)

✓

✓

✓

✓

✓

✓

✓

Structural API

Writing output to a container repository

Requires Kubernetes.

For self-hosted Docker deployments, you can install and configure a separate Kubernetes cluster to use. For more information, go to .

For information about required Kubernetes permissions, go to .

Not compatible with upsert.

Not compatible with Preserve Destination or Incremental table modes.

Only supported for PostgreSQL, MySQL, and SQL Server.

You can configure a workspace to write destination data to a container repository instead of to a database server.

When Structural writes data generation output to a repository, it writes the destination data to a container volume. From the list of container artifacts, you can copy the volume digest, and download a Docker Compose file that provides connection settings for the database on the volume. Structural generates the Compose file when you make the request to download it. For more information about getting access to the container artifacts, go to .

You can also use the data volume to start a Tonic Ephemeral database. However, if the data is larger than 10 GB, we recommend that you write the data to an Ephemeral user snapshot instead. For information about writing to an Ephemeral snapshot, go to .

For an overview of writing destination data to container artifacts, you can also view the .

Indicating to write destination data to container artifacts

Under Destination Settings, to indicate to write the destination data to container artifacts, click Container Repository.

For a Structural instance that is deployed on Docker, unless you , the Container Repository option is hidden.

You can switch between writing to a database server and writing to a container repository at any time. Structural preserves the configuration details for both options. When you run data generation, it uses the currently selected option for the workspace.

Identifying the base image to use to create the container artifacts

From the Database Image dropdown list, select the image to use to create the container artifacts.

Select an image version that is compatible with the version of the database that is used in the workspace.

Providing a customization file for MySQL

For a MySQL workspace, you can provide a customization file that helps to ensure that the temporary destination database is configured correctly.

To provide the customization details:

  1. Toggle Use customization to the on position.

  2. In the text area, paste the contents of the customization file.

Setting the location for the container artifacts

To provide the location where Structural publishes the container artifacts:

  1. In the Registry field, type the path to the container registry where Structural publishes the data volume.

    Do not include the HTTP protocol, such as http:// or https://.

  2. In the Repository Path field, provide the path within the registry where Structural publishes the data volume.

    For a Google Artifact Registry (GAR) repository, the path format is PROJECT-ID/REPOSITORY/IMAGE.

    For more information about repository and image names, go to the .

Providing the credentials to write to the registry

You next provide the credentials that Structural uses to read from and write to the registry.

When you provide the registry, Structural detects whether the registry is from Amazon Elastic Container Registry (Amazon ECR), Google Artifact Registry (GAR), or a different container solution.

It displays the appropriate fields based on the registry type.

Fields for registries other than Amazon ECR or GAR

For a registry other than an Amazon ECR or a GAR registry, the credentials can be either a username and access token, or a secret.

The option to use a secret is not available on Structural Cloud.

In general, the credentials must be for a user that has read and write permissions for the registry.

The secret is the name of a Kubernetes secret that lives on the pod that the Structural worker runs on. The secret type must be kubernetes.io/dockerconfigjson. The Kubernetes documentation provides information on .

To use a username and access token:

  1. Click Access token.

  2. In the Username field, provide the username.

  3. In the Access Token field, provide the access token.

To use a secret:

  1. Click Secret name.

  2. In the Secret Name field, provide the name of the secret.

Azure Container Registry (ACR) permission requirements

For ACR, the provided credentials must be for a service principal that has sufficient permissions on the registry.

For Structural, the service principal must at least have the permissions that are associated with the.

Providing a service file for GAR

Structural only supports Google Artifact Registry (GAR). It does not support Google Container Registry (GCR).

For a GAR registry, you upload a service account file, which is a JSON file that contains credentials that provide access to Google Cloud Platform (GCP).

The associated service account must have the Artifact Registry Writer role.

For Service Account File, to search for and select the file, click Browse.

Amazon ECR registries

For an Amazon ECR registry, you can either:

  • Provide the AWS access and secret key that is associated with the IAM user that will connect to the registry

  • Provide an assumed role

  • (Self-hosted only) Use the credentials configured in the Structural environment settings TONIC_AWS_ACCESS_KEY_ID and TONIC_AWS_SECRET_ACCESS_KEY.

  • (Self-hosted only) If Structural is deployed in Amazon Elastic Kubernetes Service (Amazon EKS), then you can use the AWS credentials that live on the EC2 instance.

Using AWS access keys

To provide an AWS access key and secret key:

  1. Click Access Keys.

  2. In the AWS Access Key field, enter an AWS access key that is associated with an IAM user or role.

  3. In the AWS Secret Key field, enter the secret key that is associated with the access key.

  4. Optionally, in the AWS Session Token field, enter the session token to use for the connection.

Using an assumed role

To provide an assumed role:

  1. Click Assume Role.

  2. In the Role ARN field, provide the Amazon Resource Name (ARN) for the role.

  3. In the Session Name field, provide the role session name. If you do not provide a session name, then Structural automatically generates a default unique value. The generated value begins with TonicStructural.

  4. In the Duration (in seconds) field, provide the maximum length in seconds of the session. The default is 3600, indicating that the session can be active for up to 1 hour. The provided value must be less than the maximum session duration that is allowed for the role.

For the assumed role, Structural generates the external ID that is used in the assume role request. Your role’s trust policy must be configured to condition on your unique external ID.

Here is an example trust policy:

Using the credentials from the environment settings (self-hosted only)

On a self-hosted instance, to use the credentials configured in the environment settings, click Environment Variables.

Using the AWS credentials from the EC2 instance (self-hosted only)

On a self-hosted instance, to use the AWS credentials from the EC2 instance, click Instance Profile.

Required permissions for the IAM user

The IAM user must have permission to list, push, and pull images from the registry. The following example policy includes the required permissions.

For additional security, a repository name filter allows you to limit access to only the repositories that are used in Structural. You need to make sure that the repositories that you create for Structural match the filter.

For example, you could prefix Structural repository names with tonic-. In the policy, you include a filter based on the tonic- prefix:

Providing tags for the container artifacts

In the Tags field, provide the tag values to apply to the container artifacts. You can also change the tag configuration for individual data generation jobs.

Use commas to separate the tags.

A tag cannot contain spaces. Structural provides the following built-in values for you to use in tags:

  • {workspaceId} - The identifier of the workspace.

  • {workspaceName} - The name of the workspace.

  • {timestamp} - The timestamp when the data generation job that created the artifact completed.

  • {jobId} - The identifier of the data generation job that created the artifact.

For example, the following creates a tag that contains the workspace name, job identifier, and timestamp:

{workspaceName}_{jobId}_{timestamp}

To also tag the artifacts as latest, check the Tag as "latest" in your repository checkbox.

Specifying custom resources for the Kubernetes pods

You can also optionally configure custom resource values for the Kubernetes pods. You can specify the ephemeral storage, memory, and CPU millicores.

To provide custom resources:

  1. Toggle Set custom pod resources to the on position.

  2. Under Storage Size:

    1. In the field, provide the number of megabytes or gigabytes of storage.

    2. From the dropdown list, select the unit to use.

    The storage can be between 32MB and 25GB.

  3. Under Memory Size:

    1. In the field, provide the number of megabytes or gigabytes of RAM.

    2. From the dropdown list, select the unit to use.

    The memory can be between 512MB and 4 GB.

  4. Under Processor Size:

    1. In the field, provide the number of millicores.

    2. From the dropdown list, select the unit.

    The processor size can be between 250m and 1000m.

Setting a custom database name

Only available for PostgreSQL and SQL Server. Not available for MySQL.

In the Custom Database Name field, provide the name to use for the destination database.

If you do not provide a custom database name, then the destination database uses the same name as the source database.

Setting a custom database user password

In the Custom Password field, provide the password for the destination database user.

If you do not provide a password, then Structural generates a password.

The destination database username is always the default user for the database:

  • For PostgreSQL, postgres

  • For MySQL, root

  • For SQL Server, sa

Configuring the required tolerations for datapacker node taints

If your Kubernetes nodes are configured with taints, then on a self-hosted instance, you can configure the tolerations that enable the datapacker pods to be scheduled on the nodes. The datapacker pod hosts the temporary database that Structural uses during the data generation.

For an overview of taints and tolerations, go to the .

To configure the tolerations, you configure the following . You can add these settings to the Environment Settings list on Structural Settings.

  • CONTAINERIZATION_POD_NODE_TOLERATION_KEY - The toleration key value to apply to the datapacker pods. This setting is required. If you do not configure this setting, then Structural ignores the other settings.

  • CONTAINERIZATION_POD_NODE_TOLERATION_VALUES - A comma-separated list of toleration values to apply to the datapacker pods.

  • CONTAINERIZATION_POD_NODE_TOLERATION_EFFECT - The toleration effect to apply to the datapacker pods.

  • CONTAINERIZATION_POD_NODE_TOLERATION_OPERATOR - The toleration operator to apply to the datapacker pods.

Viewing workspace jobs and job details

Tonic Structural runs the following types of jobs on a workspace:

  • Sensitivity scans, which analyze the source database to identify sensitive data.

  • Collection scans, which analyze the source data for a MongoDB workspace to determine the available fields in each collection, the field types, and how prevalent the fields are.

  • Data generation, data pipeline generation, and containerized generation jobs, which generate the destination data from the source data.

  • Upsert data generation jobs, which generate the intermediate database from the source database.

  • Upsert jobs, which use data from the intermediate database to add new rows to and update changed rows in the destination database. If the migration process is enabled, then it is a step in the upsert job.

  • SDK table statistics jobs. These jobs only run when you use the SDK to generate data in a Spark workspace, and the assigned generators require the statistics.

You can view a list of jobs that ran on the workspace, and view details for individual jobs.

Viewing the list of jobs

The Jobs view displays the list of jobs that ran on the workspace. The list includes the 100 most recent jobs.

To display the Jobs view:

  • On the workspace management view, in the workspace navigation bar, click Jobs.

  • On Workspaces view, from the dropdown menu in the Name column, select Jobs.

Information in the job list

For each job, the job list includes the following information:

  • Job ID - The identifier of the job. To copy the job identifier, click the icon at the left of the row.

  • Type - The type of job.

  • Status - The current status of the job, and how long ago the job reached that status. When you hover over the status, a tooltip displays the actual timestamp for the status change, and the length of time that the job ran. For queued jobs, to display a panel with information about why the job is queued, click the status value.

  • Submitted - The date and time when the job was submitted.

  • Completed - The date and time when the job finished running.

Job statuses

A job can have one of the following statuses:

  • Queued - The job is queued to run, but has not yet started. A job is queued for one of the following reasons:

    • Another job is currently running on the same workspace. For example, you cannot run a sensitivity scan and a data generation, or multiple data generations, at the same time on the same workspace. This is true regardless of the number of workers on the instance. On Structural Cloud, there is also a limit on the number of concurrent running jobs for each organization. When that maximum is reached, a new job remains queued until a current running job completes.

    • There isn't an available worker on the instance to run the job. A Structural instance with one worker can only run one job at a time. If a job from one workspace is currently running, a job from another workspace cannot start until the first job is finished.

    To view information about why a job is queued, click the status value.

  • Running - The job is in progress.

  • Canceled - The job is canceled.

  • Completed - The job completed successfully.

  • Failed - The job failed to complete.

Each of these statuses has a corresponding "with warnings" status. For example, Running with warnings, Completed with warnings. A "with warnings" status indicates that the job had at least one warning at the time of the request.

Filtering the job list

You can filter the list by either the type or the status.

To filter the list by the job type:

  1. Click the filter icon in the Type column heading. By default, all types are included, and none of the checkboxes are checked.

  2. To only include specific types of jobs, check the checkbox next to each type to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.

To filter the list by the job status:

  1. Click the filter icon in the Status column heading. The status panel displays all of the statuses that are currently in the list. For example, if there are no Queued jobs, then the Queued status is not in the list. By default, all of the statuses are included, and none of the checkboxes are checked.

  2. To only include jobs that have specific statuses, check the checkbox next to each status to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.

Sorting the job list

You can sort the jobs by either the submission or completion timestamp.

To sort by submission date, click the Submitted column heading. To reverse the sort order, click the heading again.

To sort by completion date, click the Completed column heading. To reverse the sort order, click the heading again.

Viewing details for a selected job

For jobs other than Queued jobs, you can display details about the workspace and the job progress.

From the Jobs view, to display the details for a job, click the job row.

Workspace information

The left side of the job details view contains the workspace information.

For a sensitivity scan, the workspace information is limited to the owner, database type, and worker version.

For a data generation job, the workspace information also includes:

  • Whether subsetting, post-job scripts, or webhooks are used.

  • The number of schemas, tables, and columns in the source database.

  • The number of schemas, tables, and columns in the destination database.

Job Log

The Job Log tab shows the start date, start time, and duration of the job, followed by the list of job process steps.

Privacy Report

For data generation jobs, the Privacy Report tab displays the number of at-risk, protected, and not sensitive columns in the source database.

At-risk columns contain sensitive data, but still have Passthrough as the assigned generator.

Protected columns have an assigned generator other than Passthrough.

Not sensitive columns have Passthrough as the assigned generator, but do not contain sensitive data.

Ephemeral output details

A workspace can , with an option to preserve the temporary Ephemeral database that is used to create the snapshot.

For data generation jobs that write to Ephemeral, the Data available in Tonic Ephemeral panel displays. It contains a link to Ephemeral, and access to either the snapshot or the database.

Snapshot details

When the temporary database is not preserved, the Data available in Tonic Ephemeral panel provides access to the snapshot.

To navigate to Ephemeral and view the details for an Ephemeral snapshot, click View Snapshot in Tonic Ephemeral.

Database connection details

When the temporary database is preserved, the Data available in Tonic Ephemeral panel provides access to the database.

To display the connection details for the Ephemeral database, click View connection info.

For an Ephemeral database, the connection details include the database location and credentials. Each field contains a copy icon to allow you to copy the value.

Copying the job identifier

The job identifier is a unique identifier for the job. To copy the job identifier, either:

  • From the Jobs view, click the copy () icon in the leftmost column.

  • From the job details view, click the copy () icon next to the job identifier.

Canceling a job

You can cancel Queued or Running jobs.

For jobs with those statuses, the rightmost column in the job list contains a cancel icon.

To cancel the job, click the icon.

Downloading job information

For workspaces that are configured to write destination data to a container repository, the Jobs view also provides access to the generated artifacts. For more information, go to .

Job logs

Required workspace permission: Download job logs

To download diagnostic logs, you must have the Enable diagnostic logging global permission.

For all jobs, the job logs provide detailed information about the job processing. Tonic.ai support might request the job logs to help diagnose issues.

For a failed data generation to Ephemeral, the job logs include the Ephemeral logs and the destination database pod logs.

For upsert jobs where the migration process is enabled, and you configured the GET Schema Change Logs endpoint, the upsert job logs include the migration process logs.

Where to download the job logs

You can download the job logs from the Jobs view or the job details view. The download includes up to 1MB of log entries.

On the Jobs view, to download the logs for a job, click the download icon in the rightmost column.

On the job details view, to download the logs for a job, click Reports and Logs, then select Job Logs.

Downloading diagnostic logs

By default, Structural redacts sensitive values from the job logs. To help support troubleshooting, you can configure data connectors or an individual data generation job to create unredacted versions of the log files, referred to as diagnostic logs. For more information, go to .

To access diagnostic log files, you must have the Enable diagnostic logging global permission.

If you do not have the Enable diagnostic logging global permission, then you cannot download the logs for that job. The download option is disabled.

Privacy Report for data generation

Required workspace permission: View and download Privacy Report

From the job details view, you can download a Privacy Report file that provides an overview of the current protection status of the database columns based on the workspace configuration at the time that the job ran.

You can download either:

  • The Privacy Report .csv file, which provides details about the table columns, the column content, and the current protection configuration.

  • The Privacy Report PDF file, which provides charts that summarize the privacy ranking scores for the table columns. It also includes the table from the .csv file.

To display the download options, click Reports and Logs. In the menu:

  • To download the Privacy Report .csv file, click Privacy Report CSV.

  • To download the Privacy Report PDF file, click Privacy Report PDF.

For more information about the Privacy Report files and their content, go to .

Additional logs for output to a container repository

For a workspace that , the job includes the following additional logs:

  • Database logs - Logs for the database container that is used as the destination.

  • Datapacker logs - Logs for creating the OCI artifact and uploading it to an OCI registry.

To download these logs for a data generation job, on the job details view, click Reports and Logs, then select Database Logs or Datapacker Logs.

CloudWatch logs for data generation

For workspaces that are connected to Amazon Redshift or Snowflake on AWS databases, the data generation job requires multiple calls to a Lambda function. For these data generation jobs, the CloudWatch logs monitor the progress of and display errors for these Lambda function calls.

To download the CloudWatch logs for a data generation job, on the job details view, click Reports and Logs, then select CloudWatch Logs.

The CloudWatch Logs option only displays for Amazon Redshift and Snowflake on AWS data generation jobs.

Oracle SQL Loader log files

Required workspace permission: Download SqlLdr Files

For an Oracle data generation, if both of the following are true:

  • The data generation job ran SQL Loader (sqlldr).

  • sqlldr either failed or succeeded with errors.

Then to download the sqlldr log files, click Reports and Logs, then select sqlldr Logs.

Transformed files for file connector data generation

For a data generation from a file connector workspace that uses local files, you can download the transformed files for that job.

The download is a .zip file that contains the files for a selected file group.

On the job details view, when files are available to download, the Data available for file groups panel displays.

To download the files for a file group:

  1. Click Download Results.

  2. From the list, select the file group. Use the filter field to filter the list by the file group name.

Performance metrics for data generation

Required workspace permission: Download job logs

For workspaces that use the newer data generation processing, users can configure a data generation job to also . This is usually done for troubleshooting purposes.

On the job details view, to download the performance metrics for the job, click Reports and Logs, then click Performance Metrics.

Viewing a Gantt chart of a data generation job flow

This feature is currently in beta.

From the job details view, you can display a Gantt chart that shows the flow of a data generation job over time. The chart can help you to understand the different steps of a data generation job and how long it takes Structural to complete each step.

Note that this option is only available for data generation jobs that use the newer data generation process. For more information, go to . Data generation jobs that use the older process do not produce the Gantt chart.

To display the chart, click Reports and Logs, then select View Gantt.

The Job Visualization page displays the Gantt chart of the job progress.

Data connectors
Workspace permission sets (built-in)
Create workspaces
View Privacy Hub
Run sensitivity scans
Assign table modes
Assign generators to columns
Run data generation
View job details
Schema change monitoring (conflicting changes)
Schema change monitoring (non-conflicting changes)
Single sign-on (SSO)
Commenting and notifications
Structural data encryption
Custom value processors
Custom sensitivity rules
Generator presets
Virtual foreign keys
Subsetting
Upsert
Post-job scripts
Webhooks
Workspace inheritance
Configure secrets managers for database connections
Privacy Report
Audit Trail
Custom permission sets
Basic
Basic
Advanced
Configuring generator presets
{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Principal": {
      "AWS": "<originating-account-id>"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": {
        "sts:ExternalId": "<external-id>"
      }
    }
  }
}
{
  {
    "Sid": "ManageTonicRepositoryContents",
    "Effect": "Allow",
    "Action": [
      "ecr:DescribeRepositories",
      "ecr:ListImages",
      "ecr:DescribeImages",
      "ecr:BatchGetImage",
      "ecr:BatchCheckLayerAvailability",
      "ecr:InitiateLayerUpload",
      "ecr:UploadLayerPart",
      "ecr:CompleteLayerUpload",
      "ecr:PutImage"
    ],
    "Resource": [
       "arn:aws:ecr:<region>:<account_id>:repository/<optional name filter>"
    ]
  },
  {
    "Sid": "GetAuthorizationToken",
    "Effect": "Allow",
    "Action": [
      "ecr:GetAuthorizationToken"
    ],
    "Resource": "*"
  }
}
"Resource": [
  "arn:aws:ecr:<region>:<account_id>:repository/tonic-*"
]
Setting up a Kubernetes cluster to use to write output data to a container repository
Required access to write destination data to a container repository
Viewing and downloading container artifacts
Writing output to Tonic Ephemeral
video tutorial
set up a separate Kubernetes cluster
Google Cloud documentation
how to create a registry credentials secret
AcrPush role
Kubernetes documentation
environment settings
Diagram showing how data is written to and accessed from a container artifact
write output to a Tonic Ephemeral snapshot
Viewing and downloading container artifacts
Redacted and diagnostic (unredacted) logs
Using the Privacy Report to verify data protection
writes the output to a container repository
generate performance metrics
Jobs view
Job type filter options for Jobs view
Job status filter options for Jobs view
Job details page for a data generation job
Privacy Report tab on the job details page
Data available in Tonic Ephemeral panel when the temporary database is not preserved
Data available in Tonic Ephemeral panel when the temporary database is preserved
Job list with a Running job that can be canceled
Reports and Logs menu for a data generation job
Reports and Logs dropdown list for generation to a container repository
Job details option to download transformed file connector files
Job Visualization page with a Gantt chart of the job flow
Starting an upsert job based on the most recent data generation
Determining the data generation process to use (Oracle, SQL Server, MySQL only)
Ephemeral documentation
Ephemeral documentation
Managing custom images in the Ephemeral documentation
Configuring an allowlist for Ephemeral Cloud database connections
Indicating whether a column is sensitive

Privacy Hub

About Privacy Hub

Privacy Hub tracks the current protection status of source data columns based on:

  • Column sensitivity, either from the most recent sensitivity scan or from manual assignments

  • Assigned table modes

  • Assigned generators

Privacy Hub

To display Privacy Hub, either:

  • On the workspace management view, in the workspace navigation bar, click Privacy Hub.

  • On Workspaces view, click the workspace name.

From Privacy Hub, you can:

  • Review and apply the recommended generators for all detected sensitive columns

  • View the current protection status of columns

  • Manually mark columns as sensitive or not sensitive

  • Configure protection for sensitive columns

  • Download a preview Privacy Report

  • Run a new sensitivity scan

You can also track the history of changes to column sensitivity and the assigned column generators. For more information, go to Tracking changes to workspaces, generator presets, and sensitivity rules.

Viewing the count of detected sensitive columns that are not protected

The sensitivity scan detects specific types of sensitive data.

If your workspace contains any columns that the sensitivity scan identified, and for which you have not either:

  • Assigned a generator

  • Marked as not sensitive

Then Tonic Structural displays a Sensitivity Recommendations banner that contains a count of those columns.

Sensitivity Recommendations banner on Privacy Hub

The count only includes sensitive columns that the sensitivity scan detects. If you manually mark a column as sensitive, it is not included in the list.

On the banner, the Review Recommendations option allows you to review the detected columns and the recommended generators for each detected sensitive data type.

You can then apply the recommended generators or ignore the recommendations. When you ignore a recommendation, you either:

  • Indicate to remove the generator recommendation for the column.

  • Indicate that the column data is not sensitive.

For more information, go to Reviewing and applying recommended generators.

Viewing the protection status for each column

The protection status panels at the top of Privacy Hub provide an overview of the current protection status of the columns in the source data.

Protection status panels

Each panel displays:

  • The number of columns that are in that category.

  • The estimated percentage of columns that are in that category.

Note that for a JSON column that uses Document View, the protection status displays a separate box for each combination of JSON path and data type.

From each panel, you can display details for and configure protection for each column.

The column counts do not include columns that do not have data in the destination database. For example, if a table is assigned Truncate table mode, then Privacy Hub ignores the columns in that table.

The information on these panels updates automatically as you change whether columns are sensitive and assign generators to columns.

At-Risk Columns

The At-Risk Columns panel reflects columns that:

  • Are populated in the destination database.

  • Are marked as sensitive.

  • Have the generator set to Passthrough, which indicates that Structural does not perform any transformation on the data.

For each column, the At-Risk Columns panel also indicates the sensitivity confidence, from full confidence (completely red) to low confidence (a small percentage of red).

The goal is to have 0 at-risk columns.

When you click Open in Database View, you navigate to Database View. The column list is filtered to show columns that are at risk.

Protected Columns

The Protected Columns panel reflects columns that:

  • Are populated in the destination database.

  • Are assigned a generator other than Passthrough.

It includes both sensitive and non-sensitive columns.

Note that a column is considered protected based solely on the assigned generator. Some more complex generators, such as JSON Mask or Conditional, allow you to apply different generators to specific portions of a value or based on a specific condition. However, the protection status does not reflect these sub-generators. An applied sub-generator could be Passthrough.

When you click Open in Database View, you navigate to Database View. The column list is filtered to show all included columns that are protected.

Not Sensitive Columns

The Not Sensitive Columns panel reflects columns that:

  • Are populated in the destination database.

  • Are marked as not sensitive.

  • Have the generator set to Passthrough.

When you click Open in Database View, you navigate to Database View. The column list is filtered to show included columns that are not sensitive and are not protected.

Viewing the protection status for each table

The Database Tables list shows the protection status for each table in the source database. You can view the number of columns that have each protection status, and update the column configuration.

The list does not include tables where the table mode is Truncate or Preserve Destination. Truncated tables are not populated in the destination database. For Preserve Destination tables, the existing data in the destination database does not change.

Information in the list

For each table, Database Tables provides the following information:

  • Name - The table name. For a file connector workspace, each table corresponds to a file group. Each JSON column that uses Document View is also in a separate row. For JSON columns, the Name column displays both the table name and the column name.

  • Not Sensitive - The number of not sensitive columns in the table. Not sensitive columns are not marked as sensitive and have Passthrough as the generator. When you click the value, you navigate to Database View, filtered to display the not sensitive columns for the table.

  • Protected - The number of protected columns in the table. Protected columns have an assigned generator. A protected column can be either sensitive or not sensitive. When you click the value, you navigate to Database View, filtered to display the protected columns for the table.

  • At-Risk - The number of at-risk columns in the table. These columns are marked as sensitive, but have Passthrough as the generator. The goal is to have 0 unprotected sensitive columns. When you click the value, you navigate to Database View, filtered to display the at-risk columns for the table.

  • Privacy Status - Indicates the current protection status of the columns in the table. It provides the same view and configuration options as the protection status panels at the top of Privacy Hub.

Filtering the list

You can filter the Database Tables list either by the table name or by the schema.

Filtering by table name

To filter the list by table name, in the filter field, begin to type text that is in the table name. As you type, Structural updates the list to only display matching tables.

Filtering by schema

To filter the list to only include tables that belong to a specific schema:

  1. Click Filter by Schema.

  2. From the schema dropdown list, select the schema.

When you select a schema, Structural adds it to the filter field.

Sorting the list

You can sort the Database Tables list by any column except for the Privacy Status column.

To sort by a column, click the column heading. To reverse the sort order, click the heading again.

Managing columns from the table list

The Privacy Status column in the Database Tables list indicates the protection status of the columns in the table.

This column provides the same options to view and configure columns as the protection status panels at the top of Privacy Hub, but is limited to the columns in a specific table.

Viewing and configuring columns

Navigating through columns and viewing column details

Each protection status panel displays a series of boxes to represent the columns that apply to that status. For example, if the source data contains four columns that are at-risk, then the At-Risk Columns panel displays four boxes, one for each column.

The Privacy Status column in the Database Tables list displays the same set of boxes for the columns in an individual table.

If the number of columns is too large to fit, then the last box shows the number of additional columns that apply. For example, if there are 15 columns that don't fit, then the last box is labeled +15.

When you hover over a box, the column name displays in a tooltip.

When you click a box, the details panel for that column displays.

Settings view of column details panel

When you click the box for remaining columns, the details panel for the first column in the remaining columns displays.

You can use the next and previous icons at the bottom right of the details panel to display the details for the next or previous column.

The column details panel opens to the settings view. The settings view contains the following information:

  • The table and column name.

  • Whether the column is flagged as sensitive.

  • The type of sensitive data that the column contains.

  • The data type for the column data.

  • The generator that is assigned to the column.

  • For a child workspace, whether the column configuration is inherited from the parent workspace. For columns that have overrides, you can reset to the parent configuration.

Indicating whether a column is sensitive

Required workspace permission: Configure column sensitivity

From the settings view of the column details, you can configure the column sensitivity.

You cannot change the sensitivity of columns in a child workspace. A child workspace always inherits the sensitivity from its parent workspace. For more information, go to About workspace inheritance.

As you change the column sensitivity, Structural updates the protection status panels.

To change whether the column is sensitive, toggle the Sensitive option. The column is moved if needed to reflect its new status. However, you remain on the current panel.

For example, from the At-Risk Columns panel, you change a column to be not sensitive. The column is moved to the Not Sensitive Columns panel. When you click the next or previous icons, you view the details for the next or previous column on the At-Risk Columns panel.

Selecting and configuring a generator for the column

Required workspace permission: Configure column generators

From the column details, you can assign and configure the column generator.

When you change the column generator, Structural updates the protection status panels.

If the column generator was previously Passthrough, then the column is moved to the Protected Columns panel. However, you remain on the current panel. For example, you assign a generator to a column that is on the At-Risk Columns panel. The column is moved to the Protected Columns panel, but when you click the next or previous icons, you view the details for the next or previous column on the At-Risk Columns panel.

Selecting the generator

For sensitive columns that are not protected, Structural displays the recommended generator as a button.

For self-hosted instances that have an Enterprise license, the recommended generator is the built-in generator preset.

To assign the recommended generator to the column, click the button.

Otherwise, select the generator from the Generator Type dropdown list.

For more information about selecting a generator, go to Assigning and configuring generators.

Configuring the generator

If the selected generator requires additional configuration, then below the Generator Type dropdown list is an Edit Generator Options link.

Column details panel with generator selected

To display the configuration fields for the generator, click Edit Generator Options.

Configuration options for a selected generator

For information about configuring a selected generator or generator preset, go to Assigning and configuring generators.

After you configure the generator, to return to the settings view, click Back.

Displaying sample data for a column

Required workspace permission:

  • Source data: Preview source data

  • Destination data: Preview destination data

From the column details, you can display sample data for the column. The sample data allows you to compare the source and destination versions of the column values.

To display the sample data, click the view sample (magnifying glass) icon.

On the sample data view of the column details:

  • The Original Data tab shows the values in the source data.

  • The Protected Output tab shows the values that the generator produced.

Sample data view on the column details panel

Enabling Document View for JSON columns

Supported only for the file connector and PostgreSQL.

For a JSON column, instead of assigning a generator, you can enable Document View.

From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to Document View for JSON columns.

To enable Document View, on the column details panel, toggle Use Document View to the on position. When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.

Commenting on a column

Required license: Professional or Enterprise

From the column details, you can view and add comments on the column. You might use a comment to explain why you selected a particular generator or marked a column as sensitive or not sensitive.

From the column details, to display the comments for the column, click the comment icon.

The comments view displays any existing comments on the column. The most recent comment is at the bottom of the list. Each comment includes the name of the user who made the comment.

To add the first comment to a column, type the comment into the comment text area, then click Comment.

To add an additional comment, type the comment into the comment text area, then click Reply.

Comment view of the column details panel

Downloading a preview Privacy Report

Required license: Enterprise

The Privacy Report files that you download from Privacy Hub or the workspace download menu provide an overview of the current protection status based on the current configuration.

This is different from the Privacy Report files that you download from the data generation job details, which show the protection status for the data produced by that data generation.

You can download either:

  • The Privacy Report .csv file, which provides details about the table columns, the column content, and the current protection configuration.

  • The Privacy Report PDF file, which provides charts that summarize the privacy ranking scores for the table columns. It also includes the table from the .csv file.

For more information about the Privacy Report files and their content, go to Using the Privacy Report to verify data protection.

From workspace management view

To download the report from the workspace management view, click the download icon. In the download menu:

Download menu for a workspace
  • To download the Privacy Report PDF file, click Download Privacy Report PDF.

  • To download the Privacy Report .csv file, click Download Privacy Report CSV.

From Privacy Hub

To download the report from Privacy Hub, click Reports and Logs, then:

Reports and Logs menu on Privacy Hub
  • To download the Privacy Report .csv file, click Privacy Report CSV.

  • To download the Privacy Report PDF file, click Privacy Report PDF.

Running a new sensitivity scan on the data

Required workspace permission: Run sensitivity scan

Privacy Hub provides an option to manually start a new sensitivity scan. For example, you might want to run a new sensitivity scan when:

  • You add columns to the source database. The new scan identifies whether the new columns contain sensitive data.

  • The data in a column changes significantly, and a column that Structural originally marked as not sensitive might now contain sensitive data.

You cannot run a sensitivity scan on a child workspace. Child workspaces always inherit the sensitivity results from their parent workspace.

To run a new sensitivity scan, click Run Sensitivity Scan.

Buttons at the top of Privacy Hub

When Structural runs a new sensitivity scan:

  • Structural analyzes and determines the sensitivity of any new columns.

  • It does not change the sensitivity of existing columns that you marked as sensitive or not sensitive.

  • For existing columns that you did not change the sensitivity of:

    • Structural does not change the sensitivity of columns that the original scan marked as sensitive.

    • It can change the sensitivity of columns that the original scan marked as not sensitive.

The protection status panels are updated to reflect the results of the new scan.

Getting started with the Structural free trial

If you are a user who wants to set up an account in an existing Tonic Structural Cloud or self-hosted organization, go to .

About the Structural free trial

The Structural 14-day free trial allows you to explore and experiment in Structural Cloud before you decide whether to purchase Structural.

When you sign up for a free trial, Structural automatically creates a sample workspace for you to use. You can also create a workspace that uses your own database or files.

The free trial provides tools to introduce you to Structural and to guide you through configuring and completing a data generation.

Structural tracks and displays the amount of time remaining in your free trial. You can request a demonstration and contact support.

When the free trial period ends, you can continue to use Structural to configure workspaces. You can no longer generate data or train models. Contact Tonic.ai to discuss purchasing a Structural license, or select the option to .

Signing up for the free trial

To start a new free trial of Structural:

  1. Go to .

  2. Click Create Account.

On the Create your account dialog, to create an account, either:

  • To use a corporate Google email address to create the account, click Create account using Google.

  • To create a new Structural account:

    1. Enter your email address. You cannot use a public email address for a free trial account.

    2. Create and confirm a Structural password.

    3. Click Create Account.

Structural sends an activation link to your email address.

After you activate your account and log in, Structural next prompts you to select the use case that best matches why you are exploring Structural.

If none of the provided use cases fits, use the Other option to tell us about your use case.

After you select a use case, click Next. The Create Your Workspace panel displays.

Determining whether to use your own data

When you sign up for a free trial, Structural provides access to a sample PostgreSQL workspace that you can use to explore how to configure and run data generation.

You can also choose to create a workspace that uses your own data, either from local files or from a database.

If you do connect to your own data, then you must allowlist the Structural static IP addresses. For more information, go to .

On the Create your workspace panel:

  • To use the sample workspace, click Use a sample workspace, then click Next. Structural displays , which summarizes the protection status for the source data. It also displays the and the .

  • To create a workspace that uses local files as the source data, click Upload Files, then click Next. Go to .

  • To create a new workspace that uses your own data, click Bring your own data, then click Next. Go to .

Uploading files

The Upload files option creates a local files workspace. The source data consists of groups of files selected from a local file system. The files in a file group must have the same type and structure. Each file group becomes a "table" in the source data.

For other workspaces that you create during the free trial, you can also create a file connector workspace that uses files from cloud storage ( Amazon S3 or Google Cloud Storage).

After you select Upload files and click Next, you are prompted to provide a name for the workspace.

In the field provided, enter the name to use for the workspace, then click Next.

Structural displays the File Groups view, where you can .

It also displays the with links to resources to help you get started.

After you create at least one file group, you can start to use the other Structural features and functions.

Connecting to a database

If you connect to your own data, then you must allowlist the Structural static IP addresses. For more information, go to .

Provide a name for your workspace

If you choose to create a workspace with your own data, then the first step is to provide a name for the workspace.

In the field provided, enter the name to use for your first workspace, then click Next.

The Invite others to Tonic panel displays.

Invite other users to Structural and your workspace

Under Invite others to Tonic, you can optionally invite other users with the same corporate email domain to start their own Structural free trial. The users that you invite are able to view and edit your workspace.

For example, you might want to invite other users if you don't have access to the connection information for the source data. You can invite a user who does have access. They can then update the workspace configuration to add the connection details.

To continue without inviting other users, click Skip this step.

To invite users:

  1. For each user to invite, enter the email address, then press Enter. The email addresses must have the same corporate email domain as your email address.

  2. After you create the list of users to invite, click Next.

The Add source data connection view displays.

Supported databases for free trial workspaces

The final step in the workspace creation is to provide the source data to use for your workspace.

Structural provides data connectors that allow you to connect to an existing database. Each data connector allows you to connect to a specific type of database. Structural supports several types of application databases, data warehouses, and Spark data solutions.

For the first workspace that you create using the free trial wizard, you can choose:

For subsequent workspaces that you create from Workspaces view, you can also choose , , and .

Selecting the database type

To connect to an existing database, on the Add source data connection panel, click the data connector to use, then click Add connection details.

The panel also includes a Local files option, which creates a local files file connector workspace, the same as the Upload files option.

Use the connection details fields to provide the connection information for your source data. The specific fields depend on the type of data connector that you select.

After you provide the connection details, to test the connection, click Test Connection.

To save your workspace, click Save.

Structural displays , which summarizes the protection status for the source data.

It also displays the with links to resources to help you get started.

Free trial resources

The Structural free trial includes a couple of resources to introduce you to Structural and to guide you through the tasks for your first data generation.

Getting Started Guide panel

The Getting Started Guide panel provides access to Structural information and support resources.

The Getting Started Guide panel displays automatically when you first start the free trial. To display the Getting Started Guide panel manually, in the Structural heading, click Getting Started.

The Getting Started Guide panel provides links to Structural instructional videos and this Structural documentation. It also contains links to request a Structural demo, contact Tonic.ai support, and purchase a Structural Cloud pay-as-you-go subscription.

Quick start checklist

For each free trial workspace, Structural provides access to a workspace checklist.

The checklist displays at the bottom left of the workspace management view. It displays automatically when you display the workspace management view. To hide the checklist, click the minimize icon. To display the checklist again, click the checklist icon.

The checklist provides a basic list of tasks to perform in order to complete a Structural data generation.

Each checklist task is linked to the Structural location where you can complete that task. Structural automatically detects and marks when a task is completed.

The checklist tasks are slightly different based on the type of workspace.

Checklist for database-based workspaces

For workspaces that are connected to a database, including the sample PostgreSQL workspace and workspaces that you connect to your own data, the checklist contains:

  1. Connect a source database - Set the connection to the source database. In most cases, you set the source connection when you create the workspace. When you click this step, Structural navigates to the Source Settings section of the workspace details view.

  2. Connect to destination database - Set the location where Structural writes the transformed data. When you click this step, Structural navigates to the Destination Settings section of the workspace details view.

  3. Apply generators to modify dataset - Configure how Structural transforms at least one column in the source data. When you click this step:

    • If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.

    • If there are no available generator recommendations, then Structural navigates to Database View.

  4. Generate data - Run the data generation to produce the destination data. When you click this item, Structural navigates to the Confirm Generation panel.

Checklist for local file workspaces

For workspaces that use data from local files, the checklist contains:

  1. Create a file group - Create a file group with files that you upload from a local file system. Each file group becomes a table in the workspace. When you click this step, Structural navigates to the File Groups view for the workspace.

  2. Apply generators to modify dataset - Configure how Structural transforms at least one column in the source files. When you click this step:

    • If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.

    • If there are no available generator recommendations, then Structural navigates to Database View.

  3. Generate data - Run the data generation to produce transformed versions of the source files. When you click this step, Structural navigates to the Confirm Generation panel.

  4. Download your dataset - Download the transformed files from the Structural application database.

Checklist for cloud storage file workspaces

For workspaces that use data from files in cloud storage (Amazon S3 or Google Cloud Storage), the checklist contains:

  1. Configure output location - Configure the cloud storage location where Structural writes the transformed files. When you click this step, Structural navigates to the Output location section of the workspace details view.

  2. Create a file group - Create a file group that contains files selected from cloud storage. When you click this step, Structural navigates to the File Groups view for the workspace.

  3. Apply generators to modify dataset - Configure how Structural transforms at least one column in the source data. When you click this step:

    • If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.

    • If there are no available generator recommendations, then Structural navigates to Database View.

  4. Generate data - Run the data generation to produce transformed versions of the source files. When you click this step, Structural navigates to the Confirm Generation panel.

Next step hints

In addition to the workspace checklists, Structural uses next step hints to help guide you through the workspace configuration and data generation.

When a next step hint is available, it displays as an animated marker next to the suggested next action.

When you hover over the highlighted action, Structural displays a help text popup that explains the recommended action.

When you click the highlighted action, the hint is removed, and the next hint is displayed.

Creating a file group

For a file connector workspace, to identify the source data, you create file groups. A file group is a set of files of the same type and with the same structure. Each file group becomes a table in the workspace. For CSV files, each column becomes a table column. For XML and JSON file groups, the table contains a single XML or JSON column.

On the File Groups view, click Create File Group.

Uploading local files

For a file connector workspace that uses local files, you can either drag and drop files from your local file system to the file group, or you can search for and select files to add. For more information, go to .

Selecting files from cloud storage

For a file connector workspace that uses cloud storage, you select the files to include in the file group. For more information, go to .

Configuring file delimiters and settings

For files that contain CSV content, you configure the delimiters and other file settings. For more information, go to .

Assigning a generator

To get value out of the data generation process, you assign generators to the data columns.

A generator indicates how to transform the data in a column. For example, for a column that contains a name value, you might assign the Name generator, which indicates how to generate a replacement name in the generation output.

Applying all recommendations

For sensitive columns that Structural detects, Structural can also provide a recommended generator configuration.

When there are recommendations available, Privacy Hub displays a link to review all of the recommendations.

The Recommended Generators by Sensitivity Type panel displays a list of sensitive columns that Structural detected, along with the suggested generators to apply.

After reviewing, to apply all of the suggested generators, click Apply All. For more information about using this panel, go to .

Selecting a generator

You can also choose to apply an individual generator manually. You can do this from , , or .

To display Database View, on the workspace management view, click Database View.

On Database View, in the column list, the Applied Generator column lists the currently assigned generator for each column. For a new workspace, the columns are all assigned the Passthrough generator. The Passthrough generator simply passes the source value through to the destination data without masking it.

Click a column that is marked as Passthrough, and that is not marked as sensitive. For example, in the sample workspace, the customers.Last_Transaction column. The column configuration panel displays. To select a generator, click the generator dropdown. The list contains generators that can be assigned to the column based on the column data type. For customers.Last_Transaction, the Timestamp Shift generator is a good option.

Assigning a recommended generator

For Passthrough columns that Structural identified as containing sensitive data, the Applied Generator column displays an icon to indicate that there is a recommended generator.

In Database View, click one of those columns. For example, in the sample workspace, the customers.email column is marked as containing an email address.

For customers.Email, click the generator dropdown. Instead of the column configuration panel, there is a panel that indicates the recommended generator. For customers.Email, the recommended generator is Email. To assign the Email generator, click Apply. The column configuration panel displays with the generator assigned.

Configuring the destination location

To run a data generation, Structural must have a destination for the transformed data.

For a local files workspace, Structural saves the transformed files to the application database.

For workspaces that use data from a database, and for workspaces that use cloud storage files, you configure where Structural writes the output data.

Available output options

The destination location for data generation output can be one of the following:

  • If the data connector supports Tonic Ephemeral, then the default option is to .

  • For database-based data connectors, you can write the transformed data to a destination database.

  • For some Structural data connectors, Structural can .

  • For file connector workspaces that transform files from cloud storage (Amazon S3 or Google Cloud Storage), you .

Displaying the current destination configuration

To display the destination configuration for the workspace:

  1. Click the Workspace Settings tab.

  2. Scroll to the Destination Settings section or, for a file connector workspace that uses cloud storage files, scroll to the Output location section.

Confirming or changing the destination configuration

Ephemeral snapshot

For data connectors that Ephemeral supports, the default option is to write the output to Ephemeral.

For the Ephemeral option, the default configuration is:

  • Structural writes the output to Ephemeral Cloud. If you do not have an Ephemeral Cloud account, then we create an Ephemeral free trial account for you. If your organization has a self-hosted Ephemeral instance, then you can choose to write the output to that instance. Note that all workspaces in the same organization or for the same self-hosted Structural instance must use the same Ephemeral instance.

  • Structural uses the output data to create an Ephemeral user snapshot. You can use the user snapshot to create Ephemeral databases.

  • When Structural creates the user snapshot in Ephemeral, it creates a temporary Ephemeral database to use as the basis for the user snapshot. There is an option to keep that temporary database. For a free trial workspace, this option is enabled by default. The database expires after 48 hours.

For details about how to configure Structural to write output to Ephemeral, go to . For more information about Ephemeral, go to the .

Destination database

To write the data to a destination database, click Database Server. Structural displays the configuration fields for the destination database.

For information on how to configure the destination information for a specific data connector, go to the workspace configuration information for that data connector. The contains a list of the available data connectors, and provides a link to the documentation for each data connector.

Container repository

To write the data to a data volume in a container repository, click Container Repository. Structural displays the configuration fields to select a base image and provide the details about the repository.

For more information, go to .

Cloud storage files output location

For a file connector workspace that uses files from cloud storage (Amazon S3 or Google Cloud Storage), you configure the cloud storage output location where Structural writes the transformed files. The configuration includes the required credentials to use.

For more information, go to .

Running data generation

After you complete the workspace and generator configuration, you can run your first data generation.

The data generation process uses the assigned generators to transform the source data. It writes the transformed data to the configured destination location.

For a local files workspace, it writes the files to the Structural application database.

Starting the generation

The Generate Data option is at the top right of the Tonic heading.

When you click Generate Data, Structural displays the Confirm Generation panel.

The Confirm Generation panel provides access to the current destination configuration, along with other advanced generation options such as subsetting and upsert.

It also indicates if there are any issues that prevent you from starting the data generation. For example, if the workspace does not have a configured destination, then Structural cannot run the data generation.

To start the data generation, click Run Generation. For more information about running data generation, go to .

For a new Tonic Ephemeral account, the first time that you run data generation, you also receive an activation email message for the account.

Viewing the job details and connecting to an Ephemeral database

To view the job status and details:

  1. Click Jobs.

  2. In the list, click the data generation job.

For a data generation that writes the output to an Ephemeral database, the Data Available in Tonic Ephemeral panel provides access to the database connection information.

To display the connection details, click Connecting to your database.

The connection details include the database location and credentials. Each field contains a copy icon to allow you to copy the value.

Next steps for free trial users

The first time that you complete all of the steps in a checklist, Structural displays a panel with options to chat with our sales team, schedule a demo, or purchase a subscription.

You can also continue to get to know Structural and experiment with other Structural features such as or using to mask more complex values such as JSON or XML.

If your free trial has expired, to get an extension, you can reach out to us using either the in-app chat or an email message.

Creating a new account in an existing organization
start a Structural Cloud pay-as-you-go subscription
app.tonic.ai
Privacy Hub
Getting Started Guide panel
quick start checklist
file connector
set up the file groups for the workspace
Getting Started Guide panel
Google BigQuery
MongoDB
MySQL
PostgreSQL
Snowflake on AWS
Snowflake on Azure
SQL Server
Databricks
Salesforce
Amazon DynamoDB
Privacy Hub
Getting Started Guide panel
Reviewing and applying recommended generators
Privacy Hub
Database View
Table View
write the output data to Ephemeral
write the transformed data to a data volume in a container repository
configure the cloud storage location where Structural writes the transformed files
Writing output to Tonic Ephemeral
data connector summary
Writing output to a container repository
Configuring the file connector storage type and output options
Running data generation jobs
subsetting
composite generators
Use case selection for a free trial account
Create your workspace panel with workspace options
Field to specify the workspace name
Field to specify the name of your first workspace
Option to invite other users to create an account
Available data connectors for a free trial workspace
Connection details for a data connector
Getting Started Guide panel
Link to the Getting Started Guide panel in the Tonic Structural heading
Icon to display the quick start checklist
Next step hint for a recommended action
Next step hint help text
Recommended generators panel
Workspace management options for data generation configuration
Selecting a generator for a column
Generator recommendation for a sensitive column
Destination Settings section for a workspace
Generate Data button to start data generation
Confirm Generation panel
I allowlist access to my database. What are your static IP addresses?
Uploading files
Connecting to a database
I allowlist access to my database. What are your static IP addresses?
Ephemeral documentation
Selecting local files
Selecting cloud storage or file mount files
Configuring delimiters and file settings for .csv files

Generator summary

The following table summarizes the available generators. The table includes generator characteristics that you might take into account when you select the generator to use for a column.

Generator hints and tips also provides some suggestions for generators to use for specific use cases.

Information in the table

The generator summary includes the following columns:

  • Generator - The name of the generator, linked to the entry in the generator reference.

  • Description - An overview description of the generator.

  • Supported features - Includes the following information:

    • The generator characteristics that the generator supports

    • Whether the generator is a composite generator or a primary key generator

    • The generator privacy ranking

Generator
Description
Supported features

API:

Generates replacement values for U.S. mailing addresses. You select the address component or format for the replacement values. For example, the column might only contain a street address or a postal code, or it might contain a full address.

Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Identifies the algebraic relationship between 3 or more numeric values, including at least one non-integer. Based on the relationship, generates new values to match. If there is no relationship, uses the Categorical generator.

Linkable - linking is required Privacy ranking: 3

API:

Generates unique alphanumeric strings of the same length as the input. For example, for the origin value ABC123, the output value is a six-character alphanumeric string such as D24N05.

Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Within an array, replaces letters with random other letters, and numbers with random other numbers. Preserves punctuation and whitespace.

Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Used to transform array values in JSON.

To identify values to transform, you provide a list of JSONPaths. For each JSONPath, you assign a sub-generator to apply to matching values.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

API:

Used to transform values in an array. To identify values to transform, you provide a regular expression. For each capture group in an expression, you assign a sub-generator to apply to matching values.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

API:

Generates unique alpha-numeric strings based on any printable ASCII characters. You can optionally exclude lowercase letters from the generated values. The replacement value does not preserve the length of the original value.

Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Generates a random company name-like string.

Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Shuffles the original values for a column to different rows. Maintains the overall frequency of each value. For example, a column contains the values Small (3 times), Medium (4 times), and Large (5 times). In the transformed data, each value appears the same number of times, but the values are shuffled to different rows.

Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy

API:

Replaces letters with random other letters and numbers with random other numbers. Preserves punctuation, whitespace, and mathematical symbols.

Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Replaces characters with other random characters. Preserves punctuation, capitalization, and whitepace. A replacement character is always from within the same Unicode Block as the source character. A source character is always mapped to the same destination character. For example, M might always map to V.

Always self-consistent Unique columns allowed Privacy ranking: 4

(Deprecated) API:

This generator is deprecated. Use the generator instead. Generates a random company name-like string.

Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Applies different generators to rows conditionally based on the column value. For example, apply the Character Scramble generator for values other than Test. You configure a list of conditions. Each condition performs a check against the column value. For each condition, you assign a sub-generator to apply to matching values.

Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: If a fallback generator is selected, then the lower of 5 or the fallback generator. 5 if no fallback generator is selected.

API:

Uses a single specified value to replace all of the values in the column. The replacement value must be compatible with the column data type.

Differential privacy Data-free Privacy ranking: 1

API:

Generates a continuous distribution to fit the underlying data. Can link to other columns to create multivariate distributions. Can also be partitioned by other columns.

Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy

API:

Populates the column using the sum of values from a column in another table. To select the rows to use, uses a foreign key value that matches the primary key value for the current row. For example, to transform the Total_Sales column in the Customers table, from the Transactions table, use the sum of the Amount values for rows where the Customer_ID value matches the primary key value for the current customer.

Privacy ranking: 3

API:

Used to mask text in a delimited format.

Parses the text as a row where the columns are delimited by a specified character. For each index, you assign a sub-generator to apply to the index value.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

API:

Replaces the original column value with a value from list of values that you provide.

Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Truncates dates or timestamps to a specific date or time component. For example, you might truncate a date value to the month or a timestamp to the hour.

Privacy ranking: 5

API:

Scrambles characters in an email address.

Preserves the formatting and keeps the @ and .. You can identify specific email domains to not scramble.

Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Generates timestamps that fit an event distribution. You can link columns to create a sequence of events across multiple columns. You can also partition the generator by other columns.

Linkable Privacy ranking: 3

API:

Scrambles characters in a file name.

Preserves the formatting and the file extension.

Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Replaces all instances of the find string with the replace string. For the find string, you can optionally provide a regular expression.

Privacy ranking: 5

API:

Generates a valid Finnish Personal Identity Code (PIC).

You configure the date range during which the PIC was issued.

Consistency - Self only

Data-free if not consistent

Unique columns allowed

Format-preserving encryption (FPE)

Privacy ranking:

  • 1 if not consistent

  • 4 if consistent

API:

Transforms Norwegian national identity numbers. You can optionally preserve the gender and birthdate portions of the identifier values.

Consistency - Self and other Unique columns allowed Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Used to transform columns that contain latitude and longitude values.

Linkable Unique columns allowed Privacy ranking: 3

API:

Can be used to generate cities, states, zip codes, and latitude/longitude values that follow HIPAA guidelines for safe harbor.

Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Generates random host names, based on the English language.

Consistency - Self and other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Used to transform values in an HStore column in a PostgreSQL database. You specify a list of keys for which to transform the values. For each key, you assign a generator to apply to the key value.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

API:

Used to transform columns that contain HTML content. To identify the values to transform, you provide a list of path expressions. For each path expression, you assign a generator to apply to the matching value.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

API:

Generates unique integer values.

By default, the generated values are within the range of the column’s data type.

You can also specify a range for the generated values. The source values must be within that range.

Consistency - Self only Differential privacy if not consistent Data-free if not consistent Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent

API:

For Canadian mailing addresses, can generate:

  • Street name

  • Postal code

For United Kingdom (UK) mailing addresses, can generate postal codes.

Consistency - Self only Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Generates a random IP address-formatted string. You specify the percentage of IPv4 addresses. The remaining addresses are IPv6.

Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Used to transform values in JSON columns. To identify values to transform, you provide a list of JSONPaths.

For each JSONPath, you assign a sub-generator to apply to matching values.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

API:

Generates a random MAC address formatted string.

Consistency - Self only Differential privacy if not consistent Data-free if not consistent Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Generates unique MongoDB objectId values. Can be assigned to text columns that contain MongoDB ObjectId values. The column value must be 12 bytes long.

Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Generates a random name string from a dictionary of first and last names. You specify the name format. For example, a column might contain only a first name, or a full name that is last name first.

Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Masks values in numeric columns.

Either adds or multiplies the original value by random noise.

Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Replaces all of the column values with NULL values.

Differential privacy Data-free Unique columns allowed Privacy ranking: 1

API:

Generates unique numeric strings of the same length as the input numeric string.

Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Default generator. Does not perform any transformation on the source data.

Unique columns allowed Privacy ranking: 6

API:

Generates a random telephone number that matches the country or region and format of the input telephone number. For invalid telephone numbers, either replaces individual numbers or generates a valid replacement number.

Consistency - Self only Privacy ranking: 3

API:

Generates a random boolean value. You specify the percentage of true values. The remaining values are false.

Differential privacy Data-free Privacy ranking: 1

API:

Generates a random double number that is between the specified minimum (inclusive) and maximum (exclusive) values.

Differential privacy Data-free Privacy ranking: 1

API:

Generates a random hash string.

Differential privacy Data-free Privacy ranking: 1

API:

Returns a random integer that is between the specified minimum (inclusive) and maximum (exclusive) values.

Differential privacy Data-free Privacy ranking: 1

API:

Generates random dates, times, and timestamps that fall within a specified range.

Differential privacy Data-free Privacy ranking: 1

API:

Generates a random new UUID string.

Differential privacy Data-free Unique columns allowed Privacy ranking: 1

API:

To identify values to transform, you provide a regular expression.

For each capture group in an expression, you assign a sub-generator to apply to matching values.

Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: 5

API:

Generates a column of unique integer values that start with specified value, and then increment by 1 for each processed row.

Linkable Unique columns allowed Privacy ranking: 3

API:

Generates values of ISO 6346 compliant shipping container codes. The codes are all in the freight ("U") category.

Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Generates a new valid Canadian Social Insurance Number. Preserves the formatting from the original value.

Consistency - Self only Data-free if not consistent Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Generates a new valid United States Social Security Number. For numeric columns, the dashes (xxx-xx-xxxx) are always excluded. Otherwise, you can specify the percentage of values for which to include the dashes.

Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

API:

Used to transform StructFields within a StructType in Spark databases (Databricks and Amazon EMR). To identify the StructField value to transform, you provide a path expression. For each path expression, you assign a sub-generator to apply to the matching values.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

API:

Shifts timestamps by a random amount of a specific unit of time, within a set range. The range can start before the original value.

Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Generates unique email addresses.

Replaces the username with a randomly generated GUID, and masks the domain with a character scramble.

Consistency - Self only Unique columns allowed Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Used to transform URLs. Preserves the formatting. Keeps the URL scheme and top-level domain intact.

Unique columns allowed Privacy ranking: 3

API:

Generates UUIDs.

Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

API:

Used to transform values in XML columns. To identify the values to transform, you provide XPaths. For each XPath, you assign a sub-generator to apply to the matching values.

Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

Address
AddressGenerator
Algebraic
AlgebraicGenerator
Alphanumeric String Key
AlphaNumericPkGenerator
Array Character Scramble
ArrayTextMaskGenerator
Array JSON Mask
ArrayJsonMaskGenerator
Array Regex Mask
ArrayRegexMaskGenerator
ASCII Key
AsciiPkGenerator
Business Name
BusinessNameGenerator
Categorical
CategoricalGenerator
Character Scramble
TextMaskGenerator
Character Substitution
StringMaskGenerator
Company Name
CompanyNameGenerator
Business Name
Conditional
ConditionalGenerator
Constant
ConstantGenerator
Continuous
GaussianGenerator
Cross Table Sum
CrossTableAggregateGenerator
CSV Mask
CsvMaskGenerator
Custom Categorical
CustomCategoricalGenerator
Date Truncation
DateTruncationGenerator
Email
EmailGenerator
Event Timestamps
EventGenerator
File Name
FileNameGenerator
Find and Replace
FindAndReplaceGenerator
Finnish Personal Identity Code
FinnishPicGenerator
FNR
FnrGenerator
Geo
GeoGenerator
HIPAA Address
HipaaAddressGenerator
Hostname
HostnameGenerator
HStore Mask
HStoreMaskGenerator
HTML Mask
HtmlMaskGenerator
Integer Key
IntegerPkGenerator
International Address
InternationalAddressGenerator
IP Address
IPAddressGenerator
JSON Mask
JsonMaskGenerator
MAC Address
MACAddressGenerator
Mongo ObjectId Key
ObjectIdPkGenerator
Name
NameGenerator
Noise Generator
NoiseGenerator
Null
NullGenerator
Numeric String Key
NumericStringPkGenerator
Passthrough
PassthroughGenerator
Phone
USPhoneNumberGenerator
Random Boolean
RandomBooleanGenerator
Random Double
RandomDoubleGenerator
Random Hash
RandomStringGenerator
Random Integer
RandomIntegerGenerator
Random Timestamp
RandomTimestampGenerator
Random UUID
UUIDGenerator
Regex Mask
RegexMaskGenerator
Sequential Integer
UniqueIntegerGenerator
Shipping Container
ShippingContainerGenerator
SIN
SINGenerator
SSN
SsnGenerator
Struct Mask
StructMaskGenerator
Timestamp Shift
TimestampShiftGenerator
Unique Email
UniqueEmailGenerator
URL
UrlGenerator
UUID Key
UuidPkGenerator
XML Mask
XmlMaskGenerator