arrow-left
Only this pageAll pages
gitbookPowered by GitBook
triangle-exclamation
Couldn't generate the PDF for 404 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

Tonic Structural

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Creating and managing workspaces

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Configuring data generation

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

About Tonic Structural

The Tonic Structural synthetic data platform combines sensitive data detection and data transformation to allow users to create safe, secure, and compliant datasets.

Common Structural use cases include creating staging and development environments, and trying out a new cloud provider without complex data agreements.

Structural allows you to reduce bug counts, shorten testing life cycles, and share data with partners, all while helping to ensure security and compliance with the latest regulations, from GDPR to CCPA.

You can use the Structural API to integrate with CI/CD pipelines or to create automated processes that ensure that the generated data is available on demand.

Structural data generation workflow

Tonic Structural data generation combines sensitive data detection and data transformation to create safe, secure, and compliant datasets.

The Structural data generation workflow involves the following steps:

Overview diagram of the Tonic Structural data generation workflow

You can also view this video overview of the Structural data generation workflowarrow-up-right.

  1. To get started, you create a workspace. When you create a workspace, you identify the type of source data, such as PostgreSQL or MySQL, and establish the connections to the source database and the destination location. The source database contains the original data that you want to synthesize. The destination location is where Structural stores the synthesized data. It might be a database, a storage location, or a container repository.

  2. Next, you . The sensitivity scan identifies columns that contain sensitive data. These columns need to be protected by a generator.

  3. Based on the sensitivity scan results, you configure the data generation. The configuration includes:

    • The table mode controls the number of rows and columns that are copied to the destination database.

    • You can make adjustments to the initial sensitivity assignments. For example, you can mark additional columns as sensitive that the initial scan did not identify as sensitive.

  4. After you complete the configuration, you . The data generation job uses the configured table modes and generators to transform the data from the source database and write the transformed data to the destination location. You can track the job progress and view the job results.

Tonic Structural User Guide

The Tonic Structural platform creates safe, realistic datasets to use in staging environments or for local development. The Structural web application and API can be used by engineers, data analysts, or security experts.

Structural connects to source databases that contain sensitive data such as personally identifiable information (PII) or protected health information (PHI). To protect that data, Structural transforms the sensitive values and then writes the transformed data to a destination location.

Data flow from the source database through Tonic Structural to the destination database

New to Structural? Review the Tonic Structural workflow overview. For information on how to create a Structural account and start a Structural free trial, go to Getting started with the Structural free trial.

Want to know what's in the latest Structural releases? Go to the Tonic Structural release notesarrow-up-right.

The Structural application heading includes a feature updates icon, which displays a summary of the newest features, and includes a link to the Structural release notes.

hashtag
Connect to your data

hashtag
Configure and generate transformed data

hashtag
Manage a self-hosted Tonic Structural instance

Need help with Structural? Contact .

Displaying sample data for a column

circle-info

Required workspace permission:

  • Source data: Preview source data

Identifying similar columns

During sensitivity scans and schema change scans, Tonic Structural identifies groups of similar columns.

To identify similar columns, Structural uses a text embedding model to calculate the semantic similarity between any two column names in the database. When a column name's semantic similarity to the name of a given column is above a specified threshold, then the column is similar to the given column.

If a column has similar columns, then the Applied Generator column contains an icon that includes the count of similar columns.

By default, the similar columns icon is hidden. To display the similar columns icon, hover over the column row.

When you assign a generator to a column, the similar columns icon for that column remains visible during your current session.

Structural data generation workflow

Overview of the Structural steps to generate de-identified data.

Structural deployment types

You can use Structural Cloud or set up a self-hosted Structural instance.

Structural implementation roles

Functions that participate in a Structural implementation.

Structural license plans

View the license options and their available features.

Assigning and configuring column generators. To protect the data in a column, especially a sensitive column, you assign a generator to it. The generator replaces the source value with a different value in the destination database. For example, the generator might scramble the characters or assign a random value of the same type.

analyze the results of the initial sensitivity scan
Assigning table modes to tables.
Indicating column sensitivity.
run the data generation job
[email protected]envelope
Feature updates icon

Workspaces

A workspace contains the data connections and data generation configuration.

Data connectors

Each data connector allows Structural to read from and write to a specific type of data source.

Privacy Hub

View and update the current protection status based on the sensitivity scan and workspace configuration.

Database View

Configure transformation options for tables and columns.

Generators

A generator is assigned to a column and performs a data transformation.

Subsetting

Configure a subset of source data to include in the transformed destination data.

Generate data

Run the data generation process to produce transformed destination data.

Schema changes

Review and address changes to the source data schema.

User access

Manage who has access to your instance.

Monitoring and logging

Monitor Structural services and share logs with Tonic.ai.

Updating Structural

Upgrade to the latest version of Structural.

Destination data: Preview destination data

For each column on Database View, you can display a sample list of the column values.

For columns that have an assigned generator, the sample shows both the current values and the possible values after the generator is applied.

To display the sample values, in the Column column, click the magnifying glass icon.

If the generator is Passthrough, then the sample data panel contains only Original Data.

Sample data for a column that does not have an assigned generator

If a different generator is assigned, then the sample data panel contains both Original Data and Protected Output.

Sample data for a column that has an assigned generator
When you click the similar columns icon, Structural displays a panel with an option to filter the list to display the current column and its similar columns. To apply the filter, click Filter.
Similar columns panel with filter option

The similar columns filter is applied, and other table and column filters are removed.

Database View column list with a similar columns filter applied
Similar columns icon with the count of similar columns

Logging into Structural for the first time

When you go to Tonic Structural for the first time, you create an account. How you create an account depends on the type of user you are.

A new Structural user can be one of the following:

  • A completely new user who is starting a Structural 14-day free trial. Free trial users use Structural Cloud to explore and experiment with Structural before they decide whether to purchase it.

  • A new user on a self-hosted Structural instance. Self-hosted instances are installed on-premises. The customer administers the Structural users.

  • New users are added to existing organizations based on their email domain.

Workspace configuration settings

The workspace settings for a new workspace (New Workspace view) or edited workspace (Workspace Settings tab) provide information about the workspace and its data.

Workspace identification and connection type

Every workspace includes the following settings to identify the workspace and to select the type of data connector.

hashtag
Fields to identify the workspace

All workspaces have the following fields that identify the workspace:

  1. In the Workspace name field, enter the name of the workspace.

  2. In the Workspace description field, provide a brief description of the workspace. The description can contain up to 200 characters.

  3. In the Tags field, provide a comma-separated list of tags to assign to the workspace. For more information on managing tags, go to .

hashtag
Connection type

Under Connection Type, select the type of data connector to use for the workspace data. You cannot change the connection type on a .

The Basic and Professional licenses limit the number and type of data connectors you can use.

  • A Basic instance can only use one data connector type, which can be either PostgreSQL or MySQL. After you create your first workspace, any subsequent workspaces must use the same data connector type.

  • A Professional instance can use up two different data connector types, which can be any type other than Oracle or Db2 for LUW. After you create workspaces that use two different data connector types, any subsequent workspaces must use one of those data connector types.

If the database that you want to connect to isn't in the list, or you want to have different database types for your source and destination database, contact [email protected].

When you select a connector type, Structural updates the view to display the connection fields used for that connector type. The specific fields vary based on the .

Using secrets managers for authentication

circle-info

Required license: Enterprise

Your organization might use a secrets manager to secure credentials, including database connection credentials.

You can configure a set of available secrets managers. In the workspace configuration, users can then select a secret name from a secrets manager.

Assigning tags to a workspace

circle-info

Required workspace permission: Configure workspace settings

You can associate custom tags with each workspace. Tags can help to organize and provide a quick glance into the workspace configuration.

Tags are accessible to every user that has access to the workspace.

Tags are stored in the workspace JSON, and are included in the workspace export. You can also use the API to get access to tags.

hashtag
Managing tags from workspace settings

You can add and edit tags in the Tags field on the New Workspace and Workspace Settings views.

  • To add tags, enter a comma-separated list of the tags to add.

  • To remove a tag, click its delete icon.

hashtag
Managing tags from Workspaces view

You can also manage tags directly from Workspaces view.

hashtag
Assigning tags

To add tags to a workspace that does not currently have tags:

  1. Hover over the Tags column for the workspace.

  2. Click Add Tags.

  3. In the tag input field, type a comma-separated list of tags to apply.

hashtag
Editing the assigned tags

To edit the assigned tags:

  1. Click the Tags column for the workspace.

  2. In the tag input field, to remove tag, click its delete icon.

  3. To add tags, type a comma-separated list of the tags to add.

Managing the workspace schema cache

You can configure a workspace to cache the source database schema, to reduce the number of times that Tonic Structural needs to query the source database.

hashtag
Viewing the schema cache status

When schema caching is enabled for a workspace, then in the expanded version of the workspace management view header, below the workspace name, Structural shows the current status of the schema cache.

Schema cache status for a workspace

The status indicates when:

  • Structural is retrieving the schema information.

  • Structural is checking for schema updates.

  • Structural is refreshing the schema cache.

  • Structural fails to connect to the source database.

hashtag
Refreshing the schema cache

When the schema cache is updated, or when Structural has detected updates to the schema, then the schema cache status includes an option to refresh the schema cache.

To refresh the schema:

  1. Click the schema cache status.

  2. On the panel, click Refresh Schema.

Structural starts a new schema retrieval job. To track the progress of the job, go to the .

Commenting on columns

circle-info

Required license: Professional or Enterprise

From Database View, you can add comments to columns. For example, you might use a comment to explain why you selected a particular generator or marked a column as sensitive or not sensitive.

hashtag
Creating a new comment

If a column does not have any comments, then to add a comment:

  1. In the Applied Generator column, click the comment icon.

  2. In the comment field, type the comment text.

  3. Click Comment.

hashtag
Responding to existing comments

When a column has existing comments, the comment icon is green. To add comments:

  1. Click the comment icon. The comments panel shows the previous comments. Each comment includes the comment user.

  2. In the comment field, type the comment text.

  3. Click Reply.

Working with document-based data

For document-based data connectors - currently MongoDB and Amazon DynamoDB - Database View and Table View are replaced by Collection View. "Collection" is the term that Structural uses to refer to MongoDB collections and DynamoDB tables.

For JSON columns in file connector and PostgreSQL workspaces, you can use Document View to view and assign generators to JSON fields.

Manually indicating whether a column is sensitive

You can also manually indicate that a column is sensitive or not sensitive.

For example, the sensitivity scan might incorrectly identify a column as sensitive. Or a column might contain data that you consider sensitive but that does not match a detected sensitivity type.

When you manually change a column from not sensitive to sensitive, Structural marks the sensitivity detection as full confidence.

For information on how to change whether a column is sensitive:

  • For Privacy Hub, go to .

  • For Database View, go to:

    • For a single column,

    • For multiple selected columns,

  • For Table View, go to .

The Structural API also provides .

Algebraic

The algebraic generator identifies the algebraic relationship between three or more numeric values and generates new values to match. At least one of the values must be a non-integer.

If a relationship cannot be found, then the generator defaults to the Categorical generator.

This generator can be linked with other Algebraic generators.

hashtag
Characteristics

hashtag
How to configure

To configure the generator, from the Link To dropdown list, select the columns to link this column to. You can select other columns that are assigned the Algebraic generator.

You must select at least three columns.

The column values must be numeric. At least one of the columns must contain a value other than an integer.

If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Alphanumeric String Key

Generates unique alphanumeric strings of the same length as the input.

For example, for the origin value ABC123, the output value is a six-character alphanumeric string such as D24N05.

hashtag
Characteristics

hashtag
How to configure

To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

By default, the generator is not consistent.

If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

Find and Replace

This generator replaces all instances of the find string with the replace string.

For example, you can indicate to replace all instances of abc with 123.

hashtag
Characteristics

hashtag
How to configure

To configure the generator:

  1. In the Find field, type the string to look for in the source column value. To use a regular expression to identify the source value, check the Use Regex checkbox. If you use a regular expression, use backslash ( \ ) as the escape character.

  2. In the Replace field, type the string to replace the matching string with.

  3. If is enabled, then to use it for this column, in the advanced options section, toggle

Managing access to workspaces

When you create a workspace, you become the owner of the workspace, and by default are assigned the built-in Manager workspace permission set for the workspace. The Manager permission set provides full access to the workspace configuration, data, and results.

With a Professional or Enterprise license, you can also assign workspace permission sets to other users and to SSO groups. You can also transfer a workspace to a different owner.

If you are granted access to any workspace permission set for a workspace, then you have access to all of the workspace management views for that workspace. However, you can only perform tasks that you have permission for in that workspace.

Workspace access is managed from the Workspaces view. You cannot assign workspace permission sets from Structural Settings view.

You can also view an .

Transferring ownership of a workspace

circle-info

Required permission

  • Global permission: View organization users. This permission is only required for the Tonic Structural application. It is not needed when you use the Structural API.

Database View

Database View provides a complete view of your source database structure and configuration.

To display Database View, either:

  • On the workspace management view, in the workspace navigation bar, click Database View.

  • On

Identifying sensitive data

Tonic Structural uses its sensitivity scan to identify source data columns that contain sensitive information. The scan ignores truncated tables.

The sensitivity scan identifies Structural's built-in sensitivity types. It also looks for custom types that you define.

You can also manually mark a column as sensitive or not sensitive.

Array Character Scramble

A version of the generator that can be used for array values.

This generator replaces letters with random other letters, and numbers with random other numbers. Punctuation and whitespace are preserved.

For example, for the following array value:

["ABC.123", 3, "last week"]

The output might be something like:

["KFR.860", 7, "sdrw mwoc"]

Character Substitution

Performs a random character replacement that preserves formatting (spaces, capitalization, and punctuation).

Characters are replaced with other characters from within the same Unicode Block. A given source character is always mapped to the same destination character. For example, M might always map to V.

For example, for the following input string:

Miami Store #162

The output would be something like:

Continuous

Generates a continuous distribution to fit the underlying data.

This generator can be linked to other Continuous generators to create multivariate distributions and can be partitioned by other columns.

hashtag
Characteristics

File Name

This generator scrambles characters, but preserves formatting and keeps the file extension intact.

For example, for the following input value:

DataSummary1.pdf

The output value would look something like:

RsnoPwcsrtv5.pdf

This generator securely masks letters and numbers. There is no way to recover the original data.

FNR

The FNR generator transforms Norwegian national identity numbers. In Norwegian, the term for national identity number abbreviates to FNR.

The first six digits of an FNR reflects the person's birthdate. You can choose to preserve the birthdates from the source values in the destination values. If you do not preserve the source values, the destination values are still within the same date range as the source values.

Another digit in an FNR indicates whether the person is male or female. You can specify whether to preserve in the generated value the gender indicated in the source value.

The last digits in an FNR are a checksum value. The last digits in the destination value are not a checksum - the values are random.

hashtag

Hostname

Generates random host names, based on the English language.

hashtag
Characteristics

Press Enter.

To save the tag changes, press Enter.

Either:

  • Workspace permission: Transfer workspace ownership

  • Global permission: Manage access to Tonic Structural and to any workspace

To grant yourself access after the transfer:

  • Workspace permission: Share workspace access

hashtag
About workspace ownership

Every workspace has an owner. The owner is always a user.

The user who creates the workspace is automatically the owner of the workspace.

By default, the workspace owner is assigned the built-in Manager workspace permission set. On Enterprise instances, you can choose a different workspace permission set to assign to all workspace owners.

You cannot remove that permission set from the workspace owner.

hashtag
About ownership transfer

You can transfer a workspace to a different owner.

When you transfer ownership of a workspace, the new owner is assigned the owner permission set.

If the previous owner was not separately granted access to the owner permission set, then that permission set is removed.

hashtag
Completing the ownership transfer

To transfer workspace ownership:

  1. To transfer ownership of a single workspace, from the workspace actions menu, select Transfer Ownership.

  2. To transfer ownership of multiple workspaces:

    1. Check the checkbox for each workspace to grant access to.

    2. From the Actions menu, select Transfer Ownership.

  3. On the transfer ownership panel, from the User dropdown list, select the new owner.

  4. If you are the current owner of the workspace, then to grant yourself non-owner access after you transfer the ownership:

    1. Toggle Receive access to workspace to the on position.

    2. Select the workspace permission set to assign to yourself.

  5. Click Transfer Ownership.

Identification and connection type

Settings to identify the workspace and to select the data connector.

Data connection settings

Connect to source and destination databases or, for the file connector, local or cloud storage files.

Data generation settings

Block data generation on schema changes.

Enable and configure upsert

Add new destination records and update changed destination records. Ignore other unchanged destination records.

Write output to a container repository

Use the data generation output to populate a container data volume.

Advanced workspace overrides

Workspace-specific settings for cross-run consistency and data generation performance.

Configure available secrets managers Set up and test the secrets managers that can be used.

Select a secret from a secrets manager In the workspace configuration, identify a secret to use for the connection.

Scan collections Collection scans identify the fields and data types in a collection.

Use Collection View Configure generators and collection modes.

Use Document View for JSON columns Assign generators to fields in JSON columns.

Assign generators to path expressions In Collection View or Document View, assign a generator to fields that match a JSONPath expression.

A new user in an existing Structural Cloud organization.
Assigning tags to a workspace
child workspace
connector type

Consistency

No, cannot be made consistent.

Linking

Yes, can be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

3

Generator ID (for the API)

AlgebraicGenerator

Structural data encryption

Consistency

Yes, can be made self-consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

Yes

Allowed for unique columns

Yes

Uses format-preserving encryption (FPE)

Yes

Privacy ranking

  • 3 if not consistent

  • 4 if consistent

Generator ID (for the API)

AlphaNumericPkGenerator

Structural data encryption
Use data encryption process
to the on position.

Consistency

No, cannot be made consistent.

Linking

No, cannot be linked.

Differential privacy

No

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

5

Generator ID (for the API)

FindAndReplaceGenerator

Structural data encryption
Comment panel for a column that has no comments
Replying to existing comments on a column

Run the Structural sensitivity scan

Run, configure, and get the results of the sensitivity scan.

Set column sensitivity manually

Options to override the results of the sensitivity scan.

Built-in sensitivity types

Types of sensitive data that the sensitivity scan can identify.

Configure custom sensitivity rules

Set up rules to enable the scan to identify other sensitive columns based on the column data types and names.

The schema cache refresh fails.

  • The schema cache is updated. The status includes the timestamp of the most recent refresh.

  • workspace Jobs view
    Refresh Schema option for a workspace schema cache
    Workspaces
    view, from the dropdown menu in the
    Name
    column, select
    Database View
    .

    Database View consists of:

    • On the left, the list of tables in the source database.

    • On the right, the list of columns in those tables.

    Database View

    hashtag
    View table and column information

    hashtag
    Configure and comment on columns

    This generator securely masks letters and numbers. There is no way to recover the original data.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    hashtag
    How to configure

    To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

    By default, the generator is not consistent.

    If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Character Scramble
    Vgkjg Gmlvf #681

    Note that for a numeric column, when a generated number starts with a 0, the starting 0 is removed. This could result in matching output values in different columns. For example, one column is changed to 113 and the other to 0113, which also becomes 113.

    Character Substitution is similar to Character Scramble, with a couple of key differences. Because Character Substitution always maps the same source character to the same destination character, it is always consistent. It also can be used for unique columns.

    In Character Scramble, the character mapping is random, which makes Character Scramble slightly more secure. However, Character Scramble cannot be used for unique columns.

    hashtag
    Characteristics

    Consistency

    This generator is implicitly self-consistent. You do not specify whether the generator is consistent. Every occurrence of a character always maps to the same substitute character. Because of this, it can be used to preserve a join between two text columns, such as a join on a name or email.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    Yes

    Allowed for unique columns

    Yes

    hashtag
    How to configure

    If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Linking

    Yes, can be linked.

    Differential privacy

    Configurable

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 2 if differential privacy enabled

    • 3 if differential privacy not enabled

    Generator ID (for the API)

    hashtag
    How to configure

    To configure the generator:

    1. From the Link To drop-down list, select the other Continuous generator columns to link to. The linking creates a multivariate distribution.

    2. From the Partition By drop-down list, select one or more columns to use to partition the data. The selected columns must have the generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to Partitioning a column.

    3. Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, the generator is not differentially private.

    4. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Consistency

    No, cannot be made consistent.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    hashtag
    How to configure

    To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

    By default, the generator is not consistent.

    If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Characteristics

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    Linking

    No, cannot be linked

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    Yes

    hashtag
    How to configure

    To configure the generator:

    1. To preserve the gender from the source value in the destination value, toggle Preserve Gender to the on position.

    2. To preserve the birthdate from the source value in the destination value, toggle Preserve Birthdate to the on position.

    3. Toggle the Consistency setting to indicate whether to make the generator consistent. By default, consistency is disabled.

    4. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given value for that other column in the source database results in the same value in the destination database. For example, if the FNR column is consistent with a Name column, then every instance of John Smith in the source database results in the same FNR in the destination database.

    5. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    hashtag
    How to configure

    To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.

    By default, the generator is not consistent.

    If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from Consistent to, select the column.

    When the generator is consistent with itself, then a given value in the source database is mapped to the same value in the destination database. For example, Host123 in the source database always produces MyHostABC in the destination database.

    When the generator is consistent with another column, then a given source value in the other column results in the same host name value in the destination database. For example, a host name column is consistent with a department column. Every instance of Sales in the source data is given the same host name in the destination database.

    If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    Linking

    No, cannot be linked.

    Differential privacy

    endpoints to designate columns as sensitive or not sensitive
    Marking the columns as sensitive or not sensitive
    overview video tutorial about workspace accessarrow-up-right

    Structural deployment types

    hashtag
    Self-hosted Tonic Structural instance

    You can deploy a self-hosted, on-premises instance of Tonic Structural.

    For a self-hosted instance, Structural provides administrator tools that allow you to monitor Structural services and manage Structural users.

    You can configure Structural environment settings to customize your instance.

    On a self-hosted instance, based on your , you have access to the full set of supported data connectors.

    hashtag
    Structural Cloud

    Structural Cloud is our secure hosted environment. On Structural Cloud, Tonic.ai handles monitoring Structural services and updating Structural.

    For , Structural Cloud only supports Okta.

    Structural Cloud does not include:

    • . Structural Cloud uses a single configuration.

    Structural Cloud also supports a pay-as-you-go plan, where free trial users can move on to set up a monthly subscription. For more information, go to .

    Each Structural Cloud user belongs to a Structural Cloud organization, which is determined either by the user's email domain or by a workspace invitation. Structural Cloud users do not have any access to workspaces or users from other organizations.

    Each free trial user is in a separate organization, along with any users that they invite to have access to a free trial workspace.

    For information about Structural Cloud organizations, go to .

    The Account Admin permission set allows a Structural Cloud user to manage organization users and workspaces. For information about granting access to the Account Admin permission set, go to .

    Structural implementation roles

    A Tonic Structural implementation can involve the following roles - from those who set up the Structural environment to the consumers of the data that Structural processes.

    Note that these roles are not related to role-based access (RBAC) within Structural, which is managed using permission sets.

    hashtag
    Infrastructure engineers

    For self-hosted instances of Structural.

    Infrastructure engineers set up the Structural application and its relevant dependencies. They are typically DevOps, Site Reliability Engineering (SRE), or Kubernetes cluster administrators.

    Infrastructure engineers perform the following Structural-related tasks:

    • Ensure that the proper infrastructure is ready for Structural installation based on the .

    • . Works with Tonic.ai support as needed.

    • Perform routine maintenance of Structural and the Structural environment. and its dependencies as needed.

    hashtag
    Database administrators

    For both self-hosted instances of Structural and Structural Cloud.

    Database administrators integrate Structural into your data architecture to support .

    They ensure that source databases are available to Structural, and that Structural can write to destination databases.

    perform the following Structural-related tasks:

    • Set up the required Structural access to source databases.

    • Set up destination databases for Structural to write transformed data to.

    hashtag
    Structural users

    Structural users are the actual users of the Structural application.

    Depending on the use case, Structural users might be compliance analysts, DevOps, or data engineers.

    Tonic users perform the following Structural-related tasks:

    • Use the to configure the logic used to transform the source data and to generate the transformed data.

    • Work with data consumers to produce usable data.

    hashtag
    Data consumers

    Data consumers are the end users of transformed destination data.

    They are typically QA testers, developers, or analysts.

    Data consumers perform the following Structural-related tasks:

    • Validate the usability of the destination data.

    • Provide guidance on application-specific requirements for data.

    hashtag
    Security and compliance

    Security and compliance specialists ensure and validate that the data that Structural produces meets expectations, and that Structural is compliant with other security-related processes.

    Security and compliance specialists perform the following Structural-related tasks:

    • Provide guidance on what data is sensitive.

    • Sign off on proposed approaches to mask sensitive data.

    • Approve data access and permissions.

    Advanced workspace overrides

    For self-hosted instances, Structural provides environment settings to configure features that include:

    • Consistency across runs and databases

    • Data generation performance

    The Advanced Workspace Overrides section of the workspace details view allows you to override those environment settings for an individual workspace.

    For example, the environment setting TONIC_TABLE_PARALLELISM determines the number of tables that Structural processes simultaneously. You can then override that value within individual workspaces.

    The workspace overrides are available on both self-hosted instances and on Structural Cloud.

    hashtag
    Configuring the overrides

    To display the available override settings, expand Advanced Workspace Overrides.

    hashtag
    Enabling and setting an override

    For information on how to configure the statistics seed, go to .

    For other settings, to enable the override and set the override value:

    1. Toggle the setting to the on position.

    1. Set the value.

    hashtag
    Removing an override

    To remove the override, toggle the setting to the off position.

    hashtag
    Available overrides

    hashtag
    Workspace statistics seed for cross-run consistency

    For generators where is enabled, a statistics seed enables consistency across data generation runs. The Structural-wide statistics seed value ensures consistency across both data generation runs and workspaces.

    You use the Override Statistics Seed setting to override the Structural-wide statistics seed value.

    You can either disable consistency across data generations, or provide a seed value for the workspace. The workspace seed value ensures consistency across data generation runs for that workspace, and across other workspaces that have the same seed value.

    For details about using seed values to ensure consistency across data generation runs and databases, go to .

    hashtag
    Data generation performance settings

    Structural provides environment settings to manage . For example, these settings include configuration for parallel processing.

    From Advanced Workspace Overrides, you can override some of these data generation performance settings for an individual workspace.

    hashtag
    Data encryption and decryption keys

    To use Structural data encryption, you must .

    You use the Override Data Decryption Key and Override Data Encryption Key settings to override the Structural-wide keys that are provided in the environment settings.

    hashtag
    Destination database schema creation

    Some data connectors allow you to configure whether you provide the schema for the destination database. For more information, go to related information for , , , , and .

    From Advanced Workspace Overrides, you can override the instance-wide configuration.

    hashtag
    Overwrite handling for Databricks

    Databricks allows you to configure how Structural handles overwrites of existing data.

    You use the Override Workspace Default Error on Override and Override Workspace Default Save Mode settings to override the instance-wide configuration.

    Configuring multiple columns

    The bulk edit option on Database View allows you to configure multiple columns at the same time. From the bulk editing panel, you can:

    • Mark the selected columns as sensitive or not sensitive.

    • Assign a generator to the selected columns.

    • Apply the recommended generator to the selected columns.

    • Reset the generator configuration to the baseline. This option requires that all of the selected columns are assigned the same preset.

    Depending on the column selection, you can also create a new sensitivity rule.

    hashtag
    Displaying the bulk edit option

    To select the columns and display the bulk edit option:

    1. Check the checkbox next to each column to update.

    2. Click Bulk Edit.

    hashtag
    Marking the columns as sensitive or not sensitive

    circle-info

    Required workspace permission: Configure column sensitivity

    On the Bulk Edit panel, under Sensitivity:

    • To mark the selected columns as sensitive, click Sensitive.

    • To mark the selected columns as not sensitive, click Not Sensitive.

    hashtag
    Changing the assigned generator

    circle-info

    Required workspace permission: Configure column generators

    On the Bulk Edit panel, under Bulk Edit Applied Generator, select and configure the generator to assign to the selected columns.

    hashtag
    Assigning the recommended generator to the columns

    circle-info

    Required workspace permission: Configure column generators

    If any of the selected columns have a recommended generator, then on the Bulk Edit panel, the Generator recommendations found panel displays. The panel indicates the number of selected columns that have a recommendation.

    To assign the recommended generators to those columns, click Apply.

    hashtag
    Restoring the baseline configuration for the columns

    circle-info

    Required workspace permission: Configure column generators

    For a generator preset, the baseline configuration is the configuration that is saved for that preset. The baseline configuration determines the default configuration to use when you assign the preset to a column. After you select the preset, you can override the baseline configuration.

    If all of the selected columns are assigned the same preset, then to restore the baseline configuration for all of the columns, click Reset, then select Reset to Generator Preset.

    hashtag
    Creating a sensitivity rule

    circle-info

    Required license: Enterprise

    Required global permission: Create and manage sensitivity rules

    You might bulk edit columns that could benefit from a custom sensitivity rule.

    For example, in your data, the Widget column is in multiple tables and contains sensitive data that Structural cannot identify. You select all of the Widget columns so that you can mark them as sensitive and apply the Character Scramble generator to them.

    However, a custom sensitivity rule would ensure that in the future, Widget columns are always marked as sensitive and have the Character Scramble generator recommended.

    On the Bulk Edit panel, when all of the selected columns:

    • Have the same data type.

    • Do not have a generator assigned.

    • Do not have a recommended generator.

    Then Structural displays the Create a Sensitivity Rule panel, which contains the option to create a new sensitivity rule.

    To create a sensitivity rule:

    1. Click Create Custom Rule.

    2. On the Create Custom Rule view, configure the new sensitivity rule. Structural automatically selects a data type based on the selected columns. The current workspace is used as the testing workspace to verify the columns that match the rule configuration. For details about the sensitivity rule configuration, go to .

    3. When you finish configuring the new rule:

    Structural closes the sensitivity rule configuration view and returns you to Database View. It maintains the previous column selection.

    If you did not apply the generator preset, then the sensitivity rule is included in the next sensitivity scan.

    Array JSON Mask

    This is a composite generator.

    A version of the JSON Mask generator that can be used for array values.

    Runs a selected generator on values that match a user-specified JSONPatharrow-up-right.

    hashtag
    Characteristics

    hashtag
    How to configure

    hashtag
    Adding a sub-generator

    To assign a generator to a path expression:

    1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.

    2. In the Path Expression field, type the JSONPath expression to identify the value to apply the generator to. To populate a path expression, you can also click a value in the Cell JSON field. Matched JSON Values shows the result from the value in Cell JSON.

    hashtag
    Managing the sub-generator list

    From the Sub-Generators list:

    • To edit a generator assignment, click the edit icon.

    • To remove a generator assignment, click the delete icon.

    • To move a generator assignment up or down in the list, click the up or down arrow.

    hashtag
    Enabling data encryption

    If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    ASCII Key

    Generates unique alphanumeric strings based on any printable ASCII characters. The length of the source string is not preserved. You can choose to exclude lowercase letters from the generated values.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent.

    Linking

    hashtag
    How to configure

    To configure the generator:

    1. To exclude lowercase letters from the generated values, toggle Exclude Lowercase Alphabet to the on position.

    2. Toggle the Consistency setting to indicate whether to make the generator consistent. By default, the generator is not consistent.

    3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Business Name

    Generates a random company name-like string.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    Linking

    No, cannot be linked.

    hashtag
    How to configure

    To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.

    By default, the generator is not consistent.

    If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.

    When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Business is always mapped to New Business.

    When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.

    If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Character Scramble

    This generator replaces letters with random other letters and numbers with random other numbers. Punctuation, whitespace, and mathematical symbols are preserved.

    For example, for the following input string:

    ABC.123 123-456-789 Go!

    The output would be something like:

    PRX.804 296-915-378 Ab!

    This generator securely masks letters and numbers. There is no way to recover the original data.

    Character Scramble is similar to , with a couple of key differences.

    While you can enable consistency for the entire value, Character Scramble does not always replace the same source character with the same destination character. Because there is no guarantee of unique values, you cannot use Character Scramble on unique columns.

    Character Substitution, however, does always map the same source character to the same destination character. Character Substitution is always consistent, which makes it less secure than Character Scramble. You can use Character Substitution on unique columns.

    hashtag
    Characteristics

    hashtag
    How to configure

    To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

    By default, the generator is not consistent.

    If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Constant

    Uses a single value to mask all of the values in the column.

    For example, you can replace every value in a string column with the value String1. Or you can replace every value in a numeric column with the value 12345.

    hashtag
    Characteristics

    hashtag
    How to configure

    To configure the generator, in the Constant Value field, provide the value to use.

    The value must be compatible with the field type. For example, you cannot provide a string value for an integer column.

    If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Cross Table Sum

    Links columns in two tables. This column value is the sum of the values in a column in another table.

    This generator does not provide a preview. The sums are not computed until the other table is generated.

    For example, a Customers table contains a Total_Sales column. The Transactions table uses a foreign key Customer_ID column to identify the customer who made the transaction, and an Amount column that contains the amount of the sale. The Customer_ID value in the Transactions table is a value from the ID primary key column in the Customers table.

    You assign the Cross Table Sum generator to the Total_Sales column. In the generator configuration, you indicate that the value is the sum of the Amount values for the Customer_ID value that matches the primary key ID value for the current row.

    For the Customers row for ID 123, the Total_Sales column contains the sum of the Amount column for Transactions rows where Customer_ID is 123.

    hashtag
    Characteristics

    hashtag
    How to configure

    To configure the generator:

    1. From the Foreign Table dropdown list, select the table that contains the column for which to sum the values.

    2. From the Foreign Key dropdown list, select the foreign key. The foreign key identifies the row from the current table that is referred to in the foreign table.

    3. From the Sum Over dropdown list, select the column for which to sum the values.

    Email

    Scrambles the characters in an email address. It preserves formatting and keeps the @ and . characters.

    For example, for the following input value:

    [email protected]

    The output value would be something like:

    [email protected]

    By default, the generator scrambles the domain. You can configure the generator to not mask specific domains. You can also specify a domain to use for all of the output email addresses.

    For example, if you configure the generator to not scramble the domain company.com, then the output for [email protected] would look something like:

    [email protected]

    This generator securely masks letters and numbers. There is no way to recover the original data.

    If your email addresses include name values - for example, [email protected] - then you can use the Regex Mask generator to produce email addresses that are tied to name values in the same table. For information on how to do this, go to .

    hashtag
    Characteristics

    hashtag
    How to configure

    To configure the generator:

    1. In the Email Domain field, enter a domain to use for all of the output values. For example, use @mycompany.com for all of the generated values. The generator scrambles the content before the @.

    2. In the Excluded Email Domains field, enter a comma-separated list of domains for which email addresses are not masked in the output values. This allows you, for example, to maintain internal or testing email addresses that are not considered sensitive.

    Event Timestamps

    Generates timestamps that fit an event distribution. The source timestamp must include a date. It cannot be a time-only value.

    Link columns to create a sequence of events across multiple columns. This generator can be partitioned by other columns.

    hashtag
    Characteristics

    Consistency

    No, cannot be made consistent.

    hashtag
    How to configure

    To configure the generator:

    1. From the Link To dropdown list, select the other Event Timestamps generator columns to link this column to. Linking creates a sequence across multiple columns.

    2. From the Partition drop-down list, select one or more columns to use to partition the data. The selected columns must have their generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to .

    3. The Options list displays the current column and linked columns. Use the Up and Down

    Geo

    This generator can be used to mask columns of latitude and longitude.

    The Geo generator divides the globe into grids that are approximately 4.9 x 4.9 km. It then counts the number of points within each grid.

    During data generation, each (latitude, longitude) pair is mapped to its grid.

    • If the grid contains a sufficient number of points to preserve privacy, then the generator returns a randomly chosen point in that grid.

    • If the grid does not contain enough points to preserve privacy, then the generator returns a random coordinate from the nearest grid that contains enough points.

    hashtag
    Characteristics

    hashtag
    How to configure

    To configure the generator:

    1. From the Link To dropdown list, select the column to link to this one. You typically assign the Geo generator to both the latitude and longitude column, then link those columns.

    2. From the value type dropdown, select whether this column contains a latitude value or a longitude value.

    3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    HStore Mask

    This is a composite generator.

    Runs selected generators on specified key values in an HStore column in a PostgreSQL database. HStore columns contain a set of key-value pairs.

    hashtag
    Characteristics

    Consistency

    Determined by the selected sub-generators.

    hashtag
    How to configure

    hashtag
    Adding a sub-generator

    To assign a generator to a key:

    1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HStore field contains a sample value from the source database. You can use the previous and next icons to page through different values.

    2. Under Enter a key, enter the name of a key from the column value. For example, for the column value: "pages"=>"446", "title"=>"The Iliad", "category"=>"mythology" To apply a generator to the title, you would enter title as the key. Matched HStore Values shows the result from the value in Cell HStore.

    hashtag
    Managing the sub-generators list

    From the Sub-Generators list:

    • To edit a generator assignment, click the edit icon.

    • To remove a generator assignment, click the delete icon.

    • To move a generator assignment up or down in the list, click the up or down arrow.

    Frequently Asked Questions

    hashtag
    What is the minimum required screen width for the Tonic Structural application?

    The minimum screen width is 1120 pixels.

    hashtag

    Tutorial videos

    Use these tutorial videos to learn more about how to use Tonic Structural.

    hashtag
    Tonic Structural 101

    Provides an overview of the Structural workflow and how to use Structural to generate de-identified data. For more information, go to .

    Managing workspaces

    A Tonic Structural workspace provides a context within which to configure and generate transformed data.

    A workspace represents a path between the source data and the transformed output data. For example, postgres-prod-copy to postgres-staging.

    A workspace includes:

    • Where to find the source data to transform during data generation

    Data connection settings

    After you select the connector type, you configure:

    • Where to find the source data

    • Where to write the data generation output

    Built-in sensitivity types that Structural detects

    Structural identifies the following types of sensitive values. These include some information types that are considered by many privacy standards and frameworks such as HIPAA, GDPR, CCPA, and PCI.

    For more information about the HIPAA and Safe Harbor information types that Structural detects, go to the Tonic.ai guide .

    Names

    • First

    Array Regex Mask

    This is a .

    A version of the generator that can be used for array values.

    Uses regular expressions to parse strings and replace specified substrings with the output of specified generators. The parts of the string to replace are specified inside unnamed top-level capture groups.

    hashtag
    Characteristics

    Company Name

    circle-info

    This generator is deprecated. Use the generator instead.

    Generates a random company name-like string.

    hashtag
    Characteristics

    Conditional

    This is a .

    Applies different generators to the value conditionally based on any value in the table.

    For example, a Users table contains Name, Username, and Role columns. For the Username column, you can use a conditional generator to indicate that if the value of Role is something other than Test, then use the Character Scramble generator for the Username value. For Test users, the name is not masked.

    hashtag
    Characteristics

    HIPAA Address

    This generator can be used to generate cities, states, and zip codes that follow HIPAA guidelines for safe harbor.

    hashtag
    Handling of address parts

    hashtag
    Zip codes

    Mongo ObjectId Key

    Generates unique object identifiers.

    Can be assigned to text columns that contain MongoDB ObjectId values. The column value must be 12 bytes long.

    hashtag
    Characteristics

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    ArrayTextMaskGenerator

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    4

    Generator ID (for the API)

    StringMaskGenerator

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    FileNameGenerator

    From the Generator Configuration dropdown list, select the generator to apply to the key value. You cannot select another composite generator.

  • Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  • To save the configuration and immediately add a generator for another key, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

  • Linking

    Determined by the selected sub-generators.

    Differential privacy

    Determined by the selected sub-generators.

    Data-free

    Determined by the selected sub-generators.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    5

    Generator ID (for the API)

    HStoreMaskGenerator

    Share workspace access

    Grant access to other users. The assigned workspace permission sets determine the level of access.

    Transfer ownership of a workspace

    Make another user the workspace owner. You can also assign yourself workspace permission sets.

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    FnrGenerator

    Structural data encryption
    Access to the following data connectors:
    • Amazon Redshift

    • Db2 for LUW

    • Spark SDK

    license plan
    single sign-on (SSO)
    Custom permission sets
    Environment setting configuration
    Structural data encryption
    Setting up and managing a Structural Cloud pay-as-you-go subscription
    Structural organizations
    Granting Account Admin access for a Structural Cloud organization

    Create Structural-processed data pipelines for development and testing workflows.

    deployment checklist
    Follow the installation instructions
    Updates Structural
    Structural data connectors
    Database administrators
    Structural data generation workflow
    By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, and Null.
  • From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.

  • Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  • To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

  • Consistency

    Determined by the specified sub-generators.

    Linking

    Determined by the specified sub-generators.

    Differential privacy

    Determined by the specified sub-generators.

    Data-free

    Determined by the specified sub-generators.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    5

    Generator ID (for the API)

    ArrayJsonMaskGenerator

    Structural data encryption

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    Yes

    Allowed for unique columns

    Yes

    Uses format-preserving encryption (FPE)

    Yes

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    AsciiPkGenerator

    Structural data encryption

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    BusinessNameGenerator

    Structural data encryption

    Consistency

    No, cannot be made consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    Yes

    Data-free

    Yes

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    1

    Generator ID (for the API)

    ConstantGenerator

    Structural data encryption
    buttons to configure the column sequence.
  • If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

  • Linking

    Yes, can be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    3

    Generator ID (for the API)

    EventGenerator

    Partitioning a column
    Structural data encryption
    GaussianGenerator
    HostnameGenerator
    Last
  • Full

  • Organization

    Location

    • Street address

    • ZIP

    • PO Box

    • City

    • State and two-letter abbreviation

    • Country

    • Postal code

    • GPS coordinates

    Contact information

    • Email address

    • Telephone number

    User credentials

    • Username

    • Password

    Financial information

    • Credit card number

    • International bank account number (IBAN)

    • SWIFT code for bank transfers

    • Money amount

    • BTC (Bitcoin) address

    Identification

    • Social Security Number

    • Passport number

    • Driver's license number

    • Birth date

    • Gender

    • Biometric identifier, such as a fingerprint or voiceprint

    • Full face photographic images and similar images

    Medical information

    • ICD-9 and ICD-10 codes (Used to identify diseases)

    • Medical record number

    • Health plan beneficiary number

    • Admission date

    • Discharge date

    • Date of death

    Other personal information

    • Marital status

    Accounts and licenses

    • Account number

    • Certificate or license number

    Network and web location

    • IP address

    • IPv6 address

    • MAC address

    • Web URL

    International Mobile Equipment Identity (IMEI)

    Vehicle information

    • Vehicle identification number (VIN)

    • License plate number

    Using Tonic Structural and the Safe Harbor method to de-identify PHIarrow-up-right

    View and configure tables

    Filter the table list, and assign table modes to tables.

    View the column list

    Apply filters to and sort the list of columns.

    View sample data

    View example source and destination data for a column.

    Configure an individual column

    Assign a generator and determine the column sensitivity.

    Configure multiple columns

    Use the bulk edit option to configure multiple columns.

    Identify similar columns

    Identify and filter to columns that are similar to a column, based on the column name.

    Comment on columns

    Add and respond to column comments.

    To both save the rule and apply the generator preset to all workspace columns that match the rule, click Save and Apply. On the confirmation panel, click Confirm Auto Apply.

  • To save the rule, but not apply the configured generator preset to matching columns, click Save.

  • Bulk Edit panel to update multiple columns
    Bulk Edit panel with the option to create a sensitivity rule
    Sensitivity rule configuration

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    Consistency

    Yes, can be made self-consistent

    Linking

    No, cannot be linked

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Character Substitution
    Structural data encryption

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    3

    Generator ID (for the API)

    From the Primary Key dropdown list, select the primary key for the current table.

  • If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

  • Consistency

    No, cannot be made consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    Toggle the Replace invalid emails setting to indicate whether to replace an invalid email address with a generated valid email address. By default, invalid email addresses are not replaced. In the replacement values, the username is generated. If you specify a value for Email Domain, then the email addresses use that domain. Otherwise, the domain is generated.
  • Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

  • If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

  • Consistency

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    3

    Generator ID (for the API)

    Consistency

    No, cannot be made consistent.

    Linking

    Yes, can be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    Yes

    Structural data encryption

    Uses format-preserving encryption (FPE)

    How do you connect to a local database when running Structural in a Docker container locally?

    If the locally running database that you want to connect to runs in a Docker container:

    1. Run: docker inspect

    2. In the networks section of the results, find the Gateway IP address. Use this IP address as the server address in Structural.

    If the locally running database does NOT run in a container, but runs on the machine, then:

    • On Windows or Mac, use host.docker.internal.

    • On Linux, use 172.17.0.1, which is the IP address of the docker0 interface.

    hashtag
    I allowlist access to my database. What are your static IP addresses?

    If you use Structural Cloud, and your database only allows connections from allowlisted IP addresses, then you need to allowlist Structural static IP addresses.

    This is not required for self-hosted instances of Structural.

    hashtag
    United States-based instance

    For the United States-based instance (app.tonic.aiarrow-up-right), the static IP address is:

    • 54.92.217.68

    hashtag
    Europe-based instance

    For the Europe-based instance (app-de.tonic.aiarrow-up-right), the static IP address is:

    • 3.69.249.144

    hashtag
    I allowlist network calls. What do I need to allowlist?

    hashtag
    URLs for telemetry sharing

    The URL https://telemetry.tonic.ai/ is used for our Amplitude telemetry.

    https://telemetry.tonic.ai/logs is used specifically for log sharing.

    Allowlist https://telemetry.tonic.ai/ or the following IP address:

    • 44.193.110.147

    Telemetry sharing is required. These metrics are valuable for us as we debug, make product roadmaps, and determine feature viability.

    No customer data is included. For more information about the specific telemetry data that we collect, go to Data that Tonic.ai collects.

    For more information on how to verify that telemetry is shared, go to Verifying and enabling telemetry sharing.

    hashtag
    URLs for Structural version information

    To support the one-click update option, Structural needs to be able to retrieve information about the latest Structural version.

    For more information, go to .

    hashtag
    How do I check my current version of Structural?

    Click your user image at the top right. The menu includes the Tonic version.

    hashtag
    How should we provision our source database?

    We recommend that you use a static copy of your production database that was restored from a backup.

    If that's not possible, consider the following when you connect Structural to your source data:

    • Structural cannot guarantee referential integrity of the output data if the source database is written to while data is generated. For this reason we recommend that you connect to a static copy of production data.

    • Read replicas and fast followers can be problematic for Structural because of how long it takes some queries to run. Read replicas tend to have short query timeout limits, which causes the queries to time out. Read replicas also reflect recent writes, which means that we cannot guarantee the referential integrity of the output.

    hashtag
    How does Structural use AI?

    For details about the available AI features and how they are supported, go to How Structural uses AI.

    hashtag
    What data does Tonic.ai collect from Structural?

    For details about the types of data that Tonic.ai does and does not collect, go to Data that Tonic.ai collects.

    hashtag
    Creating a Structural workspace

    Provides an overview of what a Structural workspace is and how to create a new Structural workspace. For more information, go to Managing workspaces.

    hashtag
    Sensitivity detection and generator recommendations

    Provides an overview of how Structural detects sensitive values and how you can apply recommended generators to the detected values.

    hashtag
    Managing workspace access

    Provides an overview of workspace owners, permissions, and permission sets. Explains how to share and transfer ownership of a workspace. For more information, go to Managing access to workspaces.

    hashtag
    Structural generators overview

    Identifies the types of generators and transformations that you can use in Structural, and explains how to assign a generator to a column. For more information, go to Generator information.

    hashtag
    Generator presets

    Provides an overview of generator presets. Includes how to create and update them, and how to track where each generator preset is used. For more information, go to Managing generator presets.

    hashtag
    File connector overview

    Provides an overview of the file connector and how to manage file groups in a file connector workspace. For more information, go to File connector.

    hashtag
    Generating data with consistency

    Provides an overview of the consistency generator property and how it works. For more information, go to Enabling consistency.

    hashtag
    Using Document View to configure JSON columns

    Provides an overview of how to enable Document View for a JSON column and how to use it to configure generators for JSON fields.

    hashtag
    Subsetting your data

    Provides an overview of subsetting, how it is configured, and how Structural uses the configuration to generate a subset. For more information, go to Subsetting data.

    hashtag
    Upsert data generation

    Provides an overview of upsert data generation. Includes how it works and how to enable and run it for a workspace. For more information, go to Enabling and configuring upsert.

    hashtag
    Writing destination data to a container repository

    Provides an overview of how to write destination data to a container repository instead of a database server. For more information, go to Writing output to a container repository.

    Structural data generation workflow

    Where to write the transformed data

  • The rules for the transformation

  • hashtag
    Manage workspaces

    hashtag
    Workspace details and tools

    hashtag
    Source database connection

    For data connectors that connect to a database, the Source Settings section provides connection information for the source database.

    You cannot change the source data configuration for a child workspace.

    For information about the source connection fields for a specific data connector, go to the workspace configuration topic for that connector type.

    hashtag
    Upsert configuration

    For data connectors that support upsert, the workspace configuration includes an Upsert section to allow you to enable and configure upsert. Upsert adds and updates rows in the destination database, but keeps all other existing rows intact.

    If you enable upsert, then you cannot write output to an Ephemeral database or to a container repository. You must write the output to a destination database.

    For more information, go to Enabling and configuring upsert.

    hashtag
    Destination data location

    For data connectors that connect to a database, the Destination Settings section provides information about where and how Structural writes the output data from data generation.

    Depending on the data connector type, you might be able to write to either:

    • Destination database - Writes the output data to a destination database on a database server.

    • Container repository - Writes the output data to a data volume in a container repository.

    hashtag
    Destination database

    When you write the output to a destination database, the destination database must be of the same type as the source database.

    Structural does not create the destination database. It must exist before you generate data.

    In Destination Settings, you provide the connection information for the destination database. For information about the destination database connection fields for a specific data connector, go to the workspace configuration topic for that connector type.

    If available, the Copy Settings from Source allows you to copy the source connection details to the destination database, if both databases are in the same location. Structural does not copy the connection password.

    hashtag
    Container repository

    Some data connectors allow you to write the transformed data to a data volume in a container repository instead of to a database server.

    For more information, go to Writing output to a container repository.

    hashtag
    Testing database connections

    When you provide connection details for a database server, Structural provides a Test Connection button to test the connection, and verify that Structural can use the connection details to connect to the database. Structural uses the connection details to try to reach the database, and indicates whether it succeeded or failed. We strongly recommend that you test the connections.

    The environment setting TONIC_TEST_CONNECTION_TIMEOUT_IN_SECONDS determines the number of seconds before a connection test times out. You can configure this setting from the Environment Settings tab on Structural Settings. By default, the connection test times out after 15 seconds.

    For the following data connectors, if the connection is successful, then Structural tests and reports on the connection speed.

    Connection speed information for a successful connection test
    • MySQL

    • Oracle

    • PostgreSQL

    • SQL Server

    hashtag
    File connector source and destination data

    A file connector workspace uses files as its source data and produces transformed versions of those files as its output.

    For file connector workspaces, the File Location section indicates where the source files are obtained from - either a local file system or a cloud storage solution (Amazon S3 or Google Cloud Storage).

    When the files come from cloud storage, the Output Location section indicates where to write the transformed files. You must also provide the cloud storage connection credentials.

    For more information, go to Configuring the file connector storage type and output options.

    Consistency

    Determined by the selected sub-generators.

    Linking

    Determined by the selected sub-generators.

    Differential privacy

    Determined by the selected sub-generators.

    Data-free

    Determined by the selected sub-generators.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    5

    Generator ID (for the API)

    hashtag
    How to configure

    hashtag
    Adding a regular expression

    To add a regular expression:

    1. Click Add Regex. On the configuration panel, Cell Value shows a sample value from the source database. You can use the previous and next options to navigate through the values.

    2. By default, Replace all matches is enabled. To only match the first occurrence of a pattern, toggle Replace all matches to the off position.

    3. In the Pattern field, enter a regular expression. If the expression is valid, then Structural displays the capture groups for the expression.

    4. For each capture group, to select and configure the generator to apply, click the selected generator. You cannot select another composite generator.

    5. To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

    hashtag
    Managing the regular expressions list

    From the Regexes list:

    • To edit a regular expression, click the edit icon.

    • To remove a regular expression, click the delete icon.

    composite generator
    Regex Mask

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    Linking

    No, cannot be linked.

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    hashtag
    How to configure

    To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.

    By default, the generator is not consistent.

    If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.

    When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Company is always mapped to New Company.

    When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.

    If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Business Name

    Consistency

    Determined by the selected generators.

    Linking

    Determined by the selected generators.

    Differential privacy

    Determined by the selected generators.

    Data-free

    Determined by the selected generators.

    Allowed for primary keys

    Yes, but:

    • Make sure that the configuration preserves uniqueness.

    • Do not use on primary key columns that are used for subsetting.

    Allowed for unique columns

    Yes

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • If a fallback generator is selected, then the lower of either 5 or the fallback generator.

    • 5 if no fallback generator is selected

    Generator ID (for the API)

    hashtag
    How to configure

    The generator consists of a list of options. Each option includes the required conditions and the generator to use if those conditions are met.

    hashtag
    Setting the default generator

    The generator always contains a Default option. The Default option is used if the value does not meet any of the conditions. To configure the Default option:

    1. From the Default dropdown list, select the generator to use by default.

    2. Configure the selected generator.

    hashtag
    Adding a condition option

    To add a condition option:

    1. Click + Conditional Generator.

    2. To add a condition:

      1. Click + Condition.

      2. From the column list, select the column for which to check the value.

      3. Select the comparison type.

      4. Enter the column value to check for.

      To remove a condition, click the delete icon for the condition.

    3. From the Generator dropdown list, select the generator to run on the current column if the conditions are met. You cannot select another composite generator.

    4. Choose the configuration options for the selected generator.

    hashtag
    Viewing and editing condition options

    To view details for and edit a condition option, click the expand icon for that option.

    hashtag
    Removing a condition option

    To remove a condition option, click the delete icon for the option.

    composite generator

    How the HIPAA Address generator handles zip codes is based on whether the Replace zeros in truncated Zip Code toggle in the generator configuration is off or on.

    By default, the setting is off. In this case, the last two digits of the zip code in the column are replaced with zeros, unless the zip code is a low population area as designated by the current census. For a low population area, all of the digits in the zip code are replaced with zeros.

    If the setting is on, then the generator selects a real zip code that starts with the same three digits as the original zip code. For a low population area, if a state is linked, then the generator selects a random zip code from within that state. Otherwise the generator selects a random zip code from the United States.

    hashtag
    Cities

    When a zip code column is not linked, a random city is chosen in the United States. When a zip code is already added to the link, a city is chosen at random that has at least some overlap with the zip code.

    If the original zip code is designated as a low population area, then a random city is chosen within the state. This is done only if the user has linked a State column. If they have not, a random city within the United States is chosen.

    For example, if the original city and zip code are (Atlanta, 30305), the zip code would be replaced with 30300. Many cities contain zip codes that begin in 303, such as Atlanta, Decatur, Chamblee, Hapeville, Dunwoody, and College Park. One of these cities is chosen at random so that, for example, the final value is (Chamblee, 30300).

    hashtag
    States

    HIPAA guidelines allow for information at the state level to be kept. Therefore, these values are passed through.

    hashtag
    Latitude and longitude (GPS) coordinates

    GPS coordinates are randomly generated in descending order of dependence of the linked HIPAA address components:

    1. If a zip code is linked, a random point within the same 3-digit zip code prefix is generated, if the 3-digit zip code prefix is not designated a low population area. If it is a low population area, use the linked state.

    2. If a state is available and a zip code and city are not, or the zip code or city are in a 3-digit zip code prefix that is designated a low population area, then a random GPS coordinate is generated somewhere within the state.

    3. If no zip code, city, or state is linked, or one or more of them were provided, but there was a problem generating a random GPS coordinate within the linked areas, then a GPS coordinate is generated at a random location within the United States.

    Note: If the city component of the HIPAA address is linked with latitude and/or longitude, the GPS coordinate components are randomly generated independently of the city.

    hashtag
    Other address parts

    All other address parts are generated randomly. The output value is not influenced at all by the underlying value in the column.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent.

    Linking

    Yes, can be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    hashtag
    How to configure

    To configure the generator:

    1. From the Link To dropdown list, select the other columns to link to. You can only select columns that are also assigned the HIPAA Address generator.

    2. From the address part dropdown list, select the type of address value that is in the column.

    3. Toggle the Replace zeros in truncated Zip Code setting how to generate zip codes. If the setting is off, then the last two digits are replaced with zero. For low population areas, the entire zip code is populated with zeroes. If the setting is on, then a real zip code is selected that starts with the first three digits of the original zip code. For low population areas, if a state is linked, a random zip code from the state is used. Otherwise, a random zip code from the United States is used.

    4. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

    5. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    hashtag
    Spark supported address parts

    For the HIPAA Address generator, Spark workspaces (Databricks and self-managed Spark clusters) only support the following address parts:

    • City

    • City with State

    • City with State Abbr

    • State

    • State Abbr

    • US Address

    • US Address with Country

    • Zip Code

    The Address generator provides support for additional address parts in Spark workspaces.

    Linking

    No, cannot be linked

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    hashtag
    How to configure

    To configure the generator:

    1. A MongoID object identifier consists of an epoch timestamp, a random value, and an incremented counter. To only change the random value portion of the identifier, but keep the timestamp and counter portions, toggle Preserve Timestamp and Incremental Counter to the on position.

    2. Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.

    3. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Consistency

    Yes, can be made self-consistent

    consistency
    data generation performance
    provide encryption and decryption keys
    Databricks
    MySQL
    Oracle
    Snowflake
    SQL Server
    Advanced Workspace Overrides section for a workspace
    Enabling and setting an override value

    Viewing your list of workspaces

    Workspaces view lists the workspaces that you have access to. To display Workspaces view, in the Tonic Structural heading, click Workspaces.

    Workspaces view

    hashtag
    How the workspace list is displayed

    The workspace list contains:

    • Workspaces that you own

    • Workspaces that you are granted access to

    If you have the global permission Copy any workspace or Manage user access to Tonic and to any workspace, then list includes all of the workspaces.

    The Permissions column lists the workspace permission sets that you are granted in each workspace. The permission sets include both permission sets that were granted to you directly as a user, and permission sets that were granted to an SSO group that you are a member of.

    always display under their parent workspace. The list only includes child workspaces that you have access to. If you have access to a child workspace, but not to its parent workspace, then the parent workspace is grayed out. You cannot select it.

    hashtag
    Filtering the workspace list

    You can filter the workspaces based on the following information:

    • Name - In the filter field, begin to type text that is in the name of the workspaces to display in the list.

    • Owner - From the Filter by Owner dropdown list, select the owner of the workspaces to display in the list.

    • Database type - From the Filter by Database Type dropdown list, select the type of database for the workspaces to display in the list.

    You can combine different filters. For example, you can filter the list to only include workspaces that use PostgreSQL and for which the generation status is Canceled or Failed.

    Child workspaces always display under their parent workspace, even if the parent workspace does not match the filter.

    hashtag
    Sorting the workspace list

    You can sort the workspace list by name, status, or owner.

    By default, the list is sorted alphabetically by name.

    To sort by a column, click the column heading. To reverse the order of the sort, click the column heading again.

    Child workspaces always display under their parent workspace. The child workspaces are sorted within the parent.

    hashtag
    Workspace details on Workspaces view

    Workspaces view provides the following information about each workspace:

    • Name - Contains the name and database type for the workspace. To view the workspace description, hover over the name.

    • Generation status - The status for the most recent generation job. To display the job details for the job, click the job status. To display more details about the date, time, and duration for the job, hover over the generation timestamp. If a job failed recently, you are given additional information about how long this job has been failing (the date of the first failure occurrence among a continuous series of failures).

    hashtag
    Getting access to workspace tools and actions

    hashtag
    Displaying the workspace management view

    On Workspaces view, when you click the workspace name, the for the workspace is displayed. The Privacy Hub tab is selected.

    The Name column also provides access to a menu of workspace configuration options. When you select an option, the is displayed, open to the view for the selected option.

    hashtag
    Options column

    The last column in the workspaces list provides additional workspace options:

    • Subsetting icon - Displays the subsetting configuration for the workspace. Go to .

    • Post-job actions icon - Displays the post-job actions for the workspace. For more information, go to and .

    • Actions menu - Provides access to additional options.

    hashtag
    Actions menu for bulk actions

    The Actions menu at the top left of the workspaces list allows you to to perform bulk actions on multiple workspaces. It is enabled when you check one or more of the checkboxes in the first column of each row. The Actions menu provides options for the selected workspaces.

    Creating, editing, or deleting a workspace

    hashtag
    Creating a workspace

    When you create a new workspace, you can either:

    • Create a completely new workspace.

    • The copy initially uses the configuration from the original workspace. After the copy is created, it is completely independent from the original workspace.

    • Child workspaces inherit configuration from the parent workspace. They continue to be updated automatically when the parent workspace is updated. For more information, go to .

    You can also view this .

    hashtag
    Creating a completely new workspace

    circle-info

    Required global permission: Create workspaces

    To create a completely new workspace, on Workspaces view, click Create Workspace > New Workspace.

    hashtag
    Creating a copy of a workspace

    circle-info

    Required workspace permission: Copy workspace (in the workspace to copy)

    Or

    Required global permission: Copy any workspace

    To create a workspace based on an existing workspace, either:

    • On the workspace management view of the workspace to copy, from the workspace actions menu, select Duplicate Workspace.

    • On Workspaces view, click the actions menu for the workspace, then select Duplicate Workspace.

    When you create a copy of a workspace, the copy initially inherits the following workspace configuration:

    • Source and destination database connections

    • Sensitivity designations, including manual designations that override the sensitivity scan results

    • Table mode assignments

    hashtag
    Creating a child workspace

    circle-info

    Required license: Enterprise

    Required workspace permission: Create child workspaces (in the parent workspace)

    You can create a workspace that is a child of an existing workspace. You cannot create a child workspace of another child workspace.

    The parent workspace must have a source database configured. You cannot create a child workspace from a workspace that uses the Databricks, self-managed Spark cluster, or MongoDB data connector.

    To create a child workspace, either:

    • On Workspaces view:

      • Click Create Workspace > Child Workspace.

      • Click the actions menu for the parent workspace, then select Create Child Workspace.

    On the New Workspace view, under Child Workspace, Parent Workspace identifies the parent workspace.

    If you used the Create Workspace > Child Workspace option to create the child workspace, then Parent Workspace is not populated. From the Parent Workspace dropdown list, select the parent workspace for the new child workspace.

    If you selected the child workspace option for a specific workspace, then Parent Workspace is set to that workspace.

    If you originally chose to create a completely new workspace, then on the New Workspace view:

    1. To change to a child workspace, select Create Child Workspace from the Create a child workspace panel at the right. Structural adds the Child Workspace panel to the New Workspace view.

    2. From the Parent Workspace dropdown list, select the parent workspace for the new child workspace.

    hashtag
    Editing a workspace

    circle-info

    Required workspace permission: Configure workspace settings

    To edit the configuration for an existing workspace, either:

    • On the workspace management view:

      • On the workspace navigation bar, click Workspace Settings.

      • From the workspace actions menu, select Workspace Settings.

    hashtag
    Deleting a workspace

    circle-info

    Required workspace permission: Delete workspace

    You can delete workspaces that you no longer need.

    You cannot delete a parent workspace. You must first delete all of its child workspaces.

    To delete a workspace:

    • On the workspace management view, from the workspace actions menu, select Delete Workspace.

    • On the Workspaces view, click the actions menu for the workspace, then select Delete.

    • On the Workspace Settings view, click Delete Workspace.

    Selecting a secrets manager secret

    For fields that support secrets managers, such as database password fields, if at least one secrets manager is available, then at the top right of the field is a Use Secret link.

    Use Secret option for a field that supports using a secret from a secrets manager

    hashtag
    Indicating to use a secret

    To use a secret to populate the value:

    1. Click Use Secret.

    1. On the Use Secret panel, from the Secrets Manager dropdown list, select the name of the secrets manager that contains the secret.

    2. Based on the secrets manager type, Structural prompts you for the information needed to identify and retrieve the secret.

    3. After you provide the required information, click Confirm.

    hashtag
    Updating the secret selection

    To change the secret selection or other information about the selected secret:

    1. Click the Using Secret link.

    2. On the Use Secret panel, update the information.

    3. Click Confirm.

    hashtag
    Selecting a secret from AWS Secrets Manager

    When you select a secrets manager from AWS Secrets Manager:

    1. In the Secret ARN or Name field, provide the name or ARN of the secret.

    2. If the secret is part of a structured key-value pair, then in the Property Name field, provide the property name that contains the secret value.

    hashtag
    Selecting a secret from HashiCorp Vault

    When you select a secrets manager from HashiCorp Vault:

    hashtag
    Using chained credentials

    For HashiCorp vault, for additional security, you can choose to use chained credentials. When you enable chained credentials, you provide a set of credentials that is used in turn to retrieve the credentials that are used to retrieve the specified secret.

    To enable chained credentials, toggle Use chained credentials to the on position. The Chained Credentials Configuration is displayed.

    hashtag
    Selecting the authentication method

    From the Method dropdown list, select the type of authentication to use:

    • AppRole

    • LDAP

    • Token

    hashtag
    Configuring the shared authentication settings

    For all authentication types:

    1. If the selected authentication method is enabled in a specific namespace, then in the first Namespace field, provide the namespace.

    2. If the selected authentication method does not use the default mount path, then in the first Mount path field, provide the mount path.

    3. In the Secret Name field, provide the name of the secret.

    hashtag
    Configuring app role authentication

    For app role authentication:

    1. In the Role ID field, provide the name of the secret property that contains the identifier of the application role.

    2. In the Secret ID field, provide the name of the secret property that contains the secret identifier of the application role.

    hashtag
    Configuring token authentication

    For token authentication, in the Token field, provide the name of the secret property that contains the authentication token to use.

    hashtag
    Configuring LDAP authentication

    For LDAP authentication:

    1. In the LDAP Username field, provide the name of the secret property that contains the LDAP username.

    2. In the LDAP Password field, provide the name of the secret property that contains the password for the LDAP user.

    hashtag
    Providing the database secret

    If you use chained credentials, then the database secret fields are under Database Secret Configuration.

    To provide information about the secret to retrieve:

    1. In the Secret Name field, provide the name of the secret.

    2. If the secret is in a specific namespace, then in the Namespace field, provide the namespace.

    3. If the authentication does not use the default mount path, then in the Mount Path field, provide the mount path.

    hashtag
    Selecting a secret from CyberArk Central Credential Provider

    When you select a secrets manager from CyberArk Central Credential Provider:

    1. In the Secret Name field, provide the name of the secret.

    2. Optionally:

      1. In the CyberArk Safe field, provide the name of the CyberArk safe that contains the secrets manager.

    hashtag
    Removing the secret selection

    To remove the secret selection entirely, and enable a value to be entered manually.

    1. Click the Using Secret link.

    2. On the Use Secret panel, click Remove.

    Sharing workspace access

    circle-info

    Required license: Professional or Enterprise

    Required permission

    • Global permission: View organization users. This permission is only required for the Tonic Structural application. It is not needed when you use the Structural API.

    • Either:

      • Workspace permission: Share workspace access

      • Global permission: Manage user access to Tonic and to any workspace

    hashtag
    About workspace access

    Tonic Structural uses workspace permission sets for role-based access (RBAC) of each workspace.

    A workspace permission set is a set of . Each permission provides access to a specific workspace feature or function.

    Structural provides . Enterprise instances can also .

    To share workspace access, you assign workspace permission sets to users and, if you use SSO to manage Structural users, to SSO groups.

    Before you assign a workspace permission set to an SSO group, make sure that you are aware of who is in the group. The permissions that are granted to an SSO group automatically are granted to all of the users in the group. For information on how to configure Structural to filter the allowed SSO groups, go to .

    hashtag
    Limitations on workspace sharing

    hashtag
    Cannot remove the owner permission set from the owner

    You cannot remove the owner workspace permission set from the workspace owner. By default, the owner permission set is the built-in Manager permission set.

    hashtag
    Cannot grant or remove a permission you do not have

    Within a workspace, the Share workspace access permission grants you the ability to share access to the workspace.

    However, you cannot either grant or revoke access to a workspace permission that you do not have.

    For example, for a given workspace, the workspace permission set that is assigned to you includes Share workspace access, but does not include Run data generation. Because of this, when you share workspace access:

    • You cannot grant access to a workspace permission set that includes Run data generation.

    • You cannot remove access to a workspace permission set that includes Run data generation.

    Note that this requirement does not apply to users who have the global permission Manage user access to Tonic and to any workspace, which is by default granted to Admin users. Those users can grant or revoke any workspace permission set.

    hashtag
    Changing the workspace access

    To change the current access to the workspace:

    1. To manage access to a single workspace, either:

      • On the workspace management view, in the heading, click the share icon.

      • On Workspaces view, click the actions menu for the workspace, then select Share.

    Viewing and configuring tables

    The table list at the left of Database View contains the list of tables in the source database. You can filter the table list and assign tables modes to the tables.

    hashtag
    Information in the table list

    The table list is grouped by schema. You can expand and collapse the list of tables in each schema. This does not affect the displayed columns.

    For a file connector workspace, each table corresponds to a file group.

    For each table, the table list includes the following information:

    • The name of the table.

    • The number of columns that have an assigned generator (a generator other than Passthrough). The number does not display if none of the table columns has an assigned generator.

    • The . The table list only shows the first letter of the table mode:

    For a child workspace, if the selected table mode overrides the parent workspace configuration, then the override icon displays.

    To display for a table, click the arrow icon to the right of the table entry.

    hashtag
    Filtering the table list

    You can filter the table list and . You can also filter the tables based on .

    As you filter the table list, the column list also is filtered to only include the columns for the filtered tables.

    hashtag
    Filtering by table name

    To filter the table list by name, in the filter field, begin to type text that is in the table name.

    As you type, Tonic Structural filters the list to only display tables with names that contain the filter text.

    hashtag
    Filtering by the assigned table mode

    To filter the table list based on the assigned table mode:

    1. Click Filters.

    2. On the filter panel, check the checkbox next to each table mode to include. By default, the list includes all of the table modes. As you check and uncheck the table mode checkboxes, Structural adds and removes the associated tables from the list.

    hashtag
    Filtering to exclude tables that have assigned generators

    You can filter the table list to only display tables that have no assigned generators:

    1. Click Filters.

    2. On the filter panel, to only show tables that do not have assigned generators, check the No Generators Applied checkbox.

    hashtag
    Assigning table modes to tables

    circle-info

    Required workspace permission: Assign table modes

    The table mode determines the number of rows and columns in the destination database. For details about the available table modes and how they work, go to .

    hashtag
    Updating a single table

    To change the assigned table mode for a single table:

    1. Click the table mode dropdown next to the table name.

    2. From the table mode dropdown list, select the table mode.

    3. For a child workspace, the table mode selection panel indicates whether the selected table mode is inherited from the parent workspace. If the child workspace currently overrides the parent workspace configuration, then to reset the table mode to the table mode that is assigned in the parent workspace, click Reset.

    hashtag
    Updating multiple tables

    To change the assigned table mode for multiple tables:

    1. Check the checkbox for each table to change the table mode for. To select a continuous range of tables, click the first table in the range, then Shift-click the last table in the range. To select all of the tables in a schema, click the schema name.

    2. Click Bulk Edit.

    3. On the panel, click the radio button for the table mode to assign to the selected tables.

    Performing scans on collections

    circle-info

    Required workspace permission: Run collection scan

    When you first connect to a MongoDB or Amazon DynamoDB database, Tonic Structural performs a scan to determine the available fields in each collection, the field types, and how prevalent the fields are. It performs this scan at the same time as the initial sensitivity scan.

    For each collection, Structural creates a hybrid document, which is a superset of all of the fields contained in the collection documents.

    hashtag
    Configuring the collection scan

    By default, for each collection:

    • The scan includes all of the documents in the collection, and continues until the scan is finished.

    • Every unique path (field+data type) in the collection is added to the hybrid document.

    You can change the default scan behavior. To change the scan configuration, use the following . You can add these settings manually to the Environment Settings list on Structural Settings.

    Note that these settings, including settings that include MONGO in the name, apply to both MongoDB and Amazon DynamoDB.

    hashtag
    Configuring how schemas are scanned

    The following options control the number of documents that Structural scans in a collection.

    These options allow you to limit the number of scanned documents when the additional documents do not add fields to the hybrid document.

    For large homogenous collections, where all or most documents have the same structure, configuring these options can improve performance.

    If you set both options, then the scan completes when it reaches either limit. For example, if the maximum document count is 10 and the maximum scan time is 360 seconds, then the scan completes either after 10 documents or after 360 seconds, whichever comes first.

    hashtag
    Configuring how fields are collapsed in the hybrid document

    Typically, the number of unique fields in a collection is small relative to the number of documents. However, in some cases the number of fields is similar to or greater than the number of documents. This most commonly occurs when documents have "data as keys", such as keys that are ObjectIds, UUIDs, or incrementing integers.

    In these cases, adding every unique field to the hybrid document can result in a large hybrid document that has an undesirable structure.

    Structural offers configuration options to "collapse" fields within the hybrid document. This shrinks the size of the hybrid document. It also allows you to assign a generator to the collapsed group instead of to each unique key.

    By default, Structural does not collapse fields.

    hashtag
    Collapsing fields when the key is an ObjectId

    To enable this, set the TONIC_MONGO_OBJECT_ID_COLLAPSE_THRESHOLD to the number of ObjectId keys that an object can contain before Structural collapses the object schema into a single key.

    For example, if this is 10, then any object that has 10 or more ObjectId keys is collapsed into a single key.

    A negative value indicates to not collapse the keys.

    The default value is -1.

    hashtag
    Collapsing fields when the key matches a custom pattern

    To enable Structural to collapse fields, you provide a regular expression to identify the fields that can be collapsed into the same field. You then configure the number of matches that must exist before Structural collapses the fields.

    To configure how the fields are collapsed, use the following :

    For example:

    • To collapse keys that are integer values, use the regular expression [0-9]+ or \d+

    • To collapse keys that are UUIDs, use the regular expression [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}

    hashtag
    Viewing the most recent scans for each collection

    On Privacy Hub, the Latest Collection Scan table shows the most recent scans on each scanned collection.

    The Build Schema option runs a new scan on the collection.

    hashtag
    Starting a collection scan

    When the source database has a new collection, then on Collection View, you are prompted to run a scan either on that collection or on all collections.

    Running the Structural sensitivity scan

    The Structural sensitivity scan identifies sensitive columns in source data. The scan ignores truncated tables.

    hashtag
    When sensitivity scans run

    For most data connectors, Structural runs sensitivity scans automatically based on specific events. You can also run manual sensitivity scans on demand.

    On a self-hosted instance, sensitivity scans can also run automatically at the same time each day.

    hashtag
    Event-based sensitivity scans

    circle-info

    Structural does not run automatic sensitivity scans for Databricks and Spark SDK workspaces.

    By default, Structural does not run automatic sensitivity scans for file connector workspaces. To enable automatic sensitivity scans for the file connector, set the TONIC_ENABLE_FILES_PRIVACY_SCAN_AUTORUN to true.

    Structural automatically runs a sensitivity scan when you:

    • Create a completely new workspace and connect a data source.

    • Change the data connection details for the source database.

    • Add a file group to a file connector workspace.

    A child workspace always inherits the sensitivity designations from its parent workspace.

    When you copy a workspace, Structural runs a new sensitivity scan on the copy to identify sensitive columns. However, it keeps the sensitivity designation for columns that you specifically marked as sensitive or not sensitive.

    hashtag
    Manual sensitivity scans

    In addition to the automatic scans, from Privacy Hub, you can .

    hashtag
    Scheduling daily sensitivity scans

    On self-hosted instances, Structural can also run scheduled daily sensitivity scans in the background.

    The daily scans only run on the 10 workspaces that had the most recent activity. Activity includes:

    • User-initiated updates that are included in the .

    • Data generation jobs.

    By default, Structural runs the sensitivity scans each day at midnight.

    To enable and configure the daily sensitivity scans, use the following . You can add these settings to the Environment Settings list on Structural Settings.

    • TONIC_ENABLE_SCHEDULED_SENSITIVITY_SCAN - Boolean to indicate whether to enable the scheduled daily sensitivity scans. The default value is true. To disable the scheduled daily scan, set this to false.

    • TONIC_SENSITIVITY_SCAN_HOUR - When scheduled scans are enabled, the hour at which to run the scans. The setting uses the local time zone. The value is an integer between 0 and 23, where 0 is midnight and 23 is 11:00 PM. For example, a value of 14 indicates to run the job at 2:00 PM. The default value is 0.

    hashtag
    Configuring parallel processing for sensitivity scans

    For improved performance, sensitivity scans can use parallel processing.

    For relational databases such as PostgreSQL and SQL Server, to configure parallel processing, you use the TONIC_PII_SCAN_PARALLELISM_RDBMS. The default value is 4.

    For document-based databases such as MongoDB, you use the environment setting TONIC_PII_SCAN_PARALLELISM_DOCUMENTDB. The default value is 1.

    hashtag
    How Structural identifies sensitive values

    The Structural sensitivity scan uses the following rules and processes to:

    • Identify sensitive columns.

    • Recommend generators for those columns. For information about applying recommended generators to columns, go to .

    • Indicate its confidence that an identified column is sensitive and is of the detected sensitivity type.

    Note that this process cannot guarantee perfect precision and recall. We strongly recommend that a human reviews the sensitivity scan results and the broader dataset to ensure that nothing sensitive was missed.

    hashtag
    Rule-based data type, column name, and value analysis - High, medium, or low confidence

    To identify that a column contains sensitive information for a , Structural looks at the data type, column name, and column values.

    This part of the sensitivity scan uses regular expression matching and dictionary lookups. It produces high, medium, or low confidence detections.

    When this part of the sensitivity scan determines that a column contains sensitivity data, it:

    • Marks the column as sensitive

    • Assigns the sensitivity type to the column

    • Recommends the generator configuration for the identified sensitivity type. Note that if the recommended generator is not compatible with the column, then Structural discards the recommendation.

    hashtag
    Custom sensitivity rules - Full confidence

    The sensitivity scan also looks for any columns that match custom sensitivity types that you define in your custom sensitivity rules.

    Custom sensitivity rules are based on the column data type and column name. For more information about custom sensitivity rules, go to .

    Custom sensitivity rules always produce full confidence detections.

    When a column matches a custom sensitivity rule, Structural:

    • Marks the column as sensitive.

    • Assigns the sensitivity rule name as the sensitivity type.

    • Recommends the generator preset from the sensitivity rule.

    hashtag
    Model-based analysis - Medium and low confidence

    To identify additional sensitive columns that might not be captured by the other parts of the scan, the sensitivity scan uses an artificial intelligence (AI) model. Note that the model is pre-trained. Structural does not use customer data to train the model, and it does not send any customer data externally.

    This part of the scan produces medium or low confidence detections for built-in entity types.

    The model considers the table and column name. If the combination of table and column name is similar in meaning to a sensitivity type that Structural has a recommended generator for, then Structural:

    • Marks the column as sensitive.

    • Assigns the sensitivity type to the column.

    • Recommends the generator configuration for that sensitivity type.

    hashtag
    Downloading the sensitivity scan log

    To download the log of the most recent sensitivity scan, either:

    • On the workspace management view, from the download menu, select Download Sensitivity Scan Log.

    • On Privacy Hub, click Reports and Logs, then select Scan Log.

    The log tracks the progress of the scan.

    CSV Mask

    This is a composite generator.

    Masks text columns by parsing the values as rows whose columns are delimited by a specified character.

    You can assign specific generators to specific indexes. You can also use the generator that is assigned to a specific index as the default. This applies the generator to every index that does not have an assigned generator.

    The output value maintains the quotes around the index values.

    For example, a column contains the following value:

    "first","second","third"

    You assign the Character Scramble generator to index 0 and assign Passthrough to index 2. You select index 0 as the index to use for the default generator.

    In the output, the first and second values are masked by the Character Scramble generator. The third value is not masked. The output looks something like:

    "wmcop", "xjorsl", "third"

    hashtag
    Characteristics

    hashtag
    How to configure

    hashtag
    Setting the delimiter

    In the Delimiter field, type the delimiter that is used as a separator in the value.

    For example, for the value "first","second","third", the delimiter is a comma.

    hashtag
    Adding a sub-generator

    You can configure a generator for any or all of the indexes. To add a sub-generator for an index:

    1. Under Sub-Generators, click Add Generator. On the add generator dialog, the Cell CSV field contains a sample value from the source data. You can use the navigation icons to page through the values.

    2. In the CSV Index field, type the index to assign a generator to. The index numbers start with 0. You cannot use an index that already has an assigned generator. Matched CSV values shows the value at that index for the current sample column value.

    3. Under Generator Configuration

    hashtag
    Managing the sub-generator list

    From the Sub-Generators list:

    • To edit a generator assignment, click the edit icon.

    • To remove a generator assignment, click the delete icon.

    • To move a generator assignment up or down in the list, click the up or down arrow.

    hashtag
    Setting the default for indexes without a generator

    After you configure a generator for at least one index, the Default Link dropdown list is displayed.

    From the Default Link dropdown list, select the index to use to determine how to mask values for indexes that do not have an assigned generator.

    For example, you assign the Character Scramble generator to index 2. If you set Default Link to 2, then all indexes that do not have an assigned generator use the Character Scramble generator.

    Custom Categorical

    A version of the Categorical generator that selects from values that you provide instead of shuffling the original values.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    Linking

    hashtag
    How to configure

    hashtag
    Linking the column

    From the Link To dropdown list, select the columns to link this column to.

    You can only select other columns that use the Custom Categorical generator.

    hashtag
    Providing the values to use

    In the Custom Categories text area, provide the list of values that the generator can choose from.

    To provide the values, you can either:

    • Enter the values manually

    • Provide an AI prompt to populate the values (Structural Cloud only)

    hashtag
    Entering the values manually

    When you enter the values manually, put each value on a separate line.

    To add a NULL value to the list, use the keyword {NULL}.

    hashtag
    Providing an AI prompt

    circle-info

    Only available on the Structural Cloud instance that is hosted in the United States. Not available on self-hosted instances or on the European instance of Structural Cloud.

    To use an AI prompt to create the values:

    1. In the AI prompt field below the Custom Categories text area, type the prompt to use to create the values. The prompt can include the number of values to create. For example, 10 names of flowers or 20 cities in California. If the prompt does not include a number, then Structural determines a reasonable set of values to generate based on the prompt. If there is a very limited set of values, then Structural often generates the full set of values. Otherwise it attempts to generate a reasonable number of values, usually between 10 and 20.

    2. Press Enter or click the add values icon.

    The values replace any existing values in the list.

    After you use the prompt to create a set of values, you can edit the list manually.

    For information about how Structural uses AI, go to .

    hashtag
    Configuring consistency

    Toggle the Consistency setting to indicate whether to make the column consistent.

    By default, consistency is disabled.

    If you enable consistency, then by default the generator is self-consistent.

    To make the generator consistent with another column, from the Consistent to dropdown list, select the column.

    When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database.

    When a generator is consistent with another column, then a given source value in that column always results in the same value for the current column in the destination database. For example, a department column is consistent with a username column. For each instance of User1 in the source database, the value in the department column is the same.

    hashtag
    Enabling Structural data encryption

    If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Date Truncation

    Truncates a date value or a timestamp to a specific part.

    For a date or a timestamp, you can truncate to the year, month, or day.

    For a timestamp, you can also truncate to the hour, minute, or second.

    hashtag
    Characteristics

    hashtag
    How to configure

    To configure the generator:

    1. From the dropdown list, select the part of the date or timestamp to truncate to. For both date and timestamp values, you can truncate to the year, month, or day. When you select one of these options, the time portion of a timestamp is set to 00:00:00. For the date, the values below the selected truncation value are set to 01. For example, when you truncate to month, the day value is set to 01, and the timestamp is set to 00:00:00. For a timestamp value, you also can truncate to the hour, minute, or second. The date values remain the same as the original data. The time values below the selected truncation value are set to 00. For example, when you truncate to minute, the seconds value is set to 00.

    2. Toggle the Birth Date option. When you enable Birth Date, the generator shifts dates that are more than 90 years before the generation date to the date exactly 90 years before the generation date. For example, data generation occurs on January 1, 2023. Any date that occurs before January 1, 1933 is changed to January 1, 1933.

      This is mostly intended for birthdate values, to group birthdates for everyone who is older than 89 into a single year. This is used to comply with HIPAA Safe Harbor.

    hashtag
    Truncation examples

    Here are examples of date and time values and how the selected truncation affects the output:

    Option
    Date value
    Timestamp value

    MAC Address

    Generates a random MAC address formatted string.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    hashtag
    How to configure

    To configure the generator:

    1. In the Bytes Preserved field, enter the number of bytes to preserve in the generated address.

    2. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

    3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    IP Address

    Generates a random IP address formatted string.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    Linking

    No, cannot be linked.

    hashtag
    How to configure

    To configure the generator:

    1. In the Percent IPv4 field, type the percentage of output values that are IPv4 addresses. For example, if you set this to 60, then 60% of the generated IP addresses are IPv4 addresses, and 40% of the generated IP addresses are IPv6 addresses. If you set this to 100, then all of the generated IP addresses are IPv4 addresses. If you set this to 0, then all of the generated IP addresses are IPv6 addresses.

    2. Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.

    Finnish Personal Identity Code

    Generates a valid Finnish Personal Identity Code (PIC) that would have been issued during a specific date range.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    hashtag
    How to configure

    To configure the generator:

    1. Under Date Range, set the start and date for the date range to generate the PICs for.

    2. Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.

    3. If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.

    Numeric String Key

    Generates unique numeric strings of the same length as the input value.

    For example, for the input value 123456, the output value would be something like 832957.

    You can apply this generator only to columns that contain numeric strings.

    hashtag
    Characteristics

    hashtag
    How to configure

    To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.

    By default, the generator is not consistent.

    If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Null

    Generates NULL values to fill the rows of the specified column.

    hashtag
    Characteristics

    Consistency

    No, cannot be made consistent.

    Linking

    No, cannot be linked.

    hashtag
    How to configure

    The Null generator has no configuration options.

    Managing your user account

    From the User Settings view, you can manage settings for your individual Tonic Structural account.

    To display the User Settings view:

    1. Click your user image at the top right.

    2. In the menu, click User Settings.

    About the workspace management view

    You use the workspace management view to configure and run data generation for an individual workspace.

    When you log in to Tonic Structural, it displays the workspace management view for the workspace that was selected when you logged out.

    hashtag
    Components of the workspace management view

    About workspace inheritance

    circle-info

    Required license: Enterprise

    If you have multiple workspaces, then it is likely that many of the workspace components and configurations are the same or similar. It can be difficult to maintain that consistency across separate, independent workspaces.

    When you copy a workspace, the new workspace is completely independent of the original workspace. There is no visibility into or inheritance of changes from the original workspace.

    Workspace inheritance allows you to create workspaces that are children of a selected workspace. Unlike a copy of a workspace, a child workspace remains tied to its parent workspace.

    Creating and managing custom sensitivity rules

    circle-info

    Required license: Professional

    Required global permission: Create and manage sensitivity rules

    By default, when a Structural security scan runs on a workspace, it looks for the .

    You can also define custom sensitivity rules to identify other values and the corresponding recommended generator. Your data might include values that are specific to your organization.

    Address

    Generates a random mailing address-like string.

    You can indicate which part of an address string that the column contains. For example, the column might contain only the street address or the city, or it might contain the full address.

    hashtag
    Characteristics

    HTML Mask

    This is a .

    Masks text columns by parsing the contents as HTML, and applying sub-generators to specified path expressions.

    If applying a sub-generator fails because of an error, the generator selected as the fallback generator is applied instead.

    Path expressions are defined using the .

    For example, for the following HTML:

    To get the value of h1, the expression is //h1/text().

    International Address

    Generates an address-like string to replace either:

    • For a Canadian postal address:

      • Street name

    No

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    CompanyNameGenerator

    If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given source value in that column always results in the same IP address value in the destination database. For example, an IP address column is consistent with a username column. For each instance of User1 in the source database, the value in the IP address column is the same.

  • If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

  • Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    IPAddressGenerator

    Differential privacy

    No, cannot be made differentially private.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    Yes

    Generator ID (for the API)

    FinnishPicGenerator

    Differential privacy

    Yes

    Data-free

    Yes

    Allowed for primary keys

    No

    Allowed for unique columns

    Yes

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    1

    Generator ID (for the API)

    NullGenerator

    Workspaces view

    View the list of workspaces that you have access to.

    Create, edit, and delete workspaces

    Add and remove workspaces, or update a workspace configuration.

    Export and import workspace configuration

    Save an existing workspace configuration. Apply a saved configuration to a workspace.

    Assign workspace tags

    Use tags to identify and organize your workspaces.

    Workspace settings

    Includes identifying information, data connection settings, and data generation settings.

    Workspace management view

    Provides access to workspace configuration and generation tools.

    Workspace inheritance

    Create child workspaces that inherit source data and configuration from their parent workspace.

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    HipaaAddressGenerator

    Structural data encryption

    TONIC_PII_SCAN_MAX_TIMEOUT_IN_MINUTES_IF_AUTOMATIC - The number of minutes after which a scheduled scan times out. By default, the scan times out after 3 minutes.

    Marks the sensitivity detection as high, medium, or low confidence. The confidence level is based on a calculation of how well the column matched the applicable rules.
    Marks the sensitivity detection as full confidence.
    Uses AI to compare the table name and column name combination to the sensitivity type, and produces a semantic similarity score.
  • Based on the semantic similarity score, marks the sensitivity detection as either medium or low confidence.

  • environment setting
    start a sensitivity scan manually
    Protection Audit Trail
    environment settings
    environment setting
    Reviewing and applying recommended generators
    built-in sensitivity type
    Creating and managing custom sensitivity rules

    Yes, can be linked.

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    CustomCategoricalGenerator

    How Structural uses AI
    Structural data encryption

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    Yes

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    MACAddressGenerator

    Structural data encryption
    TextMaskGenerator
    CrossTableAggregateGenerator
    GeoGenerator
    ArrayRegexMaskGenerator
    ConditionalGenerator
    ObjectIdPkGenerator
    Generator configuration
  • Subsetting configuration

  • Post-job scripts

  • On the workspace management view, from the workspace actions menu, select Create Child Workspace.

    On Workspaces view, click the actions menu for the workspace, then select Workspace Settings.

    Create a copy of an existing workspace.
    Create a child of an existing workspace.
    About workspace inheritance
    video overview of how to create a workspacearrow-up-right
    Workspace actions menu
    Workspace options column and dropdown list
    New Child Workspace view for a child workspace
    Create Child Workspace option on the New Workspace view

    To manage access for multiple workspaces:

    1. Check the checkbox for each workspace to grant access to.

    2. From the Actions menu, select Share Workspaces.

  • The workspace access panel contains the current list of users and groups that have access to the workspace. To add a user or group to the list of users and groups, begin to type the user email address or group name. From the list of matching users or groups, select the user or group to add. Free trial users can invite other users to start their own free trial. Provide the email addresses of the users to invite. The email addresses must have the same corporate email domain as your email address. When the invited users sign up for the free trial, they are added to the Structural organization for the free trial user that invited them and have access to the workspace.

  • For a user or group, to change the assigned workspace permission sets:

    1. Click Access. The dropdown list is populated with the list of custom and built-in workspace permission sets. If you selected multiple workspaces, then on the initial display of the workspace sharing panel, for each permission set that a user or group currently has access to, the list shows the number of workspaces for which the user or group has that permission set. For example, you select three workspaces. A user currently has Editor access for one workspace and Viewer access for the other two. The Editor permission set has 1 next to it, and the Viewer permission set has 2 next to it.

    2. Under Custom Permission Sets, check the checkbox next to each workspace permission set to assign to the user or group. Uncheck the checkbox next to each workspace permission set to remove from the user or group.

    3. Under Built-In Permission Sets, check the workspace permission set to assign to the user or group. You can only assign one built-in permission set. By default, for an added user or group, the Editor permission set is selected. To select a built-in workspace permission set that is lower in access than the currently selected permission set, you must first uncheck the selected permission set. For example, if Editor is currently checked, then to change the selection to Viewer, you must first uncheck Editor.

  • To remove all access for a user or group, and remove the user or group from the list, click Access, then click Revoke.

  • To save the new access, click Save.

  • workspace permissions
    built-in workspace permission sets
    configure custom permission sets
    Synchronizing SSO groups with Structural
    Workspace access panel

    D = De-identify

  • S = Scale

  • T = Truncate

  • P = Preserve Destination

  • I = Incremental

  • assigned table mode
    Table View
    by name
    by the assigned table mode
    whether any of the columns have assigned generators
    Table modes
    Table list on Database View
    Table that overrides the configuration from the parent workspace
    Filtering the table list by table name
    Filtering the table list by table mode
    Assigning a table mode to a single table
    Assigning a table mode to multiple tables

    TONIC_DOCUMENT_SCAN_MAX_DOCS_COUNT

    The maximum number of documents to scan for each schema in a collection. For example, if this is 10, then Structural scans up to 10 documents, and ignores the remaining documents. When this value is empty, Structural scans all of the documents.

    TONIC_DOCUMENT_SCAN_MAX_TIME_SECONDS

    The maximum amount of time in seconds to scan a schema. For example, if this is 360, then Structural scans a schema for up to 360 seconds. When this value is empty, Structural continues the scan until it is complete.

    TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX

    The regular expression that identifies the fields that can be collapsed into a single field. By default, this value is empty.

    TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX_THRESHOLD

    The number of fields that match the regular expression before Structural collapses the fields into a single field. For example, if this is 5, then after Structural finds 5 fields that match the regular expression, it collapses all of the matching fields into a single field. A negative value indicates to not collapse the fields. The default value is -1.

    environment settings
    environment setting
    environment settings
    Latest Collection Scan on Privacy Hub
    Collection scan prompt on Collection View

    Structural changes the link to Using Secret, and disables the field.

    If the vault is enabled in a specific namespace, then in the second Namespace field, provide the namespace.

  • If the secrets engine does not use the default mount path, then in the second Mount path field, provide the mount path.

  • If the secret is part of a structured key-value pair, then in the Property Name field, provide the property name that contains the secret value.
    In the CyberArk Folder field, provide the name of the folder within the safe that contains the secrets manager. If you do not specify a folder here, Structural uses the folder configured in the secrets manager. If a folder is not configured in the secrets manager, then the folder defaults to Root. To specify a folder path, use Root followed by the rest of the path, with each path component separated by backslashes. For example: Root\OS\Linux.

    If you do not provide these values, they fall back to the values that are configured for the secrets manager.

    Use Secret panel to select the secret to use
    Field marked as using a secret from a secrets manager
    Fields to identify a secret from an AWS secrets manager
    Secret selection panel for a HashiCorp Vault secret
    Chained credentials configuration for a HashiCorp Vault secret
    Secret selection panel for a CyberArk Central Credential Provider secret
  • If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

  • 2021-12-20

    2021-12-20 00:00:00

    Truncate to hour

    Not applicable

    2021-12-20 13:00:00

    Truncate to minute

    Not applicable

    2021-12-20 13:42:00

    Truncate to second

    Not applicable

    2021-12-20 13:42:55

    Consistency

    No, cannot be made consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    5

    Generator ID (for the API)

    DateTruncationGenerator

    Original value

    2021-12-20

    2021-12-20 13:42:55

    Truncate to year

    2021-01-01

    2021-01-01 00:00:00

    Truncate to month

    2021-12-01

    2021-12-01 00:00:00

    Truncate to day

  • Generation status - In the Generation Status column heading, click the filter icon. Check the checkbox next to the generation status values for the workspaces to display in the list.

  • Tags - In the Tags column heading, click the filter icon. By default, the workspaces are not filtered by tag, and all of the checkboxes are unchecked. To only include workspaces that have specific tags, check the checkbox next to each tag to include. To uncheck all of the selected tags, click Reset Tags. When you filter by tag, Structural checks whether each workspace contains any of the selected tags.

  • Permissions - In the Permissions column heading, click the filter icon. You can check and uncheck checkboxes to include or exclude specific permission sets. For example, you can filter the list to only display workspaces for which the Editor permission set is granted either to you or to an SSO group that you belong to. For users that have the global permission Copy any workspace, the Permissions filter panel also contains an Any permissions checkbox. By default, Any permissions is unchecked, and the list includes workspaces for which you are not assigned any workspace permission sets. To display all of the workspaces for which you have any assigned workspace permission sets, check Any permissions. If you filter the list based on a specific permission set, to clear the filter and show all workspaces for which you have any permission set, check Any permissions. To display all workspaces, including workspaces that you do not have any permissions for, uncheck Any permissions.

  • Schema changes -
    Indicates whether Structural detected changes to the source database schema. If there are changes, the column shows the number of changes. Hover over the column value to display additional details, and to navigate to the
    Schema Changes
    view. Go to
    .
  • Tags - The tags that are assigned to the workspace.

  • Permissions - The permission sets that are assigned to you for the workspace.

  • Owner - The name and email address of the workspace owner.

  • Child workspaces
    workspace management view
    workspace management view
    Viewing the current subsetting configuration
    Post-job scripts
    Webhooks
    Workspace tools menu
    Options column and dropdown for a workspace
    Actions menu for selected workspaces
    Viewing and resolving schema changes

    No

    Privacy ranking

    5

    Generator ID (for the API)

    , from the
    Select a Generator
    dropdown list, select the generator to use for the selected index. You cannot select another composite generator. To remove the selection, click the delete icon.
  • Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

  • To save the configuration and immediately add a generator for another index, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

  • Consistency

    Determined by the selected sub-generators.

    Linking

    Determined by the selected sub-generators.

    Differential privacy

    Determined by the selected sub-generators.

    Data-free

    Determined by the selected sub-generators.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    Consistency

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    Yes

    Allowed for unique columns

    Yes

    Uses format-preserving encryption (FPE)

    Structural data encryption

    Yes

    The User Settings view includes options to:

    • Configure your user image.

    • View and copy your organization identifier.

    • Configure email notifications for column comments.

    • (if your Structural instance does not use SSO).

    hashtag
    Choosing your user image

    You can select an image to associate with your account. The image is displayed next to your name and email address throughout Structural.

    If your instance uses Google or Azure single sign-on (SSO) to manage Structural users, then by default your Structural account image is the image from the SSO.

    Otherwise, the default image displays your initials.

    To change your user image, click Upload, then select the image file.

    hashtag
    Viewing and copying your organization identifier

    Below your user image file name is the identifier of the organization that your account belongs to.

    To copy the identifier, click the copy icon.

    hashtag
    Configuring notifications for column comments

    circle-info

    Required license: Professional or Enterprise

    Structural allows users to provide comments on columns. You can do this from Privacy Hub and Database View.

    From the Comment Notification Settings section of User Settings, you can configure when to receive email notifications for comments.

    The available options are:

    • I am an owner, editor, auditor, or am being replied to This is the default option. You receive email notifications when comments are made on columns in a workspace that you are an owner, editor, or auditor for. You also receive an email notification when someone replies to a comment that you made.

    • I am @ mentioned You only receive an email notification if someone specifically mentions you in a comment.

    • Never You never receive email notifications for column comments.

    hashtag
    Generating and managing API tokens

    Before you can use the Structural API, you must create an API token. From the User API Tokens section of the User Settings view, you can create and manage API tokens.

    User API Tokens list on the User Settings view

    hashtag
    Creating an API token

    To create an API token:

    1. Click Create Token.

    2. On the Create New Token dialog, enter a name for the new token.

    3. Click Confirm.

    In the list, the new token displays as clear text. To copy the new token, click the copy icon next to the token.

    The new token text and copy icon only display during the current session. After that, Structural masks the token and removes the copy icon.

    hashtag
    Revoking an API token

    To revoke a token, click the Revoke option for the token.

    hashtag
    Changing your Structural password

    If your Structural account is not managed using SSO, then from User Settings, you can change your Structural password.

    If your Structural instance uses SSO to manage users, then your user credentials are managed in the SSO system. You cannot change your user password in Structural.

    Under Password Change, to change your Structural password:

    1. In the Old Password field, type your current Structural password.

    2. In the New Password field, type your new Structural password.

    3. In the Repeat New Password field, type your new Structural password again.

    4. Click Confirm.

    hashtag
    Deleting your Structural account

    From User Settings, you can delete your Structural account. If your instance uses SSO to manage users, then deleting your account only affects your access to Structural.

    You cannot delete your Structural account if you are the owner of a workspace for which other users are granted access. Before you can delete your Structural account, you must either:

    • Revoke access from other users.

    • Transfer ownership to a different user.

    To delete your Structural account, click Delete Account.

    When you delete your account, you are logged out of Structural.

    The workspace management view includes the following components.

    hashtag
    Workspace information

    The top left of the workspace management view provides information about the workspace, including:

    Workspace information
    • The workspace name

    • When the workspace was last updated

    • The user who last updated the workspace

    • Whether the workspace is a

    hashtag
    Workspace options

    The top right of the workspace management view provides general options for working with the workspace, including:

    Workspace options
    • Undo and redo options for configuration changes

    • The workspace share icon, to grant workspace access to other users and groups

    • The workspace download menu to:

      • Download sensitivity scan and privacy reports

    • The workspace actions menu

    • The Generate Data button, to

    hashtag
    Workspace navigation bar

    The workspace navigation bar provides access to workspace configuration options.

    Workspace navigation bar

    hashtag
    Displaying the workspace management view

    To display the workspace management view for a workspace:

    • On Workspaces view, in the Name column either:

      • Click the workspace name. The workspace management view opens to Privacy Hub.

      • Click the dropdown icon, then select a workspace management option.

    Workspace tools menu
    • Click the search field at the top. A list of available type the name of the workspace. As you type, Tonic displays a list of matching workspaces. In the list, click the workspace name.

    Workspace search

    hashtag
    Collapsing and expanding the workspace heading

    To reduce the amount of vertical space used by the heading of the workspace management view, you can collapse it.

    To collapse the heading, click the collapse icon in the Structural heading.

    Workspace heading with collapse option highlighted

    When you collapse the workspace management heading:

    • The workspace information is hidden. The workspace name is displayed in the search field.

    • The workspace options are moved up into the Structural heading.

    The workspace navigation bar remains visible.

    When you collapse the heading, the collapse icon changes to an expand icon. To restore the full heading, click the expand icon.

    Collapsed workspace heading with expand option highlighted
    Workspace management view for a workspace

    By default, a child workspace's configuration is synchronized with the configuration of the parent. In other words, any changes to the parent workspace are copied to its child workspaces. Child workspaces can also override some of the parent configuration. From the parent workspace, you can track the child workspaces and how they are customized .

    For example, you might want separate workspaces for different development teams. Each team can make adjustments to suit their specific projects - such as different subsets - but inherit everything else.

    hashtag
    What does a child workspace inherit?

    By default, a child workspace inherits all of the configuration from the parent workspace, except for the following:

    • Workspace name - A child workspace has its own name.

    • Workspace description - A child workspace has its own description.

    • Tags - A child workspace has its own tags.

    • Destination database - A child workspace writes output data to its own destination database. You can copy the destination database from the parent workspace.

    • Intermediate database - For upsert, a child workspace does not inherit the intermediate database.

    • Webhooks - A child workspace has its own webhooks.

    hashtag
    How parent workspace changes affect child workspaces

    When you change the configuration of a parent workspace, the configuration is also updated in the child workspaces.

    The exception is when a child workspace overrides the configuration. If the configuration is overridden, then the child workspace does not inherit the change.

    Tonic Structural indicates on both the parent and child workspaces when the configuration is overridden.

    hashtag
    What can a child workspace override?

    A child workspace can override the following configuration items.

    • Schema management settings - A child workspace can override the settings to determine how to respond to schema changes and whether to cache the source database schema.

    • Table modes - A child workspace can override the table mode for individual tables. The other tables continue to inherit the table mode that is configured in the parent workspace.

    • Column generators - A child workspace can override the generator for individual columns. The other columns continue to inherit the generator that is configured in the parent workspace. For linked columns, a change to any of the linked columns overrides the inheritance for all of the columns.

    • Subsetting - A child workspace can override the subsetting configuration from the parent workspace. Any change in the child workspace means that the child workspace no longer inherits any changes to the subsetting configuration from the parent workspace. For example, if you change the percentage setting on a single target table from 5 to 6, that eliminates the subsetting inheritance. The child workspace keeps the subsetting configuration that it already has, but it is not updated when the parent workspace is updated.

    • Post-job scripts - A child workspace can override the post-job scripts. Any change to the post-job scripts in the child workspace means that the child workspace no longer inherits any changes to the post-job scripts configuration.

    • Statistics seed - A child workspace can override the .

    From each view, you can eliminate the overrides and restore the inheritance.

    hashtag
    What must a child workspace inherit?

    A child workspace cannot override the following configuration items:

    • Data connector type and source database - A child workspace always uses the same source data as the parent workspace.

    • Foreign keys - A child workspace always uses the same foreign key configuration as the parent workspace.

    • Sensitivity designation for a column - A child workspace cannot change whether a column is marked as sensitive.

    hashtag
    How schema changes are resolved in parent and child workspaces

    For removed tables and columns, when a child workspace overrides the parent workspace configuration for the table or column, you must resolve the change in the child workspace.

    If there is a conflicting change for the removed table or column in the parent workspace configuration, then regardless of whether the configuration is inherited, you must resolve that change in the parent workspace before the change is resolved for the child workspace.

    For changes to column nullability or data type, you resolve the change separately in the child and parent workspaces.

    You also dismiss notifications (new tables and columns) separately in the parent and child workspaces.

    Each custom sensitivity rule specifies:
    • The data type for matching columns.

    • Text matching criteria for the names of matching columns.

    • The recommended generator preset.

    hashtag
    Displaying the list of custom sensitivity rules

    To display the current list of sensitivity rules, in the Structural navigation menu, click Sensitivity Rules.

    Sensitivity Rules view with the list of custom sensitivity rules

    The list contains the sensitivity rules for a self-hosted Structural instance or a Structural Cloud organization.

    For each rule, the list includes:

    • The rule name and description

    • The recommended generator preset

    • When the rule was most recently modified

    hashtag
    Filtering the rules

    You can filter the rule list by the following:

    • Rule name

    • Rule description

    • Generator preset name

    • Name of the user who most recently updated the rule

    In the filter field, start to type text from any of those values. As you type, the list is filtered to only include matching rules.

    Note that when the list is filtered, you cannot change the display sequence of the rules.

    hashtag
    Setting the rule sequence

    Structural applies the rules based on their display order in the list.

    If a column matches more than one rule, Structural applies the first matching rule.

    To change the display order of a rule, drag and drop it to the new location in the list.

    Note that you cannot change the rule sequence when the list is filtered.

    hashtag
    Creating and editing a sensitivity rule

    hashtag
    Creating a sensitivity rule

    To create a sensitivity rule:

    1. On the Sensitivity Rules view, click New Custom Rule.

    2. On the Create Custom Rule view, configure the new rule.

    3. Click Save.

    hashtag
    Editing a sensitivity rule

    To change the configuration of a sensitivity rule:

    1. On the Sensitivity Rules view, click the edit icon for the rule.

    2. On the Edit Custom Rule view, update the configuration.

    3. Click Save.

    Note that any changes to a sensitivity rule do not take effect until the next sensitivity scan.

    hashtag
    Sensitivity rule configuration

    Details view for a custom sensitivity rule

    hashtag
    Rule name and description

    In the Name field, type the name of the sensitivity rule. The rule name becomes the sensitivity type for matching columns. The rule name must be unique, and also cannot match the name of a built-in sensitivity type.

    Optionally, in the Description field, type a longer description of the sensitivity rule.

    hashtag
    Data type

    From the Data Type dropdown list, select the data type for matching columns. For example, a rule might only be used for columns that contain text.

    The available data types are general types that map to specific data types in a given database. The available types are:

    • Array

    • Binary

    • Boolean

    • Continuous Numerical

    • Date Range

    • Datetime

    • Integer

    • JSON

    • MAC Address

    • Network Address

    • Text

    • UUID

    • XML

    hashtag
    Column name criteria

    Under Column Name Match, provide the criteria to identify matching columns based on the column name.

    Note that a matching column must match both the data type and the column name criteria.

    hashtag
    Configuring text matching conditions

    When you provide a list of text matching conditions, a matching column must match all of the conditions. In other words, the conditions are joined by AND.

    To apply the same generator preset to columns that have completely different names, you must create separate sensitivity rules.

    To create a list of text matching conditions:

    Column name text match rules for a custom sensitivity rule
    1. Click Text Match.

    2. To add a column name condition, click Add String Match.

    3. For each condition:

      1. From the comparison type dropdown list, select the type of comparison. For example, Contains, Starts with, Ends with.

      2. In the comparison text field, provide the text to check for. The comparison text is case insensitive. For example, if you set a condition to match column names that contain the text term, it also matches column names that contain TERM or Term or tErM.

    4. To remove a column name condition, click its delete icon.

    hashtag
    Providing a regular expression

    To use a regular expression to identify matching columns based on the column name:

    Column name regular expression field for a custom sensitivity rule
    1. Click Regular Expression.

    2. In the field, provide the regular expression.

    hashtag
    Generator preset to apply

    From the Recommended Generator Preset dropdown list, select the generator preset that is the recommended generator for matching columns.

    To search for a specific preset, begin to type the generator preset name.

    hashtag
    Managing generator preset configuration

    circle-info

    Required global permission: Create and manage generator presets

    When you configure a sensitivity rule, you can also create a new generator preset or update the configuration of the selected generator preset.

    To create a new generator preset, click Create Preset. On the generator preset details panel, provide the generator preset configuration, then click Create.

    To edit the selected generator preset, click Edit Current Preset. On the generator preset details panel, update the generator preset configuration, then click Save and Apply.

    For more information about generator preset configuration, go to .

    hashtag
    Previewing the rule results

    If you have access to a workspace, then you can use the workspace to preview the sensitivity rule results.

    Under Test Results, from the workspace dropdown list, select the workspace to use.

    Structural searches the workspace schema for matching columns based on the sensitivity rule configuration.

    It displays any matching columns. You can filter the matching columns based on the table or column name.

    Test Results section to preview the results for a sensitivity rule

    For each matching column, the list includes:

    • The column name and table

    • A sample value from the source data. The sample source value is only present if you have the Preview source data permission for the workspace.

    • A sample replacement value, based on the selected generator preset for the sensitivity rule. The sample replacement value is only present if you have the Preview destination data permission for the workspace.

    hashtag
    Deleting a sensitivity rule

    To delete a sensitivity rule, on the Sensitivity Rules view, click the delete icon for the rule.

    Note that existing generator recommendations for the rule remain in place until the next sensitivity scan.

    built-in sensitivity types

    Linking

    Yes, can be linked.

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    hashtag
    How to configure

    To configure the generator:

    1. From the Link To dropdown list, select the columns to link this column to. You can link columns that use the Address generator to mask one of the following address components:

      • City

      • City State

      • Country

      • Country Code

      • State

      • State Abbreviation

      • Zip Code

      • Latitude

      • Longitude

      Note that when linked to another address column, a country or country code is always the United States.

    2. From the address component dropdown list, select the address component that this column contains. The available options are:

      • Building Number

      • Cardinal Direction (North, South, East, West)

    3. Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.

    4. If consistency is enabled, then by default, the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When the Address generator is consistent with itself, then the same value in the source database is always mapped to the same destination value. For example, for a column that contains a state name, Alabama is always mapped to Illinois. When the Address generator is consistent with another column, then the same value in the other column always results in the same destination value for the address column. For example, if the address column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same address value in the destination database.

    5. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    hashtag
    Spark supported address parts

    For the Address generator, Spark workspaces (Databricks and self-managed Spark clusters) only support the following address parts:

    • Building Number

    • City

    • Country

    • Country Code

    • Full Address

    • Latitude

    • Longitude

    • State

    • State Abbr

    • Street Address

    • Street Name

    • Street Suffix

    • US Address

    • US Address with Country

    • Zip Code

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    To get the value of the first list item, the expression is //ul/li[1]/text().

    hashtag
    Characteristics

    Consistency

    Determined by the selected sub-generators.

    Linking

    Determined by the selected sub-generators.

    Differential privacy

    Determined by the selected sub-generators.

    Data-free

    Determined by the selected sub-generators.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    hashtag
    How to configure

    hashtag
    Adding a sub-generator

    To assign a generator to a path expression:

    1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HTML field contains a sample value from the source database. You can use the previous and next icons to page through different values.

    2. In the Path Expression field, type the path expression to identify the value to apply the generator to. Matched HTML Values shows the result from the value in Cell HTML.

    3. From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.

    4. Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

    5. To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

    hashtag
    Managing the sub-generators list

    From the Sub-Generators list:

    • To edit a generator assignment, click the edit icon.

    • To remove a generator assignment, click the delete icon.

    • To move a generator assignment up or down in the list, click the up or down arrow.

    hashtag
    Selecting the fallback generator

    From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.

    The options are:

    • Passthrough

    • Constant

    • Null

    composite generator
    XPath syntaxarrow-up-right
    Postal code
  • For a United Kingdom (UK) mailing address:

    • City

    • County

    • District

    • Country

    • Postal code

  • To replace a Canadian postal code:

    • The generator selects a real postal code that starts with the same three digits - has the same Forward Sortation Area (FSA) - as the original postal code, but that has a different Local Delivery Unit (LDU).

    • For a postal code whose FSA is not on the list that the generator uses, you can provide a fallback value to use.

    To replace a UK address components, the generator selects real values.

    hashtag
    Characteristics

    Consistency

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    hashtag
    How to configure

    To configure the generator:

    1. From the Generator Type dropdown list, select International Address.

    2. From the Country dropdown list, select the country (Canada or United Kingdom).

    3. From the Address Component dropdown list, select the address component that this column contains. For Canada, the available options are:

      • Street Name

      • Postal Code

      For the UK, the available options are:

      • Postal code

      • City

      • County

      • District

    4. For a Canadian postal code, in the Fallback Value field, type the FSA to use if the value in the data does not exist. For example, the FSA in the data might be new and not yet in the list that Structural uses, or the FSA might be invalid. By default, the fallback value is NULL, meaning that in the destination data, the postal code value is the string literal "NULL".

    5. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

    6. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    EmailGenerator

    Schema management settings

    On Workspace Settings view for a workspace, the Schema Changes section contains the schema management settings..

    Schema management settings for a workspace

    hashtag
    Responding to schema changes

    Schema changes include:

    • Schema changes that could expose data, which if not addressed can result in data leakage. These changes include new tables and columns, and changes to data types.

    • Notifications, which Structural can handle automatically during each data generation. These include removed tables and columns.

    For more information, go to .

    hashtag
    Selecting the handling option

    On the Workspace Settings view, under Block Data Generation on Schema Changes, select how Structural responds when there are unaddressed changes to the database schema.

    The options are:

    • Do Not Block - With this option, schema changes never block data generation. When you select this option, then Structural automatically handles notifications during data generation.

      The Automatically apply generators toggle determines how Structural responds to changes that could expose data.

      • When this is enabled, then during data generation, Structural automatically applies generators to the affected columns.

    hashtag
    How Structural selects generators to apply automatically

    When Structural automatically applies generators to columns affected by a schema change:

    • For new columns or document fields that are detected as sensitive, Structural applies the recommended generator for the sensitivity type.

    • For new columns or document fields that are not sensitive, Structural applies an appropriate generator based on the data type.

    • For changes to column data types, nullability, or uniqueness, Structural applies a new recommended generator or an appropriate generator for the data type.

    Here is a summary of how Structural determines the generator to apply to columns that are not sensitive and do not match an existing sensitivity type or custom sensitivity rule:

    Data type
    Generator selection

    hashtag
    Indicating whether to cache the source schema

    circle-info

    Schema caching is not available for document-based databases (MongoDB, DynamoDB).

    hashtag
    About the schema cache

    By default, every time you load a workspace, Structural queries the source database to retrieve the schema.

    You can instead configure the workspace to cache the schema. Structural then updates the cache at a regular interval, and whenever a change to the workspace triggers a schema cache update.

    You can also trigger a cache update manually.

    By default, the schema cache is only used by calls from within Structural. To enable an external API request to use the cached schema, add the query parameter useSchemaCache=true to the request.

    In the application, each update to the schema cache is represented by a schema retrieval job. Schema retrieval jobs are short-lived, and run on the Structural web server. You can view the schema retrieval jobs from the .

    Note that the schema cache does not include the schema for JSON columns that use Document View. Those schemas are detected by a different scan.

    hashtag
    Enabling and configuring schema caching

    To enable and configure the caching:

    1. On the Workspace Settings view, toggle Cache source schema for faster loading to the on position.

    2. Under Schema Freshness, configure the maximum length of time between schema retrievals.

      1. In the field, provide the value.

    Configuring an individual column

    For an individual column in Database View, you can configure the assigned generator and determine the column sensitivity.

    hashtag
    Displaying the generator configuration panel

    From the column list, to display the generator configuration panel, in the Applied Generator column, click the generator name tag.

    Generator configuration panel

    hashtag
    Indicating whether a column is sensitive

    circle-info

    Required workspace permission: Configure column sensitivity

    The Structural sensitivity scan provides an initial indication of whether a column is sensitive and, if it is sensitive:

    • The type of sensitive data that is in the column.

    • The confidence level of the sensitivity detection.

    For more information, go to .

    In a child workspace, you cannot configure whether a column is sensitive. A child workspace always inherits the sensitivity designations from its parent workspace.

    hashtag
    Status column

    From the Status column, to confirm or change the column sensitivity, click the Status value.

    The status panel indicates whether the column is sensitive. It identifies the sensitivity type, and indicates how the sensitivity was determined - by a sensitivity scan or by a user.

    hashtag
    Built-in sensitivity type

    For a column that matches a built-in sensitivity type, the first time that you display the panel, the Sensitive data? setting displays Yes and No options for you to confirm or change the sensitivity.

    • To indicate that the column is sensitive, click Yes.

    • To indicate that the column is not sensitive, click No.

    When you click Yes or No, the Yes and No options change to a simple toggle. When you click Yes, the sensitivity confidence level changes to full.

    After that:

    • To indicate that the column is sensitive, toggle Sensitive data? to the on position.

    • To indicate that the column is not sensitive, toggle Sensitive data? to the off position.

    hashtag
    Sensitivity rule match

    When a column matches a sensitivity rule, the sensitivity panel indicates that the column matched a sensitivity rule.

    You use the Sensitive data? toggle to indicate whether the column is actually sensitive.

    hashtag
    No built-in sensitivity type or sensitivity rule match

    When a column does not match a built-in sensitivity type or a custom sensitivity rule, the sensitivity panel indicates that column is not sensitive.

    The Sensitive data? setting displays Yes and No options for you to confirm or change the sensitivity.

    • To indicate that the column is sensitive, click Yes.

    • To confirm that the column is not sensitive, click No.

    When you click Yes or No, the Yes and No options change to a simple toggle.

    If you click Yes:

    • The panel updates to indicate that a user confirmed that the column is sensitive.

    • The sensitivity confidence level is set to full confidence.

    After that:

    • To indicate that the column is sensitive, toggle Sensitive data? to the on position.

    • To indicate that the column is not sensitive, toggle Sensitive data? to the off position.

    hashtag
    Column configuration panel

    To configure the sensitivity, you can also use the Sensitive Data toggle on the column configuration panel.

    • To indicate that a column is sensitive, toggle the sensitivity setting to the on position.

    • To indicate that the column is not sensitive, toggle the sensitivity setting to the off position.

    When you change the sensitivity from the generator configuration panel, the Sensitive data? setting on the sensitivity panel also changes from the Yes and No options to the toggle.

    hashtag
    Assigning or ignoring the recommended generator

    circle-info

    Required workspace permission: Configure column generators

    When a sensitivity scan identifies a column, Structural recommends a generator for the column. For example, when the sensitivity scan identifies a column as a first name, Structural recommends the Name generator configured to generate a first name value.

    In the Assigned Generator column on Database View, columns that do not have an assigned generator, and that have a recommended generator, display the available recommendation icon.

    When you click the generator dropdown, the column configuration panel includes the following information:

    • The sensitivity confidence level.

    • The recommended generator.

    • Sample source and destination values based on the recommended generator.

    From the panel, you choose whether to assign or ignore the recommended generator for that type.

    • To assign the recommended generator, click Apply.

    • To ignore the recommendation, click Ignore. Structural clears the recommendation.

    hashtag
    Changing the column generator configuration

    circle-info

    Required workspace permission: Configure column generators

    To change the generator that is assigned to a selected column:

    1. Click the generator name tag for the column.

    2. To assign a different generator to the column, from the Generator Type dropdown list, select the generator.

    3. Configure the generator options.

    To reset an assigned generator to Passthrough, which indicates to not transform the data, click Reset, then click Reset to Passthrough.

    For details about the configuration options for each generator, go to the .

    For more information about selecting and configuring generators and generator presets, go to .

    hashtag
    Enabling Document View for JSON columns

    circle-info

    Supported only for the file connector and PostgreSQL.

    For a JSON column, instead of assigning a generator, you can enable Document View.

    From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to .

    To enable Document View, on the column configuration panel, toggle Use Document View to the on position. Note that if you have , or enabled , then the Use Document View toggle is in the advanced options.

    When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.

    Indicating whether a column is sensitive

    Passthrough

    Passthrough is the default option.

    It passes through the value from the source database to the destination database without masking it.

    hashtag
    Characteristics

    Consistency

    No, cannot be made consistent.

    hashtag
    How to configure

    Passthrough has no configuration options.

    Generator information

    Generators transform the data in a source database column. You assign the generators to use. Tonic Structural offers a variety of generators to transform different types of data.

    For details about how to assign and configure generators, and manage generator presets, go to .

    You can also view this .

    hashtag
    About the available generators

    JSON Mask

    This is a .

    Runs a selected sub-generator on values that match a user specified . You can only search for and apply sub-generators to individual key values. You cannot apply a sub-generator to an object or to an array.

    If an error occurs, the selected fallback generator is used for the entire JSON value.

    For JSON columns in a file connector workspace, you can instead use Document View to assign generators to individual paths. For more information, go to .

    hashtag

    Noise Generator

    Masks values in numeric columns. Adds or multiplies the original value by random noise.

    The additive noise generator draws noise from an interval around 0 scaled to the magnitude of original value. For example, the default scale is 10% of the underlying value. The larger the value, the larger the amount of noise that is added.

    The multiplicative noise generator multiplies the original value by a random scaling factor that falls within a specified range.

    hashtag
    Characteristics

    Name

    Generates a random name string from a dictionary of first and last names.

    You specify the name information that is contained in the column. A column might only contain a first name or last name, or it might contain a full name. A full name might be first name first or last name first.

    For example, a Name column contains a full name in the format Last, First. For the input value Smith, John, the output value would be something like, Jones, Mary.

    hashtag

    Integer Key

    Generates unique integer values. By default, the generated values are within the range of the column’s data type.

    You can also specify a range for the generated values. The source values must be within that range.

    This generator cannot be used to transform negative numbers.

    hashtag
    Characteristics

    Tutorial video: Tonic Structural 101
    <html>
    <body>
      <div class="container">
        <h1>Title</h1>
        <p>Paragraph content</p>
        <ul>
          <li>Item 1</li>
          <li>Item 2</li>
          <li>Item 3</li>
        </ul>
      </div>
    </body>
    </html>

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    5

    Generator ID (for the API)

    HtmlMaskGenerator

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    Yes

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    6

    Generator ID (for the API)

    PassthroughGenerator

    statistics seed configuration

    Country

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    InternationalAddressGenerator

    Structural data encryption
    CsvMaskGenerator
    NumericStringPkGenerator
    City
  • City Prefix (Examples: North, South, East, West, Port, New)

  • City Suffix (Examples: land, ville, furt, town)

  • City with State (Example: Spokane, Washington)

  • City with State Abbr (Example: Houston, TX)

  • Country (Examples: Spain, Canada)

  • Country Code (Uses the 2-character country code. Examples: ES, CA)

  • County

  • Direction (Examples: North, Northeast, Southwest, East)

  • Full Address

  • Latitude (Examples: 33.51, 41.32)

  • Longitude (Examples: -84.05, -74.21)

  • Ordinal Direction (Examples: Northeast, Southwest)

  • Secondary Address (Examples: Apt 123, Suite 530)

  • State (Examples: Alabama, Wisconsin)

  • State Abbr (Examples: AL, WI)

  • Street Address (Example: 123 Main Street)

  • Street Name (Examples: Broad, Elm)

  • Street Suffix (Examples: Way, Hill, Drive)

  • US Address

  • US Address with Country

  • Zip Code (Example: 12345)

  • Structural data encryption
    AddressGenerator
    Generate and manage API tokens.
    Change your Structural password
    Delete your Structural account.
    child workspace
    Export and import workspace configuration
    start a data generation job
    Identifying sensitive data
    Generator reference
    Assigning and configuring generators
    Using Document View for JSON columns
    custom value processors
    Structural data encryption
    Column status panel with confirmation options
    Status panel after you select No or Yes to indicate the sensitivity
    Sensitivity panel for a column that matched a sensitivity rule
    Sensitivity panel for a not sensitive column
    Sensitivity panel after you change the sensitivity on a not sensitive column
    Sensitivity toggle on the column configuration panel
    Column with the available recommendation icon
    Recommended generator panel for a column
    Tutorial video: Generating data with consistency
    Tutorial video: Using Document View to configure JSON columns
    Tutorial video: Enabling upsert data generation
    Tutorial video: Subsetting your data
    Tutorial video: Creating a Structural workspace
    Tutorial video: Managing workspace access
    Tutorial video: Generator presets
    Tutorial video: Tonic Structural generators overview
    Tutorial video: Outputting data to a container repository
    Tutorial video: Sensitivity detection and generator recommendations
    Tutorial video: File connector overview
    When it is disabled, Structural does not change the column configuration.
  • Block On Changes That Could Expose Data - Indicates to only block data generation if there are schema changes that might expose data, such as new columns. Structural automatically handles notifications during data generation. For this option, Structural does not block data generation for schema changes on truncated tables.

  • Block On All Changes - For this option, if there are any unaddressed schema changes at all, either sensitive changes or notifications, then data generation fails.

  • MAC address

    UUID

    If unique or a primary key, .

    Otherwise .

    JSON

    If Struct format, .

    Otherwise .

    XML

    Either or

    Other

    If HStore format,

    User-defined

    From the dropdown list, select the unit of time. You can configure the length of time in minutes, hours, or days.

    If the cached schema is older than that length of time, then the next time the application loads, it queries the source database for the current schema. The default value is 6 hours. Note that for some data connectors, schema retrievals run automatically in the background. This setting does not affect the frequency of those schema retrievals. For example, a schema retrieval runs automatically in the background every 2 hours. If you set the schema freshness to 6 hours, the background retrieval still runs every 2 hours. However, if you set the schema freshness to 1 hour, then schema retrieval occurs no more than 1 hour after the previous schema retrieval.

  • You can optionally enable diagnostic logging for the schema retrieval. Diagnostic logging adds additional diagnostic errors to help with troubleshooting. Note that this additional information might contain sensitive information such as schema identifiers. To enable diagnostic logging:

    1. Click Show advanced options.

    2. Toggle Enable diagnostic logging to the on position.

  • Integer

    If unique or a primary key, Integer Key. Otherwise Random Integer.

    Boolean

    Random Boolean

    Datetime

    Either Timestamp Shift, Date Truncation, or Random Timestamp

    Text

    If unique or a primary key, either Aphanumeric String Key, ASCII Key, or Numeric String Key.

    Otherwise either Character Scramble or Categorical.

    Continuous

    If unique or a primary key, Integer Key.

    Otherwise Random Double.

    Network

    IP Address

    Viewing and resolving schema changes
    workspace Jobs view

    hashtag
    Generator characteristics and types

    Generator assignment and configuration
    video overview of generators and how they workarrow-up-right
    Sequence for applying the sub-generators

    Sub-generators are applied sequentially, from the sub-generator at the top of the list to the sub-generator at the bottom of the list.

    If a key matches more than one JSONPath expression, then the most recently added generator takes priority.

    hashtag
    JSON path options

    hashtag
    Regular expressions and comparisons

    JSON paths can also contain regular expressions and comparison logic, which allow the configured sub-generators to be applied only when there are properties that satisfy the query.

    For example, a column contains this JSON:

    [ { file_name: "foo.txt", b: 10 }, ... ]

    The following JSON path only applies to array elements that contain a file_name key for which the value ends in .txt:

    $.[?(@.file_Name =~ /^.*.txt$/)]

    hashtag
    Using recursion

    A JSON path can also be used to point to a key name recursively. For example, a column contains this JSON:

    The following JSON path applies to all properties for which the key is first_name:

    $..first_name

    hashtag
    Characteristics

    Consistency

    Determined by the selected sub-generators.

    Linking

    Determined by the selected sub-generators.

    Differential privacy

    Determined by the selected sub-generators.

    Data-free

    Determined by the selected sub-generators.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    hashtag
    How to configure

    hashtag
    Adding a sub-generator

    To assign a generator to a path expression:

    1. Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.

    2. In the Path Expression field, type the path expression to identify the value to apply the generator to. You cannot use the exact same path expression more than once. To create a path expression, you can also click the value in Cell JSON that you want the expression to point to. The path expression must identify a key value. You cannot apply sub-generators to an object or to an array. Matched JSON Values shows the result from the value in Cell JSON.

    3. By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, Boolean, and Null.

    4. From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.

    5. Configure the selected generator. You cannot configure the selected generator to be consistent with another column.

    6. To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.

    hashtag
    Managing the sub-generators list

    From the Sub-Generators list:

    • To edit a generator assignment, click the edit icon.

    • To remove a generator assignment, click the delete icon.

    • To move a generator assignment up or down in the list, click the up or down arrow.

    hashtag
    Selecting the fallback generator

    From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.

    The options are:

    • Passthrough

    • Constant

    • Null

    composite generator
    JSONPatharrow-up-right
    Using Document View for JSON columns

    Consistency

    Yes, can be made self-consistent or consistent with another column.

    Linking

    No, cannot be linked.

    Differential privacy

    No

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 3 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    hashtag
    How to configure

    You can use either the additive noise generator or the multiplicative noise generator, then set the other generator settings.

    hashtag
    Using the additive noise generator

    To use the additive noise generator:

    1. From the dropdown list, choose Additive.

    2. In the Relative noise scale field, type the percentage of the underlying value to scale the noise to. The default value is 10.

    3. In the decimal places field, set the number of decimal places to use. The default value is 2.

    Tonic samples the additive noise from a range between [-{scale/100} * |value|, {scale/ 100} * |value|), where scale is the noise scale, and value is the original data value.

    The lower value of the range is inclusive, and the upper value of the range is exclusive.

    For example, for the default noise scale of 10, and a data value of 20, the additive noise range would be [-.1 * 20, .1 * 20). In other words, between -2 (inclusive) and 2 (exclusive).

    hashtag
    Using the multiplicative noise generator

    To use the multiplicative noise generator:

    1. From the dropdown list, choose Multiplicative.

    2. In the Min field, type the minimum value for the scaling factor. The minimum value is inclusive. The default value is 0.5.

    3. In the Max field, type the maximum value for the scaling factor. The maximum value is exclusive. The default value is 5.

    4. In the decimal places field, set the number of decimal places to use. The defalt value is 2.

    Tonic scales the original value from a range between [min, max), where min is the minimum scaling factor, and max is the maximum scaling factor.

    For example, for the default values of 0.5 and 5, Tonic multiplies the original data value by a value from between 0.5 (inclusive) and 5 (exclusive).

    hashtag
    Other generator configuration

    To configure the generator consistency and data encryption:

    1. Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.

    2. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. If the generator is self-consistent, then a given value in the source database is masked in exactly the same way to produce the value in the destination database. If the generator is consistent with another column, then for a given value in that other column, the column that is assigned the Noise generator is always masked in exactly the same way in the destination database. For example, a field containing a salary value is assigned the Noise Generator and is consistent with the username field. For each instance of User1, the Noise Generator masks the salary value in exactly the same way.

    3. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Characteristics

    Consistency

    Yes, can be made self-consistent or consistent with another column. Note that all Name generator columns that have the same consistency configuration are automatically consistent with each other. The columns must either be all self-consistent or all consistent with the same other column. For example, you can use this to ensure that a first name and last name column value always match the first name and last name in a full name column.

    Linking

    No, cannot be linked.

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    hashtag
    How to configure

    To configure the generator:

    1. From the name format dropdown list, select the type of name value that the column contains:

      • First. This also is commonly used for standalone middle name fields.

      • Last

      • First Last

      • First Middle Last

      • First Middle Initial Last

      • Last, First

      • Last, First Middle

      • Middle Initial

    2. Toggle the Preserve Capitalization setting to indicate whether to preserve the capitalization of the column value. By default, the capitalization is not preserved.

    3. Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.

    4. If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.

    5. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Yes, can be made self-consistent.

    Linking

    No, cannot be linked.

    Differential privacy

    Yes, if consistency is not enabled.

    Data-free

    Yes, if consistency is not enabled.

    Allowed for primary keys

    Yes

    Allowed for unique columns

    Yes

    Uses format-preserving encryption (FPE)

    Yes

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    hashtag
    How to configure

    To configure the generator:

    1. In the Minimum field, enter the minimum value to use for an output value. The minimum value cannot be larger than any of the values in the source data.

    2. In the Maximum field, enter the maximum value to use for an output value. The maximum value cannot be smaller than any of the values in the source data.

    3. Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.

    4. If is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Consistency

    Assigning generators to path expressions

    On Collection View and Document View, Structural can automatically assign generators to fields that match a configured path expression. Each collection or JSON column has its own set of path generators.

    Structural always applies the path generator to matching fields that do not have an assigned generator (are set to Passthrough).

    Structural does not apply the path generator to matching fields that have a generator configuration applied directly.

    However, in a child workspace, Structural does apply the path generator to matching fields that inherit their current generator configuration from the parent workspace.

    hashtag
    Displaying the list of path generators

    To display the list of path generators for the current collection or JSON column, click Path Generators.

    For each path generator, the list includes:

    • The priority order. Structural checks the fields against the paths in the order that the paths are displayed. The first matching path wins.

    • The path expression to identify matching fields.

    • The data type filter for matching fields. You can configure a path generator to only apply to fields of a specific type or types.

    hashtag
    Creating a path generator

    To create a path generator, you can either create a completely new path generator, or start from a duplicate of an existing path generator.

    Structural saves the new generator automatically when the configuration is complete. The Saved button at the bottom right indicates when the generator is saved.

    hashtag
    Creating a completely new path generator

    To create a path generator:

    1. On the path generators panel, click Add path generator.

    2. On the path generator details panel, in the Path Expression field, provide the path expression to use to identify matching paths. The path expression uses the syntax. Note that for a path generator, you cannot use the expression to check for a field value. For more information about the supported operators and some examples, go to . When you provide a path expression, the matching fields list displays the fields that match the expression.

    3. You can optionally filter the matching fields based on the data type. For example, you might only want to apply a generator to text or integer fields. By default, the data type filter list is empty. The available data types are general types that map to specific data types in a given database. Under

    hashtag
    Copying an existing path generator

    You can create a path generator based on an existing one. For example, for the same path expression, you might want to assign a different generator based on the data type.

    To create a new path generator based on an existing path generator:

    1. On the path generators list, click the options menu for the path generator to copy.

    1. Click Duplicate path generator.

    2. On the path generator details panel, edit the configuration.

    hashtag
    Updating a path generator

    To update a path generator:

    1. On the path generators list, click the options menu for the path generator.

    2. Click Edit path generator.

    3. On the path generator details panel, edit the configuration.

    Structural saves the changes automatically.

    For fields that were assigned a generator based on the previous configuration, but that do not match the updated path generator configuration:

    • If the field matches other path generators, then the next matching configuration is applied.

    • If the field does not match any other path generators, then the field reverts to Passthrough.

    hashtag
    Deleting a path generator

    When you delete a path generator, the generator assignment is removed from the matching fields. If a field matched more than one path generator, then the next match is used.

    To delete a path configuration:

    1. On the path generators list, click the options menu for the path configuration.

    2. Click Delete path generator.

    For fields that were assigned a generator based on the path generator:

    • If the field matches other path generators, then the next matching path generator is applied.

    • If the field does not match any other path generators, then the field reverts to Passthrough.

    hashtag
    Supported JSONPath operators and examples

    For a path generator path expression, Structural supports the following operators:

    • $ - Root

    • . - Child operator

    • .. - Recursive descent operator

    For example, a document includes an array of objects. Each object contains name, address, and email address fields.

    You can configure a path generator that assigns a generator to the address field in all of the array objects. You cannot only assign a generator to the address field in one of the array objects.

    Here are some example path expressions, based on the following JSON:

    hashtag
    Determining the priority order for the path generators

    When Structural looks for matching fields, it checks the path generators in the order that they are displayed on the Path Generators panel.

    For each field, it uses the first matching configuration.

    To change the order of the path generators, drag and drop each configuration to the appropriate location in the list.

    hashtag
    Identifying matching fields on Collection View and Document View

    On Collection View and Document View, when the assigned generator comes from a path generator, the generator assignment is marked with an icon.

    When you click the generator, the configuration panel indicates that the generator is assigned based on a path generator.

    hashtag
    Overriding the path generator assignment

    When you change the configuration for the field, the icon is removed. The override tooltip indicates that the path generator was overridden.

    If you set the generator to Passthrough, then the field reverts to the path generator.

    Generator reference

    This generator reference provides the details for each of the the supported generators in Tonic Structural.

    chevron-rightInformation provided for each generatorhashtag

    For each generator, the reference provides:

    • Overview description

    • A table that contains:

      • Generator characteristics that you might want to take into account when you select the generator.

      • The generator , which indicates the level of protection that the generator provides.

      • The generator ID to use in the Structural API. The generator ID is linked to the API details for the generator.

    • Instructions for how to configure the generator

    The generator characteristics include:

    • - Whether you configure the generator to base the the destination values on the source values.

    • - Whether you can link columns that use the generator to indicate that there is a relationship between them.

    • - Whether the generator supports differential privacy, which ensures that the source value cannot be reverse engineered from the output value.

    The generators are in alphabetical order by the generator name.

    Here are some groupings to help to identify generators that are used for different types of values. also provides some suggestions for generators to use for specific uses cases.

    chevron-rightComposite generatorshashtag

    Transform data that uses complex formats or based on a condition. For more information, go to .

    chevron-rightInformation type generatorshashtag

    These generators produce specific types of values.

    • (and the deprecated )

    chevron-rightDatetime value generatorshashtag

    These generators are used to specifically transform datetime values.

    chevron-rightKey generatorshashtag

    Intended for use with primary key columns. For more information, go to .

    chevron-rightNumeric value generatorshashtag

    These generators are specifically intended to work with numeric values.

    chevron-rightString value generatorshashtag

    These generators are useful for transforming string values that aren't covered by a specific information type generator.

    chevron-rightOther value substitution and replacement generatorshashtag

    These generators perform other types of transformation on column values.

    Configuring the available secrets managers

    circle-info

    Required global permission: Manage secrets managers

    hashtag
    Supported secrets manager tools and formats

    Structural currently supports:

    Using Document View for JSON columns

    circle-info

    Only supported for the file connector and PostgreSQL.

    Note that for PostgreSQL, Document View cannot be used when the entire JSON document is an array. The JSON can contain arrays, but the document itself cannot be an array.

    For columns that contain JSON content, you can use the JSON Mask generator to assign generators to individual JSON fields. To identify the fields, you use JSONPath expressions.

    Another option is to use Document View, which allows you to view the structure of the JSON content and then assign generators to individual JSON fields.

    Categorical

    The Categorical generator shuffles the existing values within a field while maintaining the overall frequency of the values. It disassociates the values from other pieces of data. Note that NULL is considered a separate value.

    For example, a column contains the values Small, Medium, and Large. Small appears 3 times, Medium appears 4 times, and Large appears 5 times. In the output data, each value still appears the same number of times, but the values are shuffled to different rows.

    This generator is optimized for categories with fewer than 10,000 unique values. If your underlying data has more unique values (for example, your field is populated by freeform text entry), we recommend that you use the

    {
      "first_name": "John",
      "last_name": "Smith",
      "children": [
        {
          "first_name": "Mary",
          "last_name": "Jones",
          "children": [
            {
              "first_name": "Ann",
              "last_name": "Jones"
            }
          ]
        }
      ]
    }

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    5

    Generator ID (for the API)

    JsonMaskGenerator

    Generator summary

    Summary list of generators.

    Generator reference

    Details about the characteristics and configuration options for each generator.

    Generator API reference

    Details about the structure of each generator assignment in the API.

    Generator characteristics

    Common generator characteristics to be aware of, such as consistency and linking.

    Composite generators

    Composite generators apply a generator to a specific data element or based on a condition.

    Primary key generators

    Learn about generators that you can apply to primary key columns.

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 1 if not consistent

    • 4 if consistent

    Generator ID (for the API)

    NameGenerator

    Structural data encryption

    Data-free - Whether the generator is data-free, meaning that the output data is completely unrelated to the source data.

  • Allowed for primary keys - Whether you can assign the generator to primary key columns.

  • Allowed for unique columns - Whether you can assign the generator to columns that require unique values.

  • Uses format-preserving encryption (FPE) - Whether the generator uses FPE to encrypt the values.

  • Conditional

  • CSV Mask

  • HStore Mask

  • HTML Mask

  • JSON Mask

  • Regex Mask

  • Struct Mask

  • XML Mask

  • Email

  • File Name

  • Finnish Personal Identity Code

  • FNR

  • Geo

  • HIPAA Address

  • Hostname

  • International Address

  • IP Address

  • MAC Address

  • Name

  • Phone

  • Shipping Container

  • SIN

  • SSN

  • Unique Email

  • URL

  • Random Timestamp

  • Timestamp Shift Generator

  • Integer Key

  • Numeric String Key

  • UUID Key

  • Cross Table Sum

  • Noise Generator

  • Random Double

  • Random Integer

  • Sequential Integer

  • Character Substitution

  • Constant - Also useable for numeric columns.

  • Custom Categorical - Also useable for numeric columns.

  • Find and Replace

  • Regex Mask

  • Random Boolean

  • Random Hash

  • Random UUID

  • privacy ranking
    Consistency
    Linking
    Differential privacy
    Generator hints and tips
    Composite generators
    Array JSON Mask
    Array Regex Mask
    Address
    Business Name
    Company Name
    Date Truncation
    Event Timestamps
    Primary key generators
    Alphanumeric String Key
    ASCII Key
    Algebraic
    Continuous
    Categorical
    Character Scramble
    Array Character Scramble
    Null
    MAC Address
    UUID Key
    Random UUID
    Struct Mask
    JSON Mask
    XML Mask
    HTML Mask
    HStore Mask
    Categorical
    Structural data encryption
    NoiseGenerator
    Structural data encryption
    IntegerPkGenerator
    Overriding the Structural seed value for a workspace
    Enabling consistency across runs or multiple databases
    The name of the generator preset that Structural applies to matching fields.
    Data Types
    , to add a data type to the filter, select it from the dropdown list. To remove a data type, click its delete icon. When you configure data type filters, the matching fields list is updated to only include fields that have one of the specified data types.
  • From the generator dropdown list, select the generator to apply to matching fields. The available generators are affected by the data type filter. When the data type filter is empty, you can only select from generators that can be used for any type of column. When you specify a list of data types, you can only select from generators that can be used for all of those data types.

  • Configure the selected generator. For a generator assigned to a path expression:

    • Linking is not supported.

    • Consistency with other columns is not supported.

  • * - Wildcard operator

  • [*] - Array operator. Note that a path generator must always target all of the items in an array. Any use of the array operator must include the wildcard operator.

  • $.bookstore_name

    Find the bookstore_name field at the top level of the JSON.

    $.mailing_address.zip_code

    Find the zip_code field in the mailing address.

    $.books[*].isbn

    Find the isbn field in each entry in the array of books.

    $..country

    Find all country fields in the JSON.

    JSONPatharrow-up-right
    Path generators list
    Options menu for a path generator
    Field with a generator assigned by a path generator
    Configuration panel for a field that matches a path generator
    Field where the path generator is overridden by a different generator configuration
    Supported JSONPath operators and examples
    {
      [
        {
          "name": "John Smith",
          "address": "1 Main Street",
          "email_address": "[email protected]"
        },
        {
          "name": "Mary Jones",
          "address": "5 Elm Avenue",
          "email_address": "[email protected]"
        },
      ]
    }
    {
      "bookstore_name": "Read & Brew Books",
      "mailing_address": {
        "street": "123 Literary Lane",
        "city": "Bookville",
        "state": "MA",
        "zip_code": "02451",
        "country": "USA"
      },
      "phone_number": "555-123-4567",
      "books": [
        {
          "title": "The Great Gatsby",
          "author": "F. Scott Fitzgerald",
          "isbn": "978-0743273565",
          "publication_year": 1925,
          "country": "USA"
        },
        {
          "title": "Moby Dick",
          "author": "Herman Melville",
          "isbn": "978-1503280786",
          "publication_year": 1851,
          "country": "USA"
        },
        {
          "title": "To the Lighthouse",
          "author": "Virginia Woolf",
          "isbn": "978-0156907392",
          "publication_year": 1927,
          "country": "UK"
        },
        {
          "title": "The Catcher in the Rye",
          "author": "J.D. Salinger",
          "isbn": "978-0316769174",
          "publication_year": 1951,
          "country": "USA"
        }
      ]
    }
    • AWS Secrets Manager

    • HashiCorp Vault

    Structural only supports secrets that store passwords. For AWS Secrets Manager, the passwords must be in one of the following formats:

    • String

    • JSON

    The JSON must contain a map of key-value pairs. It can either:

    • Contain a single key for which the value is the password in plaintext.

    • Contain a key for which the value is the password in plaintext.

    hashtag
    Viewing the secrets manager list

    To display the list of secrets managers, on Structural Settings view, click Secrets Managers.

    Secrets Manager stab on Structural Settings

    hashtag
    Working with secrets managers

    hashtag
    Creating a secrets manager

    To create a secrets manager:

    1. On the Secrets Managers tab, click Add Secrets Manager.

    2. On the Create Secrets Manager panel, in the Name field, provide a name to use to identify the secrets manager. Secrets manager names must be unique. The name is used in the secrets manager dropdown list on the workspace settings view.

    3. From the Type dropdown list, select the secrets manager product. Structural currently supports AWS Secrets Manager.

    4. Configure the credentials to use to connect to the secrets manager.

    5. Click Save.

    hashtag
    Editing an existing secrets manager

    For an existing secrets manager, you can change the name and the credentials configuration.

    You cannot change the type.

    To edit an existing secrets manager:

    1. In the secrets manager list, click the edit icon for the secrets manager.

    2. On the Edit Secrets Manager panel, update the configuration.

    Edit Secrets Manager panel
    1. Click Save.

    hashtag
    Deleting a secrets manager

    When you delete a secrets manager, it is removed from the workspace database connections that use it. Structural is no longer able to connect to those databases.

    To delete a secrets manager:

    1. In the secrets manager list, click the delete icon for the secrets manager.

    2. On the confirmation panel, click Delete.

    hashtag
    Providing credentials for AWS Secrets Manager

    hashtag
    Required AWS Secrets Manager permissions

    The AWS Secrets Manager credentials that you provide must have the following permissions:

    • On each secret to use, secretsmanager:GetSecretValue

    • On the encryption key for secrets that are encrypted with a customer managed key (CMK), kms:Decrypt

    Here is an example policy that grants the required Secrets Manager permissions:

    hashtag
    Selecting the source of the credentials

    For AWS Secrets Manager, from the Authentication dropdown list, select the source of the credentials:

    Authentication options for AWS Secrets Manager
    • Environment - Only available on self-hosted instances. Indicates to use either:

      • The credentials for the AWS Identity and Access Management (IAM) role on the host machine.

      • The credentials set in the following environment settings:

        • TONIC_AWS_ACCESS_KEY_ID - An AWS access key that is associated with an IAM user or role

        • TONIC_AWS_SECRET_ACCESS_KEY - The secret key that is associated with the access key

        • TONIC_AWS_REGION - The AWS Region to send the authentication request to

    • Assume Role - Indicates to use the specified assumed role.

    • User Credentials - Indicates to use the provided user credentials.

    hashtag
    Providing an assumed role

    To provide an assumed role, select Assume Role, then:

    Configuration fields for the Assume Role option for AWS Secrets Manager credentials
    1. In the Role ARN field, provide the Amazon Resource Name (ARN) for the role.

    2. In the Session Name field, provide the role session name. If you do not provide a session name, then Structural automatically generates a default unique value. The generated value begins with TonicStructural.

    3. In the Duration (in seconds) field, provide the maximum length in seconds of the session. The default is 3600, indicating that the session can be active for up to 1 hour. The provided value must be less than the maximum session duration that is allowed for the role.

    4. From the AWS Region dropdown list, select the AWS Region to send the authentication request to.

    Structural generates the external ID that is used in the assume role request. Your role’s trust policy must be configured to condition on your unique external ID.

    Here is an example trust policy:

    hashtag
    Providing AWS user credentials

    To provide the credentials, select User Credentials, then:

    Configuration fields for the User Credentials option for AWS Secrets Manager credentials
    1. In the AWS Access Key field, enter the AWS access key that is associated with an IAM user or role.

    2. In the AWS Secret Key field, enter the secret key that is associated with the access key.

    3. Optional. In the AWS Session Token field, provide the session token to use.

    4. From the AWS Region dropdown list, select the AWS Region to send the authentication request to.

    hashtag
    Configuring a HashiCorp Vault secrets manager

    For a HashiCorp Vault secrets manager:

    Configuration panel for a HashiCorp Vault secrets manager
    1. In the Vault Server field, provide the server name or IP address where the secrets manager is located.

    2. From the Secrets Engine dropdown list, select the version of the secrets engine used by the secrets manager.

    3. From the Authentication Method dropdown list, select how to authenticate to the secrets manager. The options are:

      • AppRole

      • Token

      • LDAP

    4. For app role authentication:

      1. In the Role ID field, provide the identifier of the application role.

      2. In the Secret ID field, provide the secret identifier of the application role.

      On a self-hosted instance, if you do not provide a role identifier and secret identifier, then Structural uses the values of the

    5. For token authentication, in the Token field, provide the authentication token to use. On a self-hosted instance, if you do not provide a token, Structural uses the value of the TONIC_SECRET_MANAGERS_HASHICORP_VAULT_TOKEN to authenticate.

    6. For LDAP authentication:

      1. In the Username field, provide the LDAP username.

      2. In the Password field, provide the password for the LDAP user.

      On a self-hosted instance, if you do not provide a username and password, then Structural uses the values of the TONIC_SECRET_MANAGERS_HASHICORP_VAULT_LDAP_USERNAME

    7. If the authentication method is enabled in a specific namespace, then in the Namespace field, provide the namespace.

    8. If the selected authentication method does not use the default mount path, then in the Mount path field, provide the mount path.

    hashtag
    Configuring a CyberArk Central Credential Provider secrets manager

    For a CyberArk secrets manager:

    Configuration panel for a CyberArk Credential Provider secrets manager
    1. In the CyberArk Server field, provide the server name or IP address where the secrets manager is located.

    2. In the CyberArk Port field, provide the port to use to connect to the secrets manager.

    3. In the CyberArk Application ID field, provide the identifier of the CyberArk application that contains the secrets manager.

    4. In the CyberArk Safe field, provide the name of the CyberArk safe that contains the secrets.

    5. Optional. In the CyberArk Folder field, provide the name of the folder within the safe that contains the secrets. If you do not specify a folder here or in the workspace configuration, the folder defaults to Root. To specify a folder path, use Root followed by the rest of the path, with each path component separated by backslashes. For example: Root\OS\Linux.

    By default, Structural uses the application identifier to authenticate to CyberArk.

    You can instead use a CyberArk authentication certificate.

    When you use a certificate, then by default, during authentication, Structural validates the certificate that the server sends. You might need to install a root certificate on the Structural web server and workers. Instead, to bypass the validation, set the environment setting TONIC_SECRET_MANAGERS_CYBER_ARK_CCP_VERIFY_SERVER_CERTIFICATE to false.

    To configure the CyberArk authentication certificate:

    1. Click Certificate.

    Configuration for a CyberArk authentication certificate
    1. Under CyberArk Authentication Certificate, to search for and select the certificate file, click Browse. The certificate must be either .pfx or .p12.

    2. In the CyberArk Certificate Password field, provide the password that is associated with the certificate.

    hashtag
    Testing a secrets manager configuration

    From the configuration panel, you can test whether Structural can use the configured credentials to connect to the secrets manager and retrieve a specific secret.

    hashtag
    Running the configuration test

    To test a secrets manager configuration:

    1. Click Test Secrets Manager.

    2. On the Test Secrets Manager panel, in the Secret Name field, provide the name of the secret.

    3. If needed for the secrets manager type, provide any other information that is needed for the connection.

    4. Click Run Test.

    hashtag
    Additional test information for HashiCorp Vault

    For a HashiCorp Vault secrets manager, in addition to the secret name, you can provide the following if applicable:

    • Namespace - If the secret is in a specific namespace, the name of the namespace.

    • Mount Path - If the secrets engine does not use the default mount path, the mount path to use.

    hashtag
    Additional test information for CyberArk Central Credential Provider

    For a CyberArk Central Credential Provider secrets manager, in addition to the secret name, you can optionally provide:

    • CyberArk Safe - The name of the CyberArk safe that contains the secrets manager.

    • CyberArk Folder - The name of the folder within the safe that contains the secrets manager.

    If you do not provide these values, they fall back to the values that are configured for the secrets manager.

    You can also view this video overview of Document Viewarrow-up-right.

    hashtag
    Enabling Document View for a JSON column

    For a JSON column, the Document View option is available from Privacy Hub, Database View, and Table View.

    On the column configuration panel, to enable Document View, toggle Use Document View to the on position. When you enable document parsing:

    • The generator dropdown changes to an Open in Document View button.

    • If this is the first column that you enabled Document View for, then the Document View tab becomes visible.

    • Any existing generator assignment is discarded.

    • On Privacy Hub, in the protection status display, each JSON path is displayed as a separate column. In the Database Tables list, each JSON path is a separate entry.

    Structural also runs a scan on the column to detect the JSON structure and identify sensitive fields.

    hashtag
    Displaying Document View

    On workspace management view, you use Document View to view the JSON structure.

    Document View is only available when it is enabled for at least one JSON column.

    hashtag
    Selecting the JSON column to configure

    From the Column dropdown list, select the JSON column to configure. The dropdown contains the columns that have Document View enabled.

    Column dropdown list on Document View

    hashtag
    Selecting the type of view

    From the View dropdown list, select the view to use for the selected column.

    View dropdown on Document View

    hashtag
    Hybrid view

    Hybrid view provides a consolidated view of the schema across all of the rows.

    Hybrid view of Document view

    For example, for an array, hybrid view contains a single entry with all of the possible fields.

    hashtag
    Single view

    Single view shows the structure for one row at a time. You can then page through up to 100 rows. For each row, Structural displays the row structure.

    Single view of Document view

    For example, for an array, single view shows the actual array entries for each record.

    hashtag
    Information in the field list

    For each JSON field, Document View always displays:

    • The field name and data type.

    • The assigned generator.

    • An example value. In hybrid view, you can use the magnifying glass icon to display additional example values.

    Hybrid view also displays a Field Freq column. Field Freq shows the percentage of rows that contain that permutation of field and type. For example, a field might be Null 33% of the time and contain a numeric value 67% of the time. Or a field value might be an Int32 value 3% of the time and an Int64 value 6% of the time. The percentages apply to the first 100 rows.

    hashtag
    Toggling between source and preview data

    circle-info

    Required workspace permission:

    • Source data: Preview source data

    • Destination data: Preview destination data

    The Preview toggle at the top right of Document View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to determine exactly how Tonic Structural transforms the data based on the field configuration.

    By default, the Preview toggle is in the on position, and the displayed data reflects the assigned generators.

    To display the original source data, toggle Preview to the off position.

    hashtag
    Filtering Document View fields

    In single view, you can filter by either a field name or a field value.

    In hybrid view, you can filter by either field name or field properties.

    hashtag
    Filtering single document view by field name or value

    You can filter single view to only display fields that have specific text in either the field name or the field value.

    To filter by value, toggle Search by Value to the on position.

    After you select the filter type, in the search field, type text that is in the field name or value. As you type, Structural filters the list to only include fields that contain the filter text.

    Searching single view by field value

    hashtag
    Filtering hybrid view by field name

    To filter hybrid view by field name, in the search field, begin typing text that is in the field name. As you type, Structural filters the list to only include fields with names that include the filter text.

    Searching hybrid view by field name

    hashtag
    Filtering hybrid view by field properties

    From the hybrid document view, you can filter the fields based on field properties.

    To display the Filters panel, click Filters.

    Filters panel on Document View

    hashtag
    Searching for a filter

    To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.

    Filter search for Document View

    hashtag
    Adding a filter

    To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the field list.

    Above the list, Structural displays tags for the selected filters.

    Document View with applied filters

    hashtag
    Clearing the selected filters

    To clear all of the currently selected filters, click Clear All.

    hashtag
    Filters panel filters

    The Filters panel in hybrid view includes the following options.

    hashtag
    At-risk JSON fields

    An at-risk JSON field:

    • Is marked as sensitive

    • Is assigned the Passthrough generator.

    To only display at-risk JSON fields, on the Filters panel, check At-Risk Field.

    When you check At-Risk Field, Structural adds the following filters under Privacy Settings:

    • Sets the sensitivity filter to Sensitive.

    • Sets the protection status filter to Not protected.

    hashtag
    Sensitivity

    You can filter the JSON fields based on the field sensitivity.

    On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive JSON fields.

    • To only display sensitive JSON fields, click Sensitive.

    • To only display non-sensitive JSON fields, click Not sensitive.

    Note that when you check At-risk Field, Structural automatically selects Sensitive.

    hashtag
    Protection status

    You can filter the JSON fields based on whether they have any generator other than Passthrough assigned.

    On the Filters panel, under Privacy Settings, the field protection filter is by default set to All, which indicates to display both protected and not protected JSON fields.

    • To only display JSON fields that have an assigned generator, click Protected.

    • To only display JSON fields that do not have an assigned generator, click Not protected.

    Note that when you check At-Risk Field, Structural automatically selects Not protected.

    hashtag
    Recommended generators

    When Structural detects that a JSON field is sensitive, it can also determine a recommended generator.

    For example, when it detects a name value, it also recommends the Name generator.

    You can filter the fields to display the fields that have recommended generators.

    On the Filters panel, under Recommended Generators, check the checkbox next to the recommended generator for which to display the fields that have that recommendation.

    hashtag
    Field data type

    You can filter the fields by the field data type. For example, you might only display columns that contain either numeric or integer values.

    To only display fields that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.

    The list of data types only includes data types that are present in the currently displayed fields and that are compatible with other applied filters.

    To search for a specific data type, in the Filters search field, begin to type the data type.

    hashtag
    Unresolved schema changes

    When the structure of the JSON changes, you might need to update the configuration to reflect those changes. If you do not resolve the changes, then the data generation might fail.

    To only display fields that have unresolved changes to the JSON structure, on the Filters panel, check Unresolved Schema Changes.

    hashtag
    Sensitivity type

    For detected sensitive fields, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.

    To only display fields that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.

    The list of sensitivity types only includes sensitivity types that are present in the currently displayed fields.

    To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.

    hashtag
    Sensitivity confidence

    When the document scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination.

    You can filter the columns based on the confidence level.

    To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.

    hashtag
    Indicating whether a JSON field is sensitive

    circle-info

    Required workspace permission: Configure column sensitivity

    On the field configuration panel, the sensitivity toggle at the top right indicates whether the field is marked as sensitive.

    Field configuration panel on Document View

    To mark a field as sensitive, toggle the setting to the Sensitive position.

    To mark a field as not sensitive, toggle the setting to the Not Sensitive position.

    hashtag
    Assigning a generator to a JSON field

    circle-info

    Required workspace permission: Configure column generators

    For each node, you assign a generator.

    To assign a generator:

    1. Click the generator value for the JSON field.

    2. On the configuration panel, from the Generator Type dropdown list, select the generator. Other than the Conditional and Regex Mask generators, you cannot assign a composite generator to a JSON field.

    3. Configure the generator options. For details about the available configuration options for each generator, go to the generator reference.

    When you configure a generator in Document View:

    • You can only link to other JSON fields.

    • You can only enable self-consistency.

    hashtag
    Assigning generators to fields that match JSONPath expressions

    In addition to assigning generators to individual fields, you can assign generators to generic paths. The paths use JSONPath syntax.

    For more information, go to Assigning generators to path expressions.

    or
    generator.

    hashtag
    Characteristics

    Consistency

    No, cannot be made consistent.

    Linking

    Yes, can be linked.

    Differential privacy

    Configurable

    Data-free

    No

    Allowed for primary keys

    No

    Allowed for unique columns

    No

    hashtag
    How to configure

    To configure the generator:

    1. From the Link To dropdown, select the columns to link to the current column. You can select from other columns that use the Categorical generator.

    2. Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, differential privacy is disabled.

    3. If Structural data encryption is enabled, then to use it for this column, in the advanced options section, toggle Use data encryption process to the on position.

    Character Scramble
    Custom Categorical

    Viewing the column list

    The column list on Database View contains information about the sensitivity and generator configuration for each column.

    Column list on Database View

    hashtag
    Column - Column name and type

    The Column column provides general information about the columns and their content, including:

    • Table and column name. When you click the column name, for the column table displays.

    • The name of the schema that contains the table.

    • The data type for the column.

    • An indicator when the column is a primary key

    The Column column also contains the option to .

    hashtag
    Status - Protection and sensitivity status

    The Status column provides information about whether the column contains sensitive data and whether it has an assigned generator.

    The protection status can be one of the following values:

    • Protected - The column has an assigned generator.

    • Not Sensitive - The column is marked as not sensitive.

    • At Risk - The column is sensitive and does not have an assigned generator.

    At the right of the Status column is a confidence indicator. For At Risk columns, the confidence indicator shows how confident Structural is that the column is sensitive and contains values of the displayed sensitivity type. Protected columns also reflect the original confidence level.

    For more information about how Structural identifies values and assigns the confidence level, go to .

    From the Status column, you can .

    hashtag
    Applied Generator - Column configuration

    The Applied Generator column is where you .

    The generator dropdown indicates the currently assigned generator. It also indicates when an unprotected column has a recommended generator.

    For foreign key columns, the generator dropdown is disabled and the column is marked as a foreign key. Foreign key columns always inherit the generator that is assigned to the primary key.

    In a child workspace, when the generator configuration overrides the parent workspace, the generator dropdown displays the override icon.

    The Applied Generator column also contains the option to .

    hashtag
    Filtering the column list

    To filter the column list, you can:

    • Use the table list to filter the displayed columns based on the table that the columns belong to.

    • Use the filter field to filter the columns by table or column name.

    • Use the Filters panel to filter the columns based on column attributes and generator configuration.

    You can use column filters to quickly find columns that you want to verify or to update the configuration for.

    hashtag
    Filter by table

    To filter the column list to only include columns for specific tables, either:

    • .

    • Check the checkbox for each table to display columns for.

    hashtag
    Filter by table or column name

    To filter the column list by table or column name, in the filter field, begin to type text that is in the table or column name.

    As you type, Structural filters the column list.

    hashtag
    Using the Filters panel

    The Filters panel provides access to column filters other than the table and column name.

    To display the Filters panel, click Filters. The list only includes the filters that apply to the displayed data.

    hashtag
    Searching for a filter

    To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.

    For each filter, the Filters panel indicates the number of matching columns, based on the selected tables and the current filters.

    hashtag
    Adding a filter

    To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the column list. Above the list, Structural displays tags for the selected filters.

    hashtag
    Clearing the selected filters

    To clear all of the currently selected filters, click Clear All.

    hashtag
    Filters panel filters

    hashtag
    Columns with generator recommendations

    To only display detected sensitive columns for which there is a recommended generator, on the Filters panel, check Has Generator Recommendation.

    hashtag
    At-risk columns

    An at-risk column:

    • Is marked as sensitive.

    • Is included in the destination data.

    • Is assigned the Passthrough generator.

    To only display at-risk columns, on the Filters panel, check At-Risk Column.

    When you check At-Risk Column, Structural adds the following filters under Privacy Settings:

    • Sets the sensitivity filter to Sensitive

    • Sets the protection status filter to Not protected

    • Sets the column inclusion filter to Included

    hashtag
    Sensitivity

    You can filter the columns based on the column sensitivity.

    On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive columns.

    • To only display sensitive columns, click Sensitive.

    • To only display non-sensitive columns, click Not sensitive.

    Note that when you check At-risk Column, Structural automatically selects Sensitive.

    hashtag
    Protection status

    You can filter the columns based on whether they have any generator other than Passthrough assigned. To filter the columns based on specific assigned generators, use the Applied Generator filter.

    On the Filters panel, under Privacy Settings, the column protection filter is by default set to All, which indicates to display both protected and not protected columns.

    • To only display columns that have an assigned generator, click Protected.

    • To only display columns that do not have an assigned generator, click Not protected.

    Note that when you check At-Risk Column, Structural automatically selects Not protected.

    hashtag
    Inclusion in the destination database

    You can filter the columns based on whether they are populated in the destination database. For example, if a table is truncated, then the columns in that table are not populated.

    On the Filters panel, under Privacy Settings, the column inclusion filter is by default set to All, which indicates to display both included and not included columns.

    • To only display columns that are populated in the destination database, click Included.

    • To only display columns that are not populated in the destination database, click Not included.

    Note that when you check At-Risk Column, Structural automatically selects Included.

    hashtag
    Assigned generator

    To only display columns that are assigned specific generators, on the Filters panel, under Applied Generator, check the checkbox for each generator to include.

    The list of generators only includes generators that are assigned to the currently displayed columns and that are compatible with other applied filters.

    To search for a specific generator, in the Filters search field, begin to type the generator name.

    hashtag
    Column data type

    You can filter the columns by the column data type. For example, you can only display varchar columns, or only columns that contain either numeric or integer values.

    To only display columns that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.

    The list of data types only includes data types that are present in the currently displayed columns and that are compatible with other applied filters.

    To search for a specific data type, in the Filters search field, begin to type the data type.

    hashtag
    Unresolved schema changes

    When the source database schema changes, you might need to update the configuration to reflect those changes. If you do not resolve the schema changes, then the data generation might fail. The data generation fails if there are unresolved conflicting changes, or if you configure Structural to always fail data generation when there are any unresolved changes.

    For more information about schema changes, go to .

    To only display columns that have unresolved schema changes, on the Filters panel, check Unresolved Schema Changes.

    hashtag
    Sensitivity type

    For detected sensitive columns, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.

    To only display columns that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.

    The list of sensitivity types only includes sensitivity types that are present in the currently displayed columns.

    To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.

    hashtag
    Sensitivity confidence

    When the Structural sensitivity scan , it also determines how confident it is in that determination. The Status column displays the confidence level.

    You can filter the columns based on the confidence level.

    To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.

    hashtag
    Column nullability

    You can filter the column list based on whether the column is nullable.

    On the Filters panel, under Data Attributes, the nullability filter is by default set to All, which indicates to display both nullable and non-nullable columns.

    • To only display columns that are nullable, click Nullable.

    • To only display columns that are not nullable, click Non-nullable.

    hashtag
    Column uniqueness

    You can filter the column list based on whether the column must be unique.

    On the Filters panel, under Data Attributes, the uniqueness filter is by default set to All, which indicates to display both unique and not unique columns.

    • To only display columns that must be unique, click Unique.

    • To only display columns that do not require uniqueness, click Not unique.

    hashtag
    Primary or foreign keys

    You can filter the column list to indicate whether to include:

    • Columns that are not primary or foreign keys.

    • Columns that are foreign keys.

    • Columns that are primary keys.

    On the Filters panel, under Column Type:

    • To display columns that are neither a primary key nor a foreign key, check Non-keyed.

    • To display columns that are primary keys, check Primary key.

    • To display columns that are foreign keys, check Foreign key.

    hashtag
    Generator overrides in a child workspace

    In a child workspace, to only display columns that override the generator configuration that is in the parent workspace, on the Filters panel, check Overrides Inheritance.

    hashtag
    Uses Structural data encryption

    You can enable Structural data encryption, a configuration that allows Structural to:

    • Decrypt source data before applying the generator.

    • Encrypt generated data before writing it to the destination database.

    For more information, go to .

    When Structural data encryption is enabled, the generator configuration panel includes an option to use Structural data encryption.

    To only display columns that are configured to use Structural data encryption, on the Filters panel, check Uses Data Encryption.

    hashtag
    Columns with automatically assigned generators

    You can filter the columns to only display those for which Structural automatically assigned the generator. Based on the , this can happen during data generation in response to changes to the source data schema.

    To only display columns that have automatically applied generators, on the Filters panel, toggle Automatically Applied Generators to the on position.

    hashtag
    Sorting the column list

    By default, the column list is sorted first by table name, then by column name. The columns for each table display together. Within each table, the columns are in alphabetical order.

    You can also sort the column list by column name first, then by table. Columns that have the same name display together. Those columns are sorted by the name of the table.

    The button at the right of the Column column heading indicates the current sort order.

    • T.C indicates that the table is sorted by table, then by column

    • C.T indicates that the table is sorted by column, then by table

    To switch the sort order, click the button.

    Table modes

    Each table is assigned a table mode. The table mode determines at a high level how the table is populated in the destination database.

    hashtag
    Selecting the table mode for a table

    circle-info

    Required workspace permission: Assign table modes

    Both Database View and Table View allow you to view and update the selected table mode for a table.

    For Database View, go to .

    For Table View, go to .

    hashtag
    Available table modes

    hashtag
    De-Identify

    This is the default table mode for new tables.

    In this mode, Tonic Structural copies over all of the rows to the destination database.

    For columns that have the generator set to Passthrough, Structural copies the original source data to the destination database.

    For columns that are assigned a generator other than Passthrough, Structural uses the generator to replace the column data in the destination database.

    hashtag
    Truncate

    This mode drops all data for the table in the destination database. Sensitivity scans ignore truncated tables.

    For data connectors other than Spark-based data connectors, the table schema and any constraints associated with the table are included in the destination database.

    For Spark-based data connectors (, ), the table is ignored completely.

    For the , file groups are treated as tables. When a file group is assigned Truncate mode, the data generation process ignores the files that are in that file group.

    Any existing data in the destination database is removed. For example, if you change the table mode to Truncate after an initial data generation, the next data generation clears the table data. For Spark-based data connectors, the table is removed.

    If you assign Truncate mode to a table that has a foreign key constraint, it fails during data generation. If this is a requirement, contact [email protected] for assistance.

    When is enabled, the Truncate table mode does not actually truncate the destination table. Instead, it works more like Preserve Destination table mode, which preserves existing records in the destination table.

    hashtag
    Preserve Destination

    This mode preserves the data in the destination database for this table. It does not add or update any records.

    This feature is primarily used for very large tables that don't need to be de-identified during subsequent runs after the data exists in the destination database.

    When you assign Preserve Destination mode to a table, Structural locks the generator configuration for the table columns.

    The destination database must have the same schema as the source database.

    You cannot use Preserve Destination mode when you:

    • Enable upsert for a workspace.

    • Write destination data to a container repository.

    • Write destination data to an Ephemeral snapshot.

    hashtag
    Incremental

    Incremental mode only processes the changes that occurred to the source table since the most recent data generation or other changes in the destination. This can greatly reduce generation time for large tables that don't have a lot of changes.

    For Incremental mode to work, the following conditions must be satisfied:

    • The table must exist in the destination database. Either Structural created the table during data generation, or the table was created and populated in some other way.

    • A reliable date updated column must be present. When you select Incremental mode for a table, Structural prompts you to select the date updated column to use.

    • The table must have a primary key.

    To maximize performance, we recommend that you have an index on the date updated field.

    For tables that use Incremental mode, Structural checks the source database for records that have an updated date that that is greater than the maximum date in that column in the destination database.

    When identifying records to update, Structural only checks the updated date. It does not check for other updates. Records where the generator configuration is changed are not updated if they do not meet the updated date requirement.

    For the identified records, Structural checks for primary key matches between the source and destination databases, then does one of the following:

    • If the primary key value exists in the destination database, then Structural overwrites the record in the destination database.

    • If the primary key value does not exist in the destination database, then Structural adds a new record to the destination database.

    This mode currently only updates and adds records. Rows that are deleted from the source database remain in the destination database.

    To ensure accurate incremental processing of records, we recommend that you do not directly modify the destination database. A direct modification might cause the maximum updated date in the destination database to be after the date of the last data generation. This could prevent records from being identified for incremental processing.

    Incremental mode is currently supported on PostgreSQL, MySQL, and SQL Server. If you want to use this table mode with another database type, contact .

    You cannot use Incremental mode when you:

    • Enable upsert for a workspace.

    • Write destination data to a container repository.

    • Write destination data to an Ephemeral snapshot.

    hashtag
    Scale

    In this mode, Structural generates an arbitrary number of new rows, as specified by the user, using the generators that are assigned to the table columns.

    You can use linking and partitioning to create complex relationships between columns.

    Structural generates primary and foreign keys that reflect the distribution (1:1 or 1:many) between the tables in the source database.

    You cannot use Scale mode when you enable upsert for a workspace.

    In Scale mode tables, you can only use the following generators:

    hashtag
    Indicating whether to return an error when destination data already exists (Databricks only)

    For the Databricks data connector, the table mode configuration includes an Error on Overwrite setting. The setting indicates whether to return an error when Structural attempts to write data to a destination table that already contains data. The option is not available when you write destination data to Databricks Delta tables.

    To return the error, toggle the setting to the on position.

    To not return the error, toggle the setting to the off position.

    hashtag
    Applying a filter to tables

    For workspaces that use following data connectors, the table mode configuration for De-Identify mode includes an option to apply a filter to the table:

    Table filters provide a way to generate a smaller set of data when a data connector does not support subsetting. For more information, go to .

    hashtag
    Configuring partitioning for the destination database

    This option is only available for workspaces that use the following data connectors:

    On the table mode configuration panel, you can use the Repartition or Coalesce option to indicate a number of partitions to generate.

    By default, the destination database uses the same partitioning as the source database. The partition option is set to Neither.

    hashtag
    Using the Repartition option

    The Repartition option allows you to provide a specific number of partitions to generate.

    To use the Repartition option:

    1. Click Repartition.

    2. In the field, enter the number of partitions.

    hashtag
    Using the Coalesce option

    The Coalesce option allows you to provide a maximum number of partitions to generate. If the source data has fewer partitions than the number you specify, then Structural only generates that number.

    The Coalesce option should be more efficient than the Repartition option.

    To use the Coalesce option:

    1. Click Coalesce.

    2. In the field, enter the number of partitions.

    Exporting and importing the workspace configuration

    circle-info

    Required workspace permission: Export and import workspace

    You can export a workspace configuration to a JSON file, and import configuration from a workspace configuration JSON file.

    For example, you might want to preserve a version of the workspace configuration before you test other changes. You can then use the exported file to restore the original configuration.

    Or you might want to use a script to make changes to an exported configuration file. You can then import the updated file to update the workspace configuration.

    hashtag
    Information in the exported file

    The workspace JSON configuration file includes the following information:

    • Sensitivity designations that you assigned to columns

    • Assigned table modes

    • Assigned column generators

    hashtag
    Exporting the workspace configuration

    To export the workspace configuration, either:

    • On the workspace management view, from the download menu, select Export Workspace.

    • On Workspaces view, click the actions menu for the workspace, then select Export.

    When you export a child workspace, the exported workspace does not retain any of the inheritance information. The exported information is the same for all exported workspaces.

    hashtag
    Importing a workspace configuration file

    To import a workspace configuration file:

    1. Select the import option. Either:

      • On the workspace management view, from the download menu, select Import Workspace.

      • On Workspaces view, click the actions menu for the workspace, then select Import.

    When you import a workspace configuration into a child workspace, Tonic Structural only updates the configuration that can be overridden. If a configuration must be inherited from the parent workspace, then it is not affected by the imported configuration. For more information, go to .

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "AllowReadingSecrets",
          "Effect": "Allow",
          "Action": [
            "secretsmanager:GetSecretValue"
          ],
          "Resource": "arn:aws:secretsmanager:us-east-1:111111111111:secret:mySecretNamespace/*"
        }
      ]
    }
    {
      "Version": "2012-10-17",
      "Statement": {
        "Effect": "Allow",
        "Principal": {
          "AWS": "<originating-account-id>"
        },
        "Action": "sts:AssumeRole",
        "Condition": {
          "StringEquals": {
            "sts:ExternalId": "<external-id>"
          }
        }
      }
    }

    Uses format-preserving encryption (FPE)

    No

    Privacy ranking

    • 2 if differential privacy enabled

    • 3 if differential privacy not enabled

    Generator ID (for the API)

    CategoricalGenerator

    Subsetting configuration
  • Post-job script configuration

  • On the Import Workspace dialog, to select the file to import, click Browse.

  • After you select the file, click Import.

  • About workspace inheritance
    Download menu for a workspace
    Workspace options column and dropdown list
    TONIC_SECRET_MANAGERS_HASHICORP_VAULT_APPROLE_ROLE_ID
    and
    TONIC_SECRET_MANAGERS_HASHICORP_VAULT_APPROLE_SECRET_ID
    .
    and
    TONIC_SECRET_MANAGERS_HASHICORP_VAULT_LDAP_PASSWORD
    .
    environment settings
    environment setting
    environment settings
    Table View
    display sample data for the column
    change whether a column is sensitive
    select and configure the generator to assign
    display and create column comments
    Apply a filter to the table list
    Viewing and resolving schema changes
    identifies a value as belonging to a sensitivity type
    Configuring and using Structural data encryption
    workspace configuration
    Status Values for a column
    Confidence level indicators for database columns
    Unprotected column that has a recommended generator
    Disable generator dropdown for a foreign key column
    Column with a generator configuration that overrides the parent workspace
    Filtering columns by name
    Filters panel for columns
    Using the column filter search
    Filters panel with filters selected
    Sort button in the Column heading
    How Structural identifies sensitive values
    Categorical
  • Character Scramble

  • Company Name

  • Conditional

  • Constant

  • Continuous

  • Cross Table Sum

  • Custom Categorical

  • Email

  • Event Timestamps

  • Hostname

  • IP Address

  • MAC Address

  • Name

  • Null

  • Phone

  • Random Boolean

  • Random Double

  • Random Hash

  • Random Integer

  • Random Timestamp

  • Random UUID

  • Sequential Integer

  • SSN

  • Unique Email

  • Snowflake
    Databricks
    Spark SDK
    file connector
    upsert
    [email protected]envelope
    Address
    Algebraic
    Business Name
    Amazon Redshift
    Databricks
    Google BigQuery
    Using table filtering for data warehouses and Spark-based data connectors
    Databricks
    Table mode configuration panel for a Spark-based workspace
    Assigning table modes to tables
    Selecting the table mode
    Allowing Structural to retrieve information about the latest version

    Viewing workspace jobs and job details

    You can view a list of jobs for the workspace, and view details for individual jobs.

    hashtag
    Job types

    Tonic Structural runs the following types of jobs on a workspace:

    • Sensitivity scans - Analyze the source database to identify sensitive data.

    • Collection scans - Analyze the source data for a MongoDB workspace to determine the available fields in each collection, the field types, and how prevalent the fields are.

    • Schema retrieval jobs - Refresh the cached version of the source database schema.

    • Data generation, data pipeline generation, and containerized generation jobs - Generate the destination data from the source data.

    • Upsert data generation jobs - Generate the intermediate database from the source database.

    • Upsert jobs - Use data from the intermediate database to add new rows to and update changed rows in the destination database. If the migration process is enabled, then it is a step in the upsert job.

    • SDK table statistics jobs - Only run when you use the SDK to generate data in a Spark workspace, and the assigned generators require the statistics.

    hashtag
    Job statuses

    A job can have one of the following statuses:

    • Queued - The job is queued to run, but has not yet started. A job is queued for one of the following reasons:

      • Another job is currently running on the same workspace. For example, you cannot run a sensitivity scan and a data generation, or multiple data generations, at the same time on the same workspace. This is true regardless of the number of workers on the instance. On Structural Cloud, there is also a limit on the number of concurrent running jobs for each organization. When that maximum is reached, a new job remains queued until a current running job completes.

      • There isn't an available worker on the instance to run the job. A Structural instance with one worker can only run one job at a time. If a job from one workspace is currently running, a job from another workspace cannot start until the first job is finished.

    Each of these statuses has a corresponding "with warnings" status. For example, Running with warnings, Completed with warnings. A "with warnings" status indicates that the job had at least one warning at the time of the request.

    hashtag
    Viewing the list of jobs

    Jobs view displays the list of jobs for the workspace.

    hashtag
    Displaying Jobs view

    To display Jobs view:

    • On the workspace management view, in the workspace navigation bar, click Jobs.

    • On Workspaces view, from the dropdown menu in the Name column, select Jobs.

    hashtag
    Filtering the jobs by type

    On Jobs view, you use the tabs to filter the jobs based on the job type.

    The possible tabs are:

    • All Jobs - Always displayed. Contains all of the workspace jobs across all job types. When you first display Jobs view, All Jobs is selected.

    • Data Generation - Always displayed. Includes the following types of jobs:

      • Data generation

    hashtag
    Information in a jobs list

    The list is always sorted by the submission date, with the most recent jobs at the top of the list.

    For each job, the job list includes the following information:

    • Job ID - The identifier of the job. To copy the job identifier, click the icon at the left of the row.

    • Type - The type of job.

    • Status - The current status of the job, and how long ago the job reached that status. When you hover over the status, a tooltip displays the actual timestamp for the status change, and the length of time that the job ran. For queued jobs, to display a panel with information about why the job is queued, click the status value.

    hashtag
    Filtering jobs by job status

    To filter the list by the job status:

    1. Click the filter icon in the Status column heading. The status panel displays all of the statuses that are currently in the list. For example, if there are no Queued jobs, then the Queued status is not in the list. By default, all of the statuses are included, and none of the checkboxes are checked.

    2. To only include jobs that have specific statuses, check the checkbox next to each status to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.

    hashtag
    Filtering jobs by identifier

    To filter the list by the job identifier, in the filter field, provide the full identifier.

    hashtag
    Viewing details for a selected job

    For jobs other than Queued jobs, you can display details about the workspace and the job progress.

    From the Jobs view, to display the details for a job, click the job row.

    hashtag
    Workspace information

    The left side of the job details view contains the workspace information.

    For a sensitivity scan, the workspace information is limited to the owner, database type, and worker version.

    For a data generation job, the workspace information also includes:

    • Whether subsetting, post-job scripts, or webhooks are used.

    • The number of schemas, tables, and columns in the source database.

    • The number of schemas, tables, and columns in the destination database.

    hashtag
    Job Log

    The Job Log tab shows the start date, start time, and duration of the job, followed by the list of job process steps.

    hashtag
    Privacy Report

    For data generation jobs, the Privacy Report tab displays the number of at-risk, protected, and not sensitive columns in the source database.

    At-risk columns contain sensitive data, but still have Passthrough as the assigned generator.

    Protected columns have an assigned generator other than Passthrough.

    Not sensitive columns have Passthrough as the assigned generator, but do not contain sensitive data.

    hashtag
    Copying the job identifier

    The job identifier is a unique identifier for the job. To copy the job identifier, either:

    • From Jobs view, click the copy () icon in the leftmost column.

    • From the job details view, click Copy Job ID.

    hashtag
    Canceling a job

    You can cancel Queued or Running jobs.

    For jobs with those statuses, the rightmost column in the job list contains a cancel icon.

    To cancel the job, click the icon.

    hashtag
    Downloading job information

    For workspaces that are configured to write destination data to a container repository, the Jobs view also provides access to the generated artifacts. For more information, go to .

    hashtag
    Job logs

    circle-info

    Required workspace permission: Download job logs

    To download diagnostic logs, you must have the Enable diagnostic logging global permission.

    For all jobs, the job logs provide detailed information about the job processing. Tonic.ai support might request the job logs to help diagnose issues.

    For upsert jobs where the migration process is enabled, and you configured the GET Schema Change Logs endpoint, the upsert job logs include the migration process logs.

    hashtag
    Where to download the job logs

    You can download the job logs from the Jobs view or the job details view. The download includes up to 1MB of log entries.

    On the Jobs view, to download the logs for a job, click the download icon in the rightmost column.

    On the job details view, to download the logs for a job, click Reports and Logs, then select Job Logs.

    hashtag
    Downloading diagnostic logs

    By default, Structural redacts sensitive values from the job logs. To help support troubleshooting, you can configure data connectors or an individual data generation job to create unredacted versions of the log files, referred to as diagnostic logs. For more information, go to .

    To access diagnostic log files, you must have the Enable diagnostic logging global permission.

    If you do not have the Enable diagnostic logging global permission, then you cannot download the logs for that job. The download option is disabled.

    hashtag
    Privacy Report for data generation

    circle-info

    Required workspace permission: View and download Privacy Report

    From the job details view, you can download a Privacy Report file that provides an overview of the current protection status of the database columns based on the workspace configuration at the time that the job ran.

    You can download either:

    • The Privacy Report .csv file, which provides details about the table columns, the column content, and the current protection configuration.

    • The Privacy Report PDF file, which provides charts that summarize the privacy ranking scores for the table columns. It also includes the table from the .csv file.

    To display the download options, click Reports and Logs. In the menu:

    • To download the Privacy Report .csv file, click Privacy Report CSV.

    • To download the Privacy Report PDF file, click Privacy Report PDF.

    For more information about the Privacy Report files and their content, go to .

    hashtag
    Additional logs for output to a container repository

    For a workspace that , the job includes the following additional logs:

    • Database logs - Logs for the database container that is used as the destination.

    • Datapacker logs - Logs for creating the OCI artifact and uploading it to an OCI registry.

    To download these logs for a data generation job, on the job details view, click Reports and Logs, then select Database Logs or Datapacker Logs.

    hashtag
    CloudWatch logs for data generation

    For workspaces that are connected to Amazon Redshift or Snowflake on AWS databases, the data generation job requires multiple calls to a Lambda function. For these data generation jobs, the CloudWatch logs monitor the progress of and display errors for these Lambda function calls.

    To download the CloudWatch logs for a data generation job, on the job details view, click Reports and Logs, then select CloudWatch Logs.

    The CloudWatch Logs option only displays for Amazon Redshift and Snowflake on AWS data generation jobs.

    hashtag
    Oracle SQL Loader log files

    circle-info

    Required workspace permission: Download SqlLdr Files

    For an Oracle data generation, if both of the following are true:

    • The data generation job ran SQL Loader (sqlldr).

    • sqlldr either failed or succeeded with errors.

    Then to download the sqlldr log files, click Reports and Logs, then select sqlldr Logs.

    hashtag
    Sending a log package to Tonic.ai

    circle-info

    Required global permission: Enable diagnostic logging and uploading logs directly to Tonic.ai

    The job details include an option to send a log package to Tonic.ai.

    You would likely send the log package at the request of the Structural support team, to help to troubleshoot a data generation issue.

    To send the package, from the Reports and Logs dropdown list, select Send logs to Tonic.ai.

    Structural creates the package, then uploads it to an S3 bucket. Packages are removed from the S3 bucket automatically after 30 days.

    hashtag
    Transformed files for file connector data generation

    For a data generation from a file connector workspace that uses local files, you can download the transformed files for that job.

    The download is a .zip file that contains the files for a selected file group.

    On the job details view, when files are available to download, the Data available for file groups panel displays.

    To download the files for a file group:

    1. Click Download Results.

    2. From the list, select the file group. Use the filter field to filter the list by the file group name.

    hashtag
    Performance metrics for data generation

    circle-info

    Required workspace permission: Download job logs

    For workspaces that use the newer data generation processing, users can configure a data generation job to also . This is usually done for troubleshooting purposes.

    On the job details view, to download the performance metrics for the job, click Reports and Logs, then click Performance Metrics.

    hashtag
    Viewing a Gantt chart of a data generation job flow

    circle-info

    This feature is currently in beta.

    From the job details view, you can display a Gantt chart that shows the flow of a data generation job over time. The chart can help you to understand the different steps of a data generation job and how long it takes Structural to complete each step.

    Note that this option is only available for data generation jobs that use the newer data generation process. For more information, go to . Data generation jobs that use the older process do not produce the Gantt chart.

    To display the chart, click Reports and Logs, then select View Gantt.

    The Job Visualization page displays the Gantt chart of the job progress.

    Table View

    Table View displays source or preview data for a single table. For a file connector workspace, each table corresponds to a file group.

    circle-info

    Required workspace permission:

    • Source data: Preview source data

    • Destination data: Preview destination data

    If you do not have either of these permissions, then you cannot display Table View.

    To display Table View:

    • On the workspace management view, click Table View.

    • On Workspaces view, from the dropdown menu in the Name column, select Table View.

    • From Database View, either click the arrow icon for the table, or click a row in the table.

    From Table View, you can view and update the table and column configuration.

    hashtag
    Selecting and configuring tables

    hashtag
    Selecting the table to view

    When you display Table View from Database View, it displays the data for the selected table.

    When you display Table View from the workspace management view or Workspaces view, it displays the most recently displayed table.

    If Table View was never displayed before, then it displays the first table in the workspace.

    To change the selected table, from the Table dropdown list, select the table to view.

    hashtag
    Selecting the table mode

    circle-info

    Required workspace permission: Assign table modes

    To change the table mode that is assigned to the table:

    1. Click the current table mode.

    2. On the table mode panel, from the table mode dropdown list, select the new table mode.

    When you change the table mode, Tonic Structural updates the preview data as needed. For example, if you change the table mode to Truncate, then the preview data is empty.

    For a , the table mode selection panel indicates whether the selected table mode is inherited from the parent workspace.

    If the child workspace currently overrides the parent workspace configuration, then to reset the table mode to the table that is assigned in the parent workspace, click Reset.

    hashtag
    Viewing the generator configuration summary

    The Model section of Table View displays the configured generators for the table columns.

    The header for each Model entry is the column name.

    Linked columns share an entry. The heading is a comma-separated list of the linked columns.

    Each entry contains the following information:

    • The column and generator, in the format Column Name >> Generator Name. For example, First_Name >> Name indicates that the First_Name column has the Name generator applied. For linked columns, there is a Column Name >> Generator Name entry for each column.

    • The selected configuration options for the generator.

    For a , each Model entry indicates whether the configuration overrides the parent configuration. For configurations that override the parent, to remove the overrides and restore the inheritance, click Reset.

    The Model entry also indicates when is enabled for the column.

    To remove the generator from a column, click the delete icon.

    hashtag
    Changing the column data display

    hashtag
    Toggling between source and preview data

    The Preview toggle at the top right of Table View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to understand exactly how Structural transforms the data based on the table and column configuration.

    By default, the Preview toggle is in the on position, and the displayed data reflects the selected table mode and the assigned generators. For tables that use Truncate mode, the preview data is empty. Truncated tables do not have data in the destination database.

    To display the original source data, toggle Preview to the off position.

    Note that for , you cannot preview the destination data from Table View. You must preview the data from Document View.

    hashtag
    Using a query to filter the source data

    You can provide a query to filter the source data. The query is always against the source data, not the preview data, regardless of whether the Preview toggle is off or on.

    For example, you configure a first name field to use the Name generator and enable consistency. You can then query the source data for a specific first name value to check that the preview data uses the same destination value for all of those records.

    To apply a query to the source data:

    1. Click the query filter icon, located between the table name and the table mode.

    2. On the Table Filter dialog, provide the WHERE clause for the query.

    3. To apply the query, click Apply.

    To clear an applied query, on the Table Filter dialog, click Clear.

    If no filter is applied, then the query filter icon has a white background.

    If a valid filter is applied, then the query filter icon has a gray background.

    If the provided WHERE clause is not valid, then the query filter icon has a red background.

    hashtag
    Navigating to a specific column

    When the width of the table is more than 1.5 times the visible display area, then the Jump to column option displays.

    To bring a specific column into view:

    1. At the top right of Table View, click Jump to column. The list of table columns is displayed.

    2. To filter the list, in the filter field, begin to type text that is in the column name.

    1. Click the column. Structural scrolls the columns to make the selected column visible.

    hashtag
    Information in the column headings

    In addition to the column name, the column heading provides details about the column type and protection status. It also provides access to change the column configuration.

    hashtag
    Primary and foreign key indicators

    The column heading indicates when a column is either a primary key or a foreign key.

    hashtag
    Protection status

    The column heading indicates the column protection status:

    • At risk columns are sensitive and do not have an assigned generator.

    • Protected columns have an assigned generator.

    • Not sensitive columns are not sensitive and do not have an assigned generator.

    hashtag
    Sensitivity confidence

    The sensitivity confidence indicator indicates the confidence in the detection.

    For sensitive columns that Structural detected, the confidence level can be high, medium, or low.

    For custom sensitivity rule matches or columns that you manually marked as sensitive, the confidence level is full confidence.

    For more information about how Structural identifies values and assigns the confidence level, go to .

    hashtag
    Column data type

    The column heading displays the type of data that the column contains.

    hashtag
    Child workspace overrides

    circle-info

    Required license: Enterprise

    In a , when a column overrides the parent configuration, an Overriding label displays in the column heading.

    To filter Table View to only display columns with overrides, toggle Show Overrides Only to the on position.

    hashtag
    Configuring a column

    hashtag
    Applying or ignoring a recommended generator

    circle-info

    Required workspace permission: Configure column generators

    When a sensitivity scan identifies a column, Structural recommends a generator for the column. For example, when the sensitivity scan identifies a column as a first name, Structural recommends the Name generator configured to generate a first name value.

    For unprotected columns that have a recommended generator, the column heading displays the available recommendation icon.

    When you click the dropdown, the column configuration panel includes the following information:

    • The sensitivity confidence level

    • The recommended generator

    • Sample source and destination values based on the recommended generator

    From the panel, you can choose whether to assign or ignore the recommended generator for that type.

    • To assign the recommended generator, click Apply.

    • To ignore the recommendation, click Ignore. Structural clears the recommendation.

    hashtag
    Changing the column generator configuration

    circle-info

    Required workspace permission: Configure column generators

    To assign a generator to a column that does not have an assigned generator, or to change the current configuration, click the dropdown in the column heading.

    On the generator configuration panel, from the generator type dropdown list, select the generator to assign to the column.

    Structural displays the available configuration options for the selected generator. For details about the configuration options for each generator, go to the .

    To remove the selected generator or generator preset, and reset the generator to Passthrough, click Reset, then click Reset to Passthrough.

    For more information about selecting and configuring generators and generator presets, go to .

    hashtag
    Indicating whether a column is sensitive

    circle-info

    Required workspace permission: Configure column sensitivity

    On the column configuration panel, the Sensitive Data toggle indicates whether the column is marked as sensitive. The initial configuration is based on the sensitivity scan.

    • To mark a column as sensitive, toggle the setting to the on position.

    • To mark a column as not sensitive, toggle the setting to the off position.

    In a , you cannot configure whether a column is sensitive. A child workspace always inherits the sensitivity designation from its parent workspace.

    When you copy a workspace, Structural performs a new sensitivity scan on the copy. It does not copy the sensitivity designations from the original workspace.

    hashtag
    Enabling Document View for JSON columns

    circle-info

    Supported only for the file connector and PostgreSQL.

    For a JSON column, instead of assigning a generator, you can enable Document View.

    From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to .

    To enable Document View, on the column configuration panel, toggle Use Document View to the on position. Note that if you have , or enabled , then the Use Document View toggle is in the advanced options.

    When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.

    Indicating whether a column is sensitive

    Using Collection View

    For MongoDB and Amazon DynamoDB, Collection View replaces Database View and Table View. From Collection View, you can view the fields in a selected collection. You can then assign a collection mode to the collection, and assign generators to fields.

    Collection View for a MongoDB workspace

    hashtag
    Selecting the collection to view

    From the Collection dropdown list, select the collection to view.

    hashtag
    Assigning a collection mode to the collection

    Collection mode is the term used for table mode. The collection mode determines at the collection level how Structural uses the collection data to generate the destination database.

    hashtag
    Available collection modes

    By default, the collection mode is De-Identify. In this mode, Structural uses the assigned generators to transform the source database into the destination database.

    For and , the only other options are Truncate and Preserve Destination.

    • Truncate means that only the collection structure is included in the destination database. The collection has no data in the destination database.

    • Preserve Destination means that Tonic does not change the data that is currently in the destination database.

    hashtag
    Assigning the collection mode

    circle-info

    Required workspace permission: Assign table modes

    To assign the collection mode:

    1. Click the Collection Mode dropdown list.

    2. On the panel, click the current collection mode.

    3. From the drop-down list, select the mode to use.

    hashtag
    Selecting the type of view

    You can view a collection either as a hybrid document or as single documents. From the View dropdown list, select the view to use.

    hashtag
    Hybrid document view

    The default view is Hybrid Document. For the hybrid document view, the key list reflects all of the permutations of every field from every document. For example, a field might sometimes be a datetime value and sometimes a string. Hybrid document view lists both types.

    hashtag
    Single document view

    Single Document view displays a single document at a time. You can then page through up to 100 documents. Single Document view displays the structure for each document.

    hashtag
    Information on the field list

    For each field, Collection View always displays:

    • The field name and type.

    • For fields that you configured as primary or foreign keys, a key icon.

    • The assigned generator.

    For the hybrid document view, there is also a Field Freq column. Field Freq shows the percentage of documents that contain that permutation of field and type.

    For example, a field might be Null 33% of the time and contain a numeric value 67% of the time. Or a field value might be an Int32 value 3% of the time and an Int64 value 6% of the time. The percentages apply to the first 100 documents.

    hashtag
    Toggling between source and preview data

    circle-info

    Required workspace permission:

    • Source data: Preview source data

    • Destination data: Preview destination data

    The Preview toggle at the top right of Collection View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to determine exactly how Tonic Structural transforms the data based on the collection and field configuration.

    By default, the Preview toggle is in the on position, and the displayed data reflects the selected collection mode and the assigned generators. For collections that use Truncate mode, the preview data is empty. Truncated collections do not have data in the destination database.

    To display the original source data, toggle Preview to the off position.

    hashtag
    Filtering collection fields

    In the single document view, you can filter the fields by either the field name or the field value.

    In the hybrid document view, you can filter the fields based on either the field name or field properties.

    hashtag
    Filtering single document view by field name or value

    You can filter single document view to only display fields that have specific text in either the field name or the field value.

    To filter by value, toggle Search by Value to the on position.

    After you select the filter type, in the search field, type text that is in the field name or value. As you type, Structural filters the list to only include fields that contain the filter text.

    hashtag
    Filtering hybrid view by field name

    To filter hybrid view by field name, in the search field, begin to type text that is in the field name. As you type, Structural filters the list to only include fields with names that include the filter text.

    hashtag
    Filtering hybrid view by field properties

    From the hybrid document view, you can filter the fields based on field properties.

    To display the Filters panel, click Filters.

    hashtag
    Searching for a filter

    To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.

    hashtag
    Adding a filter

    To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the field list.

    Above the list, Structural displays tags for the selected filters.

    hashtag
    Clearing the selected filters

    To clear all of the currently selected filters, click Clear All.

    hashtag
    Filters panel filters

    The Filters panel in hybrid view includes the following fields.

    hashtag
    At-risk fields

    An at-risk field:

    • Is marked as sensitive

    • Is assigned the Passthrough generator.

    To only display at-risk fields, on the Filters panel, check At-Risk Field.

    When you check At-Risk Field, Structural adds the following filters under Privacy Settings:

    • Sets the sensitivity filter to Sensitive.

    • Sets the protection status filter to Not protected.

    hashtag
    Sensitivity

    You can filter the fields based on the field sensitivity.

    On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive fields.

    • To only display sensitive fields, click Sensitive.

    • To only display non-sensitive fields, click Not sensitive.

    Note that when you check At-risk Field, Structural automatically selects Sensitive.

    hashtag
    Protection status

    You can filter the fields based on whether they have any generator other than Passthrough assigned.

    On the Filters panel, under Privacy Settings, the field protection filter is by default set to All, which indicates to display both protected and not protected fields.

    • To only display fields that have an assigned generator, click Protected.

    • To only display fields that do not have an assigned generator, click Not protected.

    Note that when you check At-Risk Field, Structural automatically selects Not protected.

    hashtag
    Recommended generators

    When Structural detects that a field is sensitive, it can also determine a recommended generator.

    For example, when it detects a name value, it also recommends the Name generator.

    You can filter the fields to display the fields that have recommended generators.

    On the Filters panel, under Recommended Generators, check the checkbox next to the recommended generator for which to display the fields that have that recommendation.

    hashtag
    Field data type

    You can filter the fields by the field data type. For example, you might only display columns that contain either numeric or integer values.

    To only display fields that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.

    The list of data types only includes data types that are present in the currently displayed fields and that are compatible with other applied filters.

    To search for a specific data type, in the Filters search field, begin to type the data type.

    hashtag
    Unresolved schema changes

    When the source database schema changes, you might need to update the configuration to reflect those changes. If you do not resolve the schema changes, then the data generation might fail. The data generation fails if there are unresolved conflicting changes, or if you configure Structural to always fail data generation when there are any unresolved changes.

    For more information about schema changes, go to .

    To only display fields that have unresolved schema changes, on the Filters panel, check Unresolved Schema Changes.

    hashtag
    Sensitivity type

    For detected sensitive fields, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.

    To only display fields that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.

    The list of sensitivity types only includes sensitivity types that are present in the currently displayed fields.

    To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.

    hashtag
    Sensitivity confidence

    When the Structural sensitivity scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination.

    You can filter the columns based on the confidence level.

    To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.

    hashtag
    Primary or foreign keys

    You can filter the column list to indicate whether to include:

    • Columns that are not primary or foreign keys.

    • Columns that are foreign keys.

    • Columns that are primary keys.

    On the Filters panel, under Field Type:

    • To display fields that are neither a primary key nor a foreign key, check Non-keyed.

    • To display fields that are primary keys, check Primary key.

    • To display fields that are foreign keys, check Foreign key.

    hashtag
    Commenting on fields

    circle-info

    Required license: Professional or Enterprise

    You can add comments to fields. For example, you might use a comment to explain why you selected a particular generator or marked a field as sensitive or not sensitive.

    hashtag
    Adding a new comment

    If a field does not have any comments, then to add a comment:

    1. Click the comment icon.

    2. In the comment field, type the comment text.

    3. Click Comment.

    hashtag
    Replying to an existing comment

    When a field has existing comments, the comment icon is green. To add comments:

    1. Click the comment icon. The comments panel shows the previous comments. Each comment includes the comment user and timestamp.

    2. In the comment field, type the comment text.

    3. Click Reply.

    hashtag
    Indicating whether a field is sensitive

    circle-info

    Required workspace permission: Configure column sensitivity

    On the field configuration panel, the sensitivity toggle at the top right indicates whether the field is marked as sensitive.

    To mark a field as sensitive, toggle the setting to the Sensitive position.

    To mark a field as not sensitive, toggle the setting to the Not Sensitive position.

    hashtag
    Assigning a generator to a field and type

    circle-info

    Required workspace permission: Configure column generators

    You can assign a generator to each combination of field and type. For example, depending on the document, the data type for a field might be either string or integer. You can indicate to use the Character Scramble generator when the field type is a string and the Random Integer generator when the field type is integer.

    In hybrid document view, the Null type reflects when the column value is Null. You do not assign a generator to it.

    To assign a generator:

    1. Click the generator value for the field.

    2. On the configuration panel, from the Generator Type dropdown list, select the generator.

    3. Configure the generator options. For details about the available configuration options for each generator, go to the .

    hashtag
    Assigning generators to fields that match JSONPath expressions

    In addition to assigning generators to individual fields, you can assign generators to generic paths. The paths use JSONPath syntax.

    For more information, go to .

    hashtag
    Disabling examples for sparse collections

    By default, Structural retrieves 100 documents. It then uses the data in these documents to populate example values in the hybrid document.

    For sparsely populated collections, where less common fields are not present in those 100 documents, Structural retrieves extra documents until it has example values for all fields. For very sparsely populated collections, this might cause the collection view to load slowly, because it must retrieve many documents.

    To disable examples for sparse collections, set the TONIC_MONGO_DISABLE_EXTRA_EXAMPLES to true. You can add this setting manually to the Environment Settings list on Structural Settings.

    Note that this setting applies to both MongoDB and Amazon DynamoDB.

    When this setting is true, fields that do not have a retrieved value use a dummy default value that is based on the data type.

    Enabling and configuring upsert

    circle-info

    Required license: Professional or Enterprise

    Not compatible with writing output to a container repository or a Tonic Ephemeral snapshot.

    By default, Tonic Structural data generation replaces the existing destination database with the transformed data from the current job.

    Upsert adds and updates rows in the destination database, but keeps all of the other existing rows intact. For example, you might have a standard set of test records that you do not want to replace every time you generate data in Structural.

    Writing output to a container repository

    circle-info

    Requires Kubernetes.

    For self-hosted Docker deployments, you can install and configure a separate Kubernetes cluster to use. For more information, go to .

    For information about required Kubernetes permissions, go to .

    To close the dialog, click Close.

    child workspace
    child workspace
    Structural data encryption
    JSON columns that use Document View
    child workspace
    Generator reference
    Assigning and configuring generators
    child workspace
    Using Document View for JSON columns
    custom value processors
    Structural data encryption
    Table View with highlighted sections
    Table Mode selection on Table View
    Table mode configuration that overrides the parent workspace
    Model section of Table View
    Model entry for linked columns
    Model entry for a configuration that overrides the parent workspace
    Table Filter dialog for Table View
    Query filter icon when no query is applied
    Query filter icon when a query is applied
    Query filter icon when a bad query is applied
    Jump to column list with column name filter text
    Column headings for primary and foreign keys
    Protection status indicators in the column headings
    Sensitivity confidence level indicator
    Data type information in the column headings
    Table View column with a configuration override
    Table View with Show Overrides Only enabled
    Table View column heading with the recommended generator icon
    Recommended generator panel for a column on Table View
    Generator dropdown list for a Table View column
    How Structural identifies sensitive values
    An example value. For the hybrid view, you can use the magnifying glass icon to display additional example values.
    MongoDB
    DynamoDB
    Viewing and resolving schema changes
    Generator reference
    Assigning generators to path expressions
    environment setting
    Hybrid Document view of Collection View
    Single Document view of Collection View
    Filter field and Search by Value toggle for single document view
    Filter field and Filters button for hybrid view
    Filters panel for hybrid view on Collection View
    Aligning email addresses to names
    Configuring generator presets

    To view information about why a job is queued, click the status value.

  • Running - The job is in progress.

  • Canceled - The job is canceled.

  • Completed - The job completed successfully.

  • Failed - The job failed to complete.

  • Data pipeline generation

  • Containerized data generation

  • Upsert data generation

  • Upsert

  • Sensitivity Scan - Always displayed. Lists the sensitivity scans for the workspace.

  • Collection Scan - Displays for MongoDB and Amazon DynamoDB workspaces. Lists the collection scans on the source data.

  • Schema - Lists the schema retrieval jobs for the workspace.

  • Statistics - Displays for workspaces that use Spark-based data. Lists the SDK table statistics jobs.

  • Submitted - The date and time when the job was submitted.

  • Completed - The date and time when the job finished running.

  • Viewing and downloading container artifacts
    Redacted and diagnostic (unredacted) logs
    Using the Privacy Report to verify data protection
    writes the output to a container repository
    generate performance metrics
    Jobs view
    Job status filter options for Jobs view
    Job details page for a data generation job
    Privacy Report tab on the job details page
    Job list with a Running job that can be canceled
    Reports and Logs menu for a data generation job
    Job details option to download transformed file connector files
    Job Visualization page with a Gantt chart of the job flow

    If you enable upsert, then you cannot write the destination data to a container repository. You must write the data to a database server.

    Upsert is currently only supported for the following data connectors:

    • MySQL

    • Oracle

    • PostgreSQL

    • SQL Server

    For an overview of upsert, you can also view the video tutorialarrow-up-right.

    hashtag
    About the upsert process

    When upsert is enabled, the data generation job writes the generated data to an intermediate database. Structural then runs the upsert job to write the new and updated records to the destination database.

    Data generation process with upsert

    The destination database must already exist. Structural cannot run an upsert job to an empty destination database.

    The upsert job adds and updates records based on the primary keys.

    • If the primary key for a record already exists in the destination database, the upsert job updates the record.

    • If the primary key for a record does not exist in the destination database, the upsert job inserts a new row.

    To only update or insert records that Structural creates based on source records, and ignore other records that are already in the destination database, ensure that the primary keys for each set of records operate on different ranges. For example, allocate the integer range 1-1000 for existing destination database records that you add manually. Then ensure that the source database records, and by extension the records that Structural creates during data generation, use a different range.

    Also note that when upsert is enabled, the Truncate table mode does not actually truncate the destination table. Instead, it works more like Preserve Destination table mode, which preserves existing records in the destination table.

    hashtag
    Enabling upsert

    To enable upsert, in the Upsert section of the workspace details, toggle Enable Upsert to the on position.

    When you enable upsert for a workspace, you are prompted to configure the upsert processing and provide the connection details for the intermediate database.

    hashtag
    Configuring upsert processing

    When you enable upsert, Structural displays the following settings to configure the upsert process.

    Disable Triggers

    Indicates whether to disable any user-defined triggers before the upsert job runs. This prevents duplicate rows from being added to the destination database. By default, this is enabled.

    Automatically Start Upsert After Successful Data Generation

    Indicates whether to immediately run the upsert job after the initial data data generation to the intermediate database. By default, this is enabled. If you turn this off, then after the initial data generation, you must start the upsert job manually. For more information, go to .

    Persist Conflicting Data Tables

    When an upsert job cannot process rows with unique constraint conflicts, as well as rows that have foreign keys to those rows, this setting indicates whether to preserve the temporary tables that contain those rows. By default, this is disabled. Structural only keeps the applicable temporary tables from the most recent upsert job.

    Warn on Mismatched Constraints

    Indicates whether to treat mismatched foreign key and unique constraints between the source and destination databases as warnings instead of errors, so that the upsert job does not fail. By default, this is disabled.

    hashtag
    Connecting to migration scripts for schema changes

    circle-info

    Required license: Enterprise

    The intermediate database must have the same schema as the destination database. If the schemas do not match, then the upsert process fails.

    To ensure that schema changes are automatically reflected in the intermediate database, you can connect the workspace to your own database migration script or tool. Structural then runs the migration script or tool whenever you run upsert data generation.

    hashtag
    How upsert works with the migration process

    When you start an upsert data generation job:

    Upsert data generation process with migration
    1. If migration is enabled, Structural calls the endpoint to start the migration.

    2. Structural cannot start the upsert data generation until the migration completes successfully. It regularly calls the status check endpoint to check whether the migration is complete.

    3. When the migration is complete, Structural starts the upsert data generation.

    hashtag
    POST Start Schema Changes endpoint

    Required. Structural calls this endpoint to start the migration process specified by the provided URL.

    The request includes:

    • Any custom parameter values that you add.

    • The connection information for the intermediate database.

    The request uses the following format:

    The response contains the identifier of the migration task.

    The response uses the following format:

    hashtag
    GET Status of Schema Change endpoint

    Required. Structural calls this endpoint to check the current status of the migration process.

    The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter.

    The response provides the current status of the migration task. The possible status values are:

    • Unknown

    • Queued

    • Running

    • Canceled

    • Completed

    • Failed

    The response uses the following format:

    hashtag
    GET Schema Change Logs endpoint

    Optional. Structural calls this endpoint to retrieve the log entries for the migration process. It adds the migration logs to the upsert logs.

    The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or a query parameter

    The response body of the request should be 'text/plain'. It contains the raw logs.

    hashtag
    DELETE Cancel Schema Changes endpoint

    Optional. Structural calls this endpoint to cancel the migration process.

    The request includes the task identifier that was returned when the migration process started. The request URL must be able to pass the request identifier as either a path or query parameter.

    hashtag
    Enabling and configuring the migration process

    To enable the migration process, toggle Enable Migration Service to the on position.

    When you enable the migration process, you must configure the POST Start Schema Changes and GET Status of Schema Change endpoints.

    You can optionally configure the GET Schema Change Logs and DELETE Cancel Schema Changes endpoints.

    To configure the endpoints:

    1. To configure the POST Start Schema Changes endpoint:

      1. In the URL field, provide the URL of the migration script.

      2. Optionally, in the Parameters field, provide any additional parameter values that your migration scripts need.

    2. To configure the GET Status of Schema Change endpoint, in the URL field, provide the URL for the status check.

      The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.

    3. To configure the GET Schema Change Logs endpoint, in the URL field, provide the URL to use to retrieve the logs. The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.

    4. To configure the DELETE Cancel Schema Changes endpoint, in the URL field, provide the URL to use for the cancellation. The URL must include an {id} placeholder. This is used to pass the identifier that is returned from the Start Schema Changes endpoint.

    hashtag
    Connecting to the intermediate database

    When you enable upsert, you must provide the connection information for the intermediate database.

    For details, go to the workspace configuration information for the data connector.

    hashtag
    How Structural responds to inconsistencies in the source and destination schemas

    During upsert data generation, when Structural finds inconsistencies between the source and destination database schemas:

    • Where possible, Structural attempts to address the issue so that the data generation can succeed.

    • Structural does not change the schema of the destination database.

    • For constraint-related schema issues, Structural only attempts to address the issues if Warn on Mismatched Constraints is enabled for the workspace. If the setting is turned off, then the job fails.

    Here are some common schema issues that can occur, and how Structural responds to them.

    hashtag
    Source column is not in the destination schema

    In this case, a column that is present in the source schema is not present in the destination schema.

    Source column missing from destination schema

    For example, a new column is added to a production source table, but is not in the schema of the de-identified destination database that is used for testing.

    When this occurs, Structural ignores the column. It does not add the column to the destination schema.

    Structural adds a warning to the job logs.

    hashtag
    Destination column is not in the source schema

    In this case, a column that is present in the destination schema is not present in the source schema.

    Destination column is missing from source schema

    For example, a developer adds a column to the de-identified destination database so that they can test a new feature. The new feature is not yet released, so the source production data doesn't include the column.

    When this occurs:

    Flow to address a column missing from the source schema
    1. If the destination column is nullable, then Structural sets the value to NULL.

    2. If the destination column is not nullable, but the column has a default value, then Structural sets the destination value to the default.

    3. If the non-nullable destination column does not have a default value, then Structural attempts to set a value based on the column data type. For example, Structural might set an integer column to 0, or a varchar column to an empty string.

    4. If Structural is unable to set a value, then the data generation fails and Structural returns an error.

    hashtag
    Source and destination columns have different data types

    In this case, the same column has different data types in the source and destination schemas.

    Data type mismatch between the source and destination columns

    For example, a column might be a string in the source schema and a timestamp in the destination schema.

    When this occurs, for each record:

    Flow to address a data type mismatch
    1. If possible, Structural converts the values. For example, the source column is a string and contains datetime values. The generator also produces datetime values. In that case, Structural should be able to populate a datetime destination column.

    2. If it cannot convert the value, and the column is nullable, then Structural sets the destination column value to NULL.

    3. If it cannot convert the value, and the column is not nullable, then the record is excluded from the upsert.

    For each of these actions, Structural also adds warnings to the job logs.

    If Structural cannot perform any of those actions to work around the issue, then the data generation fails and Structural returns an error.

    hashtag
    Source constraint is not in the destination schema

    In this case, a constraint on a source column is not present in the destination schema.

    Constraint in source schema is not in the destination schema

    For example, a column is required in the source schema but optional in the destination schema.

    If Warn On Mismatched Constraints is enabled for the workspace, then Structural does not have to make any changes to the data. It populates the destination column correctly.

    Structural also adds a warning to the job logs.

    If Warn On Mismatched Constraints is turned off, then the job fails.

    hashtag
    Destination constraint is not in the source schema

    In this case, a constraint on a destination column is not present in the source schema.

    Constraint in destination schema is not in the destination schema

    For example, a column has no constraints in the source schema, but has a uniqueness constraint in the destination schema.

    When this occurs, if Warn on Mismatched Constraints is enabled for the workspace, Structural removes any records that fail the constraint. For example, for a uniqueness constraint, Structural removes duplicate records.

    Structural also adds warnings to the job logs.

    If Warn on Mismatched Constraints is turned off, then the job fails.

    hashtag
    Source table is not in the destination schema

    In this case, a table in the source schema is not present in the destination schema.

    Source table column is not in the destination schema

    For example, a new table is added to a production source table, but is not yet in the schema of the de-identified destination database that is used for testing.

    When this occurs, Structural ignores the table. It does not add the table to the destination schema.

    Structural also adds warnings to the job logs.

    hashtag
    Destination table is not in the source schema

    In this case, a table in the destination schema is not present in the source schema.

    Destination table is not in the source schema

    For example, a developer adds a table to the de-identified destination database so that they can test a new feature. Because the new feature is not yet released, the source production data doesn't include the table.

    When this occurs, Structural ignores the table. It does not attempt to populate the destination table.

    Structural also adds warnings to the job logs.

    hashtag
    Source or destination table is renamed

    Structural cannot detect that a table is renamed.

    From Structural's perspective, the original table is removed, and the table with the new name is added.

    For example, a source and destination schema both contain a table called Users.

    In the source database, the Users table is renamed to People.

    Structural would detect the following schema issues:

    • The source schema contains a People table that is not in the destination schema. For information about how Structural addresses this, go to Source table is not in the destination schema.

    • The destination schema contains a Users table that is not in the source schema. For information about how Structural addresses this, go to Destination table is not in the source schema.

    Not compatible with upsert.

    Not compatible with Preserve Destination or Incremental table modes.

    circle-info

    Only supported for PostgreSQL, MySQL, and SQL Server.

    You can configure a workspace to write destination data to a container repository instead of to a database server.

    Diagram showing how data is written to and accessed from a container artifact

    When Structural writes data generation output to a repository, it writes the destination data to a container volume. From the list of container artifacts, you can copy the volume digest, and download a Docker Compose file that provides connection settings for the database on the volume. Structural generates the Compose file when you make the request to download it. For more information about getting access to the container artifacts, go to Viewing and downloading container artifacts.

    For an overview of writing destination data to container artifacts, you can also view the video tutorialarrow-up-right.

    hashtag
    Indicating to write destination data to container artifacts

    Under Destination Settings, to indicate to write the destination data to container artifacts, click Container Repository.

    For a Structural instance that is deployed on Docker, unless you set up a separate Kubernetes cluster, the Container Repository option is hidden.

    You can switch between writing to a database server and writing to a container repository at any time. Structural preserves the configuration details for both options. When you run data generation, it uses the currently selected option for the workspace.

    hashtag
    Identifying the base image to use to create the container artifacts

    From the Database Image dropdown list, select the image to use to create the container artifacts.

    Select an image version that is compatible with the version of the database that is used in the workspace.

    hashtag
    Providing a customization file for MySQL

    For a MySQL workspace, you can provide a customization file that helps to ensure that the temporary destination database is configured correctly.

    To provide the customization details:

    1. Toggle Use customization to the on position.

    2. In the text area, paste the contents of the customization file.

    hashtag
    Setting the location for the container artifacts

    To provide the location where Structural publishes the container artifacts:

    1. In the Registry field, type the path to the container registry where Structural publishes the data volume.

      Do not include the HTTP protocol, such as http:// or https://.

    2. In the Repository Path field, provide the path within the registry where Structural publishes the data volume.

      For a Google Artifact Registry (GAR) repository, the path format is PROJECT-ID/REPOSITORY/IMAGE.

      For more information about repository and image names, go to the .

    hashtag
    Providing the credentials to write to the registry

    You next provide the credentials that Structural uses to read from and write to the registry.

    When you provide the registry, Structural detects whether the registry is from Amazon Elastic Container Registry (Amazon ECR), Google Artifact Registry (GAR), or a different container solution.

    It displays the appropriate fields based on the registry type.

    hashtag
    Fields for registries other than Amazon ECR or GAR

    For a registry other than an Amazon ECR or a GAR registry, the credentials can be either a username and access token, or a secret.

    circle-info

    The option to use a secret is not available on Structural Cloud.

    In general, the credentials must be for a user that has read and write permissions for the registry.

    The secret is the name of a Kubernetes secret that lives on the pod that the Structural worker runs on. The secret type must be kubernetes.io/dockerconfigjson. The Kubernetes documentation provides information on how to create a registry credentials secretarrow-up-right.

    To use a username and access token:

    1. Click Access token.

    2. In the Username field, provide the username.

    3. In the Access Token field, provide the access token.

    To use a secret:

    1. Click Secret name.

    2. In the Secret Name field, provide the name of the secret.

    hashtag
    Azure Container Registry (ACR) permission requirements

    For ACR, the provided credentials must be for a service principal that has sufficient permissions on the registry.

    For Structural, the service principal must at least have the permissions that are associated with the AcrPush rolearrow-up-right.

    hashtag
    Providing a service file for GAR

    circle-info

    Structural only supports Google Artifact Registry (GAR). It does not support Google Container Registry (GCR).

    For a GAR registry, you upload a service account file, which is a JSON file that contains credentials that provide access to Google Cloud Platform (GCP).

    The associated service account must have the Artifact Registry Writer role.

    For Service Account File, to search for and select the file, click Browse.

    hashtag
    Amazon ECR registries

    For an Amazon ECR registry, you can either:

    • Provide the AWS access and secret key that is associated with the IAM user that will connect to the registry

    • Provide an assumed role

    • (Self-hosted only) Use the credentials configured in the Structural environment settings TONIC_AWS_ACCESS_KEY_ID and TONIC_AWS_SECRET_ACCESS_KEY.

    • (Self-hosted only) If Structural is deployed in Amazon Elastic Kubernetes Service (Amazon EKS), then you can use the AWS credentials that live on the EC2 instance.

    hashtag
    Using AWS access keys

    To provide an AWS access key and secret key:

    1. Click Access Keys.

    2. In the AWS Access Key field, enter an AWS access key that is associated with an IAM user or role.

    3. In the AWS Secret Key field, enter the secret key that is associated with the access key.

    4. Optionally, in the AWS Session Token field, enter the session token to use for the connection.

    hashtag
    Using an assumed role

    To provide an assumed role:

    1. Click Assume Role.

    2. In the Role ARN field, provide the Amazon Resource Name (ARN) for the role.

    3. In the Session Name field, provide the role session name. If you do not provide a session name, then Structural automatically generates a default unique value. The generated value begins with TonicStructural.

    4. In the Duration (in seconds) field, provide the maximum length in seconds of the session. The default is 3600, indicating that the session can be active for up to 1 hour. The provided value must be less than the maximum session duration that is allowed for the role.

    For the assumed role, Structural generates the external ID that is used in the assume role request. Your role’s trust policy must be configured to condition on your unique external ID.

    Here is an example trust policy:

    hashtag
    Using the credentials from the environment settings (self-hosted only)

    On a self-hosted instance, to use the credentials configured in the environment settings, click Environment Variables.

    hashtag
    Using the AWS credentials from the EC2 instance (self-hosted only)

    On a self-hosted instance, to use the AWS credentials from the EC2 instance, click Instance Profile.

    hashtag
    Required permissions for the IAM user

    The IAM user must have permission to list, push, and pull images from the registry. The following example policy includes the required permissions.

    For additional security, a repository name filter allows you to limit access to only the repositories that are used in Structural. You need to make sure that the repositories that you create for Structural match the filter.

    For example, you could prefix Structural repository names with tonic-. In the policy, you include a filter based on the tonic- prefix:

    hashtag
    Providing tags for the container artifacts

    In the Tags field, provide the tag values to apply to the container artifacts. You can also change the tag configuration for individual data generation jobs.

    Use commas to separate the tags.

    A tag cannot contain spaces. Structural provides the following built-in values for you to use in tags:

    • {workspaceId} - The identifier of the workspace.

    • {workspaceName} - The name of the workspace.

    • {timestamp} - The timestamp when the data generation job that created the artifact completed.

    • {jobId} - The identifier of the data generation job that created the artifact.

    For example, the following creates a tag that contains the workspace name, job identifier, and timestamp:

    {workspaceName}_{jobId}_{timestamp}

    To also tag the artifacts as latest, check the Tag as "latest" in your repository checkbox.

    hashtag
    Specifying custom resources for the Kubernetes pods

    You can also optionally configure custom resource values for the Kubernetes pods. You can specify the ephemeral storage, memory, and CPU millicores.

    To provide custom resources:

    1. Toggle Set custom pod resources to the on position.

    2. Under Storage Size:

      1. In the field, provide the number of megabytes or gigabytes of storage.

      2. From the dropdown list, select the unit to use.

      The storage can be between 32MB and 25GB.

    3. Under Memory Size:

      1. In the field, provide the number of megabytes or gigabytes of RAM.

      2. From the dropdown list, select the unit to use.

      The memory can be between 512MB and 4 GB.

    4. Under Processor Size:

      1. In the field, provide the number of millicores.

      2. From the dropdown list, select the unit.

      The processor size can be between 250m and 1000m.

    hashtag
    Setting a custom database name

    circle-info

    Only available for PostgreSQL and SQL Server. Not available for MySQL.

    In the Custom Database Name field, provide the name to use for the destination database.

    If you do not provide a custom database name, then the destination database uses the same name as the source database.

    hashtag
    Setting a custom database user password

    In the Custom Password field, provide the password for the destination database user.

    If you do not provide a password, then Structural generates a password.

    The destination database username is always the default user for the database:

    • For PostgreSQL, postgres

    • For MySQL, root

    • For SQL Server, sa

    hashtag
    Configuring the required tolerations for datapacker node taints

    If your Kubernetes nodes are configured with taints, then on a self-hosted instance, you can configure the tolerations that enable the datapacker pods to be scheduled on the nodes. The datapacker pod hosts the temporary database that Structural uses during the data generation.

    For an overview of taints and tolerations, go to the Kubernetes documentationarrow-up-right.

    To configure the tolerations, you configure the following environment settings. You can add these settings to the Environment Settings list on Structural Settings.

    • CONTAINERIZATION_POD_NODE_TOLERATION_KEY - The toleration key value to apply to the datapacker pods. This setting is required. If you do not configure this setting, then Structural ignores the other settings.

    • CONTAINERIZATION_POD_NODE_TOLERATION_VALUES - A comma-separated list of toleration values to apply to the datapacker pods.

    • CONTAINERIZATION_POD_NODE_TOLERATION_EFFECT - The toleration effect to apply to the datapacker pods.

    • CONTAINERIZATION_POD_NODE_TOLERATION_OPERATOR - The toleration operator to apply to the datapacker pods.

    Setting up a Kubernetes cluster to use to write output data to a container repository
    Required access to write destination data to a container repository
    { 
      "parameters": {/* user supplied parameters */ },
      "databaseConnectionDetails": {
            "server": "rds.amazon.com",
            "port": "54321",
            "username": "user",
            "password": "password",
            "databaseName": "tonic_upsert",
            "schemaName": "<Oracle schema to use>",
            "sslEnabled": true,
            "trustServerCertificate": false
      }
    }
    { "id": "<unique-string-identifier>" }
    {
      "id": "a0c5c4c3-a593-4daa-a935-53c45ec255ea",
      "status": "Completed",
      "errors": []
    }
    {
      "Version": "2012-10-17",
      "Statement": {
        "Effect": "Allow",
        "Principal": {
          "AWS": "<originating-account-id>"
        },
        "Action": "sts:AssumeRole",
        "Condition": {
          "StringEquals": {
            "sts:ExternalId": "<external-id>"
          }
        }
      }
    }
    {
      {
        "Sid": "ManageTonicRepositoryContents",
        "Effect": "Allow",
        "Action": [
          "ecr:DescribeRepositories",
          "ecr:ListImages",
          "ecr:DescribeImages",
          "ecr:BatchGetImage",
          "ecr:BatchCheckLayerAvailability",
          "ecr:InitiateLayerUpload",
          "ecr:UploadLayerPart",
          "ecr:CompleteLayerUpload",
          "ecr:PutImage"
        ],
        "Resource": [
           "arn:aws:ecr:<region>:<account_id>:repository/<optional name filter>"
        ]
      },
      {
        "Sid": "GetAuthorizationToken",
        "Effect": "Allow",
        "Action": [
          "ecr:GetAuthorizationToken"
        ],
        "Resource": "*"
      }
    }
    "Resource": [
      "arn:aws:ecr:<region>:<account_id>:repository/tonic-*"
    ]
    Google Cloud documentationarrow-up-right

    Getting started with the Structural free trial

    If you are a user who wants to set up an account in an existing Tonic Structural Cloud or self-hosted organization, go to Creating a new account in an existing organization.

    hashtag
    About the Structural free trial

    The Structural 14-day free trial allows you to explore and experiment in Structural Cloud before you decide whether to purchase Structural.

    When you sign up for a free trial, Structural automatically creates a sample workspace for you to use. You can also create a workspace that uses your own database or files.

    The free trial provides tools to introduce you to Structural and to guide you through configuring and completing a data generation.

    Structural tracks and displays the amount of time remaining in your free trial. You can request a demonstration and contact support.

    When the free trial period ends, you can continue to use Structural to configure workspaces. You can no longer generate data or train models. Contact Tonic.ai to discuss purchasing a Structural license, or select the option to .

    hashtag
    Signing up for the free trial

    To start a new free trial of Structural:

    1. Go to .

    2. Click Create Account.

    On the Create your account dialog, to create an account, either:

    • To use a corporate Google email address to create the account, click Create account using Google.

    • To create a new Structural account:

      1. Enter your email address. You cannot use a public email address for a free trial account.

    Structural sends an activation link to your email address.

    After you activate your account and log in, Structural next prompts you to select the use case that best matches why you are exploring Structural.

    If none of the provided use cases fits, use the Other option to tell us about your use case.

    After you select a use case, click Next. The Create Your Workspace panel displays.

    hashtag
    Determining whether to use your own data

    When you sign up for a free trial, Structural provides access to a sample PostgreSQL workspace that you can use to explore how to configure and run data generation.

    You can also choose to create a workspace that uses your own data, either from local files or from a database.

    If you do connect to your own data, then you must allowlist the Structural static IP addresses. For more information, go to .

    On the Create your workspace panel:

    • To use the sample workspace, click Use a sample workspace, then click Next. Structural displays , which summarizes the protection status for the source data. It also displays the and the .

    • To create a workspace that uses local files as the source data, click Upload Files, then click Next. Go to .

    • To create a new workspace that uses your own data, click Bring your own data

    hashtag
    Uploading files

    The Upload files option creates a local files workspace. The source data consists of groups of files selected from a local file system. The files in a file group must have the same type and structure. Each file group becomes a "table" in the source data.

    For other workspaces that you create during the free trial, you can also create a file connector workspace that uses files from cloud storage ( Amazon S3 or Google Cloud Storage).

    After you select Upload files and click Next, you are prompted to provide a name for the workspace.

    In the field provided, enter the name to use for the workspace, then click Next.

    Structural displays the File Groups view, where you can .

    It also displays the with links to resources to help you get started.

    After you create at least one file group, you can start to use the other Structural features and functions.

    hashtag
    Connecting to a database

    If you connect to your own data, then you must allowlist the Structural static IP addresses. For more information, go to .

    hashtag
    Provide a name for your workspace

    If you choose to create a workspace with your own data, then the first step is to provide a name for the workspace.

    In the field provided, enter the name to use for your first workspace, then click Next.

    The Invite others to Tonic panel displays.

    hashtag
    Invite other users to Structural and your workspace

    Under Invite others to Tonic, you can optionally invite other users with the same corporate email domain to start their own Structural free trial. The users that you invite are able to view and edit your workspace.

    For example, you might want to invite other users if you don't have access to the connection information for the source data. You can invite a user who does have access. They can then update the workspace configuration to add the connection details.

    To continue without inviting other users, click Skip this step.

    To invite users:

    1. For each user to invite, enter the email address, then press Enter. The email addresses must have the same corporate email domain as your email address.

    2. After you create the list of users to invite, click Next.

    The Add source data connection view displays.

    hashtag
    Supported data connectors for free trial workspaces

    The final step in the workspace creation is to provide the source data to use for your workspace.

    Structural provides data connectors that allow you to connect to an existing database. Each data connector allows you to connect to a specific type of database. Structural supports several types of application databases, data warehouses, and Spark data solutions.

    For the first workspace that you create using the free trial wizard, you can choose:

    For subsequent workspaces that you create from Workspaces view, you can also choose , , and .

    hashtag
    Selecting the data connector

    To connect to an existing database, on the Add source data connection panel, click the data connector to use, then click Add connection details.

    The panel also includes a Local files option, which creates a local files file connector workspace, the same as the Upload files option.

    Use the connection details fields to provide the connection information for your source data. The specific fields depend on the type of data connector that you select.

    After you provide the connection details, to test the connection, click Test Connection.

    To save your workspace, click Save.

    Structural displays , which summarizes the protection status for the source data.

    It also displays the with links to resources to help you get started.

    hashtag
    Free trial resources

    The Structural free trial includes a couple of resources to introduce you to Structural and to guide you through the tasks for your first data generation.

    hashtag
    Getting Started Guide panel

    The Getting Started Guide panel provides access to Structural information and support resources.

    The Getting Started Guide panel displays automatically when you first start the free trial. To display the Getting Started Guide panel manually, in the Structural heading, click Getting Started.

    The Getting Started Guide panel provides links to Structural instructional videos and this Structural documentation. It also contains links to request a Structural demo, contact Tonic.ai support, and purchase a Structural Cloud pay-as-you-go subscription.

    hashtag
    Quick start checklist

    For each free trial workspace, Structural provides access to a workspace checklist.

    The checklist displays at the bottom left of the workspace management view. It displays automatically when you display the workspace management view. To hide the checklist, click the minimize icon. To display the checklist again, click the checklist icon.

    The checklist provides a basic list of tasks to perform in order to complete a Structural data generation.

    Each checklist task is linked to the Structural location where you can complete that task. Structural automatically detects and marks when a task is completed.

    The checklist tasks are slightly different based on the type of workspace.

    hashtag
    Checklist for database-based workspaces

    For workspaces that are connected to a database, including the sample PostgreSQL workspace and workspaces that you connect to your own data, the checklist contains:

    1. Connect a source database - Set the connection to the source database. In most cases, you set the source connection when you create the workspace. When you click this step, Structural navigates to the Source Settings section of the workspace details view.

    2. Connect to destination database - Set the location where Structural writes the transformed data. When you click this step, Structural navigates to the Destination Settings section of the workspace details view.

    3. Apply generators to modify dataset - Configure how Structural transforms at least one column in the source data. When you click this step:

    hashtag
    Checklist for local file workspaces

    For workspaces that use data from local files, the checklist contains:

    1. Create a file group - Create a file group with files that you upload from a local file system. Each file group becomes a table in the workspace. When you click this step, Structural navigates to the File Groups view for the workspace.

    2. Apply generators to modify dataset - Configure how Structural transforms at least one column in the source files. When you click this step:

      • If there are available generator recommendations, then Structural navigates to Privacy Hub

    hashtag
    Checklist for cloud storage file workspaces

    For workspaces that use data from files in cloud storage (Amazon S3 or Google Cloud Storage), the checklist contains:

    1. Configure output location - Configure the cloud storage location where Structural writes the transformed files. When you click this step, Structural navigates to the Output location section of the workspace details view.

    2. Create a file group - Create a file group that contains files selected from cloud storage. When you click this step, Structural navigates to the File Groups view for the workspace.

    3. Apply generators to modify dataset - Configure how Structural transforms at least one column in the source data. When you click this step:

    hashtag
    Next step hints

    In addition to the workspace checklists, Structural uses next step hints to help guide you through the workspace configuration and data generation.

    When a next step hint is available, it displays as an animated marker next to the suggested next action.

    When you hover over the highlighted action, Structural displays a help text popup that explains the recommended action.

    When you click the highlighted action, the hint is removed, and the next hint is displayed.

    hashtag
    Creating a file group

    For a file connector workspace, to identify the source data, you create file groups. A file group is a set of files of the same type and with the same structure. Each file group becomes a table in the workspace. For CSV files, each column becomes a table column. For XML and JSON file groups, the table contains a single XML or JSON column.

    On the File Groups view, click Create File Group.

    hashtag
    Uploading local files

    For a file connector workspace that uses local files, you can either drag and drop files from your local file system to the file group, or you can search for and select files to add. For more information, go to .

    hashtag
    Selecting files from cloud storage

    For a file connector workspace that uses cloud storage, you select the files to include in the file group. For more information, go to .

    hashtag
    Configuring file delimiters and settings

    For files that contain CSV content, you configure the delimiters and other file settings. For more information, go to .

    hashtag
    Assigning a generator

    To get value out of the data generation process, you assign generators to the data columns.

    A generator indicates how to transform the data in a column. For example, for a column that contains a name value, you might assign the Name generator, which indicates how to generate a replacement name in the generation output.

    hashtag
    Applying all recommendations

    For sensitive columns that Structural detects, Structural can also provide a recommended generator configuration.

    When there are recommendations available, Privacy Hub displays a link to review all of the recommendations.

    The Recommended Generators by Sensitivity Type panel displays a list of sensitive columns that Structural detected, along with the suggested generators to apply.

    After reviewing, to apply all of the suggested generators, click Apply All. For more information about using this panel, go to .

    hashtag
    Selecting a generator

    You can also choose to apply an individual generator manually. You can do this from , , or .

    To display Database View, on the workspace management view, click Database View.

    On Database View, in the column list, the Applied Generator column lists the currently assigned generator for each column. For a new workspace, the columns are all assigned the Passthrough generator. The Passthrough generator simply passes the source value through to the destination data without masking it.

    Click a column that is marked as Passthrough, and that is not marked as sensitive. For example, in the sample workspace, the customers.Last_Transaction column. The column configuration panel displays. To select a generator, click the generator dropdown. The list contains generators that can be assigned to the column based on the column data type. For customers.Last_Transaction, the Timestamp Shift generator is a good option.

    hashtag
    Assigning a recommended generator

    For Passthrough columns that Structural identified as containing sensitive data, the Applied Generator column displays an icon to indicate that there is a recommended generator.

    In Database View, click one of those columns. For example, in the sample workspace, the customers.email column is marked as containing an email address.

    For customers.Email, click the generator dropdown. Instead of the column configuration panel, there is a panel that indicates the recommended generator. For customers.Email, the recommended generator is Email. To assign the Email generator, click Apply. The column configuration panel displays with the generator assigned.

    hashtag
    Configuring the destination location

    To run a data generation, Structural must have a destination for the transformed data.

    For a local files workspace, Structural saves the transformed files to the application database.

    For workspaces that use data from a database, and for workspaces that use cloud storage files, you configure where Structural writes the output data.

    hashtag
    Available output options

    The destination location for data generation output can be one of the following:

    • For database-based data connectors, you can write the transformed data to a destination database.

    • For some Structural data connectors, Structural can .

    • For file connector workspaces that transform files from cloud storage (Amazon S3 or Google Cloud Storage), you .

    hashtag
    Displaying the current destination configuration

    To display the destination configuration for the workspace:

    1. Click the Workspace Settings tab.

    2. Scroll to the Destination Settings section or, for a file connector workspace that uses cloud storage files, scroll to the Output location section.

    hashtag
    Confirming or changing the destination configuration

    hashtag
    Destination database

    To write the data to a destination database, click Database Server. Structural displays the configuration fields for the destination database.

    For information on how to configure the destination information for a specific data connector, go to the workspace configuration information for that data connector. The contains a list of the available data connectors, and provides a link to the documentation for each data connector.

    hashtag
    Container repository

    To write the data to a data volume in a container repository, click Container Repository. Structural displays the configuration fields to select a base image and provide the details about the repository.

    For more information, go to .

    hashtag
    Cloud storage files output location

    For a file connector workspace that uses files from cloud storage (Amazon S3 or Google Cloud Storage), you configure the cloud storage output location where Structural writes the transformed files. The configuration includes the required credentials to use.

    For more information, go to .

    hashtag
    Running data generation

    After you complete the workspace and generator configuration, you can run your first data generation.

    The data generation process uses the assigned generators to transform the source data. It writes the transformed data to the configured destination location.

    For a local files workspace, it writes the files to the Structural application database.

    hashtag
    Starting the generation

    The Generate Data option is at the top right of the Tonic heading.

    When you click Generate Data, Structural displays the Confirm Generation panel.

    The Confirm Generation panel provides access to the current destination configuration, along with other advanced generation options such as subsetting and upsert.

    It also indicates if there are any issues that prevent you from starting the data generation. For example, if the workspace does not have a configured destination, then Structural cannot run the data generation.

    To start the data generation, click Run Generation. For more information about running data generation, go to .

    hashtag
    Viewing the job details

    To view the job status and details:

    1. Click Jobs.

    2. In the list, click the data generation job.

    hashtag
    Next steps for free trial users

    The first time that you complete all of the steps in a checklist, Structural displays a panel with options to chat with our sales team, schedule a demo, or purchase a subscription.

    You can also continue to get to know Structural and experiment with other Structural features such as or using to mask more complex values such as JSON or XML.

    If your free trial has expired, to get an extension, you can reach out to us using either the in-app chat or an email message.

    Indicating whether a column is sensitive

    Create and confirm a Structural password.

  • Click Create Account.

  • , then click
    Next
    . Go to
    .
    PostgreSQL
  • Snowflake

  • SQL Server

    • If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.

    • If there are no available generator recommendations, then Structural navigates to Database View.

  • Generate data - Run the data generation to produce the destination data. When you click this item, Structural navigates to the Confirm Generation panel.

  • and displays the generator recommendations panel.
  • If there are no available generator recommendations, then Structural navigates to Database View.

  • Generate data - Run the data generation to produce transformed versions of the source files. When you click this step, Structural navigates to the Confirm Generation panel.

  • Download your dataset - Download the transformed files from the Structural application database.

  • If there are available generator recommendations, then Structural navigates to Privacy Hub and displays the generator recommendations panel.

  • If there are no available generator recommendations, then Structural navigates to Database View.

  • Generate data - Run the data generation to produce transformed versions of the source files. When you click this step, Structural navigates to the Confirm Generation panel.

  • start a Structural Cloud pay-as-you-go subscription
    app.tonic.aiarrow-up-right
    Privacy Hub
    Getting Started Guide panel
    quick start checklist
    file connector
    set up the file groups for the workspace
    Getting Started Guide panel
    Google BigQuery
    MongoDB
    MySQL
    Databricks
    Salesforce
    Amazon DynamoDB
    Privacy Hub
    Getting Started Guide panel
    Reviewing and applying recommended generators
    Privacy Hub
    Database View
    Table View
    write the transformed data to a data volume in a container repository
    configure the cloud storage location where Structural writes the transformed files
    data connector summary
    Writing output to a container repository
    Configuring the file connector storage type and output options
    Running data generation jobs
    subsetting
    composite generators
    Use case selection for a free trial account
    Create your workspace panel with workspace options
    Field to specify the workspace name
    Field to specify the name of your first workspace
    Option to invite other users to create an account
    Available data connectors for a free trial workspace
    Connection details for a data connector
    Getting Started Guide panel
    Link to the Getting Started Guide panel in the Tonic Structural heading
    Icon to display the quick start checklist
    Next step hint for a recommended action
    Next step hint help text
    Recommended generators panel
    Workspace management options for data generation configuration
    Selecting a generator for a column
    Generator recommendation for a sensitive column
    Generate Data button to start data generation
    Confirm Generation panel
    I allowlist access to my database. What are your static IP addresses?
    Uploading files
    I allowlist access to my database. What are your static IP addresses?
    Connecting to a database

    Privacy Hub

    hashtag
    About Privacy Hub

    Privacy Hub tracks the current protection status of source data columns based on:

    • Column sensitivity, either from the most recent sensitivity scan or from manual assignments

    • Assigned

    • Assigned

    To display Privacy Hub, either:

    • On the workspace management view, in the workspace navigation bar, click Privacy Hub.

    • On Workspaces view, click the workspace name.

    From Privacy Hub, you can:

    • Review and apply the recommended generators for all detected sensitive columns

    • View the current protection status of columns

    • Manually mark columns as sensitive or not sensitive

    You can also track the history of changes to column sensitivity and the assigned column generators. For more information, go to .

    hashtag
    Viewing the count of detected sensitive columns that are not protected

    The sensitivity scan detects specific types of sensitive data.

    If your workspace contains any columns that the sensitivity scan identified, and for which you have not either:

    • Assigned a generator

    • Marked as not sensitive

    Then Tonic Structural displays a Sensitivity Recommendations banner that contains a count of those columns.

    The count only includes sensitive columns that the sensitivity scan detects. If you manually mark a column as sensitive, it is not included in the list.

    On the banner, the Review Recommendations option allows you to review the detected columns and the recommended generators for each detected sensitive data type.

    You can then apply the recommended generators or ignore the recommendations. When you ignore a recommendation, you either:

    • Indicate to remove the generator recommendation for the column.

    • Indicate that the column data is not sensitive.

    For more information, go to .

    hashtag
    Viewing the protection status for each column

    The protection status panels at the top of Privacy Hub provide an overview of the current protection status of the columns in the source data.

    Each panel displays:

    • The number of columns that are in that category.

    • The estimated percentage of columns that are in that category.

    Note that for a , the protection status displays a separate box for each combination of JSON path and data type.

    From each panel, you can .

    The column counts do not include columns that do not have data in the destination database. For example, if a table is assigned Truncate table mode, then Privacy Hub ignores the columns in that table.

    The information on these panels updates automatically as you change whether columns are sensitive and assign generators to columns.

    hashtag
    At-Risk Columns

    The At-Risk Columns panel reflects columns that:

    • Are populated in the destination database.

    • Are marked as sensitive.

    • Have the generator set to Passthrough, which indicates that Structural does not perform any transformation on the data.

    For each column, the At-Risk Columns panel also indicates the sensitivity confidence, from full confidence (completely red) to low confidence (a small percentage of red).

    The goal is to have 0 at-risk columns.

    When you click Open in Database View, you navigate to . The column list is filtered to show columns that are at risk.

    hashtag
    Protected Columns

    The Protected Columns panel reflects columns that:

    • Are populated in the destination database.

    • Are assigned a generator other than Passthrough.

    It includes both sensitive and non-sensitive columns.

    Note that a column is considered protected based solely on the assigned generator. Some more complex generators, such as JSON Mask or Conditional, allow you to apply different generators to specific portions of a value or based on a specific condition. However, the protection status does not reflect these sub-generators. An applied sub-generator could be Passthrough.

    When you click Open in Database View, you navigate to . The column list is filtered to show all included columns that are protected.

    hashtag
    Not Sensitive Columns

    The Not Sensitive Columns panel reflects columns that:

    • Are populated in the destination database.

    • Are marked as not sensitive.

    • Have the generator set to Passthrough.

    When you click Open in Database View, you navigate to . The column list is filtered to show included columns that are not sensitive and are not protected.

    hashtag
    Viewing the protection status for each table

    The Database Tables list shows the protection status for each table in the source database. You can view the number of columns that have each protection status, and update the column configuration.

    The list does not include tables where the table mode is Truncate or Preserve Destination. Truncated tables are not populated in the destination database. For Preserve Destination tables, the existing data in the destination database does not change.

    hashtag
    Information in the list

    For each table, Database Tables provides the following information:

    • Name - The table name. For a workspace, each table corresponds to a file group. Each is also in a separate row. For JSON columns, the Name column displays both the table name and the column name.

    • Not Sensitive - The number of not sensitive columns in the table. Not sensitive columns are not marked as sensitive and have Passthrough as the generator. When you click the value, you navigate to , filtered to display the not sensitive columns for the table.

    hashtag
    Filtering the list

    You can filter the Database Tables list either by the table name or by the schema.

    hashtag
    Filtering by table name

    To filter the list by table name, in the filter field, begin to type text that is in the table name. As you type, Structural updates the list to only display matching tables.

    hashtag
    Filtering by schema

    To filter the list to only include tables that belong to a specific schema:

    1. Click Filter by Schema.

    2. From the schema dropdown list, select the schema.

    When you select a schema, Structural adds it to the filter field.

    hashtag
    Sorting the list

    You can sort the Database Tables list by any column except for the Privacy Status column.

    To sort by a column, click the column heading. To reverse the sort order, click the heading again.

    hashtag
    Managing columns from the table list

    The Privacy Status column in the Database Tables list indicates the protection status of the columns in the table.

    This column provides the same as the protection status panels at the top of Privacy Hub, but is limited to the columns in a specific table.

    hashtag
    Viewing and configuring columns

    hashtag
    Navigating through columns and viewing column details

    Each protection status panel displays a series of boxes to represent the columns that apply to that status. For example, if the source data contains four columns that are at-risk, then the At-Risk Columns panel displays four boxes, one for each column.

    The Privacy Status column in the Database Tables list displays the same set of boxes for the columns in an individual table.

    If the number of columns is too large to fit, then the last box shows the number of additional columns that apply. For example, if there are 15 columns that don't fit, then the last box is labeled +15.

    When you hover over a box, the column name displays in a tooltip.

    When you click a box, the details panel for that column displays.

    When you click the box for remaining columns, the details panel for the first column in the remaining columns displays.

    You can use the next and previous icons at the bottom right of the details panel to display the details for the next or previous column.

    The column details panel opens to the settings view. The settings view contains the following information:

    • The table and column name.

    • Whether the column is flagged as sensitive.

    • The type of sensitive data that the column contains.

    hashtag
    Indicating whether a column is sensitive

    circle-info

    Required workspace permission: Configure column sensitivity

    From the settings view of the column details, you can configure the column sensitivity.

    You cannot change the sensitivity of columns in a child workspace. A child workspace always inherits the sensitivity from its parent workspace. For more information, go to .

    As you change the column sensitivity, Structural updates the protection status panels.

    To change whether the column is sensitive, toggle the Sensitive option. The column is moved if needed to reflect its new status. However, you remain on the current panel.

    For example, from the At-Risk Columns panel, you change a column to be not sensitive. The column is moved to the Not Sensitive Columns panel. When you click the next or previous icons, you view the details for the next or previous column on the At-Risk Columns panel.

    hashtag
    Selecting and configuring a generator for the column

    circle-info

    Required workspace permission: Configure column generators

    From the column details, you can assign and configure the column generator.

    When you change the column generator, Structural updates the protection status panels.

    If the column generator was previously Passthrough, then the column is moved to the Protected Columns panel. However, you remain on the current panel. For example, you assign a generator to a column that is on the At-Risk Columns panel. The column is moved to the Protected Columns panel, but when you click the next or previous icons, you view the details for the next or previous column on the At-Risk Columns panel.

    hashtag
    Selecting the generator

    For sensitive columns that are not protected, Structural displays the recommended generator as a button.

    For self-hosted instances that have an Enterprise license, the recommended generator is the built-in generator preset.

    To assign the recommended generator to the column, click the button.

    Otherwise, select the generator from the Generator Type dropdown list.

    For more information about selecting a generator, go to .

    hashtag
    Configuring the generator

    If the selected generator requires additional configuration, then below the Generator Type dropdown list is an Edit Generator Options link.

    To display the configuration fields for the generator, click Edit Generator Options.

    For information about configuring a selected generator or generator preset, go to .

    After you configure the generator, to return to the settings view, click Back.

    hashtag
    Displaying sample data for a column

    circle-info

    Required workspace permission:

    • Source data: Preview source data

    From the column details, you can display sample data for the column. The sample data allows you to compare the source and destination versions of the column values.

    To display the sample data, click the view sample (magnifying glass) icon.

    On the sample data view of the column details:

    • The Original Data tab shows the values in the source data.

    • The Protected Output tab shows the values that the generator produced.

    hashtag
    Enabling Document View for JSON columns

    circle-info

    Supported only for the file connector and PostgreSQL.

    For a JSON column, instead of assigning a generator, you can enable Document View.

    From Document View, you can view the JSON schema structure and assign generators to individual JSON fields. For more information, go to .

    To enable Document View, on the column details panel, toggle Use Document View to the on position. When Document View is enabled, the generator dropdown is replaced with the Open in Document View option.

    hashtag
    Commenting on a column

    circle-info

    Required license: Professional or Enterprise

    From the column details, you can view and add comments on the column. You might use a comment to explain why you selected a particular generator or marked a column as sensitive or not sensitive.

    From the column details, to display the comments for the column, click the comment icon.

    The comments view displays any existing comments on the column. The most recent comment is at the bottom of the list. Each comment includes the name of the user who made the comment.

    To add the first comment to a column, type the comment into the comment text area, then click Comment.

    To add an additional comment, type the comment into the comment text area, then click Reply.

    hashtag
    Downloading a preview Privacy Report

    circle-info

    Required license: Enterprise

    The Privacy Report files that you download from Privacy Hub or the workspace download menu provide an overview of the current protection status based on the current configuration.

    This is different from the Privacy Report files that you download from the data generation job details, which show the protection status for the data produced by that data generation.

    You can download either:

    • The Privacy Report .csv file, which provides details about the table columns, the column content, and the current protection configuration.

    • The Privacy Report PDF file, which provides charts that summarize the privacy ranking scores for the table columns. It also includes the table from the .csv file.

    For more information about the Privacy Report files and their content, go to .

    hashtag
    From workspace management view

    To download the report from the workspace management view, click the download icon. In the download menu:

    • To download the Privacy Report PDF file, click Download Privacy Report PDF.

    • To download the Privacy Report .csv file, click Download Privacy Report CSV.

    hashtag
    From Privacy Hub

    To download the report from Privacy Hub, click Reports and Logs, then:

    • To download the Privacy Report .csv file, click Privacy Report CSV.

    • To download the Privacy Report PDF file, click Privacy Report PDF.

    hashtag
    Running a new sensitivity scan on the data

    circle-info

    Required workspace permission: Run sensitivity scan

    Privacy Hub provides an option to manually start a new . For example, you might want to run a new sensitivity scan when:

    • You add columns to the source database. The new scan identifies whether the new columns contain sensitive data.

    • The data in a column changes significantly, and a column that Structural originally marked as not sensitive might now contain sensitive data.

    You cannot run a sensitivity scan on a . Child workspaces always inherit the sensitivity results from their parent workspace.

    To run a new sensitivity scan, click Run Sensitivity Scan.

    When Structural runs a new sensitivity scan:

    • Structural analyzes and determines the sensitivity of any new columns.

    • It does not change the sensitivity of existing columns that you marked as sensitive or not sensitive.

    • For existing columns that you did not change the sensitivity of:

    The protection status panels are updated to reflect the results of the new scan.

    Configure protection for sensitive columns
  • Download a preview Privacy Report

  • Run a new sensitivity scan

  • Protected - The number of protected columns in the table. Protected columns have an assigned generator. A protected column can be either sensitive or not sensitive. When you click the value, you navigate to Database View, filtered to display the protected columns for the table.
  • At-Risk - The number of at-risk columns in the table. These columns are marked as sensitive, but have Passthrough as the generator. The goal is to have 0 unprotected sensitive columns. When you click the value, you navigate to Database View, filtered to display the at-risk columns for the table.

  • Privacy Status - Indicates the current protection status of the columns in the table. It provides the same view and configuration options as the protection status panels at the top of Privacy Hub.

  • The data type for the column data.
  • The generator that is assigned to the column.

  • For a child workspace, whether the column configuration is inherited from the parent workspace. For columns that have overrides, you can reset to the parent configuration.

  • Destination data:
    Preview destination data

    Structural does not change the sensitivity of columns that the original scan marked as sensitive.

  • It can change the sensitivity of columns that the original scan marked as not sensitive.

  • table modes
    generators
    Tracking changes to workspaces, generator presets, and sensitivity rules
    Reviewing and applying recommended generators
    JSON column that uses Document View
    display details for and configure protection for each column
    Database View
    Database View
    Database View
    file connector
    JSON column that uses Document View
    Database View
    options to view and configure columns
    About workspace inheritance
    Assigning and configuring generators
    Assigning and configuring generators
    Using Document View for JSON columns
    Using the Privacy Report to verify data protection
    sensitivity scan
    child workspace
    Privacy Hub
    Sensitivity Recommendations banner on Privacy Hub
    Protection status panels
    Settings view of column details panel
    Column details panel with generator selected
    Configuration options for a selected generator
    Sample data view on the column details panel
    Comment view of the column details panel
    Download menu for a workspace
    Reports and Logs menu on Privacy Hub
    Buttons at the top of Privacy Hub
    Determining the data generation process to use (Oracle only)
    Starting an upsert job based on the most recent data generation
    Selecting local files
    Selecting cloud storage or file mount files
    Configuring delimiters and file settings for .csv files

    Structural license plans

    Tonic Structural provides different license plans to accommodate organizations that are of different sizes and that have more or less complex data architectures.

    hashtag
    Basic license

    The Basic license is designed for very small organizations that have a very simple data architecture. It provides access to Structural's core de-identification and data generation features.

    hashtag
    Users

    The Basic license allows access for a single user, with an option to purchase an additional two users.

    There is no access to .

    hashtag
    Data connectors

    With a Basic license, you can create workspaces for one data connector type. The data connector type must be one of the following:

    hashtag
    Concurrent jobs

    With a Basic license, your Structural instance can have only one Structural worker. This means that only one sensitivity scan or data generation job can run at a time.

    hashtag
    Structural features

    With a Basic license, you can , and for those workspaces.

    You can use to view the current sensitivity status based on the current workspace configuration.

    The Basic license does NOT provide access to the following features:

    hashtag
    Structural API

    With a Basic license, you only have access to the basic version of the Structural API.

    You cannot use the basic Structural API to perform the following API tasks, which require the advanced API:

    hashtag
    Professional license

    The Professional license is designed for larger organizations that have more complex data architectures. The organization might have a larger team that supports multiple databases.

    The Professional license is also granted to .

    The Professional license provides access to a larger set of Structural features than the Basic license.

    hashtag
    Users

    The Professional license allows up to 10 users. You can purchase access for unlimited users as an add-on.

    You can use to manage your Structural users.

    hashtag
    Data connectors

    With a Professional license, you can create workspaces for up to two types of data connectors. You can purchase one additional data connector type as an add-on.

    Those data connectors can be of any type except for and .

    hashtag
    Concurrent jobs

    With a Professional license, your Structural instance can have more than one Structural worker.

    This means that you can run multiple jobs from different workspaces at the same time. You can never run multiple jobs from the same workspace at the same time.

    hashtag
    Structural features

    With a Professional license, you can do the following:

    • , and for those workspaces.

    • Use to view the current sensitivity status for your workspace configuration.

    • . The Professional license does not allow you to assign the built-in Viewer and Auditor permission sets.

    The Professional license does NOT provide access to the following features:

    hashtag
    Structural API

    With a Professional license, you only have access to the basic version of the Structural API.

    You cannot use the basic Structural API to perform the following API tasks, which require the advanced API:

    hashtag
    Enterprise license

    The Enterprise license is ideal for very large organizations that have multiple teams that support very large and complex data structures, and that might have more requirements related to scale and compliance.

    It provides full access to all Structural features.

    hashtag
    Users

    An Enterprise instance does not limit the number of users.

    hashtag
    Data connectors

    You can use any number of any of the available data connectors.

    The Enterprise license provides exclusive access to the and data connectors.

    hashtag
    Structural features

    The following features are exclusive to the Enterprise license:

    hashtag
    Structural API

    The Enterprise license provides exclusive access to the advanced API.

    The advanced Structural API provides access to all of the available API tasks, including the following tasks that are not available in the basic API:

    hashtag
    Feature comparison across Structural license plans

    The following table compares the available features for the Structural license plans.

    Feature
    Basic
    Professional
    Enterprise
    Custom sensitivity rules
  • Column commenting and email notifications

  • Audit Trail

  • Privacy Report

  • Virtual foreign keys - Can view foreign keys from the data, but cannot add virtual foreign keys

  • Subsetting

  • Upsert

  • Post-job scripts

  • Webhooks

  • Alerts for non-conflicting schema changes

  • Custom generators

  • Custom value processors

  • Custom permission sets

  • Make comments on table columns. The comments can trigger email notifications.

  • Run post-job scripts and configure webhooks.

  • Use subsetting to generate a smaller destination database.

  • Create and manage generator presets.

  • Create and manage custom sensitivity rules.

  • Create virtual foreign keys.

  • Use upsert to add destination database records and update existing destination database records, but keep unchanged destination database records in place. The Professional license does not allow you to connect to migration scripts.

  • Use Schema Changes view to view and address both conflicting and non-conflicting changes to the source data schema.

  • Use Structural data encryption to have Structural decrypt source data, encrypt destination data, or both.

  • Request custom value processors, which are primarily developed to preserve encryption that can't be managed using Structural data encryption. You can also purchase custom generators.

  • Privacy Report
  • Custom permission sets

  • Global permission set assignment

  • Workspace inheritance
  • Secrets managers for database connections

  • Upsert migration service

  • Managing custom sensitivity rules

    Manager, Editor

    Manager, Editor, Auditor, Viewer

    Custom generators

    Available for purchase

    2 included Additional ones available for purchase

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    Concurrent jobs (more than 1 worker)

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    ✓

    Structural API

    Number of users

    1

    2 additional users available as add-ons

    10

    Unlimited users available as an add-on

    Unlimited

    Data connectors

    1 data connector

    PostgreSQL or MySQL

    2 data connectors

    1 additional data connector available as an add-on

    Any data connector except for Oracle or Db2 for LUW

    Unlimited number from any available data connector

    Workspace permission sets (built-in)

    single sign-on (SSO)
    PostgreSQL
    MySQL
    create and configure workspaces
    run data generation
    Privacy Hub
    Workspace inheritance
    Workspace sharing
    Generator presets
    Assigning table modes to tables
    Assigning generators to columns
    pay-as-you-go subscriptions on Structural Cloud
    single sign-on (SSO)
    Oracle
    Db2 for LUW
    Create and configure workspaces
    run data generation
    Privacy Hub
    Grant other users Manager and Editor access to your workspaces
    Workspace inheritance
    Secrets managers for database connections
    Audit Trail
    Assigning table modes to tables
    Assigning generators to columns
    Oracle
    Db2 for LUW
    Granting Viewer and Auditor access to workspaces
    Audit Trail
    Privacy Report
    Assigning table modes to tables
    Assigning generators to columns
    Managing generator presets

    Manager

    Create workspaces
    View Privacy Hub
    Run sensitivity scans
    Assign table modes
    Assign generators to columns
    Run data generation
    View job details
    Schema change monitoring (conflicting changes)
    Schema change monitoring (non-conflicting changes)
    Single sign-on (SSO)
    Commenting and notifications
    Structural data encryption
    Custom value processors
    Custom sensitivity rules
    Generator presets
    Virtual foreign keys
    Subsetting
    Upsert
    Post-job scripts
    Webhooks
    Workspace inheritance
    Configure secrets managers for database connections
    Privacy Report
    Audit Trail
    Custom permission sets
    Basic
    Basic
    Advanced

    Generator summary

    The following table summarizes the available generators. The table includes generator characteristics that you might take into account when you select the generator to use for a column.

    Generator hints and tips also provides some suggestions for generators to use for specific use cases.

    chevron-rightInformation in the tablehashtag

    The generator summary includes the following columns:

    • Generator - The name of the generator, linked to the entry in the .

    • Description - An overview description of the generator.

    • Supported features - Includes the following information:

      • The that the generator supports

      • Whether the generator is a or a

    Generator
    Description
    Supported features
    The generator privacy ranking

    Within an array, replaces letters with random other letters, and numbers with random other numbers. Preserves punctuation and whitespace.

    Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Used to transform array values in JSON.

    To identify values to transform, you provide a list of JSONPaths. For each JSONPath, you assign a sub-generator to apply to matching values.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Used to transform values in an array. To identify values to transform, you provide a regular expression. For each capture group in an expression, you assign a sub-generator to apply to matching values.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Generates unique alpha-numeric strings based on any printable ASCII characters. You can optionally exclude lowercase letters from the generated values. The replacement value does not preserve the length of the original value.

    Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Generates a random company name-like string.

    Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Shuffles the original values for a column to different rows. Maintains the overall frequency of each value. For example, a column contains the values Small (3 times), Medium (4 times), and Large (5 times). In the transformed data, each value appears the same number of times, but the values are shuffled to different rows.

    Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy

    API:

    Replaces letters with random other letters and numbers with random other numbers. Preserves punctuation, whitespace, and mathematical symbols.

    Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Replaces characters with other random characters. Preserves punctuation, capitalization, and whitepace. A replacement character is always from within the same Unicode Block as the source character. A source character is always mapped to the same destination character. For example, M might always map to V.

    Always self-consistent Unique columns allowed Privacy ranking: 4

    (Deprecated) API:

    This generator is deprecated. Use the generator instead. Generates a random company name-like string.

    Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Applies different generators to rows conditionally based on the column value. For example, apply the Character Scramble generator for values other than Test. You configure a list of conditions. Each condition performs a check against the column value. For each condition, you assign a sub-generator to apply to matching values.

    Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: If a fallback generator is selected, then the lower of 5 or the fallback generator. 5 if no fallback generator is selected.

    API:

    Uses a single specified value to replace all of the values in the column. The replacement value must be compatible with the column data type.

    Differential privacy Data-free Privacy ranking: 1

    API:

    Generates a continuous distribution to fit the underlying data. Can link to other columns to create multivariate distributions. Can also be partitioned by other columns.

    Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy

    API:

    Populates the column using the sum of values from a column in another table. To select the rows to use, uses a foreign key value that matches the primary key value for the current row. For example, to transform the Total_Sales column in the Customers table, from the Transactions table, use the sum of the Amount values for rows where the Customer_ID value matches the primary key value for the current customer.

    Privacy ranking: 3

    API:

    Used to mask text in a delimited format.

    Parses the text as a row where the columns are delimited by a specified character. For each index, you assign a sub-generator to apply to the index value.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Replaces the original column value with a value from list of values that you provide.

    Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Truncates dates or timestamps to a specific date or time component. For example, you might truncate a date value to the month or a timestamp to the hour.

    Privacy ranking: 5

    API:

    Scrambles characters in an email address.

    Preserves the formatting and keeps the @ and .. You can identify specific email domains to not scramble.

    Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Generates timestamps that fit an event distribution. You can link columns to create a sequence of events across multiple columns. You can also partition the generator by other columns.

    Linkable Privacy ranking: 3

    API:

    Scrambles characters in a file name.

    Preserves the formatting and the file extension.

    Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Replaces all instances of the find string with the replace string. For the find string, you can optionally provide a regular expression.

    Privacy ranking: 5

    API:

    Generates a valid Finnish Personal Identity Code (PIC).

    You configure the date range during which the PIC was issued.

    Consistency - Self only

    Data-free if not consistent

    Unique columns allowed

    Format-preserving encryption (FPE)

    Privacy ranking:

    • 1 if not consistent

    API:

    Transforms Norwegian national identity numbers. You can optionally preserve the gender and birthdate portions of the identifier values.

    Consistency - Self and other Unique columns allowed Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Used to transform columns that contain latitude and longitude values.

    Linkable Unique columns allowed Privacy ranking: 3

    API:

    Can be used to generate cities, states, zip codes, and latitude/longitude values that follow HIPAA guidelines for safe harbor.

    Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Generates random host names, based on the English language.

    Consistency - Self and other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Used to transform values in an HStore column in a PostgreSQL database. You specify a list of keys for which to transform the values. For each key, you assign a generator to apply to the key value.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Used to transform columns that contain HTML content. To identify the values to transform, you provide a list of path expressions. For each path expression, you assign a generator to apply to the matching value.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Generates unique integer values.

    By default, the generated values are within the range of the column’s data type.

    You can also specify a range for the generated values. The source values must be within that range.

    Consistency - Self only Differential privacy if not consistent Data-free if not consistent Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    For Canadian mailing addresses, can generate:

    • Street name

    • Postal code

    For United Kingdom (UK) mailing addresses, can generate:

    Consistency - Self only Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Generates a random IP address-formatted string. You specify the percentage of IPv4 addresses. The remaining addresses are IPv6.

    Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Used to transform values in JSON columns. To identify values to transform, you provide a list of JSONPaths.

    For each JSONPath, you assign a sub-generator to apply to matching values.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Generates a random MAC address formatted string.

    Consistency - Self only Differential privacy if not consistent Data-free if not consistent Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Generates unique MongoDB objectId values. Can be assigned to text columns that contain MongoDB ObjectId values. The column value must be 12 bytes long.

    Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Generates a random name string from a dictionary of first and last names. You specify the name format. For example, a column might contain only a first name, or a full name that is last name first.

    Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Masks values in numeric columns.

    Either adds or multiplies the original value by random noise.

    Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Replaces all of the column values with NULL values.

    Differential privacy Data-free Unique columns allowed Privacy ranking: 1

    API:

    Generates unique numeric strings of the same length as the input numeric string.

    Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Default generator. Does not perform any transformation on the source data.

    Unique columns allowed Privacy ranking: 6

    API:

    Generates a random telephone number that matches the country or region and format of the input telephone number. For invalid telephone numbers, either replaces individual numbers or generates a valid replacement number.

    Consistency - Self only Privacy ranking: 3

    API:

    Generates a random boolean value. You specify the percentage of true values. The remaining values are false.

    Differential privacy Data-free Privacy ranking: 1

    API:

    Generates a random double number that is between the specified minimum (inclusive) and maximum (exclusive) values.

    Differential privacy Data-free Privacy ranking: 1

    API:

    Generates a random hash string.

    Differential privacy Data-free Privacy ranking: 1

    API:

    Returns a random integer that is between the specified minimum (inclusive) and maximum (exclusive) values.

    Differential privacy Data-free Privacy ranking: 1

    API:

    Generates random dates, times, and timestamps that fall within a specified range.

    Differential privacy Data-free Privacy ranking: 1

    API:

    Generates a random new UUID string.

    Differential privacy Data-free Unique columns allowed Privacy ranking: 1

    API:

    To identify values to transform, you provide a regular expression.

    For each capture group in an expression, you assign a sub-generator to apply to matching values.

    Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Generates a column of unique integer values that start with specified value, and then increment by 1 for each processed row.

    Linkable Unique columns allowed Privacy ranking: 3

    API:

    Generates values of ISO 6346 compliant shipping container codes. The codes are all in the freight ("U") category.

    Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Generates a new valid Canadian Social Insurance Number. Preserves the formatting from the original value.

    Consistency - Self only Data-free if not consistent Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Generates a new valid United States Social Security Number. For numeric columns, the dashes (xxx-xx-xxxx) are always excluded. Otherwise, you can specify the percentage of values for which to include the dashes.

    Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    API:

    Used to transform StructFields within a StructType in Databricks data. To identify the StructField value to transform, you provide a path expression. For each path expression, you assign a sub-generator to apply to the matching values.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    API:

    Shifts timestamps by a random amount of a specific unit of time, within a set range. The range can start before the original value.

    Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Generates unique email addresses.

    Replaces the username with a randomly generated GUID, and masks the domain with a character scramble.

    Consistency - Self only Unique columns allowed Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Used to transform URLs. Preserves the formatting. Keeps the URL scheme and top-level domain intact.

    Unique columns allowed Privacy ranking: 3

    API:

    Generates UUIDs.

    Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

    API:

    Used to transform values in XML columns. To identify the values to transform, you provide XPaths. For each XPath, you assign a sub-generator to apply to the matching values.

    Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5

    Address API: AddressGenerator

    Generates replacement values for U.S. mailing addresses. You select the address component or format for the replacement values. For example, the column might only contain a street address or a postal code, or it might contain a full address.

    Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent

    Algebraic API: AlgebraicGenerator

    Identifies the algebraic relationship between 3 or more numeric values, including at least one non-integer. Based on the relationship, generates new values to match. If there is no relationship, uses the Categorical generator.

    Linkable - linking is required Privacy ranking: 3

    Alphanumeric String Key API: AlphaNumericPkGenerator

    Generates unique alphanumeric strings of the same length as the input. For example, for the origin value ABC123, the output value is a six-character alphanumeric string such as D24N05.

    Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent

    generator reference
    generator characteristics
    composite generator
    primary key generator

    API:

    4 if consistent

    City
  • County

  • District

  • Country

  • Postal code

  • Array Character Scramble
    ArrayTextMaskGenerator
    Array JSON Mask
    ArrayJsonMaskGenerator
    Array Regex Mask
    ArrayRegexMaskGenerator
    ASCII Key
    AsciiPkGenerator
    Business Name
    BusinessNameGenerator
    Categorical
    CategoricalGenerator
    Character Scramble
    TextMaskGenerator
    Character Substitution
    StringMaskGenerator
    Company Name
    CompanyNameGenerator
    Business Name
    Conditional
    ConditionalGenerator
    Constant
    ConstantGenerator
    Continuous
    GaussianGenerator
    Cross Table Sum
    CrossTableAggregateGenerator
    CSV Mask
    CsvMaskGenerator
    Custom Categorical
    CustomCategoricalGenerator
    Date Truncation
    DateTruncationGenerator
    Email
    EmailGenerator
    Event Timestamps
    EventGenerator
    File Name
    FileNameGenerator
    Find and Replace
    FindAndReplaceGenerator
    Finnish Personal Identity Code
    FinnishPicGenerator
    FNR
    FnrGenerator
    Geo
    GeoGenerator
    HIPAA Address
    HipaaAddressGenerator
    Hostname
    HostnameGenerator
    HStore Mask
    HStoreMaskGenerator
    HTML Mask
    HtmlMaskGenerator
    Integer Key
    IntegerPkGenerator
    International Address
    InternationalAddressGenerator
    IP Address
    IPAddressGenerator
    JSON Mask
    JsonMaskGenerator
    MAC Address
    MACAddressGenerator
    Mongo ObjectId Key
    ObjectIdPkGenerator
    Name
    NameGenerator
    Noise Generator
    NoiseGenerator
    Null
    NullGenerator
    Numeric String Key
    NumericStringPkGenerator
    Passthrough
    PassthroughGenerator
    Phone
    USPhoneNumberGenerator
    Random Boolean
    RandomBooleanGenerator
    Random Double
    RandomDoubleGenerator
    Random Hash
    RandomStringGenerator
    Random Integer
    RandomIntegerGenerator
    Random Timestamp
    RandomTimestampGenerator
    Random UUID
    UUIDGenerator
    Regex Mask
    RegexMaskGenerator
    Sequential Integer
    UniqueIntegerGenerator
    Shipping Container
    ShippingContainerGenerator
    SIN
    SINGenerator
    SSN
    SsnGenerator
    Struct Mask
    StructMaskGenerator
    Timestamp Shift
    TimestampShiftGenerator
    Unique Email
    UniqueEmailGenerator
    URL
    UrlGenerator
    UUID Key
    UuidPkGenerator
    XML Mask
    XmlMaskGenerator