1 of 81

Using the Structural API to perform tasks

The examples and documentation are based on the most recently released version of the Tonic Structural API.

For more detailed information about the Structural API endpoints, parameters, and responses, go to the Structural generated API documentation.

Configure Structural

Connect to your data

Configure and run data generation

Configure environment settings

Included in the Basic API.

You can use the Tonic Structural API to configure selected environment settings. The settings that you can configure are the same settings that are listed on the Environment Settings tab of Structural Settings.

Identify the settings that you can configure

To identify the settings that you can use the API to configure, use:

For each setting, the response includes the setting name, the current value, and the default value.

Update an environment setting

To update the value of an environment setting, use:

The request body provides the new value.

Restore an environment setting to the default value

For an environment setting that you can configure from the API, the default value is either:

The setting value in the .yaml file
If the setting is not configured in the .yaml file, the Structural default value

To restore the default value, use:

Manage generator presets

Generator presets are part of the Advanced API, which requires an Enterprise license.

A generator preset is a saved configuration for a generator. Each generator has a built-in generator preset, which provides the default configuration for that generator.

You can also create custom generator presets. For example, you can create different versions of the Address generator that use different configuration options.

Retrieving the list of generator presets

When using the API to update, delete, or assign generator presets to columns, you use the generator preset identifier.

To retrieve the list of generator presets and their current configuration, use:

The response contains an array of preset objects.

Structure of a generator preset

The object is used in requests and responses to provide details about a generator preset.

High-level structure overview

At a very high level, the general structure of a generator preset is:

Each generator preset object contains:

Identifying information for the generator preset
Information about who most recently modified the generator preset
The generator configuration for the generator preset
When the generator preset was created and most recently modified

Identifying information for the generator preset

id is the generator preset identifier. When you create a generator preset, Tonic Structural provides the identifier for it in the response. You use the identifier to select a generator preset to update, delete, or assign to a column.

name is the name that you assign to the generator preset, and description allows you to provide additional information about the generator preset.

generatorId identifies the type of generator for the generator preset.

isBuiltIn indicates whether this is a built-in generator preset, and is only included in responses. For a built-in generator preset, id and generatorId are the same value. You can update a built-in preset, but you cannot delete a built-in generator preset.

Information about who most recently modified the generator preset

lastModifiedByUser identifies who most recently modified the generator preset configuration.

The user information includes the user's identifier, first name, last name, and email address.

Generator configuration

The metadata object contains the generator configuration for the generator preset.

Generator presets do not include the following configuration options:

Linking
Consistency with another column
Partitioning
Custom value processors

You always set those configuration options in the column configuration.

Creation and update dates

createdDate and lastModifiedDate indicate when the generator preset was created and when it was most recently updated.

Creating a custom generator preset

To create a new generator preset, use:

POST api/Presets

In the request, you provide a preset object with the details for the new generator preset.

The following example preset object creates a generator preset for the Name generator. This version of the Name generator is intended for values in the format Last Name, First Name. For example, Smith, John. It preserves capitalization and is not consistent.

{
  name: "Name - Last Name, First Name"
  description: "Use for values in the format Last Name, First Name."
  generatorId: "NameGenerator"
  "metadata": {
    "links": {
      "metadata": {
        "nameType": "LastCommaFirstName",
        "preserveCapitalization": true,
        "isConsistent": false  
      }
    }
  }
}

For a successful request, the response contains the full generator preset details, including the identifier of the new generator preset.

Updating an existing generator preset

To update the configuration of an existing generator preset, use:

PUT api/Presets/{preset identifier}

In the request, you provide a preset object with the updated generator preset configuration.

For a successful request, the response contains the updated details for the generator preset.

Deleting a generator preset

You can only delete custom generator presets. You cannot delete a built-in generator preset.

To delete a generator preset, use:

For a successful request, the response contains the details for the deleted generator preset.

Manage custom sensitivity rules

Custom sensitivity rules are part of the Advanced API, which requires an Enterprise license.

A allows you to define a sensitivity type that is not included in the Structural built-in types. For example, your data might include values that are specific to your organization.

Get the list of custom sensitivity rules

To retrieve the list of custom sensitivity rules and their current configuration, use:

For each rule, the response provides a object that includes:

Rule name
Rule description
The data type for matching columns
The regular expression used to identify matching column names. Even if the sensitivity rule was configured in the Structural application to use text matching rules, the matching rules are converted to a regular expression.
The identifier of the generator preset to recommend for matching columns
The user who most recently updated the rule
The timestamp when the rule was most recently updated
The sequence number for the rule. The sequence numbers control the order in which Structural applies the sensitivity rules.

Create a sensitivity rule

To create a sensitivity rule, use:

Rule name
An optional rule description
The data type for matching columns
The regular expression to use to identify matching column names. When you use the API to create a sensitivity rule, you must provide a regular expression. You cannot use text matching rules.
The identifier of the generator preset to recommend for matching columns

Structural automatically adds the sensitivity rule to the end of the list. To assign a sequence number, you must edit the sensitivity rule.

Update a sensitivity rule

To update an existing sensitivity rule, use:

Any changes to the sensitivity rule configuration are not applied until the next sensitivity scan.

When you change the sequence number, Structural automatically adjusts the sequence numbers of the other existing sensitivity rules.

Delete a sensitivity rule

To delete an existing sensitivity rule, use:

Create a workspace

Included in the Basic API.

To create a workspace, use:

In the request body, you provide the name and description for the new workspace:

If the request to create the workspace is successful, the response returns id, the workspace identifier.

You need the workspace identifier to make requests to update or run data generation jobs on the workspace.

For information on how to use the Structural application or API to retrieve the workspace identifier for an existing workspace, go to .

Connect to source and destination data

Included in the Basic API.

Connect to the source database

To configure the connection to the source database for the workspace, use:

You specify the data connector type, and provide the required connection details for that data connector type.

Connect to the destination database

Before you can run data generation, you must configure the connection to the destination database. To do this, use:

The destination database must use the same data connector type as the source database. You provide the required connection details based on the data connector type.

Manage file groups in a file connector workspace

Included in the Basic API.

The file connector requires a Professional or Enterprise license.

In a file connector workspace (dataType is Files), you use file groups to identify the data and configure data generation. Each file group contains a set of files that are of the same type (.csv, .json, .xml) and that have an identical format.

For a file group that contains .csv files, you also provide information about the file structure and parsing. This includes the escape character, quote character, null character, and delimiter.

When setting the .csv parsing options, you must properly escape any escapable characters. For example, in the API request body, to use \N as the null character, you must use an extra \ to escape the value. So instead of "nullChar": "\N", the request body would include "nullChar": "\\N".

Getting the list of file groups

To retrieve the list of file groups in a file connector workspace, use:

GET /api/FileGroup

You provide the identifier of the workspace.

The results contain an array of FileGroupResponseModel objects. Each object represents a file group.

fileType identifies the type of file in the group.

For a file group that contains .csv files, the object includes the delimiter configuration (escapeChar, quoteChar, hasHeader, delimiter, nullChar). If the files have a header row, then the object includes the header list.

For a file group that contains files from Amazon S3 or Google Cloud Storage, the files object contains bucketKeyPair objects.

{
  "id": "string",
  "name": "string",
  "files": [
    {
      "bucketKeyPair": {
        "bucketName": "string",
        "key": "string"
      },
      "expectedFileType": "csv"
    }
  ],
  "createdDate": "2023-08-17T16:24:46.8443973Z",
  "workspaceId": "string",
  "escapeChar": "string",
  "quoteChar": "string",
  "hasHeader": true,
  "delimiter": "string",
  "nullChar": "string",
  "fileType": "string",
  "csvHeaderColumns": [
    "string"
  ]
}

For a file group that contains files from a local file system, the files object contains localFile objects, and also indicates whether there are available generated files to download:

{
  "id": "string",
  "name": "string",
  "files": [
    {
      "localFile": {
        "fileName": "string",
        "oid": 0
      },
      "expectedFileType": "string"
    }
  ],
  "createdDate": "2023-08-17T16:24:46.8443973Z",
  "workspaceId": "string",
  "escapeChar": "string",
  "quoteChar": "string",
  "hasHeader": true,
  "delimiter": "string",
  "nullChar": "string",
  "fileType": "string",
  "csvHeaderColumns": [
    "string"
  ],
  "hasGeneratedFilesAvailable": true
}

Creating a file group (Amazon S3 or Google Cloud Storage)

For a file connector workspace that uses Amazon S3 or Google Cloud Storage, to create a file group, use:

POST /api/FileGroup

You identify the workspace and provide the file group name. You also provide the list of files and, for a file group that contains .csv files, the delimiter configuration.

{
  "name": "string",
  "workspaceId": "string",
  "escapeChar": "string",
  "quoteChar": "string",
  "hasHeader": true,
  "delimiter": "string",
  "nullChar": "string",
  "files": [
    {
      "bucketKeyPair": {
        "bucketName": "string",
        "key": "string"
      },
      "expectedFileType": "string"
    }
  ]
}

Updating a file group (Amazon S3 or Google Cloud Storage)

To update a file group that contains files from Amazon S3 or Google Cloud Storage, to change the files, use:

PUT /api/FileGroup

In the request, you provide a FileGroupDefinitionModel object that contains a revised list of files. You cannot change the file group name, file type, or the .csv delimiter configuration.

Creating a file group (local files)

To create a file group that contains files from a local file system, use:

POST /api/FileGroup/create_local_filegroup

In the request, you provide a FileGroupDefinitionModel object that identifies the workspace and provides the file group name. If the file group contains .csv files, you also provide the delimiter configuration.

{
  "name": "string",
  "workspaceId": "string",
  "escapeChar": "string",
  "quoteChar": "string",
  "hasHeader": true,
  "delimiter": "string",
  "nullChar": "string",
}

Adding a file to a file group (local files)

To add a file to a file group from a local file system, use:

POST /api/FileGroup/upload_local_file

You identify the file group to add the file to, and include the file.

Deleting a file group

To remove a file group from a file connector workspace, use:

DELETE /api/FileGroup

In the request, you identify the workspace and the file group.

Downloading generated files (local files)

For a file group that contains files from a local file system, both the source files and the generated output files are stored in the Tonic Structural application database.

To identify file groups that available output files to download, use:

GET /api/FileGroup/generated_file_availability

In the request, you provide the workspace identifier.

The response contains a list of the file group identifiers for which there are available generated files to download.

To download the available files for a file group, use:

GET /api/FileGroup/download/{workspaceId}/{fileGroupId}

Assign generators to columns

Requires the Advanced API. The Advanced API requires an Enterprise license.

Getting the generator IDs and available metadata

Requires the Advanced API. The Advanced API requires an Enterprise license.

When using the API to assign generators, you use the generator identifier.

To retrieve the list of generators, use:

In the results, the message body is an array of objects.

The information for each generator includes the generator ID. It also specifies whether the generator supports configuration options such as linking, consistency, differential privacy configuration, and partitioning.

Updating generator configurations

Requires the Advanced API. The Advanced API requires an Enterprise license.

Getting the generator configuration for a table

To get the current generator configuration, use:

GET /api/Workspace/{workspace ID}/replacements

The message body contains a set of replacement objects for columns in the specified table that have an assigned generator other than Passthrough. Columns that are assigned the Passthrough generator are not included in the results.

Replacing the generator configuration for a table

By default, columns are assigned the Passthrough generator, which copies the data as is from the source database to the destination database.

To specify and configure the assigned generators for columns in a specified table, use:

PUT /api/Workspace/{workspaceId}/update_replacements

Note that when you use this endpoint, you must always specify the configuration for all of the columns in the specified table for which to override the default Passthrough generator.

The request replaces all of the current column configuration in the specified table with the configuration that is in the request.

For columns that are not in the request, the assigned generator reverts to Passthrough.

Updating a single generator configuration

To update a single generator configuration, use:

PUT /api/Workspace/{workspace ID}/replacement

The message body is a single replacement object. You must provide the entire replacement.

For linked columns, the replacement includes the configuration for all of the columns.

For a composite generator, the replacement includes the link objects for all of the sub-generators.

Removing the generator configuration for a column

When you remove a replacement, the column reverts to the Passthrough generator. To remove a replacement, use:

DELETE /api/Workspace/{workspace ID}/replacement/{replacement ID}

If the replacement contains linked columns, then all of those columns revert to the Passthrough generator. To restore the configuration for any of the columns, you must create a new replacement.

Structure of a generator assignment

In the Tonic Structural API, a generator assignment is referred to as a replacement.

A group of replacements makes up the message body for the response to get generator configuration details, and a request to update generator configuration details.

For details and examples of replacements for each Structural generator, go to .

Replacement structure

At a very high level, the structure of a replacement object is:

The name of the replacement
The schema and table where the configured columns are located
Link objects for generator and sub-generator configurations
Columns to use for partitioning

Link object structure

For fallBackLinks, the link object contains the generator configuration for the fallback generator.

Column identification

In the link object, to identify the column, you provide the schema name, table name, and column name.

The schema and table values in the link object must match the schema and table values for the replacement.

For MongoDB, you also provide the data type.

Note that even if there isn't a schema (for example, for the Databricks data connector), you must still provide an empty value for schema.

Metadata

In the link object, the metadata object identifies the generator and generator preset, and provides the generator configuration.

Generator and preset identification

Generator presets require an Enterprise license. For Basic and Professional licenses, only generatorId is provided.

For the built-in preset for a generator, presetId and generatorId are the same. If during configuration the generator preset specified by presetId is not available - for example, if the generator preset was deleted - then the baseline version of the generator specified by generatorId is applied.

Generator configuration

metadata can also contain additional objects and fields from generator-specific metadata objects.

Structural data encryption

Custom value processors

Sub-generator configuration

In the metadata object, for composite generators other than Array Regex, Regex, or Conditional, pathExpression identifies the value within the column to apply a sub-generator to.

The subGeneratorMetadata object then identifies and configures the generator to apply to that value:

Within subGeneratorMetadata:

presetId identifies the generator preset to apply.
generatorId identifies the type of generator.
customValueProcessor identifies the custom value processor to apply.

subGeneratorMetadata also contains any other fields used to configure the selected sub-generator.

Partition column list

In the replacement object, the partitions field contains a comma-separated list of columns to partition by.

Generator API reference

In this reference, each generator is identified by its name and, in parenthesis, its generator ID. You use the generator ID to identify the generator in the API.

For each generator, this reference shows the structure of a link object, and provides an example of a replacement object.

Additional resources:

Address (AddressGenerator)

The Address generator replaces the source value with a random string based on the type of address data that the column contains.

Link object structure

The Address generator can be self-consistent or consistent with another column. You cannot configure differential privacy. It can be linked to other columns.

The metadata object is populated from AddressMetadata.

For the Address generator, you specify the type of address value that is in the source column. Here is the basic structure of a link object for the Address generator.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {  
    "presetId": "string",
    "generatorId": "AddressGenerator",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string" //If custom value processor applied
    "addressType": "enum",
    "isConsistent": boolean,
    "consistencyColumn": "string"
  }
}

Example replacement

The following example replacement shows two linked columns that are assigned the built-in generator preset for the Address generator. One column contains city names, and the other contains zip codes.

Both columns have consistency disabled.

{
  "name": "city,id",
  "schema": "public",
  "table": "locations",
  "links": [
    {
      "column": "city",
      "table": "locations",
      "schema": "public",
      "metadata": {
        "presetId": "AddressGenerator",
        "generatorId": "AddressGenerator",
        "addressType": "City",
        "isConsistent": false
      }
    },
    {
      "column": "id",
      "table": "locations",
      "schema": "public",
      "metadata": {
        "generatorId": "AddressGenerator",
        "addressType": "ZipCode",
        "isConsistent": false
      }
    }
  ]
}

Algebraic (AlgebraicGenerator)

The generator identifies the algebraic relationship between three or more numeric values and generates new values to match. At least one of the values must be a non-integer.

The Algebraic generator must be linked to at least two other columns.

Link object structure

The Algebraic generator does not support consistency. You cannot configure differential privacy.

There is no generator-specific configuration.

Example replacement

The following example replacement contains three linked columns that are assigned the Algebraic generator.

Alphanumeric String Key (AlphaNumericPkGenerator)

The generator can be applied to primary key columns. It generates unique alphanumeric strings of the same length as the input. For example, for the origin value ABC123, the output value is a six-character alphanumeric string such as D24N05.

Link object structure

The Alphanumeric String Key generator can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the object.

There is no generator-specific configuration.

Example replacement

The following example replacement configures a column to use the built-in generator preset for the Alphanumeric String Key generator. The generator is not consistent.

Array Character Scramble (ArrayTextMaskGenerator)

The Array Character Scramble generator is intended for array values. It replaces letters with random other letters, and numbers with random other numbers. It preserves punctuation and whitespace from the original value.

Link object structure

The Array Character Scramble generator can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the BaseMetadata object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "metadata": {
    "presetId": "string",
    "generatorId": "ArrayTextMaskGenerator",
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string" //If custom value processor applied
  }
}

Example replacement

The following example replacement configures a column to use the built-in generator preset for the Array Character Scramble generator. The generator is not consistent.

{
  "name": "config_options",
  "schema": "public",
  "table": "products",
  "links": [
    {
      "schema": "public",
      "table": "products",
      "column": "config_options",
      "metadata": {
        "presetId": "ArrayTextMaskGenerator",
        "generatorId": "ArrayTextMaskGenerator",
        "isConsistent": false
      }
    }
  ]
}

Array JSON Mask (ArrayJsonMaskGenerator)

The generator is a composite generator. It is a version of the JSON Mask generator that can be used for array values. It runs a selected generator on values that match a specified JSONPath.

Link object structure

For the Array JSON Mask generator, you provide a link object for each sub-generator configuration.

The generator does not itself support consistency or differential privacy.

The metadata object is populated from the object. For the Array JSON Mask generator, metadata includes:

pathExpression, which is the path expression that identifies the value to apply the sub-generator to.
The types of values to apply the sub-generator to.
The subGeneratorMetadata object, which identifies and configures the sub-generator.

Here is the basic structure of a link object for an Array JSON Mask sub-generator.

Example replacement

The following example replacement applies the built-in generator preset for the Geo generator to the value at the specified path expression.

The configuration for the Geo generator indicates that it is a latitude value.

Array Regex Mask (ArrayRegexMaskGenerator)

The generator is a version of the Regex Mask generator that can be used for array values.

It uses regular expressions to parse strings and replace specified substrings with the output of specified generators. The parts of the string to replace are specified inside unnamed top-level capture groups.

Link object structure

In the Array Regex Mask generator, each link object identifies a regular expression and the generators to apply to the resulting capture groups.

The generator does not in itself support consistency or allow you to configure differential privacy.

The metadata object for each link object is populated from the object, and includes:

Whether to replace all matches or only the first match.
The regular expression used to identify the capture groups to replace.
The list of generator types to apply to each capture group. The first sub-generator is applied to the first capture group, the second generator to the second group, and so on.
In the captureGroupMetadata object, the configuration for each generator in captureGroupSubGenerators. The sequence of the entries in captureGroupMetadata must match the sequence of the generators in captureGroupSubGenerators.

Example replacement

The following example provides a regex pattern that produces a single capture group.

For that capture group, the Constant generator is applied. The capture group value is replaced with test_value.

ASCII Key (AsciiPkGenerator)

The ASCII Key generator can be applied to primary key columns. It generates unique alphanumeric strings based on any printable ASCII characters.

Link object structure

The ASCII Key generator can be configured to be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the BaseMetadata object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "AsciiPkGenerator",
    "excludeLowercaseAlphabet": boolean,
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"   //If custom value processor applied
  }
}

Example replacement

In the following example replacement for the ASCII Key generator, consistency is disabled. The output values do not include lowercase letters.

{
  "name": "userid",
  "schema": "test",
  "table": "users",
  "links": [
    {
      "schema": "test",
      "table": "users",
      "column": "userid",
      "metadata": {
        "presetId": "AsciiPkGenerator",
        "generatorId": "AsciiPkGenerator",
        "excludeLowercaseAlphabet": true,
        "isConsistent": false
      }
    }
  ]
}

Categorical (CategoricalGenerator)

The generator creates values at the same frequency and using the same values, including NULL values, as the underlying data. In other words, it shuffles the existing values within a field.

Link object structure

The Categorical generator does not support consistency. You can configure differential privacy. You can link columns.

The metadata object is populated from the object. It contains the epsilon field, which provides the .

Example replacement

The following example replacement shows a single, un-linked column. Differential privacy is enabled, and epsilon is set to 1.

Character Scramble (TextMaskGenerator)

The Character Scramble generator replaces letters with random other letters and numbers with random other numbers. It preserves punctuation, whitespace, and mathematical symbols.

Link object structure

You can configure the Character Scramble generator to be self-consistent, but not consistent to another column. You cannot configure differential privacy.

The metadata object is populated from the BaseMetadata object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "TextMaskGenerator",
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string" //If custom value processor applied
  }
}

Example replacement

The following replacement for a Character Scramble generator has consistency disabled.

{
  "name": "occupation",
  "schema": "test",
  "table": "users",
  "links": [
    {
      "schema": "test",
      "table": "users",
      "column": "group_identifier",
      "metadata": {
        "presetId": "TextMaskGenerator"
        "generatorId": "TextMaskGenerator",
        "isConsistent": false
      }
    }
  ]
}

Character Substitution (StringMaskGenerator)

The Character Substitution generator performs a random character replacement that preserves formatting (spaces, capitalization, and punctuation).

Characters are replaced with other characters from within the same Unicode Block.

Link object structure

The Character Substitution generator is implicitly consistent. You cannot configure consistency or differential privacy. There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "generatorId": "StringMaskGenerator",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string" //If custom value processor applied
  }
}

Example replacement

The following example replacement assigns the Character Substitution generator to a column.

{
  "name": "age",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "age",
      "metadata": {
        "generatorId": "StringMaskGenerator"
      }
    }
  ]
}

Conditional (ConditionalGenerator)

The generator applies different generators to the value conditionally based on any value in the table.

Link object structure

You do not configure consistency or differential privacy for the Conditional generator.

The metadata object is populated from the object.

The defaultGenerator object specifies the generator to apply if none of the conditions are met. It includes a condition object with an empty list of conditions.

The conditionalGenerators object contains the generators to apply based on one or more conditions. For each entry in conditionalGenerators, you identify and configure the generator, and provide the conditions to meet in order to apply that generator. The conditions can be joined by AND or OR.

Each condition identifies the column for which to check the value, the type of check, the value to check, and the type of data in the column that is checked.

Example replacement

In the following example replacement for the Conditional generator, the default generator is the Address generator, which is configured with the zip code as the address type.

The column being configured is column1.

If column1 contains the value VALUE and column2 is not NULL, then the Random Integer generator is applied to column1. It applies a value between 0 and 10.

If column4 contains a value that matches the regular expression .*, then the Categorical generator is applied to column1. epsilon is 1, and differential privacy is disabled.

Continuous (GaussianGenerator)

The Continuous generator generates a continuous distribution to fit the underlying data.

Link object structure

The Continuous generator supports linking. It cannot be made consistent, but you can configure differential privacy.

The metadata object is populated from the BaseMetadata object.

The Continuous generator does support partitioning, which is configured in the partitions object outside of the links object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string" //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "GaussianGenerator",
    "isDifferentiallyPrivate": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string" //If custom value processor applied
  }
}

Example replacement

In this example replacement for the Continuous generator, differential privacy is enabled. The capital-gain column is partitioned by the native-country and income columns.

{
  "name": "capital-gain",
  "schema": "public",
  "table": "user-income",
  "links": [
    {
      "schema": "public",
      "table": "user-income",
      "column": "capital-gain",
      "metadata": {
        "presetId": "GaussianGenerator",
        "generatorId": "GaussianGenerator",
        "isDifferentiallyPrivate": true
      }
    }
  ],
  "partitions": [
    "native-country",
    "income"
  ]
}

Cross Table Sum (CrossTableAggregateGenerator)

The generator sets the value of the column to the sum of the values of another column aggregated across rows that have a foreign key value that matches the primary key in the current record.

For example, in a users table, a total_transactions value is obtained from the transactions table by combining all of the transaction_amount values from rows that have a user_id value that matches the primary key value for the current users record.

Link object structure

The Cross Table Sum generator does not support linking or consistency. You cannot configure differential privacy.

The metadata object is populated from the object.

The generator-specific configuration includes:

The schema and table that contain the column to sum against.
The foreign key column to compare against the primary key for the current table.
The column that contains the values to sum.
The primary key column in the current table.

Example replacement

In the following example replacement for the Cross Table Sum generator, the value of total_transactions in the users table is set to the sum of the values of the amount column in the transactions table for rows where user_id has the same value as the id column in the current users table row.

CSV Mask (CsvMaskGenerator)

The CSV Mask generator allows to assign specific generators to specific indexes. You can also use the generator that is assigned to a specific index as the default. This applies the generator to every index that does not have an assigned generator.

Link object structure

For the CSV Mask generator, there is a link object for each index to assign a generator to.

The generator does not itself support consistency or differential privacy.

The metadata object is populated from the CsvMaskMetadata object, and includes:

pathExpression, which is the index to apply the sub-generator to.
The delimiter used to separate the CSV values.
Whether to apply that generator to indexes that are not assigned a generator.
The subGeneratorMetadata object, which identifies and configures the sub-generator.

Here is the basic structure of a link object for a CSV Mask sub-generator.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "metadata": {
    "generatorId": "CsvMaskGenerator",
    "presetId": "CsvMaskGenerator",
    "customValueProcessor": "string", //If custom value processor applied to the generator
    "pathExpression": "string",
    "delimiter": "string",
    "isDefaultGenerator": boolean,
    "subGeneratorMetadata": {
      "presetId": "string",
      "generatorId": "string",
       //Metadata for the selected sub-generator
      "customValueProcessor": "string" //If custom value processor applied to the sub-generator
    }
  }
}

Example replacement

This example replacement for the CSV Mask generator assigns generators to index 0 and index 1 of the column value. The delimiter is a comma.

For index 0, the Address generator is assigned, with an address type of City and consistency disabled.

For index 1, the Company Name generator is assigned, with consistency disabled.

Neither sub-generator is assigned as the default generator for other indexes.

{
  "name": "word",
  "schema": "public",
  "table": "customers",
  "links": [
    {
      "schema": "public",
      "table": "customers",
      "column": "location"
      "metadata": {
        "generatorId": "CsvMaskGenerator",
        "presetId": "CsvMaskGenerator",
        "delimiter": ",",
        "pathExpression": "0",
        "isDefaultGenerator": false,
        "subGeneratorMetadata": {
          "presetId": "AddressGenerator",
          "generatorId": "AddressGenerator",
          "addressType": "City",
          "isConsistent": false
        }
      }
    },
    {
      "table": "customers",
      "schema": "public",
      "column": "location",
      "metadata": {
        "generatorId": "CsvMaskGenerator",
        "presetId": "CsvMaskGenerator",
        "delimiter": ",",
        "pathExpression": "1",
        "isDefaultGenerator": false,
        "subGeneratorMetadata": {
          "presetId": "CompanyNameGenerator",
          "generatorId": "CompanyNameGenerator",
          "isConsistent": false
        }
      }
    }
  ]
}

Custom Categorical (CustomCategoricalGenerator)

The generator is a version of the Categorical generator that selects from values that you provide instead of shuffling the original values.

Link object structure

The Custom Categorical generator supports linking. It can be made self-consistent or consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the object. You use the customCategories field to provide a list of the values to use for the column in the destination database. The values are provided on a single line, separated with newline characters (\n). For example, "Small\nMedium\nLarge". To include NULL as an available value, use {NULL}.

Example replacement

In this example replacement for the Custom Categorical generator, the values to use are Red, Yellow, Blue, and White. The generator is not linked.

Consistency is disabled.

Date Truncation (DateTruncationGenerator)

The generator truncates a date value or a timestamp to a specific part. For a date or a timestamp, you can truncate to the year, month, or day. For a timestamp, you can also truncate to the hour, minute, or second.

Link object structure

The Date Truncation generator does not support linking or consistency. You cannot configure differential privacy.

The metadata object is populated from the object. The generator-specific configuration includes the part of the datetime value to truncate to, and whether to change all dates that are more than 90 years before the generation date to a date exactly 90 years before the generation date.

Example replacement

In the following example replacement for the Date Truncation generator, the values are truncated to the year. Date values that are older than 90 years before the generation date are not changed.

Email (EmailGenerator)

The Email generator scrambles the characters in an email address. It preserves formatting and keeps the @ and . characters.

Link object structure

The Email generator does not support linking. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the EmailMetadata object. You can configure:

The domain to use for all of the email addresses in the destination database.
Domains for which to keep the email addresses as is in the destination database.
Whether to replace invalid email addresses with valid ones.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "EmailGenerator",
    "domain": "string",
    "excludedDomain": "string",
    "replaceInvalidEmails": boolean,
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following example replacement for the Email generator, all of the destination email addresses use gmail.com as the domain. Source email addresses from yahoo.com are not changed. Invalid email addresses are replaced. The generator is not consistent.

{
  "name": "email_address",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "email_address",
      "metadata": {
        "presetId": "EmailGenerator",
        "generatorId": "EmailGenerator",
        "domain": "gmail.com",
        "excludedDomain": "yahoo.com",
        "replaceInvalidEmails": true,
        "isConsistent": false
      }
    }
  ]
}

Event Timestamps (EventGenerator)

The Event Timestamps generator generates timestamps that fit an event distribution. The source timestamp must include a date. It cannot be a time-only value.

Link object structure

You can link columns to create a sequence of events across multiple columns. This generator can be partitioned by other columns.

The Event Timestamps generator does not support consistency. You cannot configure differential privacy.

The metadata object is populated from the EventMetadata object. You use eventOrder to specify the sequence of the generated datetime values in the linked columns.

The Event Timestamps generator does support partitioning, which is configured in the partitions object outside of the links object.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "metadata": {
    "generatorId": "EventGenerator",
    "eventOrder": integer,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In this replacement example for the Event Timestamps generator, the date_event1 and date_event2 columns are linked. date_event1 occurs first, and date_event2 occurs second. The values are not partitioned.

{
  "name": "date_event1,date_event2",
  "schema": "public",
  "table": "events",
  "links": [
    {
      "schema": "public",
      "table": "events",
      "column": "date_event1",
      "metadata": {
        "generatorId": "EventGenerator",
        "eventOrder": 1
      }
    },
    {
      "schema": "public",
      "table": "events",
      "column": "date_event2",
      "metadata": {
        "generatorId": "EventGenerator",
        "eventOrder": 2
      }
    }
  ],
  "partitions": []
}

File Name (FileNameGenerator)

The File Name generator scrambles characters while preserving formatting and keeping the file extension intact.

Link object structure

The File Name generator does not support linking. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the BaseMetadata object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "FileNameGenerator",
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In this example replacement for the File Name generator, consistency is enabled.

{
  "name": "data_file",
  "schema": "public",
  "table": "products",
  "links": [
    {
      "schema": "public",
      "table": "products",
      "column": "data_file",
      "metadata": {
        "presetId": "FileNameGenerator",
        "generatorId": "FileNameGenerator",
        "isConsistent": true
      }
    }
  ]
}

Find and Replace (FindAndReplaceGenerator)

The generator replaces all instances of a specified find string with a specified replace string.

Link object structure

The Find and Replace generator does not support linking or consistency. You cannot configure differential privacy.

The metadata object is populated from the object. The generator-specific configuration includes:

The find string
Whether the find string is a regular expression
The replace string

Example replacement

In this example replacement for the Find and Replace generator, the value yes is replaced by the value no. The find string is not a regular expression.

FNR (FnrGenerator)

The generator transforms Norwegian national identity numbers.

Link object structure

The metadata object is populated from the object.

preserveDate indicates whether to preserve the birthdate values from the source database in the destination database. If the birthdate values are not preserved, the destination values are still within the same range as the source values.

preserveGender indicates whether the destination value should reflect the same gender as the source value.

Example replacement

In the following example replacement for the FNR generator, the birthdate values in the source database are not preserved in the destination database.

The destination values use the same gender as the source values.

The generator is consistent with the name column.

Geo (GeoGenerator)

The Geo generator is used to mask latitude or longitude values.

Link object structure

The Geo generator supports linking. Typically, the Geo generator is assigned to a latitude and longitude column and then the columns are linked.

The Geo generator does not support consistency. You cannot configure differential privacy.

The metadata object is populated from the GeoMetadata object. geoType indicates the type of value (latitude or longitude) that is in the column.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "GeoGenerator",
    "geoType": "enum",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In this example replacement for the Geo generator, the lat and long columns are assigned the Geo generator and linked.

{
  "name": "lat,long",
  "schema": "public",
  "table": "latlong",
  "links": [
    {
      "schema": "public",
      "table": "latlong",
      "column": "lat",
      "metadata": {
        "presetId": "GeoGenerator",
        "generatorId": "GeoGenerator",
        "geoType": "Latitude"
      }
    },
    {
      "schema": "public",
      "table": "latlong",
      "column": "long",
      "metadata": {
        "presetId": "GeoGenerator",
        "generatorId": "GeoGenerator",
        "geoType": "Longitude"
      }
    }
  ]
}

HIPAA Address (HipaaAddressGenerator)

The HIPAA Address generator can be used to generate cities, states, zip codes, and latitude/longitude values that follow HIPAA guidelines for safe harbor.

Link object structure

The HIPAA Address generator can be linked. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the HipaaAddressMetadata object, which includes: The type of address value that is in the column How to generate zip codes. You can generate zip codes that replace the last two digits with zeros, or use a real zip code from the same state.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "HipaaAddressGenerator",
    "replaceTruncatedZerosInZipCode": boolean,
    "addressType": "enum",
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor assigned
  }
}

Example replacement

The following example replacement for the HIPAA Address generator contains a single, unlinked column that contains a zip code value. The generator is configured to be consistent, and to not use zeros in the generated zip code values.

{
  "name": "zip-code",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "zip-code",
      "metadata": {
        "presetId": "HipaaAddressGenerator",
        "generatorId": "HipaaAddressGenerator",
        "replaceTruncatedZerosInZipCode": true,
        "addressType": "ZipCode",
        "isConsistent": false
      }
    }
  ]
}

Hostname (HostnameGenerator)

The Hostname generator generates random host names, based on the English language.

Link object structure

The Hostname generator does not support linking. It can be either self-consistent or consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the BaseMetadata object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "HostnameGenerator",
    "isConsistent": boolean,
    "consistencyColumn": "string",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor assigned
  }
}

Example replacement

In the following example replacement for the Hostname generator, consistency is disabled.

{
  "name": "hostname",
  "schema": "public",
  "table": "events",
  "links": [
    {
      "schema": "public",
      "table": "events",
      "column": "hostname",
      "metadata": {
        "presetId": "HostnameGenerator",
        "generatorId": "HostnameGenerator",
        "isConsistent": false
      }
    }
  ]
}

HStore Mask (HStoreMaskGenerator)

The generator runs selected generators on specified key values in an HStore column in a PostgreSQL database. HStore columns contain a set of key-value pairs.

Link object structure

For the HStore Mask generator, there is a link object for each path expression value to assign a generator to.

The generator does not itself support consistency or differential privacy.

The metadata object is populated from the object. It includes:

pathExpression, which is the path expression that identifies the value to apply the sub-generator to.
The subGeneratorMetadata object, which identifies and configures the sub-generator.

Example replacement

In the following example replacement for the HStore Mask generator:

The Random Integer generator is assigned to the value of the pages path expression. The generator uses values between 300 and 500.
The Character Scramble generator is assigned to the value of the title path expression. Consistency is disabled.

HTML Mask (HtmlMaskGenerator)

The generator masks text columns by parsing the contents as HTML, and applying sub-generators to specified path expressions.

If applying a sub-generator fails because of an error, the generator selected as the fallback generator is applied instead.

Link object structure

For the HTML Mask generator, there is a link object for each XPath expression value to assign a generator to.

The generator does not itself support consistency or differential privacy.

The metadata object is populated from the object. It includes:

pathExpression, which is the XPath expression that identifies the value to apply the sub-generator to.
The subGeneratorMetadata object, which identifies and configures the sub-generator.

Example replacement

In the following example replacement for the HTML Mask generator:

The Character Scramble generator is assigned to the value of the XPath expression //p. Consistency is disabled.
The Company Name generator is assigned to the value of the XPath expression //p/@data. Consistency is disabled.
In the case of an error applying either of those generators, the fallback generator is the Constant generator, which sets the value to 10.

Integer Key (IntegerPkGenerator)

The generator generates integer values that are between 0 and 2^32 - 1. The input values must be in the range 0 to 2^31 - 1.

Link object structure

The Integer Key generator does not support linking. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the object, which includes:

The minimum value
The maximum value
The underlying data type for the source values (for MySQL and MongoDB)

Example replacement

In the following example replacement for the Integer Key generator, the generator produces a value between 10 and 20. The original values are Int64. Consistency is enabled.

International Address (InternationalAddressGenerator)

The generator can generator the following international address values:

Canadian street name
Canadian postal code
United Kingdom (UK) postal code

Link object structure

The International Address generator can be self-consistent. You cannot configure differential privacy. It cannot be linked to other columns.

The metadata object is populated from .

For the International Address generator, you specify the country and the type of address value that is in the source column.

Example replacement

The following example replacement shows a column that is assigned the built-in generator preset for the International Address generator. The column contains a Canadian postal code. The fallback value is K1A.

The column has consistency disabled.

IP Address (IPAddressGenerator)

The generator generates a random string that is formatted as an IP address.

Link object structure

The IP Address generator does not support linking. It can be self-consistent or consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the object. The ratio field specifies, as a decimal value, the percentage of values to format as IPV4. The remaining values are formatted as IPV6.

Example replacement

In the following example replacement for the IP Address generator, 90% of the generated addresses are IPV4. Consistency is disabled.

JSON Mask (JsonMaskGenerator)

The generator runs a selected generator on values that match a specified JSONPath.

Link object structure

For the JSON Mask generator, you provide a link object for each sub-generator configuration.

The generator does not itself support consistency or differential privacy.

The metadata object is populated from the object, and includes:

pathExpression, which is the JSONPath that identifies the value to apply the sub-generator to.
The types of values to apply the sub-generator to.
The subGeneratorMetadata object, which identifies and configures the sub-generator.

Here is the basic structure of a link object for a JSON Mask sub-generator.

Example replacement

In the following example replacement for the JSON Mask generator:

The Date Truncation generator is applied to all values of the JSONPath expression $[*].start. The value is truncated to the year, and the birthdate flag is off.
The Email generator is applied to all values of the JSONPath expression $[0].email. The generated email addresses all use gmail.com as the domain, and no domains are excluded. Invalid email addresses are replaced. Consistency is disabled.
If there is an error applying those generators, then the fallback generator is the Null generator.

Mongo ObjectId Key (ObjectIdPkGenerator)

The Mongo ObjectId Key generator generates values to de-identify fields that contain MongoDB ObjectId values. The column value must be 12 bytes long.

Link object structure

The ObjectId Key generator does not support linking. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the ObjectIdPkMetadata object. preserveTimetampAndCounter indicates whether to only change the random value portion of the identifier, but keep the timestamp and incremented counter portions.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string", //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "ObjectIdPkGenerator",
    "isConsistent": false,
    "preserveTimestampAndCounter": false,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following example replacement for the Mongo ObjectId Key generator, consistency is disabled. Only the random value portion of the identifier is changed.

{
  "name": "userid",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "user_id",
      "dataType": "ObjectId",
      "metadata": {
        "presetId": "ObjectIdPkGenerator",
        "generatorId": "ObjectIdPkGenerator",
        "isConsistent": false,
        "preserveTimestampAndCounter": true
      }
    }
  ]
}

Name (NameGenerator)

The Name generator generates a random name string from a dictionary of first and last names.

Link object structure

The Name generator cannot be linked. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the NameClassifierMetadata object, which includes:

The type of name value.
Whether to preserve the capitalization from the source value.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "NameGenerator",
    "nameType": "enum",
    "preserveCapitalization": boolean,
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
 }

Example replacement

In the following example replacement for the Name generator, the name format is Last, First (Smith, John). Capitalization is preserved. Consistency is disabled.

{
  "name": "fullname",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "fullname",
      "metadata": {
        "presetId": "NameGenerator",
        "generatorId": "NameGenerator",
        "nameType": "LastCommaFirstName",
        "preserveCapitalization": true,
        "isConsistent": false
      }
    }
  ]
}

Noise Generator (NoiseGenerator)

The Noise Generator masks values in numeric columns. It adds or multiplies the original value by random noise.

Link object structure

The Noise Generator does not support linking. It can be either self-consistent or consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the NoiseMetadata object. The generator configuration includes:

Whether to use additive or multiplicative noise.
For additive noise, the percentage of the underlying value to scale the noise to.
For multiplicative, the minimum and maximum value for the scaling factor.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string", //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "NoiseGenerator",
    "noiseStrategy": "enum",
    "min": numeric,  //For multiplicative
    "max": numeric,  //For multiplicative
    "ratio": numeric  //For additive
    "isConsistent": boolean,
    "consistencyColumn": "string",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In this example replacement for the Noise Generator, the additive noise strategy is used. It scales the noise to 10% of the underlying value. The generator is consistent with the name column.

{
  "name": "age",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "age",
      "metadata": {
        "presetId": "NoiseGenerator",
        "generatorId": "NoiseGenerator",
        "noiseStrategy": "Additive",
        "ratio": 0.1,
        "isConsistent": true,
        "consistencyColumn": "name"
      }
    }
  ]
}

Null (NullGenerator)

The Null generator generates NULL values to fill the rows of the specified column.

Link object structure

The Null generator does not support linking or consistency. You cannot configure differential privacy.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "generatorId": "NullGenerator",
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

The following example replacement applies the Null generator to a column.

{
  "name": "occupation",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "column": "occupation",
      "table": "users",
      "schema": "public",
      "metadata": {
       "generatorId": "NullGenerator"
      }
    }
  ]
}

Numeric String Key (NumericStringPkGenerator)

The Numeric String Key generator generates unique numeric strings of the same length as the input value.

Link object structure

The Numeric String Key generator does not support linking. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the BaseMetadata object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "NumericStringPkGenerator",
    "isConsistent": false,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following example replacement for the Numeric String Key generator, consistency is disabled.

{
  "name": "userid",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "userid",
      "metadata": {
        "presetId": "NumericStringPkGenerator",
        "generatorId": "NumericStringPkGenerator",
        "isConsistent": false
      }
    }
  ]
}

Passthrough (PassthroughGenerator)

The Passthrough generator is the default. It passes through the value from the source database to the destination database without masking it.

You do not usually retrieve or provide a replacement that assigns the Passthrough generator to a column. You might specifically assign the Passthrough as a sub-generator for a composite generator.

When you use the GET api/Workspace/{workspace ID}/replacements/{schema}/{table} to get the column configuration for a table, columns that are assigned Passthrough are not included in the results.

For the PUT /api/Workspace/{workspaceId}/update_replacements/{schema}/{table} endpoint, which replaces the configuration for an entire table, any column that is not included in the message body is automatically assigned Passthrough.

To revert an individual column to Passthrough, you use the DELETE api/Workspace/{workspace ID}/replacement/{replacement ID} endpoint to remove the replacement that contains the column configuration.

Phone (USPhoneNumberGenerator)

The Phone generator generates a random telephone number that matches the country or region of the input telephone number while maintaining the format.

Link object structure

The Phone generator does not support linking. It can be made self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the PhoneNumberMetadata object, which includes a setting to indicate whether to replace invalid telephone numbers with valid telephone numbers.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "USPhoneNumberGenerator",
    "replaceInvalidNumbers": boolean,
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following replacement for the Phone generator, invalid phone numbers are replaced. Consistency is disabled.

{
  "name": "cell_phone",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "cell-phone",
      "metadata": {
        "presetId": "USPhoneNumberGenerator",
        "generatorId": "USPhoneNumberGenerator",
        "replaceInvalidNumbers": true,
        "isConsistent": false
      }
    }
  ]
}

Random Boolean (RandomBooleanGenerator)

The Random Boolean generator assigns a random boolean value.

Link object structure

The Random Boolean generator does not support linking or consistency. You cannot configure differential privacy.

The metadata object is populated from the RatioMetadata object. The ratio field indicates the percentage (as a decimal value between 0 and 1.0) of values to set to true.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "RandomBooleanGenerator",
    "ratio": numeric,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following example replacement for the Random Boolean generator, 40% of the destination values are true, and 60% are false.

{
  "name": "is-available",
  "schema": "public",
  "table": "products",
  "links": [
    {
      "schema": "public",
      "table": "products",
      "column": "is-available",
      "metadata": {
        "presetId": "RandomBooleanGenerator",
        "generatorId": "RandomBooleanGenerator",
        "ratio": 0.4
      }
    }
  ]
}

Random Double (RandomDoubleGenerator)

The Random Double generator generates a random double number between the specified minimum (inclusive) and maximum (exclusive).

Link object structure

The Random Double generator does not support linking or consistency. You cannot configure differential privacy.

The metadata object is populated from the ContinuousDistributionMetadata object. You specify the minimum and maximum values.

{
  "schema": "string",
  "table": "test",
  "column": "number-column",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "RandomDoubleGenerator",
    "min": double,
    "max": double,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In this example replacement for the Random Double generator, the generator is configured to produce numbers between 2.5 and 10.75.

{
  "name": "number-column",
  "schema": "public",
  "table": "test",
  "links": [
    {
      "schema": "public",
      "table": "test",
      "column": "number-column",
      "metadata": {
        "presetId": "RandomDoubleGenerator",
        "generatorId": "RandomDoubleGenerator",
        "min": 2.5,
        "max": 10.75
      }
    }
  ]
}

Random Hash (RandomStringGenerator)

The Random Hash generator generates a random hash string.

Link object structure

The Random Hash generator does not support linking or consistency. You cannot configure differential privacy.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "generatorId": "RandomStringGenerator",,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

Here is an example replacement for the Random Hash generator.

{
  "name": "key",
  "schema": "public",
  "table": "cohorts",
  "links": [
    {
      "schema": "public",
      "table": "cohorts",
      "column": "key",
      "metadata": {
        "generatorId": "RandomStringGenerator"
      }
    }
  ]
}

Random Integer (RandomIntegerGenerator)

The Random Integer generator returns a random integer between a specified minimum (inclusive) and maximum (exclusive).

Link object structure

The Random Integer generator does not support linking or consistency. You cannot configure differential privacy.

The metadata object is populated from DiscreteDistributionMetadata, which includes the minimum and maximum values.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "RandomIntegerGenerator",
    "min": integer,
    "max": integer,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In this example replacement for the Random Integer generator, the returned value is between 0 and 5. Because max is exclusive, the highest possible value is 4.

{
  "name": "number-of-children",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "number-of-children",
      "metadata": {
        "presetId": "RandomIntegerGenerator",
        "generatorId": "RandomIntegerGenerator",
        "min": 0,
        "max": 5
      }
    }
  ]
}

Sequential Integer (UniqueIntegerGenerator)

The Sequential Integer generator returns integer values that increment by 1 for each row in the destination database.

Link object structure

The Sequential Integer generator can be linked. You provide a link object for each linked column. The generator does not support consistency. You cannot configure differential privacy.

The metadata object is populated from the UniqueIntegerMetadata object. startingPoint provides the first integer to apply.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "UniqueIntegerGenerator",
    "startingPoint": integer,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

The following example replacement for the Sequential Integer generator configures a single unlinked column. The values start with 4.

{
  "name": "user-number",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "user-number",
      "metadata": {
        "presetId": "UniqueIntegerGenerator",
        "generatorId": "UniqueIntegerGenerator",
        "startingPoint": 4
      }
    }
  ]
}

Shipping Container (ShippingContainerGenerator)

The Shipping Container generator generates values of ISO 6346 compliant shipping container codes. All generated codes are in the freight category ("U").

Link object structure

The Shipping Container generator does not support linking. It can be self-consistent or consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the BaseMetadata object.

There is no generator-specific configuration.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string"  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "ShippingContainerGenerator",
    "isConsistent": boolean,
    "consistencyColumn": "string",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following example of a replacement for the Shipping Container generator, consistency is disabled.

{
  "name": "container",
  "schema": "public",
  "table": "shipments",
  "links": [
    {
      "schema": "public",
      "table": "shipments",
      "column": "container",
      "metadata": {
        "presetId": "ShippingContainerGenerator",
        "generatorId": "ShippingContainerGenerator",
        "isConsistent": false
      }
    }
  ]
}

SSN (SsnGenerator)

The SSN generator generates a new valid United States Social Security Number.

Link object structure

The SSN generator does not support linking. It can be self-consistent or consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the RatioMetadata object. For the SSN generator, ratio indicates the percentage of values to format with dashes (123-45-6789). The percentage is provided as a decimal value between 0 and 1.0. The remaining values are formatted as 123456789.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "SsnGenerator",
    "ratio": numeric,
    "isConsistent": boolean,
    "consistencyColumn": "string",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following example replacement for the SSN generator, the generator is consistent with the name column. None of the values are configured with dashes.

{
  "name": "employee-number",
  "schema": "public",
  "table": "employees",
  "links": [
    {
      "schema": "public",
      "table": "employees",
      "column": "employee-number",
      "metadata": {
        "presetId": "SsnGenerator",
        "generatorId": "SsnGenerator",
        "ratio": 0,
        "isConsistent": true,
        "consistencyColumn": "name"
      }
    }
  ]
}

Struct Mask (StructMaskGenerator)

The Struct Mask generator applies selected generators to specific StructFields within a StructType in a Spark database.

Link object structure

For the Struct Mask generator, there is a link object for each path expression value to assign a sub-generator to.

The generator does not itself support consistency or differential privacy.

The metadata object is populated from the JsonMaskMetadata object, and includes:

pathExpression, which is the expression that identifies the value to apply the sub-generator to.
The types of values to apply the sub-generator to.
The subGeneratorMetadata object, which identifies and configures the selected sub-generator.

{
  "schema": "string",
  "table": "string",
  "column": "string"
  "metadata": {
    "generatorId": "StructMaskGenerator",
    "customValueProcessor": "string" //If custom value processor applied
    "pathExpression": "string",
    "jsonFilterTypes": [ enum ],
    "subGeneratorMetadata": {
      "presetId": "string",
      "generatorId": "string",
      //Metadata for the selected sub-generator
      "customValueProcessor": "string",  //If custom value processor applied to the sub-generator
    }
}

Example replacement

In the following example replacement for the StructMask generator:

The value at the path expression $.address.city is assigned the Address generator. The generator is configured to produce a city value. Consistency is disabled.
The value at the path expression $.address.zip is also assigned the Address generator. The generator is configured to produce a zip code value. Consistency is disabled.

{
  "name": "value",
  "schema": "",
  "table": "simple_struct",
  "links": [
    {
      "schema": "",
      "table": "simple_struct",
      "column": "value",
      "metadata": {
        "generatorId": "StructMaskGenerator",
        "presetId": "StructMaskGenerator",
        "pathExpression": "$.address.city",
        "jsonFilterTypes": [
          0
        ],
        "subGeneratorMetadata": {
          "presetId": "AddressGenerator",
          "generatorId": "AddressGenerator",
          "addressType": "City",
          "isConsistent": false
        }
      }
    },
    {
      "schema": "",
      "table": "simple_struct",
      "column": "value",
      "metadata": {
        "generatorId": "StructMaskGenerator",
        "presetId": "StructMaskGenerator",
        "pathExpression": "$.address.zip",
        "jsonFilterTypes": [
          0
        ],
        "subGeneratorMetadata": {
          "presetId": "AddressGenerator",
          "generatorId": "AddressGenerator",
          "addressType": "ZipCode",
          "isConsistent": false
        }
      }
    }
  ]
}

Timestamp Shift (TimestampShiftGenerator)

The Timestamp Shift generator shifts timestamps by a random amount of a specific unit of time within a set range.

Link object structure

The Timestamp Shift generator does not support linking. It can be self-consistent or consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the TimestampShiftMetadata object, which includes:

For text source columns, the format of the datetime values in the original data.
For integer source columns (Unix timestamps), the unit to use.
The part of the timestamp to shift.
The minimum amount to shift the value by. Use negative numbers to move the value earlier.
The maximum amount to shift the value by.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "TimestampShiftGenerator",
    "dateFormat": "enum",  //For text source fields
    "unixTimestampFormat": "enum",  //For integer source fields
    "datePart": "enum",
    "minShiftValue": numeric,
    "maxShiftValue": numeric,
    "isConsistent": boolean,
    "consistencyColumn": "string",
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

The following example replacement for the Timestamp Shift generator updates text timestamps in the format yyyy-MM-dd. The generator shifts the day anywhere from 3 days before the current day to 3 days after the current day. The generator is consistent with the order column.

{
  "name": "user",
  "schema": "public",
  "table": "events",
  "links": [
    {
      "schema": "public",
      "table": "events",
      "column": "start",
      "metadata": {
        "presetId": "TimestampShiftGenerator",
        "generatorId": "TimestampShiftGenerator",
        "dateFormat": "yyyy-MM-dd",
        "datePart": "Day",
        "minShiftValue": -3,
        "maxShiftValue": 3,
        "isConsistent": true,
        "consistencyColumn": "order"
      }
    }
  ]
}

Unique Email (UniqueEmailGenerator)

The Unique Email generator generates unique email addresses. It replaces the username with a randomly generated GUID, and either uses a specified domain or masks the domain with a character scramble.

Link object structure

The Unique Email generator does not support linking. It can be self-consistent, but not consistent with another column. You cannot configure differential privacy.

The metadata object is populated from the EmailMetadata object. You can configure:

The domain to use for all of the email addresses in the destination database. If not specified, a character scramble is applied to the domains.
Domains for which to keep the email addresses as is in the destination database.
Whether to replace invalid email addresses with valid ones.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "presetId": "string",
    "generatorId": "UniqueEmailGenerator",
    "domain": "string",
    "replaceInvalidEmails": boolean,
    "isConsistent": boolean,
    "encryptionProcessor": "x-on", //To use configured Structural data encryption
    "customValueProcessor": "string"  //If custom value processor applied
  }
}

Example replacement

In the following example replacement for the Unique Email generator, consistency is enabled. tonic.ai is used as the domain for all of the email addresses, and invalid email addresses are not replaced.

{
  "name": "email",
  "schema": "public",
  "table": "users",
  "links": [
    {
      "schema": "public",
      "table": "users",
      "column": "email",
      "metadata": {
        "presetId": "UniqueEmailGenerator",
        "generatorId": "UniqueEmailGenerator",
        "domain": "tonic.ai",
        "replaceInvalidEmails": false,
        "isConsistent": true
      }
    }
  ]
}

XML Mask (XmlMaskGenerator)

The XML Mask generator runs a selected generator on values that match a user specified path expression.

Link object structure

For the XML Mask generator, there is a link object for each path expression value to assign a sub-generator to.

The generator does not itself support consistency or differential privacy.

The metadata object is populated from the XmlMaskMetadata object. It includes:

pathExpression, which is the expression that identifies the value to apply the sub-generator to.
The subGeneratorMetadata object, which identifies and configures the sub-generator.

{
  "schema": "string",
  "table": "string",
  "column": "string",
  "dataType": "string",  //MongoDB only
  "metadata": {
    "generatorId": "XmlMaskGenerator",
    "customValueProcessor": "string",  //If custom value processor applied
    "pathExpression": "string",
    "subGeneratorMetadata": {
      "presetId": "string",
      "generatorId": "string",
      //Metadata for the selected sub-generator
      "customValueProcessor": "string"  //If custom value processor applied to the sub-generator
    }
  }
}

Example replacement

In the following example replacement for the XML Mask generator:

The Name generator is assigned to the path expression //view/item-descriptor//@display-name. The value is in the format First Name Last Name (John Smith), and capitalization is not preserved. Consistency is disabled.
The Constant generator is assigned to the path expression //view//object-class. The constant value is object-class.

{
  "name": "xml_data",
  "schema": "public",
  "table": "xml_me",
  "links": [
    {
      "schema": "public",
      "table": "xml_me",
      "column": "xml_data",
      "metadata": {
        "generatorId": "XmlMaskGenerator",
        "presetId": "XmlMaskGenerator",
        "pathExpression": "//view/item-descriptor//@display-name",
        "subGeneratorMetadata": {
          "presetId": "NameGenerator",
          "generatorId": "NameGenerator",
          "nameType": "FirstThenLastName",
          "preserveCapitalization": false,
          "isConsistent": false
        }
      }
    },
    {
      "schema": "public",
      "table": "xml_me",
      "column": "xml_data",
      "metadata": {
        "generatorId": "XmlMaskGenerator",
        "presetId": "XmlMaskGenerator",
        "pathExpression": "//view//object-class",
        "subGeneratorMetadata": {
          "presetId": "ConstantGenerator",
          "generatorId": "ConstantGenerator",
          "constant": "object-class"
        }
      }
    }
  ]
}

Configure subsetting

Requires the Advanced API. The Advanced API requires an Enterprise license.

You can use the Tonic Structural API to configure subsetting for a workspace. You can also enable or disable subsetting for a workspace.

You can configure a table to be a target table or a lookup table. You can also remove the configuration.

Subsetting configuration format

You apply a subsetting configuration to individual tables.

A subsetting configuration identifies a table as a target table or a lookup table. For a target table, it also indicates how to identify the records to include in the subset.

The configuration includes:

The name of the schema that contains the table. For requests to update or delete a subsetting rule, the schema name is a request parameter and not in the request body.
The name of the table. For requests to update or delete a subsetting rule, the table name is a request parameter and not in the request body.
Whether the table is a lookup table (IgnoreUpstreamTables is true). If the table is not a lookup table, then it is a target table.
For a target table, to identify the records to include, either:
- A WHERE clause to filter the source records
- A percentage of source records

Subsetting configuration examples

Target table with percentage

The following example configures a table as a target table for which to include 5 percent of the records:

{
  "Schema": "public";
  "Table": "sales";
  "Percent": 5;
  "IgnoreUpstreamTables": false
}

Target table with WHERE clause

The following example configures a table as a target table for which to include records where the value of amount is less than 100.

{
  "Schema": "public";
  "Table": "sales";
  "WhereClause": "amount < 100";
  "IgnoreUpstreamTables": false
}

Lookup table

The following example configures a table as a lookup table:

{
  "Schema": "public";
  "Table": "states";
  "Percent": 100;
  "IgnoreUpstreamTables": true
}

Enabling or disabling subsetting for a workspace

To enable subsetting for a workspace, use:

POST /api/workspace/{workspaceId}/subsetting/enable

To disable subsetting for a workspace, use:

POST /api/workspace/{workspaceId}/subsetting/disable

Retrieving the subsetting configurations for a workspace

To get the current list of subsetting configurations for a workspace, use:

GET api/workspace/{workspaceId}/subsetting

Adding a subsetting configuration

To add a subsetting configuration, use:

POST api/workspace/{workspaceId}/subsetting

The request includes the subsetting configuration to apply.

Structural checks whether the specified table has a current subsetting configuration.

If it does, then Structural uses the provided configuration to update it.

If it does not, then Structural returns an error.

Adding or updating a subsetting configuration

To either add or update a subsetting configuration, use:

PUT api/workspace/{workspaceId}/subsetting/{schema}/{table}

The request body provides the subsetting configuration to apply to the specified table.

Structural checks whether the specified table has a current subsetting configuration.

If it does, then Structural uses the provided configuration to update it.

If it does not, then Structural adds the subsetting configuration for the table.

Removing a subsetting configuration

To remove a subsetting configuration, use:

DELETE api/workspace/{workspaceId}/subsetting/{schema}/{table}

Structural removes the subsetting configuration for the specified table.