Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This generator reference provides the details for each of the the supported generators in Tonic Structural.
The generators are in alphabetical order by the generator name.
Here are some groupings to help to identify generators that are used for different types of values. Generator hints and tips also provides some suggestions for generators to use for specific uses cases.
This is a composite generator.
A version of the JSON Mask generator that can be used for array values.
Runs a selected generator on values that match a user-specified JSONPath.
Consistency
Determined by the specified sub-generators.
Linking
Determined by the specified sub-generators.
Differential privacy
Determined by the specified sub-generators.
Data-free
Determined by the specified sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the JSONPath expression to identify the value to apply the generator to. To populate a path expression, you can also click a value in the Cell JSON field. Matched JSON Values shows the result from the value in Cell JSON.
By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, and Null.
From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
A version of the Character Scramble generator that can be used for array values.
This generator replaces letters with random other letters, and numbers with random other numbers. Punctuation and whitespace are preserved.
For example, for the following array value:
["ABC.123", 3, "last week"]
The output might be something like:
["KFR.860", 7, "sdrw mwoc"]
This generator securely masks letters and numbers. There is no way to recover the original data.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
The algebraic generator identifies the algebraic relationship between three or more numeric values and generates new values to match. At least one of the values must be a non-integer.
If a relationship cannot be found, then the generator defaults to the Categorical generator.
This generator can be linked with other Algebraic generators.
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator, from the Link To dropdown list, select the columns to link this column to. You can select other columns that are assigned the Algebraic generator.
You must select at least three columns.
The column values must be numeric. At least one of the columns must contain a value other than an integer.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random address-like string.
You can indicate which part of an address string that the column contains. For example, the column might contain only the street address or the city, or it might contain the full address.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
Yes, can be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
From the Link To dropdown list, select the columns to link this column to. You can link columns that use the Address generator to mask one of the following address components:
City
City State
Country
Country Code
State
State Abbreviation
Zip Code
Latitude
Longitude
Note that when linked to another address column, a country or country code is always the United States.
From the address component dropdown list, select the address component that this column contains. The available options are:
Building Number
Cardinal Direction (North, South, East, West)
City
City Prefix (Examples: North, South, East, West, Port, New)
City Suffix (Examples: land, ville, furt, town)
City with State (Example: Spokane, Washington)
City with State Abbr (Example: Houston, TX)
Country (Examples: Spain, Canada)
Country Code (Uses the 2-character country code. Examples: ES, CA)
County
Direction (Examples: North, Northeast, Southwest, East)
Full Address
Latitude (Examples: 33.51, 41.32)
Longitude (Examples: -84.05, -74.21)
Ordinal Direction (Examples: Northeast, Southwest)
Secondary Address (Examples: Apt 123, Suite 530)
State (Examples: Alabama, Wisconsin)
State Abbr (Examples: AL, WI)
Street Address (Example: 123 Main Street)
Street Name (Examples: Broad, Elm)
Street Suffix (Examples: Way, Hill, Drive)
US Address
US Address with Country
Zip Code (Example: 12345)
Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.
If consistency is enabled, then by default, the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When the Address generator is consistent with itself, then the same value in the source database is always mapped to the same destination value. For example, for a column that contains a state name, Alabama is always mapped to Illinois. When the Address generator is consistent with another column, then the same value in the other column always results in the same destination value for the address column. For example, if the address column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same address value in the destination database.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
For the Address generator, Spark workspaces (Amazon EMR, Databricks, and self-managed Spark clusters) only support the following address parts:
Building Number
City
Country
Country Code
Full Address
Latitude
Longitude
State
State Abbr
Street Address
Street Name
Street Suffix
US Address
US Address with Country
Zip Code
The Categorical generator shuffles the existing values within a field while maintaining the overall frequency of the values. It disassociates the values from other pieces of data. Note that NULL is considered a separate value.
For example, a column contains the values Small
, Medium
, and Large
. Small
appears 3 times, Medium
appears 4 times, and Large
appears 5 times. In the output data, each value still appears the same number of times, but the values are shuffled to different rows.
This generator is optimized for categories with fewer than 10,000 unique values. If your underlying data has more unique values (for example, your field is populated by freeform text entry), we recommend that you use the Character Scramble or Custom Categorical generator.
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
Configurable
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
2 if differential privacy enabled
3 if differential privacy not enabled
Generator ID (for the API)
To configure the generator:
From the Link To dropdown, select the columns to link to the current column. You can select from other columns that use the Categorical generator.
Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, differential privacy is disabled.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates unique alphanumeric strings of the same length as the input.
For example, for the origin value ABC123
, the output value is a six-character alphanumeric string such as D24N05
.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random company name-like string.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.
By default, the generator is not consistent.
If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.
When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Business is always mapped to New Business.
When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.
This generator replaces letters with random other letters and numbers with random other numbers. Punctuation, whitespace, and mathematical symbols are preserved.
For example, for the following input string:
ABC.123 123-456-789 Go!
The output would be something like:
PRX.804 296-915-378 Ab!
This generator securely masks letters and numbers. There is no way to recover the original data.
Character Scramble is similar to , with a couple of key differences.
While you can enable consistency for the entire value, Character Scramble does not always replace the same source character with the same destination character. Because there is no guarantee of unique values, you cannot use Character Scramble on unique columns.
Character Substitution, however, does always map the same source character to the same destination character. Character Substitution is always consistent, which makes it less secure than Character Scramble. You can use Character Substitution on unique columns.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
Generates a continuous distribution to fit the underlying data.
This generator can be linked to other Continuous generators to create multivariate distributions and can be partitioned by other columns.
To configure the generator:
From the Link To drop-down list, select the other Continuous generator columns to link to. The linking creates a multivariate distribution.
From the Partition By drop-down list, select one or more columns to use to partition the data. The selected columns must have the generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to .
Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, the generator is not differentially private.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Performs a random character replacement that preserves formatting (spaces, capitalization, and punctuation).
Characters are replaced with other characters from within the same Unicode Block. A given source character is always mapped to the same destination character. For example, M
might always map to V
.
For example, for the following input string:
Miami Store #162
The output would be something like:
Vgkjg Gmlvf #681
Note that for a numeric column, when a generated number starts with a 0, the starting 0 is removed. This could result in matching output values in different columns. For example, one column is changed to 113 and the other to 0113, which also becomes 113.
Character Substitution is similar to , with a couple of key differences. Because Character Substitution always maps the same source character to the same destination character, it is always consistent. It also can be used for unique columns.
In Character Scramble, the character mapping is random, which makes Character Scramble slightly more secure. However, Character Scramble cannot be used for unique columns.
Links columns in two tables. This column value is the sum of the values in a column in another table.
This generator does not provide a preview. The sums are not computed until the other table is generated.
For example, a Customers table contains a Total_Sales column. The Transactions table uses a foreign key Customer_ID column to identify the customer who made the transaction, and an Amount column that contains the amount of the sale. The Customer_ID value in the Transactions table is a value from the ID primary key column in the Customers table.
You assign the Cross Table Sum generator to the Total_Sales column. In the generator configuration, you indicate that the value is the sum of the Amount values for the Customer_ID value that matches the primary key ID value for the current row.
For the Customers row for ID 123
, the Total_Sales column contains the sum of the Amount column for Transactions rows where Customer_ID is 123
.
To configure the generator:
From the Foreign Table dropdown list, select the table that contains the column for which to sum the values.
From the Foreign Key dropdown list, select the foreign key. The foreign key identifies the row from the current table that is referred to in the foreign table.
From the Sum Over dropdown list, select the column for which to sum the values.
From the Primary Key dropdown list, select the primary key for the current table.
A version of the generator that selects from values that you provide instead of shuffling the original values.
To configure the generator:
From the Link To dropdown list, select the columns to link this column to. You can only select other columns that use the Custom Categorical generator.
In the Custom Categories text area, enter the list of values that the generator can choose from.
Put each value on a separate line.
To add a NULL value to the list, use the keyword {NULL}
.
Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given source value in that column always results in the same value for the current column in the destination database. For example, a department column is consistent with a username column. For each instance of User1 in the source database, the value in the department column is the same.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates unique alpha-numeric strings based on any printable ASCII characters. The length of the source string is not preserved. You can choose to exclude lowercase letters from the generated values.
To configure the generator:
To exclude lowercase letters from the generated values, toggle Exclude Lowercase Alphabet to the on position.
Toggle the Consistency setting to indicate whether to make the generator consistent. By default, the generator is not consistent.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This is a .
Masks text columns by parsing the values as rows whose columns are delimited by a specified character.
You can assign specific generators to specific indexes. You can also use the generator that is assigned to a specific index as the default. This applies the generator to every index that does not have an assigned generator.
The output value maintains the quotes around the index values.
For example, a column contains the following value:
"first","second","third"
You assign the Character Scramble generator to index 0 and assign Passthrough to index 2. You select index 0 as the index to use for the default generator.
In the output, the first and second values are masked by the Character Scramble generator. The third value is not masked. The output looks something like:
"wmcop", "xjorsl", "third"
In the Delimiter field, type the delimiter that is used as a separator for the value.
For example, for the value "first","second","third"
, the delimiter is a comma.
You can configure a generator for any or all of the indexes. To add a sub-generator for an index:
Under Sub-Generators, click Add Generator. On the add generator dialog, the Cell CSV field contains a sample value from the source data. You can use the navigation icons to page through the values.
In the CSV Index field, type the index to assign a generator to. The index numbers start with 0. You cannot use an index that already has an assigned generator. Matched CSV values shows the value at that index for the current sample column value.
Under Generator Configuration, from the Select a Generator dropdown list, select the generator to use for the selected index. You cannot select another composite generator. To remove the selection, click the delete icon.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another index, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
After you configure a generator for at least one index, the Default Link dropdown list is displayed.
From the Default Link dropdown list, select the index to use to determine how to mask values for indexes that do not have an assigned generator.
For example, you assign the Character Scramble generator to index 2. If you set Default Link to 2, then all indexes that do not have an assigned generator use the Character Scramble generator.
This generator is deprecated. Use the generator instead.
Generates a random company name-like string.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.
By default, the generator is not consistent.
If consistency is enabled, then by default it is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.
When the generator is consistent with itself, then a given source value is always mapped to the same destination value. For example, My Company is always mapped to New Company.
When the generator is consistent with another column, then a given source value in that other column always results in the same destination value for the company name column. For example, if the company name column is consistent with a name column, then every instance of John Smith in the name column in the source database has the same company name in the destination database.
This is a .
Applies different generators to the value conditionally based on any value in the table.
For example, a Users table contains Name, Username, and Role columns. For the Username column, you can use a conditional generator to indicate that if the value of Role is something other than Test, then use the Character Scramble generator for the Username value. For Test users, the name is not masked.
The generator consists of a list of options. Each option includes the required conditions and the generator to use if those conditions are met.
The generator always contains a Default option. The Default option is used if the value does not meet any of the conditions. To configure the Default option:
From the Default dropdown list, select the generator to use by default.
Configure the selected generator.
To add a condition option:
Click + Conditional Generator.
To add a condition:
Click + Condition.
From the column list, select the column for which to check the value.
Select the comparison type.
Enter the column value to check for.
To remove a condition, click the delete icon for the condition.
From the Generator dropdown list, select the generator to run on the current column if the conditions are met. You cannot select another composite generator.
Choose the configuration options for the selected generator.
To view details for and edit a condition option, click the expand icon for that option.
To remove a condition option, click the delete icon for the option.
This is a .
A version of the generator that can be used for array values.
Uses regular expressions to parse strings and replace specified substrings with the output of specified generators. The parts of the string to replace are specified inside unnamed top-level capture groups.
To add a regular expression:
Click Add Regex. On the configuration panel, Cell Value shows a sample value from the source database. You can use the previous and next options to navigate through the values.
By default, Replace all matches is enabled. To only match the first occurrence of a pattern, toggle Replace all matches to the off position.
In the Pattern field, enter a regular expression. If the expression is valid, then Structural displays the capture groups for the expression.
For each capture group, to select and configure the generator to apply, click the selected generator. You cannot select another composite generator.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Regexes list:
To edit a regex, click the edit icon.
To remove a regex, click the delete icon.
Uses a single value to mask all of the values in the column.
For example, you can replace every value in a string column with the String1
. Or you can replace every value in a numeric column with the value 12345
.
To configure the generator, in the Constant Value field, provide the value to use.
The value must be compatible with the field type. For example, you cannot provide a string value for an integer column.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
Consistency
Yes, can be made self-consistent
Linking
No, cannot be linked
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
Configurable
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
2 if differential privacy enabled
3 if differential privacy not enabled
Generator ID (for the API)
Consistency
This generator is implicitly self-consistent. You do not specify whether the generator is consistent. Every occurrence of a character always maps to the same substitute character. Because of this, it can be used to preserve a join between two text columns, such as a join on a name or email.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
4
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
Yes, can be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Determined by the selected generators.
Linking
Determined by the selected generators.
Differential privacy
Determined by the selected generators.
Data-free
Determined by the selected generators.
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
If a fallback generator is selected, then the lower of either 5 or the fallback generator.
5 if no fallback generator is selected
Generator ID (for the API)
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
Scrambles the characters in an email address. It preserves formatting and keeps the @
and .
characters.
For example, for the following input value:
johndoe@company.com
The output value would be something like:
brwomse@xorwxlt.slt
By default, the generator scrambles the domain. You can configure the generator to not mask specific domains. You can also specify a domain to use for all of the output email addresses.
For example, if you configure the generator to not scramble the domain company.com
, then the output for johndoe@company.com
would look something like:
brwomse@company.com
This generator securely masks letters and numbers. There is no way to recover the original data.
If your email addresses include name values - for example, John.Smith@mycompany.com - then you can use the Regex Mask generator to produce email addresses that are tied to name values in the same table. For information on how to do this, go to #generator-tips-email-name-alignment.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
In the Email Domain field, enter a domain to use for all of the output values.
For example, use @mycompany.com
for all of the generated values. The generator scrambles the content before the @
.
In the Excluded Email Domains field, enter a comma-separated list of domains for which email addresses are not masked in the output values. This allows you, for example, to maintain internal or testing email addresses that are not considered sensitive.
Toggle the Replace invalid emails setting to indicate whether to replace an invalid email address with a generated valid email address. By default, invalid email addresses are not replaced. In the replacement values, the username is generated. If you specify a value for Email Domain, then the email addresses use that domain. Otherwise, the domain is generated.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
The FNR generator transforms Norwegian national identity numbers. In Norwegian, the term for national identity number abbreviates to FNR.
The first six digits of an FNR reflects the person's birthdate. You can choose to preserve the birthdates from the source values in the destination values. If you do not preserve the source values, the destination values are still within the same date range as the source values.
Another digit in an FNR indicates whether the person is male or female. You can specify whether to preserve in the generated value the gender indicated in the source value.
The last digits in an FNR are a checksum value. The last digits in the destination value are not a checksum - the values are random.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
To preserve the gender from the source value in the destination value, toggle Preserve Gender to the on position.
To preserve the birthdate from the source value in the destination value, toggle Preserve Birthdate to the on position.
Toggle the Consistency setting to indicate whether to make the generator consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given value for that other column in the source database results in the same value in the destination database. For example, if the FNR column is consistent with a Name column, then every instance of John Smith in the source database results in the same FNR in the destination database.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This generator can be used to mask columns of latitude and longitude.
The Geo generator divides the globe into grids that are approximately 4.9 x 4.9 km. It then counts the number of points within each grid.
During data generation, each (latitude, longitude) pair is mapped to its grid.
If the grid contains a sufficient number of points to preserve privacy, then the generator returns a randomly chosen point in that grid.
If the grid does not contain enough points to preserve privacy, then the generator returns a random coordinate from the nearest grid that contains enough points.
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator:
From the Link To dropdown list, select the column to link to this one. You typically assign the Geo generator to both the latitude and longitude column, then link those columns.
From the value type dropdown, select whether this column contains a latitude value or a longitude value.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates timestamps fitting an event distribution. The source timestamp must include a date. It cannot be a time-only value.
Link columns to create a sequence of events across multiple columns. This generator can be partitioned by other columns.
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator:
From the Link To dropdown list, select the other Event Timestamps generator columns to link this column to. Linking creates a sequence across multiple columns.
From the Partition drop-down list, select one or more columns to use to partition the data. The selected columns must have their generator set to either Passthrough or Categorical. For more information about partitioning and how it works, go to Partitioning a column.
The Options list displays the current column and linked columns. Use the Up and Down buttons to configure the column sequence.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This generator scrambles characters while preserving formatting and keeping the file extension intact.
For example, for the following input value:
DataSummary1.pdf
The output value would look something like:
RsnoPwcsrtv5.pdf
This generator securely masks letters and numbers. There is no way to recover the original data.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This generator can be used to generate cities, states, and zip codes that follow HIPAA guidelines for safe harbor.
How the HIPAA Address generator handles zip codes is based on whether the Replace zeros in truncated Zip Code toggle in the generator configuration is off or on.
By default, the setting is off. In this case, the last two digits of the zip code in the column are replaced with zeros, unless the zip code is a low population area as designated by the current census. For a low population area, all of the digits in the zip code are replaced with zeros.
If the setting is on, then the generator selects a real zip code that starts with the same three digits as the original zip code. For a low population area, if a state is linked, then the generator selects a random zip code from within that state. Otherwise the generator selects a random zip code from the United States.
When a zip code column is not linked, a random city is chosen in the United States. When a zip code is already added to the link, a city is chosen at random that has at least some overlap with the zip code.
If the original zip code is designated as a low population area then a random city is chosen within the state, this is done only if the user has linked a State column. If they have not, a random city within the United States is chosen.
For example, if the original city and zip code were (Atlanta, 30305), the zip code would be replaced with 30300. There are many cities that contain zip codes beginning in 303 such as Atlanta, Decatur, Chamblee, Hapeville, Dunwoody, College Park, etc.). One of these cities is chosen at random so that our final value is (Chamblee, 30300), for example.
HIPAA guidelines allow for information at the state level to be kept. Therefore, these values are passed through.
GPS coordinates are randomly generated in descending order of dependence of the linked HIPAA address components:
If a zip code is linked, a random point within the same 3-digit zip code prefix is generated, if the 3-digit zip code prefix is not designated a low population area. If it is a low population area, use the linked state.
If a state is available and a zip code and city are not, or the zip code or city are in a 3-digit zip code prefix that is designated a low population area, then a random GPS coordinate is generated somewhere within the state.
If no zip code, city, or state is linked, or one or more of them were provided, but there was a problem generating a random GPS coordinate within the linked areas, then a GPS coordinate is generated at a random location within the United States.
Note: If the city component of the HIPAA address is linked with latitude and/or longitude, the GPS coordinate components are randomly generated independently of the city.
All other address parts are generated randomly. The output value is not influenced at all by the underlying value in the column.
To configure the generator:
From the Link To dropdown list, select the other columns to link to. You can only select columns that are also assigned the HIPAA Address generator.
From the address part dropdown list, select the type of address value that is in the column.
Toggle the Replace zeros in truncated Zip Code setting how to generate zip codes. If the setting is off, then the last two digits are replaced with zero. For low population areas, the entire zip code is populated with zeroes. If the setting is on, then a real zip code is selected that starts with the first three digits of the original zip code. For low population areas, if a state is linked, a random zip code from the state is used. Otherwise, a random zip code from the United States is used.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
For the HIPAA Address generator, Spark workspaces (Amazon EMR, Databricks, and self-managed Spark clusters) only support the following address parts:
City
City with State
City with State Abbr
State
State Abbr
US Address
US Address with Country
Zip Code
Generates unique integer values. By default, the generated values are within the range of the column’s data type.
You can also specify a range for the generated values. The source values must be within that range.
This generator cannot be used to transform negative numbers.
To configure the generator:
In the Minimum field, enter the minimum value to use for an output value. The minimum value cannot be larger than any of the values in the source data.
In the Maximum field, enter the maximum value to use for an output value. The maximum value cannot be smaller than any of the values in the source data.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This generator replaces all instances of the find string with the replace string.
For example, you can indicate to replace all instances of abc with 123.
To configure the generator:
In the Find field, type the string to look for in the source column value.
To use a regular expression to identify the source value, check the Use Regex checkbox.
If you use a regular expression, use backslash ( \
) as the escape character.
In the Replace field, type the string to replace the matching string with.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This is a .
Masks text columns by parsing the contents as HTML, and applying sub-generators to specified path expressions.
If applying a sub-generator fails because of an error, the generator selected as the fallback generator is applied instead.
Path expressions are defined using the .
For example, for the following HTML:
To get the value of h1
, the expression is //h1/text(
).
To get the value of the first list item, the expression is //ul/li[1]/text()
.
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HTML field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the path expression to identify the value to apply the generator to. Matched HTML Values shows the result from the value in Cell HTML.
From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.
The options are:
This is a .
Runs selected generators on specified key values in an HStore column in a PostgreSQL database. HStore columns contain a set of key-value pairs.
To assign a generator to a key:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell HStore field contains a sample value from the source database. You can use the previous and next icons to page through different values.
Under Enter a key, enter the name of a key from the column value.
For example, for the column value:
"pages"=>"446", "title"=>"The Iliad", "category"=>"mythology"
To apply a generator to the title, you would enter title
as the key.
Matched HStore Values shows the result from the value in Cell HStore.
From the Generator Configuration dropdown list, select the generator to apply to the key value. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another key, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
Generates random host names, based on the English language.
To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.
By default, the generator is not consistent.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from Consistent to, select the column.
When the generator is consistent with itself, then a given value in the source database is mapped to the same value in the destination database. For example, Host123 in the source database always produces MyHostABC in the destination database.
When the generator is consistent with another column, then a given source value in the other column results in the same host name value in the destination database. For example, a host name column is consistent with a department column. Every instance of Sales in the source data is given the same host name in the destination database.
Generates unique object identifiers.
Can be assigned to text columns that contain MongoDB ObjectId
values. The column value must be 12 bytes long.
To configure the generator:
A MongoID object identifier consists of an epoch timestamp, a random value, and an incremented counter. To only change the random value portion of the identifier, but keep the timestamp and counter portions, toggle Preserve Timestamp and Incremental Counter to the on position.
Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random name string from a dictionary of first and last names.
You specify the name information that is contained in the column. A column might only contain a first name or last name, or might contain a full name. A full name might be first name first or last name first.
For example, a Name column contains a full name in the format Last, First. For the input value Smith, John
, the output value would be something like, Jones, Mary
.
To configure the generator:
From the name format dropdown list, select the type of name value that the column contains:
First. This also is commonly used for standalone middle name fields.
Last
First Last
First Middle Last
First Middle Initial Last
Last, First
Last, First Middle
Middle Initial
Toggle the Preserve Capitalization setting to indicate whether to preserve the capitalization of the column value. By default, the capitalization is not preserved.
Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates an address-like string to replace either:
For a Canadian postal address, the street name or postal code
For a United Kingdom (UK) mailing address, the postal code
To replace a Canadian postal code:
The generator selects a real postal code that starts with the same three digits - has the same Forward Sortation Area (FSA) - as the original postal code, but that has a different Local Delivery Unit (LDU).
For a postal code whose FSA is not on the list that the generator uses, you can provide a fallback value to use.
To replace a UK postal code, the generator selects a real postal code.
To configure the generator:
From the Generator Type dropdown list, select International Address.
From the Country dropdown list, select the country (Canada or United Kingdom).
From the Address Component dropdown list, select the address component that this column contains. For Canada, the available options are:
Street Name
Postal Code
For the UK, the only option is to generate a postal code.
For a Canadian postal code, in the Fallback Value field, type the FSA to use if the value in the data does not exist.
For example, the FSA in the data might be new and not yet in the list that Tonic uses, or the FSA might be invalid.
By default, the fallback value is NULL
, meaning that the postal code value will also be string literal "NULL" in the destination data.
Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.
If Tonic data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This is a .
Runs a selected sub-generator on values that match a user specified . You can only search for and apply sub-generators to individual key values. You cannot apply a sub-generator to an object or to an array.
If an error occurs, the selected fallback generator is used for the entirety of the JSON value.
Sub-generators are applied sequentially, from the sub-generator at the top of the list to the sub-generator at the bottom of the list.
If multiple JSONPath expressions point to the same key, the most recently added generator takes priority.
JSON paths can also contain regular expressions and comparison logic, which allows the configured sub-generators to be applied only when there are properties that satisfy the query.
For example, a column contains this JSON:
[ { file_name: "foo.txt", b: 10 }, ... ]
The following JSON path only applies to array elements that contain a file_name
key for which the value ends in .txt
:
$.[?(@.file_Name =~ /^.*.txt$/)]
A JSON path can also be used to point to a key name recursively. For example, a column contains this JSON:
The following JSON path applies to all properties for which the key is first_name
:
$..first_name
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell JSON field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the path expression to identify the value to apply the generator to. To create a path expression, you can also click the value in Cell JSON that you want the expression to point to. The path expression must identify a key value. You cannot apply sub-generators to an object or to an array. Matched JSON Values shows the result from the value in Cell JSON.
By default, the selected generator is applied to any value that matches the expression. To limit the types of values to apply the generator to, from the Type Filter, specify the applicable types. You can select Any, or you can select any combination of String, Number, Boolean, and Null.
From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
From the Fallback Generator dropdown list, select the generator to use if the assigned generator for a path expression fails.
The options are:
Generates a random MAC address formatted string.
To configure the generator:
In the Bytes Preserved field, enter the number of bytes to preserve in the generated address.
Toggle the Consistency setting to indicate whether to make the column self-consistent. By default, consistency is disabled.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Masks values in numeric columns. Adds or multiplies the original value by random noise.
The additive noise generator draws noise from an interval around 0 scaled to the magnitude of original value. For example, the default scale is 10% of the underlying value. The larger the value, the larger the amount of noise that is added.
The multiplicative noise generator multiplies the original value by a random scaling factor that falls within a specified range.
You can use either the additive noise generator or the multiplicative noise generator, then set the other generator settings.
To use the additive noise generator:
From the dropdown list, choose Additive.
In the Relative noise scale field, type the percentage of the underlying value to scale the noise to. The default value is 10
.
Tonic samples the additive noise from a range between [-{
scale
/100} * |
value
|, {
scale
/ 100} * |
value
|)
, where scale
is the noise scale, and value
is the original data value.
The lower value of the range is inclusive, and the upper value of the range is exclusive.
For example, for the default noise scale of 10
, and a data value of 20
, the additive noise range would be [-.1 * 20, .1 * 20)
. In other words, between -2 (inclusive) and 2 (exclusive).
To use the multiplicative noise generator:
From the dropdown list, choose Multiplicative.
In the Min field, type the minimum value for the scaling factor. The minimum value is inclusive. The default value is 0.5
.
In the Max field, type the maximum value for the scaling factor. The maximum value is exclusive. The default value is 5
.
Tonic scales the original value from a range between [
min
,
max
)
, where min
is the minimum scaling factor, and max
is the maximum scaling factor.
For example, for the default values of 0.5
and 5
, Tonic multiplies the original data value by a value from between 0.5 (inclusive) and 5 (exclusive).
To configure the generator consistency and data encryption:
Toggle the Consistency setting to indicate whether to make the column consistent. By default, the consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. If the generator is self-consistent, then a given value in the source database is masked in exactly the same way to produce the value in the destination database. If the generator is consistent with another column, then for a given value in that other column, the column that is assigned the Noise generator is always masked in exactly the same way in the destination database. For example, a field containing a salary value is assigned the Noise Generator and is consistent with the username field. For each instance of User1, the Noise Generator masks the salary value in exactly the same way.
Generates a random boolean value.
To configure the generator, in the Percent True field, enter the percentage of values to set to True
in the output.
For example, if you set this to 60
, then 60% of the output values are True
, and 40% of the output values are False
.
If you set this to 100
, then all of the output values are True
.
If you set this to 0
, then all of the output values are False
.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
The provides support for additional address parts in Spark workspaces.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
6
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Consistency
Yes, can be made self-consistent
Linking
No, cannot be linked
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent or consistent with another column. Note that all Name generator columns that have the same consistency configuration are automatically consistent with each other. The columns must either be all self-consistent or all consistent with the same other column. For example, you can use this to ensure that a first name and last name column value always match the first name and last name in a full name column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
Generates a random IP address formatted string.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
In the Percent IPv4 field, type the percentage of output values that are IPv4 addresses.
For example, if you set this to 60
, then 60% of the generated IP addresses are IPv4 addresses, and 40% of the generated IP addresses are IPv6 addresses.
If you set this to 100
, then all of the generated IP addresses are IPv4 addresses.
If you set this to 0
, then all of the generated IP addresses are IPv6 addresses.
Toggle the Consistency setting to indicate whether to make the column consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given source value in that column always results in the same IP address value in the destination database. For example, an IP address column is consistent with a username column. For each instance of User1 in the source database, the value in the IP address column is the same.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random phone number that matches the country or region of the input phone number while maintaining the format. For example, (123) 456-7890 or 123-456-7890.
If the input is not a valid phone number, the generator randomly replaces numeric characters. You can also replace invalid numbers with valid numbers.
By default, the numbers are United States phone numbers. Generated numbers pass Google's libphonenumber
verification if the input is a valid phone number or if you replace invalid numbers.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator:
Toggle the Replace invalid numbers setting to indicate whether to replace invalid input values with a valid output value. By default, the generator does not replace invalid values. It randomly replaces numeric characters.
Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, consistency is disabled.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates unique numeric strings of the same length as the input value.
For example, for the input value 123456
, the output value would be something like 832957
.
You can apply this generator only to columns that contain numeric strings.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This is a composite generator.
Uses regular expressions to parse strings and replace specified substrings with the output of specified generators. The parts of the string to replace are specified inside unnamed top-level capture groups.
Defining multiple expressions allows you to attach completely different sets of sub-generators to to a given cell, depending on the cell's value.
If multiple regular expressions match a given string, the regular expressions and their associated generators are applied in the order that they are specified. The first expression defined that matches has the selected sub-generators applied.
With the Replace all matches option, the Regex Mask generator behaves similarly to a traditional regex parser. It matches all occurrences of a pattern before the next pattern is encountered. For example, the pattern ^(a)$
applied to the string aaab
matches every occurrence of the letter a
, instead of just the first.
Note that for Spark-based data connectors, depending on your environment, there might be slight differences in the regular expression support.
To ensure consistent results across all data connectors, use regular expression patterns that are compatible with both Java and C#.
For more information about regular expressions in C#, go to this reference. For more information about regular expressions in Java, go to this reference.
In a cell that contains the string ProductId:123-BuyerId:234
, to mask the substrings 123
and 234
, specify the regular expression:
^ProductId:([0-9]{3})-BuyerId:([0-9]{3})$
This captures the two occurrences of three-digit numbers in the pattern ProductId:xxx-BuyerId:xxx
. This makes it possible to define a sub-generator on neither, either, or both of these captured substrings.
The following regular expression defines a broader capture that matches more cell values:
^(\w+).(\d+).(\w+).(\d+)$
This captures pairs of words ((\w+)
) and numbers ((\d+)
) if there is a single character of any value between them, instead of the relatively more specific pattern of the first expression.
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
To add a regular expression:
Click Add Regex. On the configuration panel, Cell Value shows a sample value from the source database. You can use the previous and next options to navigate through the values.
By default, Replace all matches is enabled. To only match the first occurrence of a pattern, toggle Replace all matches to the off position.
In the Pattern field, enter a regular expression. If the expression is valid, then Tonic displays the capture groups for the expression.
For each capture group, to select and configure the generator to apply, click the selected generator. You cannot select another composite generator.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Regexes list:
To edit a regex, click the edit icon.
To remove a regex, click the delete icon.
Generates values of ISO 6346 compliant shipping container codes. All generated codes are in the freight category ("U").
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator consistent.
By default, the generator is not consistent.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column.
When the generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database.
When the generator is consistent with another column, then a given value for the other column in the source database always results in the same shipping container code value in the destination database. For example, a shipping container column is consistent with an owner column. Every instance of an owner column from the source database has the same shipping container value in the destination database.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Returns a random integer between the specified minimum (inclusive) and maximum (exclusive).
For example, for a column that contains a percentage value, you can indicate to use a value between 0
and 101
.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
To configure the generator:
In the Minimum field, type the minimum value to use in the output values. The minimum value is inclusive. The output values can be that value or higher.
In the Maximum field, type the maximum value to use in the output values. The maximum value is exclusive. The output values are lower than that value.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates UUIDs on primary key columns.
All foreign key columns that reference the configured column automatically have their UUID values masked.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
Yes
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
To preserve the version and variant bits from the source UUID in the output value, toggle Preserve Version and Variant to the on position.
Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This is a substitution cipher that preserves formatting, but keeps the URL scheme and top-level domain intact.
For example, for the following input value:
http://www.example.com/products/clothes
The output value would be something like:
http://www.example.com/sowrmsl/kwctlsn
This mask is not secure.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
This is a composite generator.
Applies selected generators to specific StructFields within a StructType in a Spark database (Databricks and Amazon EMR).
For example, for the following StructType:
To get the value of the occupation
field, you would use the expression root.occupation
.
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell Struct field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the path expression to identify the value to apply the generator to. Matched Struct Values shows the result from the value in Cell Struct.
From the Generator Configuration dropdown list, select the generator to apply to the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
Generates a new valid United States Social Security Number.
You specify the percentage of values for which to include the dashes. For columns that have a numeric data type, Structural automatically excludes the dashes.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
Yes, if consistency is not enabled.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
In the Percent with -'s field, type the percentage of output values for which to include dashes in the format.
For example, if you set this to 60
, then 60% of the output values are formatted 123-45-6789
, and 40% are formatted 123456789
.
If you set this to 100
, then all of the output values are formatted 123-45-6789
.
If you set this to 0
, then all of the output values are formatted 12345679
.
For columns that have a numeric data type, Structural automatically generates the output values without dashes.
Toggle the Consistency setting to indicate whether to make the generator consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a generator is self-consistent, then a given value in the source database is always mapped to the same value in the destination database. When a generator is consistent with another column, then a given value for that other column in the source database results in the same SSN in the destination database. For example, if the SSN column is consistent with a Name column, then every instance of John Smith in the source database results in the same SSN in the destination database.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates unique email addresses. Replaces the username with a randomly generated GUID, and masks the domain with a character scramble.
This generator only guarantees uniqueness if the underlying column is unique.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator:
In the Email Domain field, enter a domain to use for all of the output values.
For example, use @mycompany.com
for all of the generated values.
If you do not provide a value, then the generator uses a character scramble on the domain.
In the Excluded Email Domains field, enter a comma-separated list of domains for which email addresses are not masked in the output values. This allows you, for example, to maintain internal or testing email addresses that are not considered sensitive.
Toggle the Replace invalid emails setting to indicate whether to replace an invalid email address with a generated valid email address. By default, invalid email addresses are not replaced. In the replacement values, the username is generated. If you specify a value for Email Domain, then that value is used for the domain. Otherwise, the domain is generated.
Toggle the Consistency setting to indicate whether to make the generator self-consistent. By default, consistency is disabled.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random hash string.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates random dates, times, and timestamps that fall within a specified range.
For example, you might want the output dates to all fall within a specific year or month.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
To configure the generator, in the Range fields, provide the start and end dates, times, or timestamps to use for the output values.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random double number between the specified minimum (inclusive) and maximum (exclusive).
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
To configure the generator:
In the Minimum field, type the minimum value to use in the output values. The minimum value is inclusive. The output values can be that value or higher.
In the Maximum field, type the maximum value to use in the output values. The maximum value is exclusive. The output values are lower than that value.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a column of unique integer values. The values increment by 1.
Consistency
No, cannot be made consistent.
Linking
Yes, can be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
3
Generator ID (for the API)
To configure the generator:
From the Link To dropdown list, select the other columns to link to the current column. You can only select columns that also use the Sequential Integer generator.
In the Starting Point field, type the number to use as the starting point.
By default, the starting point is 0
. This means that the column value in the first processed row is 0
. The value in the next processed row is 1
. The generator continues to increment the value by 1 in each row that it processes.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a new valid Canadian Social Insurance Number that preserves the formatting of the original value.
For example, the original value might be 123456789
, 123 456 789
, or 123-456-789
. The output value uses the same format.
Consistency
Yes, can be made self-consistent.
Linking
No, cannot be linked.
Differential privacy
No, cannot be made differentially private.
Data-free
Yes, if consistency is not enabled.
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
Yes
Privacy ranking
1 if not consistent
4 if consistent
Generator ID (for the API)
To configure the generator, toggle the Consistency setting to indicate whether to make the generator self-consistent.
By default, the generator is not consistent.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Generates a random new UUID string.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
Yes
Data-free
Yes
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
1
Generator ID (for the API)
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Shifts timestamps by a random amount of a specific unit of time within a set range.
For date-only values, the Timestamp Shift Generator supports the following date formats. The example values are all for February 23, 2021.
MM/dd/yyyy
- 02/23/2021
MM/dd/yy
- 02/23/21
MM-dd-yyyy
- 02-23-2021
yyyyMMdd
- 20210223
yyyy/MM/dd
- 2021/02/23
MMddyyyy
- 02232021
To configure the generator:
From the Date Part dropdown list, select the unit of time to use for the minimum and maximum shift.
In the Minimum Shift field, type the minimum amount the value can be shifted from the original value.
Use negative numbers to indicate to shift the date to the past.
For example, assume that the date part is Day. -3
indicates that the day cannot be shifted earlier than 3 days before the original day. 3
indicates that the date cannot be shifted earlier than 3 days after the original day.
In the Maximum Shift field, type the maximum amount by which the value can be shifted from the original value.
For example, assume that the date part is Day. 5
indicates that the date cannot be shifted later than 5 days after the original day.
Toggle the Consistency setting to indicate whether to make the generator consistent. By default, consistency is disabled.
If you enable consistency, then by default the generator is self-consistent. To make the generator consistent with another column, from the Consistent to dropdown list, select the column. When a column is consistent with itself, then the same date part value is always shifted by the same amount.
When a column is consistent with another column, then for the same value in the other column, the date part value is always shifted by the same amount. For example, for the same value of username, the birthdate column value is always shifted by the same amount.
If multiple columns that use the Timestamp Shift generator are consistent with the same other column, then for those columns, the date part value shifts by the same amount. For example, the startdate
and enddate
columns are both consistent with the username
column. Both startdate
and enddate
use the Timestamp Shift generator. For the same value of username
, both startdate
and enddate
are shifted by the same amount.
This is a .
Runs a selected generator on values that match a user specified path expression.
Path expressions are defined using the .
For example, for the following XML content:
To get the first_name
value, you would use /household/member/first_name
.
You can also select a fallback generator to run on the entire XML value if there is any error during data generation.
To assign a generator to a path expression:
Under Sub-generators, click Add Generator. On the sub-generator configuration panel, the Cell XML field contains a sample value from the source database. You can use the previous and next icons to page through different values.
In the Path Expression field, type the path expression to identify the value to apply the generator to. Matched XML Values shows the result from the value in Cell XML.
From the Generator Configuration dropdown list, select the generator to apply to the value at the path expression. You cannot select another composite generator.
Configure the selected generator. You cannot configure the selected generator to be consistent with another column.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
From the Sub-Generators list:
To edit a generator assignment, click the edit icon.
To remove a generator assignment, click the delete icon.
To move a generator assignment up or down in the list, click the up or down arrow.
From the Fallback Generator dropdown list, select the generator to use if any error occurs in the generation.
The fallback generator is then used for the entire XML value.
The options are:
If is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Consistency
Yes, can be made self-consistent or consistent with another column.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
3 if not consistent
4 if consistent
Generator ID (for the API)
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
Truncates a date value or a timestamp to a specific part.
For a date or a timestamp, you can truncate to the year, month, or day.
For a timestamp, you can also truncate to the hour, minute, or second.
Consistency
No, cannot be made consistent.
Linking
No, cannot be linked.
Differential privacy
No
Data-free
No
Allowed for primary keys
No
Allowed for unique columns
No
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
To configure the generator:
From the dropdown list, select the part of the date or timestamp to truncate to. For both date and timestamp values, you can truncate to the year, month, or day. When you select one of these options, the time portion of a timestamp is set to 00:00:00. For the date, the values below the selected truncation value are set to 01. For example, when you truncate to month, the day value is set to 01, and the timestamp is set to 00:00:00. For a timestamp value, you also can truncate to the hour, minute, or second. The date values remain the same as the original data. The time values below the selected truncation value are set to 00. For example, when you truncate to minute, the seconds value is set to 00.
Toggle the Birth Date option. When you enable Birth Date, the generator shifts dates that are more than 90 years before the generation date to the date exactly 90 years before the generation date. For example, a generation occurs on January 1, 2023. Any date that occurs before January 1, 1933 is changed to January 1, 1933.
This is mostly intended for birthdate values, to group birthdates for everyone who is older than 89 into a single year. This is used to comply with HIPAA Safe Harbor.
If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.
Here are examples of date and time values and how the selected truncation affects the output:
Original value
2021-12-20
2021-12-20 13:42:55
Truncate to year
2021-01-01
2021-01-01 00:00:00
Truncate to month
2021-12-01
2021-12-01 00:00:00
Truncate to day
2021-12-20
2021-12-20 00:00:00
Truncate to hour
Not applicable
2021-12-20 13:00:00
Truncate to minute
Not applicable
2021-12-20 13:42:00
Truncate to second
Not applicable
2021-12-20 13:42:55