Generator summary

The following table summarizes the available generators. The table includes generator characteristics that you might take into account when you select the generator to use for a column. Generator hints and tips also provides some suggestions for generators to use for specific uses cases.

Information in the table

The generator summary includes the following columns:

  • Generator - The name of the generator, linked to the entry in the generator reference.

  • Description - An overview description of the generator. Also includes:

    • For generators that can be data-free, whether the generator is always data-free, or only data-free when consistency is disabled. Data-free means that the output data is completely unrelated to the source data. For more information, go to Data-free generators.

    • The possible privacy rankings for the generator. The privacy ranking indicates the level of protection that the generator provides. For information about the available privacy rankings, go to Privacy ranking.

  • Consistency - Whether the generator supports consistency. Consistency bases the output values on the source values. For generators that support consistency, the Consistency column also indicates the supported type of consistency - self consistency or consistency with another column. For more information, go to Enabling consistency.

  • Linking - whether you can link columns that are assigned the generator. When you link columns, you indicate that the columns have a relationship. For more information, go to Linking generators.

  • Differential Privacy - Whether the generator supports differential privacy, which ensures that the source value cannot be reverse engineered from the output value. For more information, go to Differential privacy.

GeneratorDescriptionConsistencyLinkingDifferential Privacy

Generates a random string to replace a specific part of a mailing address. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

Yes

Yes if not consistent

Uses deep neural networks for high-fidelity data mimicking. By default, not available. Privacy ranking: 3

No

No

No

Identifies the algebraic relationship between 3 or more numeric values (at least one non-integer) and generates new values to match. Privacy ranking: 3

No

Yes

No

Generates unique alphanumeric strings of the same length as the input. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Within an array, replaces letters with random other letters, and numbers with random other numbers. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Runs a selected generator on values that match a user-specified JSONPath. Privacy ranking: 5

--

--

--

Runs a selected generator on values that match a regular expression. Privacy ranking: 5

--

--

--

Generates unique alpha-numeric strings based on any printable ASCII characters. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates a random company name-like string.

Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Creates values at the same frequency as the values in the underlying data. Privacy ranking: - 2 if differential privacy enabled - 3 if differential privacy not enabled

No

Yes

Configurable

Replaces letters with random other letters and numbers with random other numbers. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Replaces characters randomly, but preserves formatting. Privacy ranking: 4

Yes - Implicitly consistent

No

No

Company Name (Deprecated)

This generator is deprecated. Use the Business Name generator instead. Generates a random company name-like string. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Applies different generators to rows conditionally based on any value in the table. Privacy ranking: If a fallback generator is selected, then the lower of either 5 or the fallback generator. 5 if no fallback generator is selected.

No

No

No

Uses a single specified value to mask all values in the column. Data-free. Privacy ranking: 1

No

No

Yes

Generates a continuous distribution to fit the underlying data. Privacy ranking: - 2 if differential privacy enabled - 3 if differential privacy not enabled

No

Yes

Configurable

Populates the column using the sum of the values in other columns. Privacy ranking: 3

No

No

No

Masks a text column.

Parses the text as a row for which the columns are delimited by a specified character. Privacy ranking: 5

--

--

--

Selects from values you provide. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Truncates dates or timestamps to a specific date or time part. Privacy ranking: 5

No

No

No

Scrambles characters in an email address.

Preserves the formatting and keeps the @ and .. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates timestamps that fit an event distribution. Privacy ranking: 3

No

Yes

No

Scrambles characters in a file name.

Preserves the formatting and the file extension. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Replaces all instances of the find string with the replace string. Privacy ranking: 5

No

No

No

Transforms Norwegian national identity numbers. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self or other

No

No

Masks columns that contain latitude and longitude values. Privacy ranking: 3

No

No

No

Can be used to generate cities, states, zip codes, and latitude/longitude values that follow HIPAA guidelines for safe harbor. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates random host names, based on the English language. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Runs selected generators on specified key values in an HStore column in a PostgreSQL database. Privacy ranking: 5

--

--

--

Masks text columns.

Parses the contents as HTML, and applies sub-generators to the specified path expressions. Privacy ranking: 5

--

--

--

Generates unique integer values.

By default, the generated values are within the range of the column’s data type.

You can also specify a range for the generated values. The source values must be within that range. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self

No

Yes if not consistent

For Canadian mailing addresses, can generate:

  • Street name

  • Postal code

For United Kingdom (UK) mailing addresses, can generate postal codes.

Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self

No

No

Generates a random IP address-formatted string. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Runs a generator on values that match a user specified JSONPath. Privacy ranking: 5

--

--

--

Generates a random MAC address formatted string. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self

No

Yes if not consistent

Generates unique MongoDB objectId values. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates a random name string from a dictionary of first and last names. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Masks values in numeric columns.

Adds or multiplies the original value by random noise. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self or other

No

No

Generates NULL values to fill the rows of the specified column. Data-free. Privacy ranking: 1

No

No

Yes

Generates unique numeric strings of the same length as the input. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Default generator. Does not perform any action on the source data. Privacy ranking: 6

No

No

No

Generates a random phone number that matches the country or region and format of the input phone number. Privacy ranking: 3

Yes - Self

No

No

Generates a random boolean value. Data-free. Privacy ranking: 1

No

No

Yes

Generates a random double number between the specified min and max. Data-free. Privacy ranking: 1

No

No

Yes

Generates a random hash string. Data-free Privacy ranking: 1

No

No

Yes

Returns a random integer between the specified min and max. Data-free. Privacy ranking: 1

No

No

Yes

Generates random dates, times, and timestamps. Data-free. Privacy ranking: 1

No

No

Yes

Generates a random new UUID string. Data-free. Privacy ranking: 1

No

No

Yes

Uses regular expressions to parse strings.

Replaces specified substrings with output from selected sub-generators. Privacy ranking: 5

--

--

--

Generates a column of unique integer values that start with specified value and increment by 1. Privacy ranking: 3

No

Yes

No

Generates values of ISO 6346 compliant shipping container codes. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Generates a new valid Canadian Social Insurance Number. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self

No

Yes if not consistent

Generates a new valid United States Social Security Number. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Can apply other generators on specific StructFields within a StructType in Spark databases (Databricks and Amazon EMR). Privacy ranking: 5

--

--

--

Shifts timestamps by a random amount of a specific unit of time, within a set range. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self or other

No

No

Generates unique email addresses.

Replaces the username with a randomly generated GUID, and masks the domain with a character scramble. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

A substitution cipher that preserves formatting but keeps the URL scheme and top-level domain intact. Privacy ranking: 3

No

No

No

Generates UUIDs on primary key columns. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Runs a selected generator on values that match a user-specified XPath. Privacy ranking: 5

--

--

--

Last updated