Generator summary

The following table summarizes the available generators. It indicates whether each generator can be made consistent, can be linked, and is differentially private.

In the Consistency column, the table also indicates whether the generator can be made self-consistent only, or can be made either self-consistent or consistent with another column.

The Description column includes:

  • For generators that can be data-free, whether the generator is always data-free, or only data-free when consistency is disabled.

  • The possible privacy rankings for the generator. For details about the available privacy rankings, go to Privacy ranking.

GeneratorDescriptionConsistencyLinkingDifferential Privacy

Generates a random string to replace a specific part of a mailing address. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

Yes

Yes if not consistent

Uses deep neural networks for high-fidelity data mimicking. By default, not available. Privacy ranking: 3

No

No

No

Identifies the algebraic relationship between 3 or more numeric values (at least one non-integer) and generates new values to match. Privacy ranking: 3

No

Yes

No

Generates unique alphanumeric strings of the same length as the input. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Within an array, replaces letters with random other letters, and numbers with random other numbers. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Runs a selected generator on values that match a user-specified JSONPath. Privacy ranking: 5

--

--

--

Runs a selected generator on values that match a regular expression. Privacy ranking: 5

--

--

--

Generates unique alpha-numeric strings based on any printable ASCII characters. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates a random company name-like string.

Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Creates values at the same frequency as the values in the underlying data. Privacy ranking: - 2 if differential privacy enabled - 3 if differential privacy not enabled

No

Yes

Configurable

Replaces letters with random other letters and numbers with random other numbers. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Replaces characters randomly, but preserves formatting. Privacy ranking: 4

Yes - Implicitly consistent

No

No

Company Name (Deprecated)

This generator is deprecated. Use the Business Name generator instead. Generates a random company name-like string. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Applies different generators to rows conditionally based on any value in the table. Privacy ranking: If a fallback generator is selected, then the lower of either 5 or the fallback generator. 5 if no fallback generator is selected.

No

No

No

Uses a single specified value to mask all values in the column. Data-free. Privacy ranking: 1

No

No

Yes

Generates a continuous distribution to fit the underlying data. Privacy ranking: - 2 if differential privacy enabled - 3 if differential privacy not enabled

No

Yes

Configurable

Populates the column using the sum of the values in other columns. Privacy ranking: 3

No

No

No

Masks a text column.

Parses the text as a row for which the columns are delimited by a specified character. Privacy ranking: 5

--

--

--

Selects from values you provide. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Truncates dates or timestamps to a specific date or time part. Privacy ranking: 5

No

No

No

Scrambles characters in an email address.

Preserves the formatting and keeps the @ and .. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates timestamps that fit an event distribution. Privacy ranking: 3

No

Yes

No

Scrambles characters in a file name.

Preserves the formatting and the file extension. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Replaces all instances of the find string with the replace string. Privacy ranking: 5

No

No

No

Transforms Norwegian national identity numbers. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self or other

No

No

Masks columns that contain latitude and longitude values. Privacy ranking: 3

No

No

No

Can be used to generate cities, states, zip codes, and latitude/longitude values that follow HIPAA guidelines for safe harbor. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates random host names, based on the English language. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Runs selected generators on specified key values in an HStore column in a PostgreSQL database. Privacy ranking: 5

--

--

--

Masks text columns.

Parses the contents as HTML, and applies sub-generators to the specified path expressions. Privacy ranking: 5

--

--

--

Generates unique integer values.

By default, the generated values are within the range of the column’s data type.

You can also specify a range for the generated values. The source values must be within that range. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self

No

Yes if not consistent

Generates a random IP address-formatted string. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Runs a generator on values that match a user specified JSONPath. Privacy ranking: 5

--

--

--

Generates a random MAC address formatted string. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self

No

Yes if not consistent

Generates unique MongoDB objectId values. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Generates a random name string from a dictionary of first and last names. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Masks values in numeric columns.

Adds or multiplies the original value by random noise. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self or other

No

No

Generates NULL values to fill the rows of the specified column. Data-free. Privacy ranking: 1

No

No

Yes

Generates unique numeric strings of the same length as the input. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Default generator. Does not perform any action on the source data. Privacy ranking: 6

No

No

No

Generates a random phone number that matches the country or region and format of the input phone number. Privacy ranking: 3

Yes - Self

No

No

Generates a random boolean value. Data-free. Privacy ranking: 1

No

No

Yes

Generates a random double number between the specified min and max. Data-free. Privacy ranking: 1

No

No

Yes

Generates a random hash string. Data-free Privacy ranking: 1

No

No

Yes

Returns a random integer between the specified min and max. Data-free. Privacy ranking: 1

No

No

Yes

Generates random dates, times, and timestamps. Data-free. Privacy ranking: 1

No

No

Yes

Generates a random new UUID string. Data-free. Privacy ranking: 1

No

No

Yes

Uses regular expressions to parse strings.

Replaces specified substrings with output from selected sub-generators. Privacy ranking: 5

--

--

--

Generates a column of unique integer values that start with specified value and increment by 1. Privacy ranking: 3

No

Yes

No

Generates values of ISO 6346 compliant shipping container codes. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Generates a new valid Canadian Social Insurance Number. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self

No

Yes if not consistent

Generates a new valid United States Social Security Number. Data-free if not consistent. Privacy ranking: - 1 if not consistent - 4 if consistent

Yes - Self or other

No

Yes if not consistent

Can apply other generators on specific StructFields within a StructType in Spark databases (Databricks and Amazon EMR). Privacy ranking: 5

--

--

--

Shifts timestamps by a random amount of a specific unit of time, within a set range. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self or other

No

No

Generates unique email addresses.

Replaces the username with a randomly generated GUID, and masks the domain with a character scramble. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

A substitution cipher that preserves formatting but keeps the URL scheme and top-level domain intact. Privacy ranking: 3

No

No

No

Generates UUIDs on primary key columns. Privacy ranking: - 3 if not consistent - 4 if consistent

Yes - Self

No

No

Runs a selected generator on values that match a user-specified XPath. Privacy ranking: 5

--

--

--

Last updated