Generator summary
The following table summarizes the available generators. The table includes generator characteristics that you might take into account when you select the generator to use for a column.
Generator hints and tips also provides some suggestions for generators to use for specific use cases.
Generator | Description | Supported features |
---|---|---|
Address API: AddressGenerator | Generates replacement values for U.S. mailing addresses. You select the address component or format for the replacement values. For example, the column might only contain a street address or a postal code, or it might contain a full address. | Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent |
Identifies the algebraic relationship between 3 or more numeric values, including at least one non-integer. Based on the relationship, generates new values to match. If there is no relationship, uses the Categorical generator. | Linkable - linking is required Privacy ranking: 3 | |
Generates unique alphanumeric strings of the same length as the input.
For example, for the origin value | Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent | |
Within an array, replaces letters with random other letters, and numbers with random other numbers. Preserves punctuation and whitespace. | Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent | |
Used to transform array values in JSON. To identify values to transform, you provide a list of JSONPaths. For each JSONPath, you assign a sub-generator to apply to matching values. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 | |
Used to transform values in an array. To identify values to transform, you provide a regular expression. For each capture group in an expression, you assign a sub-generator to apply to matching values. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 | |
Generates unique alpha-numeric strings based on any printable ASCII characters. You can optionally exclude lowercase letters from the generated values. The replacement value does not preserve the length of the original value. | Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent | |
Generates a random company name-like string. | Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent | |
Shuffles the original values for a column to different rows. Maintains the overall frequency of each value.
For example, a column contains the values | Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy | |
Replaces letters with random other letters and numbers with random other numbers. Preserves punctuation, whitespace, and mathematical symbols. | Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent | |
Replaces characters with other random characters. Preserves punctuation, capitalization, and whitepace.
A replacement character is always from within the same Unicode Block as the source character.
A source character is always mapped to the same destination character. For example, | Always self-consistent Unique columns allowed Privacy ranking: 4 | |
Company Name (Deprecated) API: CompanyNameGenerator | This generator is deprecated. Use the Business Name generator instead. Generates a random company name-like string. | Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent |
Applies different generators to rows conditionally based on the column value. For example, apply the Character Scramble generator for values other than Test. You configure a list of conditions. Each condition performs a check against the column value. For each condition, you assign a sub-generator to apply to matching values. | Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: If a fallback generator is selected, then the lower of 5 or the fallback generator. 5 if no fallback generator is selected. | |
Uses a single specified value to replace all of the values in the column. The replacement value must be compatible with the column data type. | Differential privacy Data-free Privacy ranking: 1 | |
Generates a continuous distribution to fit the underlying data. Can link to other columns to create multivariate distributions. Can also be partitioned by other columns. | Linkable Differential privacy is configurable Privacy ranking: - 2 with differential privacy - 3 without differential privacy | |
Populates the column using the sum of values from a column in another table. To select the rows to use, uses a foreign key value that matches the primary key value for the current row. For example, to transform the Total_Sales column in the Customers table, from the Transactions table, use the sum of the Amount values for rows where the Customer_ID value matches the primary key value for the current customer. | Privacy ranking: 3 | |
CSV Mask API: CsvMaskGenerator | Used to mask text in a delimited format. Parses the text as a row where the columns are delimited by a specified character. For each index, you assign a sub-generator to apply to the index value. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 |
Replaces the original column value with a value from list of values that you provide. | Consistency - Self and other Linkable Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent | |
Truncates dates or timestamps to a specific date or time component. For example, you might truncate a date value to the month or a timestamp to the hour. | Privacy ranking: 5 | |
Email API: EmailGenerator | Scrambles characters in an email address. Preserves the formatting and keeps the | Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent |
Generates timestamps that fit an event distribution. You can link columns to create a sequence of events across multiple columns. You can also partition the generator by other columns. | Linkable Privacy ranking: 3 | |
Scrambles characters in a file name. Preserves the formatting and the file extension. | Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent | |
Replaces all instances of the find string with the replace string. For the find string, you can optionally provide a regular expression. | Privacy ranking: 5 | |
FNR API: FnrGenerator | Transforms Norwegian national identity numbers. You can optionally preserve the gender and birthdate portions of the identifier values. | Consistency - Self and other Unique columns allowed Privacy ranking - 3 if not consistent - 4 if consistent |
Geo API: GeoGenerator | Used to transform columns that contain latitude and longitude values. | Linkable Unique columns allowed Privacy ranking: 3 |
Can be used to generate cities, states, zip codes, and latitude/longitude values that follow HIPAA guidelines for safe harbor. | Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent | |
Generates random host names, based on the English language. | Consistency - Self and other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent | |
Used to transform values in an HStore column in a PostgreSQL database. You specify a list of keys for which to transform the values. For each key, you assign a generator to apply to the key value. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 | |
Used to transform columns that contain HTML content. To identify the values to transform, you provide a list of path expressions. For each path expression, you assign a generator to apply to the matching value. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 | |
Generates unique integer values. By default, the generated values are within the range of the column’s data type. You can also specify a range for the generated values. The source values must be within that range. | Differential privacy if not consistent Data-free if not consistent Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent | |
For Canadian mailing addresses, can generate:
For United Kingdom (UK) mailing addresses, can generate postal codes. | Consistency - Self only Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent | |
Generates a random IP address-formatted string. You specify the percentage of IPv4 addresses. The remaining addresses are IPv6. | Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent | |
Used to transform values in JSON columns. To identify values to transform, you provide a list of JSONPaths. For each JSONPath, you assign a sub-generator to apply to matching values. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 | |
Generates a random MAC address formatted string. | Consistency - Self only Differential privacy if not consistent Data-free if not consistent Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent | |
Generates unique MongoDB objectId values. Can be assigned to text columns that contain MongoDB ObjectId values. The column value must be 12 bytes long. | Consistency - Self only Privacy ranking: - 3 if not consistent - 4 if consistent | |
Name API: NameGenerator | Generates a random name string from a dictionary of first and last names. You specify the name format. For example, a column might contain only a first name, or a full name that is last name first. | Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent |
Masks values in numeric columns. Either adds or multiplies the original value by random noise. | Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent | |
Null API: NullGenerator | Replaces all of the column values with | Differential privacy Data-free Unique columns allowed Privacy ranking: 1 |
Generates unique numeric strings of the same length as the input numeric string. | Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent | |
Default generator. Does not perform any transformation on the source data. | Unique columns allowed Privacy ranking: 6 | |
Generates a random phone number that matches the country or region and format of the input phone number. For invalid phone numbers, either replaces individual numbers or generates a valid replacement number. | Consistency - Self only Privacy ranking: 3 | |
Generates a random boolean value. You specify the percentage of true values. The remaining values are false. | Differential privacy Data-free Privacy ranking: 1 | |
Generates a random double number that is between the specified minimum (inclusive) and maximum (exclusive) values. | Differential privacy Data-free Privacy ranking: 1 | |
Generates a random hash string. | Differential privacy Data-free Privacy ranking: 1 | |
Returns a random integer that is between the specified minimum (inclusive) and maximum (exclusive) values. | Differential privacy Data-free Privacy ranking: 1 | |
Generates random dates, times, and timestamps that fall within a specified range. | Differential privacy Data-free Privacy ranking: 1 | |
Random UUID API: UUIDGenerator | Generates a random new UUID string. | Differential privacy Data-free Unique columns allowed Privacy ranking: 1 |
To identify values to transform, you provide a regular expression. For each capture group in an expression, you assign a sub-generator to apply to matching values. | Unique columns allowed Composite generator. Other feature support is based on the sub-generators. Privacy ranking: 5 | |
Generates a column of unique integer values that start with specified value, and then increment by 1 for each processed row. | Linkable Unique columns allowed Privacy ranking: 3 | |
Generates values of ISO 6346 compliant shipping container codes. The codes are all in the freight ("U") category. | Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent | |
SIN API: SINGenerator | Generates a new valid Canadian Social Insurance Number. Preserves the formatting from the original value. | Consistency - Self only Data-free if not consistent Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 1 if not consistent - 4 if consistent |
SSN API: SsnGenerator | Generates a new valid United States Social Security Number. For numeric columns, the dashes (xxx-xx-xxxx) are always excluded. Otherwise, you can specify the percentage of values for which to include the dashes. | Consistency - Self or other Differential privacy if not consistent Data-free if not consistent Privacy ranking: - 1 if not consistent - 4 if consistent |
Used to transform StructFields within a StructType in Spark databases (Databricks and Amazon EMR). To identify the StructField value to transform, you provide a path expression. For each path expression, you assign a sub-generator to apply to the matching values. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 | |
Shifts timestamps by a random amount of a specific unit of time, within a set range. The range can start before the original value. | Consistency - Self or other Privacy ranking: - 3 if not consistent - 4 if consistent | |
Generates unique email addresses. Replaces the username with a randomly generated GUID, and masks the domain with a character scramble. | Consistency - Self only Unique columns allowed Privacy ranking: - 3 if not consistent - 4 if consistent | |
URL API: UrlGenerator | Used to transform URLs. Preserves the formatting. Keeps the URL scheme and top-level domain intact. | Unique columns allowed Privacy ranking: 3 |
UUID Key API: UuidPkGenerator | Generates UUIDs. | Consistency - Self only Primary key generator Unique columns allowed Format-preserving encryption (FPE) Privacy ranking: - 3 if not consistent - 4 if consistent |
XML Mask API: XmlMaskGenerator | Used to transform values in XML columns. To identify the values to transform, you provide XPaths. For each XPath, you assign a sub-generator to apply to the matching values. | Composite generator. Feature support is based on the sub-generators. Privacy ranking: 5 |
Last updated