Categorical

The Categorical generator shuffles the existing values within a field while maintaining the overall frequency of the values. It disassociates the values from other pieces of data. Note that NULL is considered a separate value.

For example, a column contains the values Small, Medium, and Large. Small appears 3 times, Medium appears 4 times, and Large appears 5 times. In the output data, each value still appears the same number of times, but the values are shuffled to different rows.

This generator is optimized for categories with fewer than 10,000 unique values. If your underlying data has more unique values (for example, your field is populated by freeform text entry), we recommend that you use the Character Scramble or Custom Categorical generator.

Characteristics

Consistency

No, cannot be made consistent.

Linking

Yes, can be linked.

Differential privacy

Configurable

Data-free

No

Allowed for primary keys

No

Allowed for unique columns

No

Uses format-preserving encryption (FPE)

No

Privacy ranking

  • 2 if differential privacy enabled

  • 3 if differential privacy not enabled

Generator ID (for the API)

How to configure

To configure the generator:

  1. From the Link To dropdown, select the columns to link to the current column. You can select from other columns that use the Categorical generator.

  2. Toggle the Differential Privacy setting to indicate whether to make the output data differentially private. By default, differential privacy is disabled.

  3. If Structural data encryption is enabled, then to use it for this column, toggle Use data encryption process to the on position.

Last updated