Tonic
Search…
Consistency
Consistency is an option for some generators that when turned on, the same input will map to the same output across an entire database. For example, if consistency is turned on for a name generator, it will always map the same input name (e.g. Albert Einstein) to the same output (e.g. Richard Feynman).
The primary reasons for using consistency are to:
    Enable joining on columns that have explicit database constraints in the schema. This is often seen with things like email addresses. With consistency, you can completely anonymize an email address and still use it in a join.
    Preserve the cardinality of a column. For example, say you have a city column with 50 different cities and you want to randomize this column but still have 50 cities, you can use consistency to maintain the cardinality.
    Match duplicated data across 1 or more databases. For example, you might have a user database that contains a username in both a column and a JSON blob as well as another database that contains their website activity, identified by the same username values. To anonymize the username, but still have the username be the same in all locations/databases, use consistency.
Consistency can be enabled by simply toggling the 'Consistency' switch when adding a generator to a column. Note that not all generators support consistency.
Address Generator with Consistency Switch
Consistency is a function of the data type and the value. For example, a numeric field with value 123 and a string/varchar field with value "123" both with consistent generators applied will not have a consistent output between the two fields.

Consistency Example

Non-consistent Address Generator (City) output
Consistent Address Generator (City) output
The first image shows an address generator being used in a non-consistent fashion. Notice that the city of Atlanta is initially mapped to San Diego, but future occurrences of Atlanta are mapped to different (random) cities such as Grand Junction, Long Beach, and Phoenix.
In the second image we use an address generator in a consistent manner. Now, the city of Atlanta is consistently mapped to San Diego.

Consistency does not imply uniqueness

A consistent generator ensures that the same input always gives the same output. It does not guarantee that two different inputs will yield different outputs. In other words, consistent generators are not 1:1 mappings.

Across an entire database

Any column, regardless of which table it resides in, will be consistent with any other columns using the same consistent generator. However, consistency is not guaranteed between data generation runs (whether on the same database or not) by default. In order to enable cross-db and cross-run consistent generation please see the next section.

Across runs, or multiple databases

In order to ensure consistency across data generations you need to add a system environmental variable to the tonic_worker and tonic_webserver container. The environmental variable to add is
1
TONIC_STATISTICS_SEED: <ANY 32 BIT SIGNED INTEGER>
Copied!

Generators that can be made Consistent

The following generators have the option to be made consistent:
Last modified 2mo ago