Privacy Hub automatically finds your most sensitive information and provides recommendations on how to mask it - all with only a few clicks. To guarantee process integrity, the Privacy Hub tracks all actions in an immutable log.
When you first connect Tonic to a new datasource, privacy hub scans your datasource and displays any potentially risky data. You can then choose to use the automatically suggested replacements or customize your transformation. Once you’re satisfied with your transformations you can generate a new dataset with confidence.
Tonic uses a variety of signals to identify PII. For example, Tonic analyzes column metadata such as data type, column name, as well as the uniqueness of the column values. Tonic also scans the actual data and uses a combination of regex matching, dictionary lookups, as well as NER (named entity recognition) algorithms to help identify PII. Like all model-based approaches, this process is not flawless and cannot guarantee perfect precision and recall for our models. We strongly recommend a human review of the results of the privacy scan as well as the broader dataset to ensure that nothing sensitive has been missed.
State and two letter abbreviation
Social Security Number
National ID number
When the privacy scan finishes, you will see the results page with the following components:
Table listing the columns flagged as potentially having sensitive data that have not been protected or marked as not sensitive
The name (including schema and table path) of the column that has been flagged
The suggested generator to replace data in this column
An option to override the suggested generator with another generator
Clicking on the X marks this column as not containing sensitive data and removes it from this list. Once marked as not sensitive, re-running the privacy scan will not reflag this column as sensitive.
The immutable audit log of all modifications to columns marked as sensitive. This includes applying generators to these columns either in the privacy hub or in other locations and marking this column as not sensitive.
Shows when the privacy scan was last run and lets you trigger another scan manually