TOИIC
Search…
Using the Privacy Report to verify data protection

About the Privacy Report

Data privacy in Tonic is measured by the sensitivity of the data and the level of protection applied.
Another consideration is the use case, or the purpose and audience for the data. This is external to Tonic, but influences the protective actions that you take in Tonic.
The Privacy Report captures details about the level of data protection that is in place with Tonic. It is used at the following points in the de-identification process.
  • You can use a preview as a checkpoint while you configure the data protection, to review the generators that you applied or to look for at-risk data. You also can export the preview from Tonic before you run a generation, to increase your confidence or to confirm that the de-identification configuration is complete.
  • Every time you run a data generation job to populate the destination database with de-identified data, Tonic creates a new Privacy Report. The Privacy Report records the protection status of the data that is associated with that run.
The Privacy Report helps to answer the following questions:
  • What is the value of Tonic?
  • How do I know the data is safe for use?
  • How was the data protected?
The Privacy Report consists of summary statistics and field level details in a downloadable .csv file. Here is a stylized version of the report that shows the column groupings:
Example Privacy Report with the column groups labeled

How to display the Privacy Report

You can display a preview version of the Privacy Report from Privacy Hub.
When you run a data generation job, Tonic generates a version of the Privacy Report that reflects the generation results.

Preview in Privacy Hub

The preview of the Privacy Report is a snapshot of the current generator configuration in the workspace.
You can use the Privacy Hub to track your progress, toggling sensitivity and applying generators until all of the necessary fields are masked.
When you are ready to generate, or at any point during this process, you can export a preview from Tonic.
Option to download the Privacy Report preview on Privacy Hub
You can use the exported .csv file to review the configuration. You can also share it with others to obtain approval before you run a generation. When you are comfortable with the generators that are in place, you can run the data generation.
Note that the preview is not tied to any version of output data in the destination database. It only reflects Tonic's state at a point in time.

Privacy Report in Job Details

The Privacy Report captures the privacy associated with a particular generation job. Tonic creates a snapshot for each generation. The Privacy Report for a data generation job reflects the output data in the destination database at a point in time.
The job details for each generation job provides access to the Privacy Report for the job.
The Privacy Report tab of the Generation Job Details page displays the following summary statistics for the data protection:
  • At-Risk - The number of columns that are sensitive, but that have Passthrough as the assigned generator.
  • Protected - The number of columns that have a generator other than Passthrough assigned. This includes both sensitive and non-sensitive columns.
  • Non-Sensitive - The number of columns that are not sensitive, and that have Passthrough as the assigned generator. Also includes columns that are in truncated tables, where no data is copied to the destination database.
Job details page for a data generation job
To export the full details of the Privacy Report, click Download Privacy Report CSV.

Report definition

The fields for each row in the Privacy Report fall into the following categories.

Schema

The Privacy Report includes all of the schema detail that is viewable in the Tonic application, such as Database View and Table View). The schema in the source matches the destination.
The schema information is contained in the following columns:
  • Schema - Schema name from the source database.
  • Table - Table name from the source database.
  • TableMode - The table mode that is currently applied to the table.
  • Column - Column name from the source database.
  • DataType - Data type that is detected in the source database.

Data sensitivity

Data sensitivity reflects attributes such as:
  • Whether the data includes personally identifiable information (PII)
  • Whether the data is regulated by law
  • Whether the data is business confidential
It affects decisions on how to protect the data.
During the privacy scan, Tonic identifies suspected sensitive fields. You can also manually indicate that a column is sensitive or not sensitive.
The data sensitivity information is contained in the following columns:
  • IsSensitive - Indicates whether the column is currently flagged as sensitive. TRUE indicates that the column is currently flagged as sensitive. This includes columns that Tonic detected automatically, and fields that you flagged as sensitive. FALSE indicates that the column is currently flagged as not sensitive. This includes columns that Tonic flagged as sensitive, but that you changed to not sensitive.
  • SensitiveType - For fields that Tonic identifies as sensitive, the detected data type. For example, Tonic detects a field of type Address that might be sensitive. Manually flagged fields do not have a value for SensitiveType.

Protection

Tonic's generators are the core feature which protects sensitive information in ways that retain utility for downstream data consumers.
The protection section of the Privacy Report provides key details to Tonic users or external stakeholders about how the masking transformations protect data.
The protection information is contained in the following columns:
  • GeneratorId - The generator that is currently applied to the column. For information about how each generator transforms data, see the Generator reference.
  • IsConsistent - Indicates whether consistency is enabled for a given field. Consistency ensures that a given input always results in the same output. It retains data utility at the cost of a higher level of protection. When consistency is on, ColumnPrivacyStatus is Masked instead of Anonymized. For more information, see Privacy Status.
  • ConsistencyColumn - In some cases, a column is configured to be consistent to another column. If the consistency is to another column, then ConsistencyColumn contains the name of that column.

Privacy status

Privacy status reflects the level of privacy associated with a given protection mechanism.
The following columns are linked to the selected generator and the configured generator settings.
  • IsDifferentiallyPrivate - Indicates whether the assigned generator supports differential privacy and that differential privacy is enabled. TRUE indicates that both of these are true. FALSE indicates that either the assigned generator does not support differential privacy, or that differential privacy is not enabled. Differential privacy guarantees the highest level of privacy, and eliminates the ability to re-identify the data.
  • IsDataFree - Indicates whether the assigned generator uses the underlying data. If the output data is completely unlinked to the source data, the generator is considered data-free, with a high degree of protection.
The ColumnPrivacyStatus column provides the overall privacy status of the column. The possible privacy status values are:
  • Unprotected - Applied to columns that are flagged as sensitive, but that have Passthrough as the assigned generator.
  • Masked - Applied to columns that have a generator other than Passthrough assigned. The selected generator provides some protection against seeing source data. If both IsDifferentiallyPrivate and IsDataFree are FALSE, then ColumnPrivacyStatus is Masked. Consistency decreases the protection level. If consistency is enabled, then ColumnPrivacyStatus is Masked.
  • Anonymized - Applied to columns for which the assigned generators and the generator configuration are guaranteed against reverse engineering. The assigned generator either uses differential privacy, or is considered data-free, where the output data is completely unlinked from the source data. The assigned generator does not have consistency enabled.
  • NonSensitive - Applied to columns that are flagged as non-sensitive and that have Passthrough as the assigned generator. Also applied to columns in tables that are truncated.