Tonic Structural uses sensitivity scans to identify source data columns that contain sensitive information. You can also manually mark a column as sensitive.
Structural runs sensitivity scans automatically. You can also run a manual sensitivity scan.
Structural automatically runs a sensitivity scan when you create a completely new workspace and connect a data source.
Structural also runs a new sensitivity scan when you change the data connection details for the source database.
For a file connector workspace, Structural runs a sensitivity scan when you add a file group.
A child workspace always inherits the sensitivity designations from its parent workspace.
When you copy a workspace, Structural runs a new sensitivity scan on the copy to identify sensitive columns. However, it keeps the sensitivity designation for columns that you specifically marked as sensitive or not sensitive.
In addition to the automatic scans, from Privacy Hub, you can start a sensitivity scan manually.
To identify that a column contains sensitive information, Structural looks at the data type, column name, and column values. To help identify sensitive column values, the scan uses regex matching and dictionary lookups.
This process cannot guarantee perfect precision and recall. We strongly recommend that a human reviews the sensitivity scan results and the broader dataset to ensure that nothing sensitive was missed.
Structural identifies the following types of sensitive values. These include some information types that are considered by many privacy standards and frameworks such as HIPAA, GDPR, CCPA, and PCI.
Names
First
Last
Full
Organization
Location
Street address
ZIP
PO Box
City
State and two letter abbreviation
Country
Postal code
Contact information
Email address
Phone number
Password
Financial information
Credit card number
International bank account number (IBAN)
SWIFT code for bank transfers
BTC (Bitcoin) address
Identification
Social Security Number
Birth dates
Gender
Network location
IP address
IPv6 address
MAC address
International Mobile Equipment Identity (IMEI)
Vehicle identification number (VIN)
ICD-9 and ICD-10 codes (Used to identify diseases)
To download the log of the most recent sensitivity scan:
On the workspace management view, from the download menu, select Download Sensitivity Scan Log.
On Privacy Hub, click Download, then select Scan Log.
The log tracks the progress of the scan.
For improved performance, sensitivity scans can use parallel processing.
For relational databases such as PostgreSQL and SQL Server, to configure parallel processing, you use the environment setting TONIC_PII_SCAN_PARALLELISM_RDBMS
. The default value is 4.
For document-based databases such as MongoDB, you use the environment setting TONIC_PII_SCAN_PARALLELISM_DOCUMENTDB
. The default value is 1.
For information about how to configure environment settings, go to Configuring environment settings.
For each type of detected sensitive data, Structural suggests a recommended generator. For example, for a Social Security Number, Tonic recommends the SSN generator. For a first name, Structural recommends the Name generator configured with First as the value type.
From Privacy Hub, you can review and apply the recommended generators to columns that the sensitivity scan detected.
For more information, go to Reviewing and applying recommended generators.
The sensitivity scan provides an initial assessment of which column values are sensitive.
You can also indicate manually that a column is sensitive or not sensitive.
Privacy Hub, Database View, and Table View all provide options to indicate whether a column is sensitive or not sensitive.
The Structural API also provides endpoints to designate columns as sensitive or not sensitive.