Structural identifies the following types of sensitive values. These include some information types that are considered by many privacy standards and frameworks such as HIPAA, GDPR, CCPA, and PCI.
For more information about the HIPAA and Safe Harbor information types that Structural detects, go to the Tonic.ai guide Using Tonic Structural and the Safe Harbor method to de-identify PHI.
Names
First
Last
Full
Organization
Location
Street address
ZIP
PO Box
City
State and two-letter abbreviation
Country
Postal code
GPS coordinates
Contact information
Email address
Telephone number
User credentials
Username
Password
Financial information
Credit card number
International bank account number (IBAN)
SWIFT code for bank transfers
Money amount
BTC (Bitcoin) address
Identification
Social Security Number
Passport number
Driver's license number
Birth date
Gender
Biometric identifier, such as a fingerprint or voiceprint
Full face photographic images and similar images
Medical information
ICD-9 and ICD-10 codes (Used to identify diseases)
Medical record number
Health plan beneficiary number
Admission date
Discharge date
Date of death
Other personal information
Marital status
Accounts and licenses
Account number
Certificate or license number
Network and web location
IP address
IPv6 address
MAC address
Web URL
International Mobile Equipment Identity (IMEI)
Vehicle information
Vehicle identification number (VIN)
License plate number
You can also manually indicate that a column is sensitive or not sensitive.
For example, the sensitivity scan might incorrectly identify a column as sensitive. Or a column might contain data that you consider sensitive but that does not match a detected sensitivity type.
When you manually change a column from not sensitive to sensitive, Structural marks the sensitivity detection as full confidence.
For information on how to change whether a column is sensitive:
For Privacy Hub, go to .
For Database View, go to:
For a single column,
For multiple selected columns,
For Table View, go to .
The Structural API also provides endpoints to designate columns as sensitive or not sensitive.
Required license: Enterprise
Required global permission: Create and manage sensitivity rules
By default, when a Structural security scan runs on a workspace, it looks for the .
You can also define custom sensitivity rules to identify other values and the corresponding recommended generator. Your data might include values that are specific to your organization.
Each custom sensitivity rule specifies:
The data type for matching columns.
Text matching criteria for the names of matching columns.
The recommended generator preset.
To display the current list of sensitivity rules, in the Structural navigation menu, click Sensitivity Rules.
The list contains the sensitivity rules for a self-hosted Structural instance or a Structural Cloud organization.
For each rule, the list includes:
The rule name and description
The recommended generator preset
When the rule was most recently modified
You can filter the rule list by the following:
Rule name
Rule description
Generator preset name
Name of the user who most recently updated the rule
In the filter field, start to type text from any of those values. As you type, the list is filtered to only include matching rules.
Note that when the list is filtered, you cannot change the display sequence of the rules.
Structural applies the rules based on their display order in the list.
If a column matches more than one rule, Structural applies the first matching rule.
To change the display order of a rule, drag and drop it to the new location in the list.
Note that you cannot change the rule sequence when the list is filtered.
To create a sensitivity rule:
On the Sensitivity Rules view, click New Custom Rule.
Click Save.
To change the configuration of a sensitivity rule:
On the Sensitivity Rules view, click the edit icon for the rule.
Click Save.
Note that any changes to a sensitivity rule do not take effect until the next sensitivity scan.
In the Name field, type the name of the sensitivity rule. The rule name becomes the sensitivity type for matching columns. The rule name must be unique, and also cannot match the name of a built-in sensitivity type.
Optionally, in the Description field, type a longer description of the sensitivity rule.
From the Data Type dropdown list, select the data type for matching columns. For example, a rule might only be used for columns that contain text.
The available data types are general types that map to specific data types in a given database. The available types are:
Array
Binary
Boolean
Continuous Numerical
Date Range
Datetime
Integer
JSON
MAC Address
Network Address
Text
UUID
XML
Under Column Name Match, provide the criteria to identify matching columns based on the column name.
Note that a matching column must match both the data type and the column name criteria.
When you provide a list of text matching conditions, a matching column must match all of the conditions. In other words, the conditions are joined by AND
.
To apply the same generator preset to columns that have completely different names, you must create separate sensitivity rules.
To create a list of text matching conditions:
Click Text Match.
To add a column name condition, click Add String Match.
For each condition:
From the comparison type dropdown list, select the type of comparison. For example, Contains, Starts with, Ends with.
In the comparison text field, provide the text to check for.
The comparison text is case insensitive. For example, if you set a condition to match column names that contain the text term
, it also matches column names that contain TERM
or Term
or tErM
.
To remove a column name condition, click its delete icon.
To use a regular expression to identify matching columns based on the column name:
Click Regular Expression.
In the field, provide the regular expression.
From the Recommended Generator Preset dropdown list, select the generator preset that is the recommended generator for matching columns.
To search for a specific preset, begin to type the generator preset name.
Required global permission: Create and manage generator presets
When you configure a sensitivity rule, you can also create a new generator preset or update the configuration of the selected generator preset.
To create a new generator preset, click Create Preset. On the generator preset details panel, provide the generator preset configuration, then click Create.
To edit the selected generator preset, click Edit Current Preset. On the generator preset details panel, update the generator preset configuration, then click Save and Apply.
If you have access to a workspace, then you can use the workspace to preview the sensitivity rule results.
Under Test Results, from the workspace dropdown list, select the workspace to use.
Structural searches the workspace schema for matching columns based on the sensitivity rule configuration.
It displays any matching columns. You can filter the matching columns based on the table or column name.
For each matching column, the list includes:
The column name and table
A sample value from the source data. To see the sample source value, you must have the Preview source data permission for the workspace.
A sample replacement value, based on the selected generator preset for the sensitivity rule. To see the sample replacement value, you must have the Preview destination data permission for the workspace.
To delete a sensitivity rule, on the Sensitivity Rules view, click the delete icon for the rule.
Note that existing generator recommendations for the rule remain in place until the next sensitivity scan.
Structural runs sensitivity scans automatically based on specific events. You can also run manual sensitivity scans on demand.
On a self-hosted instance, sensitivity scans can also run automatically at the same time each day.
Structural automatically runs a sensitivity scan when you:
Create a completely new workspace and connect a data source.
Change the data connection details for the source database.
Add a file group to a file connector workspace.
A child workspace always inherits the sensitivity designations from its parent workspace.
When you copy a workspace, Structural runs a new sensitivity scan on the copy to identify sensitive columns. However, it keeps the sensitivity designation for columns that you specifically marked as sensitive or not sensitive.
In addition to the automatic scans, from Privacy Hub, you can .
On self-hosted instances, Structural can also run scheduled daily sensitivity scans in the background.
The daily scans only run on the 10 workspaces that had the most recent activity. Activity includes:
Data generation jobs.
By default, Structural runs the sensitivity scans each day at midnight.
TONIC_ENABLE_SCHEDULED_SENSITIVITY_SCAN
- Boolean to indicate whether to enable the scheduled daily sensitivity scans.
The default value is true
. To disable the scheduled daily scan, set this to false
.
TONIC_SENSITIVITY_SCAN_HOUR
- When scheduled scans are enabled, the hour at which to run the scans. The setting uses the local time zone.
The value is an integer between 0 and 23, where 0 is midnight and 23 is 11:00 PM.
For example, a value of 14 indicates to run the job at 2:00 PM.
The default value is 0.
For improved performance, sensitivity scans can use parallel processing.
For document-based databases such as MongoDB, you use the environment setting TONIC_PII_SCAN_PARALLELISM_DOCUMENTDB
. The default value is 1.
The Structural sensitivity scan uses the following rules and processes to:
Identify sensitive columns.
Indicate its confidence that an identified column is sensitive and is of the detected sensitivity type.
Note that this process cannot guarantee perfect precision and recall. We strongly recommend that a human reviews the sensitivity scan results and the broader dataset to ensure that nothing sensitive was missed.
This part of the sensitivity scan uses regular expression matching and dictionary lookups. It produces high, medium, or low confidence detections.
When this part of the sensitivity scan determines that a column contains sensitivity data, it:
Marks the column as sensitive
Assigns the sensitivity type to the column
Recommends the generator configuration for the identified sensitivity type. Note that if the recommended generator is not compatible with the column, then Structural discards the recommendation.
Marks the sensitivity detection as high, medium, or low confidence. The confidence level is based on a calculation of how well the column matched the applicable rules.
The sensitivity scan also looks for any columns that match custom sensitivity types that you define in your custom sensitivity rules.
Custom sensitivity rules always produce full confidence detections.
When a column matches a custom sensitivity rule, Structural:
Marks the column as sensitive.
Assigns the sensitivity rule name as the sensitivity type.
Recommends the generator preset from the sensitivity rule.
Marks the sensitivity detection as full confidence.
To identify additional sensitive columns that might not be captured by the other parts of the scan, the sensitivity scan uses an artificial intelligence (AI) model. Note that the model is pre-trained. Structural does not use customer data to train the model, and it does not send any customer data externally.
This part of the scan produces medium or low confidence detections for built-in entity types.
The model considers the table and column name. If the combination of table and column name is similar in meaning to a sensitivity type that Structural has a recommended generator for, then Structural:
Marks the column as sensitive.
Assigns the sensitivity type to the column.
Recommends the generator configuration for that sensitivity type.
Uses AI to compare the table name and column name combination to the sensitivity type, and produces a semantic similarity score.
Based on the semantic similarity score, marks the sensitivity detection as either medium or low confidence.
To download the log of the most recent sensitivity scan, either:
On the workspace management view, from the download menu, select Download Sensitivity Scan Log.
On Privacy Hub, click Reports and Logs, then select Scan Log.
The log tracks the progress of the scan.
On the Create Custom Rule view, .
On the Edit Custom Rule view, .
For more information about generator preset configuration, go to .
User-initiated updates that are included in the .
To enable and configure the daily sensitivity scans, use the following . You can add these settings to the Environment Settings list on Structural Settings.
For relational databases such as PostgreSQL and SQL Server, to configure parallel processing, you use the TONIC_PII_SCAN_PARALLELISM_RDBMS
. The default value is 4.
Recommend generators for those columns. For information about applying recommended generators to columns, go to .
To identify that a column contains sensitive information for a , Structural looks at the data type, column name, and column values.
Custom sensitivity rules are based on the column data type and column name. For more information about custom sensitivity rules, go to .
Run the Structural sensitivity scan
Run, configure, and get the results of the sensitivity scan.
Set column sensitivity manually
Options to override the results of the sensitivity scan.
Built-in sensitivity types
Types of sensitive data that the sensitivity scan can identify.
Configure custom sensitivity rules
Set up rules to enable the scan to identify other sensitive columns based on the column data types and names.