Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
During workspace creation, under Connection Type, select MongoDB.
The Source Settings section contains the connection information for the source database.
To provide the connection details, you can either populate the connection fields or use a connection string.
By default, Use connection string is off, and you provide the connection values in the individual fields:
In the Server field, provide the host name or IP address of the MongoDB instance.
In the Database field, provide the name of the MongoDB database.
In the Port field, provide the port number to connect to the server host.
In the Username field, provide the username of a MongoDB user in your authentication database.
In the Password field, provide the password of the specified MongoDB user.
In the Authentication Database, provide the database where the MongoDB user that you authenticate with is stored. The default is often the admin database.
To test the connection to the source database, click Test Source Connection.
To use a connection string to connect to the source database:
Toggle Use Connection String to the on position.
In the Connection String field, provide a MongoDB connection string.
For the password, use <password>
as a placeholder value.
In the Database field, provide the name of the MongoDB database.
In the Password field, provide the password to use to replace <password>
.
To test the connection to the source database, click Test Source Connection.
The Use Srv setting indicates whether you are connecting to a DNS seed list.
By default, the toggle is in the off position.
If you connect to a DNS seed list, then toggle the setting to the on position.
The Enable SSL/TLS indicates whether to encrypt the source database authentiation.
By default, the toggle is in the on position. We strongly recommend that you do not turn off this setting.
To indicate that Tonic Structural should trust the server certificate, toggle Trust Server Certificate to the on position.
To specify your own client certificate for authentication:
Click the expand icon for Client certificate settings.
For Client Cert, choose the client certificate file.
For Client Key, choose the key file for the client certificate.
For Root Cert, choose the root certificate file.
By default, data generation is not blocked as long as schema changes do not conflict with your workspace configuration.
To block data generation when there are any schema changes, regardless of whether they conflict with your workspace configuration, toggle Block data generation on schema changes to the on position.
The Destination Settings section contains the connection information for the destination database.
To copy the connection and authentication details from the source database:
Click Copy Settings from Source.
In the Password field, provide the password.
To test the connection to the destination database, click Test Destination Connection.
If you don't copy the details from the source database, then you can either populate the connection fields or use a connection string.
By default, Use connection string is off, and you provide the connection values in the individual fields:
In the Server field, provide the host name or IP address of the MongoDB instance.
In the Database field, provide the name of the MongoDB database.
In the Port field, provide the port number to connect to the server host.
In the Username field, provide the username of a MongoDB user in your authentication database.
In the Password field, provide the password of the specified MongoDB user.
In the Authentication Database, provide the database where the MongoDB user that you authenticate with is stored. The default is often the admin database.
To test the connection to the destination database, click Test Destination Connection.
To use a connection string to connect to the destination database:
Toggle Use Connection String to the on position.
In the Connection String field, provide a MongoDB connection string.
For the password, use <password>
as a placeholder value.
In the Database field, provide the name of the MongoDB database.
In the Password field, provide the password to use to replace <password>
.
To test the connection to the destination database, click Test Destination Connection.
The Use Srv setting indicates whether you are connecting to a DNS seed list.
By default, the toggle is in the off position.
If you connect to a DNS seed list, then toggle the setting to the on position.
By default, SSL is enabled, and Enable SSL/TLS is in the on position. We strongly recommend that you do not turn off SSL.
To indicate that Structural should trust the server certificate, toggle Trust Server Certificate to the on position.
To specify your own client certificate for authentication:
Click the expand icon for Client certificate settings.
For Client Cert, choose the client certificate file.
For Client Key, choose the key file for the client certificate.
For Root Cert, choose the root certificate file.
Tonic Structural supports MongoDB server version 3.6 and above.
You can also use the MongoDB data connector to connect to an Amazon DocumentDB database.
MongoDB is an open source NoSQL database. MongoDB does not store data in tables according to a rigid schema. Instead, MongoDB stores data in documents that have a flexible schema or no schema at all.
Tonic Structural can work with MongoDB data that is either hosted on Atlas or is self-hosted.
You can also use the MongoDB data connector to connect to an Amazon DocumentDB database.
When configuring subsetting in a MongoDB workspace, you can filter a target collections using either a percentage or a filter query.
For example, a customers collection contains the customer first name, last name, and membership status, as shown in the following sample records:
To filter the records to only include members, the filter query would be:
As another example, the following filter query checks whether the eighth character in an object ID called _id
is equal to 9.
Required license: Professional or Enterprise
The data structure for MongoDB is different from that of the relational data connectors.
Instead of schemas, tables, and columns, a MongoDB database consists of collections and fields. A field might be an object that is made up of other fields. In other words, a collection can have a tree structure that is similar to a JSON document.
For workspaces that use MongoDB, the Tonic Structural application has the following differences.
On the following Structural views, the term "collection" replaces the term "table".
References to columns are also replaced:
On Privacy Hub, the protection status panels refer to "fields" instead of "columns".
On the Schema Changes view, the change lists refer to "paths" instead of "columns".
For MongoDB workspaces, Structural must scan each collection to determine the fields and data types within that collection. Until a scan is performed, you cannot configure the collection modes and generators.
For MongoDB workspaces, Privacy Hub includes an additional Latest Collection Scan section that shows the most recent time that a scan was performed on each scanned collection.
For more information, go to Performing scans on collections.
For MongoDB workspaces, there are no options to download a Privacy Report CSV or PDF.
MongoDB workspaces do not support workspace inheritance.
For MongoDB workspaces, there is no Database View or Table View. Instead, MongoDB workspaces have a Collection View.
This view allows you to perform the same functions as Table View, but the display is more like Database View. For more information, go to Using Collection View.
Collection mode is the term for table mode in MongoDB workspaces.
MongoDB only supports De-Identify, Truncate, and Preserve Destination modes.
MongoDB workspaces cannot use the following generators:
Algebraic
Alphanumeric Key
Array Character Scramble
Array JSON Mask
Array Regex Mask
Cross-Table Sum
CSV Mask
Event Timestamps
HTML Mask
JSON Mask
SIN
Timestamp Shift
URL
MongoDB workspaces do not support upsert.
For MongoDB workspaces, you cannot write the destination data to container artifacts.
For MongoDB workspaces, you cannot write the destination data to an Ephemeral snapshot.
For MongoDB workspaces, there is no option to run post-job scripts after a job.
You can create webhooks that are triggered by data generation jobs.
Required workspace permission: Run collection scan
When you first connect to a MongoDB database, Tonic Structural performs a scan to determine the available fields in each collection, the field types, and how prevalent the fields are. It performs this scan at the same time as the initial sensitivity scan.
For each collection, Structural creates a hybrid document, which is a superset of all of the fields contained in the collection documents.
By default, for each collection:
The scan includes all of the documents in the collection, and continues until the scan is finished.
Every unique path (field+data type) in the collection is added to the hybrid document.
You can change the default scan behavior. To change the scan configuration, use the following environment settings. You can add these settings manually to the Environment Settings list on Structural Settings.
The following options control the number of documents that Structural scans in a collection.
These options allow you to limit the number of scanned documents when the additional documents will not add fields to the hybrid document. For large homogenous collections, where all or most documents have the same structure, configuring these options can improve performance.
TONIC_DOCUMENT_SCAN_MAX_DOCS_COUNT
The maximum number of documents to scan for each schema in a collection. For example, if this is 10, then Structural scans up to 10 documents, and ignores the remaining documents. When this value is empty, Structural scans all of the documents.
TONIC_DOCUMENT_SCAN_MAX_TIME_SECONDS
The maximum amount of time in seconds to scan a schema. For example, if this is 360, then Structural scans a schema for up to 360 seconds. When this value is empty, Structural continues the scan until it is complete.
If you set both options, then the scan completes when it reaches either limit. For example, if the maximum document count is 10 and the maximum scan time is 360 seconds, then the scan completes either after 10 documents or after 360 seconds, whichever comes first.
Typically, the number of unique fields in a collection is small relative to the number of documents. However, in some cases the number of fields is similar to or greater than the number of documents. This most commonly occurs when documents have "data as keys", such as keys that are ObjectIds, UUIDs, or incrementing integers.
In these cases, adding every unique field to the hybrid document can result in a large hybrid document that has an undesirable structure.
Structural offers configuration options to "collapse" fields within the hybrid document. This shrinks the size of the hybrid document. It also allows you to assign a generator to the collapsed group instead of to each unique key.
By default, Structural does not collapse fields.
To enable this, set TONIC_MONGO_OBJECT_ID_COLLAPSE_THRESHOLD
to the number of ObjectId keys that an object can contain before Structural collapses the object schema into a single key.
For example, if this is 10, then any object that has 10 or more ObjectId keys is collapsed into a single key.
A negative value indicates to not collapse the keys. The default value is -1.
To enable Structural to collapse fields, you provide a regular expression to identify the fields that can be collapsed into the same field. You then configure the number of matches that must exist before Structural collapses the fields.
To configure how the fields are collapsed:
TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX
The regular expression that identifies the fields that can be collapsed into a single field. By default, this value is empty.
TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX_THRESHOLD
The number of fields that match the regular expression before Structural collapses the fields into a single field. For example, if this is 5, then once Structural finds 5 fields that match the regular expression, it collapses all of the matching fields into a single field. A negative value indicates to not collapse the fields. The default value is -1.
For example:
To collapse keys that are integer values, use the regular expression [0-9]+
or \d+
To collapse keys that are UUIDs, use the regular expression [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}
On Privacy Hub, the Latest Collection Scan table shows the most recent scans on each scanned collection.
The Build Schema option runs a new scan on the collection.
When the source database has a new collection, then on Collection View, you are prompted to run a scan either on that collection or on all collections.
For MongoDB, Collection View replaces Database View and Table View. From Collection View, you can view the fields in a selected collection. You can then assign a collection mode to the collection, and assign generators to fields.
From the Collection dropdown list, select the collection to view.
The collection mode is the term used in MongoDB for table mode. The collection mode determines at the collection level how the data for the collection is used to generate the destination database.
By default, the collection mode is De-Identify. In this mode, Structural uses the assigned generators to transform the source database into the destination database.
For MongoDB, the only other options are Truncate and Preserve Destination. Truncate means that only the collection structure is included in the destination database. The collection has no data in the destination database. Preserve Destination means that Tonic does not change the data that is currently in the destination database.
To assign the collection mode:
Click the Collection Mode dropdown list.
On the panel, click the current collection mode.
From the drop-down list, select the mode to use.
You can view a collection either as a hybrid document or as single documents. From the View dropdown list, select the view to use.
The default view is Hybrid Document. For the hybrid document view, the key list reflects all of the permutations of every field from every document. For example, a field might sometimes be a datetime value and sometimes a string. Hybrid document view lists both types.
Single Document view displays a single document at a time. You can then page through up to 100 documents. For each document, you see the structure for that particular document.
For each field, Collection View always displays:
The field name and type.
For fields that you configured as primary or foreign keys, a key icon.
The assigned generator.
An example value. For the hybrid view, you can use the magnifying glass icon to display additional example values.
For the hybrid document view, there is also a Field Freq column. Field Freq shows the percentage of documents that contain that permutation of field and type. For example, you might see that a field is Null 33% of the time and contains a numeric value 67% of the time. Or a field value is an Int32 value 3% of the time and an Int64 value 6% of the time. The percentages apply to the first 100 documents.
The Preview toggle at the top right of Collection View allows you to choose whether to display original source data or the transformed data. You can switch back and forth to see exactly how Tonic Structural transforms the data based on the collection and field configuration.
By default, the Preview toggle is in the on position, and the displayed data reflects the selected collection mode and the assigned generators. For collections that use Truncate mode, the preview data is empty. Truncated collections do not have data in the destination database.
To display the original source data, toggle Preview to the off position.
In the single document view, you can filter the fields by either the field name or the field value.
In the hybrid document view, you can filter the fields based on either the field name or field properties.
You can filter single document view to only display fields that have specific text in either the field name or the field value.
To filter by value, toggle Search by Value to the on position.
After you select the filter type, in the search field, type text that is in the field name or value. As you type, Structural filters the list to only include fields that contain the filter text.
To filter hybrid view by field name, in the search field, begin typing text that is in the field name. As you type, Structural filters the list to only include fields with names that include the filter text.
From the hybrid document view, you can filter the fields based on field properties.
To display the Filters panel, click Filters.
To search for a filter or a filter value, in the search field, start to type the value. The search looks for text in the individual settings.
To add a filter, depending on the filter type, either check the checkbox or select a filter option. As you add filters, Structural applies them to the field list.
Above the list, Structural displays tags for the selected filters.
To clear all of the currently selected filters, click Clear All.
The Filters panel in hybrid view includes the following fields.
An at-risk field:
Is marked as sensitive
Is assigned the Passthrough generator.
To only display at-risk fields, on the Filters panel, check At-Risk Field.
When you check At-Risk Field, Structural adds the following filters under Privacy Settings:
Sets the sensitivity filter to Sensitive
Sets the protection status filter to Not protected
You can filter the fields based on the field sensitivity.
On the Filters panel, under Privacy Settings, the sensitivity filter is by default set to All, which indicates to display both sensitive and non-sensitive fields.
To only display sensitive fields, click Sensitive.
To only display non-sensitive fields, click Not sensitive.
Note that when you check At-risk Column, Structural automatically selects Sensitive.
You can filter the fields based on whether they have any generator other than Passthrough assigned.
On the Filters panel, under Privacy Settings, the field protection filter is by default set to All, which indicates to display both protected and not protected fields.
To only display fields that have an assigned generator, click Protected.
To only display fields that do not have an assigned generator, click Not protected.
Note that when you check At-Risk Field, Structural automatically selects Not protected.
When structural detects that a field is sensitive, it can also determine a recommended generator.
For example, when it detects a name value, it also recommends the Name generator.
You can filter the fields to display the fields that have recommended generators.
On the Filters panel, under Recommended Generators, check the checkbox next to the recommended generator for which to display the fields that have that recommendation.
You can filter the fields by the field data type. For example, you might only display columns that contain either numeric or integer values.
To only display fields that have specific data types, on the Filters panel, under Database Data Types, check the checkbox for each data type to include.
The list of data types only includes data types that are present in the currently displayed fields and that are compatible with other applied filters.
To search for a specific data type, in the Filters search field, begin to type the data type.
When the source database schema changes, you might need to update the configuration to reflect those changes. If you do not resolve the schema changes, then the data generation might fail. The data generation fails if there are unresolved conflicting changes, or if you configure Structural to always fail data generation when there are any unresolved changes.
To only display fields that have unresolved schema changes, on the Filters panel, check Unresolved Schema Changes.
For detected sensitive fields, the sensitivity type indicates the type of data that was detected. Examples of sensitivity types include First Name, Address, and Email.
To only display fields that contain specific sensitivity types, on the Filters panel, under Sensitivity Type, check the checkbox for each sensitivity type to include.
The list of sensitivity types only includes sensitivity types that are present in the currently displayed fields.
To search for a specific sensitivity type, in the Filters search field, type the sensitivity type.
When the Structural sensitivity scan identifies a value as belonging to a sensitivity type, it also determines how confident it is in that determination.
You can filter the columns based on the confidence level.
To only display columns that have a specific confidence level, on the Filters panel, under Sensitivity confidence, check the checkbox next to each confidence level to include.
You can filter the column list to indicate whether to include:
Columns that are not primary or foreign keys.
Columns that are foreign keys.
Columns that are primary keys.
On the Filters panel, under Field Type:
To display fields that are neither a primary key nor a foreign key, check Non-keyed.
To display fields that are primary keys, check Primary key.
To display fields that are foreign keys, check Foreign key.
The commenting feature requires an Enterprise license.
You can add comments to fields. For example, you might use a comment to explain why you selected a particular generator or marked a field as sensitive or not sensitive.
If a field does not have any comments, then to add a comment:
Click the comment icon.
In the comment field, type the comment text.
Click Comment.
When a field has existing comments, the comment icon is green. To add comments:
Click the comment icon. The comments panel shows the previous comments. Each comment includes the comment user and timestamp.
In the comment field, type the comment text.
Click Reply.
On the field configuration panel, the sensitivity toggle at the top right indicates whether the field is marked as sensitive.
To mark a field as sensitive, toggle the setting to the Sensitive position.
To mark a field as not sensitive, toggle the setting to the Not Sensitive position.
You can assign a generator to each combination of field and type. For example, depending on the document, the data type for a field might be either string or integer. You can indicate to use the Character Scramble generator when the field type is a string and the Random Integer generator when the field type is integer.
In hybrid document view, the Null type reflects when the column value is Null. You do not assign a generator to it.
To assign a generator:
Click the generator value for the field.
On the configuration panel, from the Generator Type dropdown list, select the generator.
By default, Structural retrieves 100 documents. It then uses the data in these documents to populate example values in the hybrid document.
For sparsely populated collections, where less common fields are not present in those 100 documents, Structural retrieves extra documents until it has example values for all fields. For very sparsely populated collections, this might cause the collection view to load slowly, because it must retrieve many documents.
When this setting is true
, fields that do not have a retrieved value use a dummy default value that is based on the data type.
For more information about schema changes, go to .
Configure the generator options. For details about the available configuration options for each generator, go to the .
To disable examples for sparse collections, set the TONIC_MONGO_DISABLE_EXTRA_EXAMPLES
to true
. You can add this setting manually to the Environment Settings list on Structural Settings.