Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Release notes provide a brief description of changes to Tonic Structural, including:
New and improved features
Fixes to issues that affect customers
The release notes are divided into groups of roughly 50 releases, with more recent releases at the top. The grouping is purely to manage the volume of releases. There is no other significance to the groups.
Structural releases approximately one release per day. A higher or lower volume of updates results in more or fewer releases.
A new release notes entry is added each week, and includes the release notes for all of the versions that were released that week.
A new entry is added each week, and contains the release notes for all of the Tonic Structural versions that were released during that week.
April 12, 2024
During the free trial, Structural now displays next step highlights to indicate the next recommended action. When you hover over the recommended action, Structural displays an explanatory tooltip.
A new environment setting, TONIC_DB_MAX_POOL_SIZE
, sets the connection pool size for the Structural application database. The default value is 3.
Fixed an issue where the preview data in the JSON Mask generator editor did not respect the applied table filter.
File connector
For workspaces that connect to Amazon S3, you can now specify different credentials for the source and output locations.
For cloud storage workspaces, fixed a regression in Tonic v1146 where the file explorer sometimes displayed incorrect results and files could not be added.
You can now add Parquet files to a file connector file group, either from cloud storage or a local file system. You cannot select the following Parquet file types: HalfFloat, Struct, Union, Dictionary, Map, List, FixedSizeList, or arrays of any type. There also is no file preview for Parquet files.
MongoDB
Fixed an issue that prevented the use of regex conditions in the conditional generator.
You can now select the UUID Key generator as a sub-generator when configuring a composite generator.
PostgreSQL
Fixed an issue where Preserve Destination tables sometimes were not preserved when data generation failed.
April 5, 2024
Fixed an issue in Table View where characters were sometimes represented inaccurately. For example, a lowercase x would become a multiplication symbol.
Fixed an issue where data generation to Tonic Ephemeral Cloud failed with the error “Ephemeral URL not found”.
For a schema change that adds a new column, both the Schema Changes view and the API response now include the data type for the new column.
For the Structural free trial, Structural now displays a checklist for each workspace. There are slightly different checklists for database-based and file connector workspaces.
Added support to run Structural in dual-stack networks and IPv6-only network environments.
On the workspace details view, fixed an issue where an Ephemeral API key appeared to be populated when no value was provided.
Fixed an issue that caused data generation to Ephemeral to fail with “Ephemeral output must be configured”.
Fixed an issue where data generation to Ephemeral failed after a first successful run.
Amazon Redshift
You can now enable TLS for workspace data connections.
MySQL
Fixed an authentication issue that prevented output from being written to an Ephemeral snapshot.
SQL Server
You can now configure a SQL Server workspace to write output to an Ephemeral snapshot.
March 29, 2024
Writing output to a Tonic Ephemeral snapshot - For database types that Tonic Ephemeral supports (currently PostgreSQL and MySQL), you can now write the output to an Ephemeral user snapshot. This replaces the option to write the output to an Ephemeral database, except for workspaces in the Structural free trial. In Ephemeral, you can use the user snapshot to start new Ephemeral databases.
Other updates
For the UUID Key generator, added a new configuration option, Preserve Version and Variant. By default, the setting is turned off. When turned on, the version and variant bits from the source UUID are preserved in the output value. For the API, the new setting is preserveVersionAndVariant
.
In the Tonic Structural free trial, the sample workspace now by default writes the output to a Tonic Ephemeral database.
Fixed an issue where vertical scrolling was sometimes blocked.
You can now configure the allowed SSL/TLS protocols and ciphers on the Tonic Web Server. To configure the protocols and ciphers, use the environment settings TONIC_WEB_SERVER_TLS_PROTOCOLS
and TONIC_WEB_SERVER_TLS_CIPHERS
.
File Connector
Data generation jobs no longer fail when they encounter a configured file group file that no longer exists in the source cloud storage location.
For local file workspaces, fixed a regression from Tonic v1136 where users could not upload additional local files into a file group.
MongoDB
Fixed an issue that caused generations to fail when documents contained empty arrays or document fields.
MySQL
Improved job warnings around missing tablespaces in the destination database.
March 22, 2024
The Structural API now includes endpoints to get and set the assigned table modes and table filters for a workspace.
Fixed an issue where the workspace audit trail displayed generator preset events that occurred before the workspace was created.
Fixed an issue where an error was returned when users tried to export selected files from a file group.
Improved error message when Structural cannot write output to Ephemeral because Ephemeral does not have a compatible base image for the database.
Databricks
When a workspace is configured with an unsupported version of Databricks, the error message now suggests the supported Databricks versions.
Updated the default spark_version to 14.3.
Google BigQuery
Fixed an issue with applying some generators to numeric columns on large tables.
Oracle
Fixed an issue where rows that contained NULL values in a VARCHAR2(1) column were dropped during data generation.
March 15, 2024
Output to a Tonic Ephemeral database - Tonic Ephemeral is a separate Tonic.ai product that allows you to create temporary databases. On Tonic Cloud, for data connectors that Ephemeral supports (currently PostgreSQL and MySQL), you can configure the workspace to write the destination data to an Ephemeral database. This is the default option for data connectors that Ephemeral supports.
The database belongs to your Ephemeral account. If you do not already have an Ephemeral account, then Tonic automatically creates a two-week Ephemeral free trial account for you. The Tonic data generation job details provide access to the database connection details.
Free trial checklist - During the free trial, the sample workspace now includes a checklist to help users get through the required steps to complete their first data generation.
Other updates
Free trial users can no longer use a public email address to create an account.
Fixed an issue where password reset links lead to a blank page.
Fixed an issue where pay-as-you-go users would see the countdown for a free trial.
In the sample workspace, fixed an issue where a faulty destination database template caused an error when a user tried to update it.
Google BigQuery
Tonic can now de-identify snapshot tables.
Oracle
For databases that use non-default database character sets, fixed an issue where rows that contained character data sometimes failed to be written to the destination database.
For databases that use non-default database character sets, fixed an issue where passthrough CLOB data types had extra bytes inserted when they were copied to the destination database.
PostgreSQL
If you do not use extensions, then the destination database no longer requires a super user.
Snowflake
Subsetting is now supported on Snowflake workspaces. The table filtering option is still available.
March 8, 2024
You can now manually add selected environment settings to the Environment Settings list on Tonic Settings.
Improved the performance of data previews in the Tonic application.
For workspaces that write output data to a container repository, fixed an issue that prevented GAR credentials from being saved.
Google BigQuery
Improved performance for data preview.
Snowflake
The workspace configuration now includes a Trust Server Certificate option for the source and destination connections. When enabled, it indicates to bypass certificate revocation checks.
Data generation jobs are now more resilient when they encounter views that have dependencies that Tonic does not have permissions for.
Destination privileges can now exclude schema creation. If the user does not have schema creation permissions, then the schemas must already exist in the destination database.
SQL Server
The upsert function now supports tables that have identity columns but do not have primary keys.
March 1, 2024
New Db2 for LUW data connector - Tonic now has a data connector for IBM Db2 for Linux, Unix, and Windows (Db2 for LUW). Tonic supports Db2 for LUW version 11.5. For more information, go to Db2 for LUW.
Other updates
When the AI Synthesizer is used in a workspace, Tonic now verifies before data generation that the AI Synthesizer does not use more than the maximum allowed categories.
Amazon EMR
Improved data generation performance when using Spark 2.4.0-3.2.x.
Databricks
Fixed an issue in Databricks 11.3+ where a generator that was consistent on another column received modified input values if a generator was applied to the consistency column. Generators that are consistent on another column now correctly receive unmodified input values.
Improved data generation performance when using Spark 2.4.0-3.2.x.
Google BigQuery
You can now de-identify views, except for views that are written in Google Legacy SQL. On the workspace settings view, to enable de-identification of views, toggle De-identify Views to the on position. You can then assign table modes and generators to the views. In the table lists, views are identified by (view)
after the view name.
For each view, Tonic
Writes the de-identified data to a table called <view name>_tonic_table
.
Creates a view that has the same name and metadata as the view in the source data, but is populated from the destination table.
Fixed an issue where Passthrough tables that had table constraints failed to copy to the destination database.
Fixed an issue where cloned tables did not appear in the Tonic application, but caused job failures. Cloned tables now display within Tonic and can be managed in the same way as any other table.
Oracle
For Oracle 12, improved the schema remapping across multiple object types and configurations.
Fixed an issue where Tonic failed to copy database link objects that were configured using passwords. Tonic no longer processes database links. If required, users must manually create database links in the destination schema.
Fixed an issue where tables that contained computed columns dropped rows during data generation.
Snowflake
You can now assign the JSON Mask generator to Snowflake variant columns.
February 23, 2024
Privacy Report PDF file
We added a new Privacy Report PDF that you can download from Privacy Hub and the job details view. The Privacy Report PDF contains a summary of the privacy ranking values, visualizations to summarize the workspace column privacy rankings based on the applied generators, and a summary table that contains the .csv Privacy Report data.
To accommodate the new file, on Privacy Hub and the job details page, the available downloads are combined into a Download menu.
Assigning recommended generators from Database View
On Database View, when an unprotected column has a recommended generator, the generator name tag now displays the type of sensitive data that was detected.
When you click the generator name tag, Tonic displays a panel that displays the sensitivity type, the recommended generator, and sample source and output data based on the recommended generator. The panel provides options to either apply or ignore the recommendation.
Other updates
Fixed an issue where changing the configuration of a generator preset did not accurately update the count of occurrences of the preset.
Oracle
Fixed an issue with Oracle 12.1 and 12.2 where constraints that had an index creation statement failed to apply.
Tonic can now de-identify external tables. External tables display in the Tonic application and can be assigned table modes and generators. During data generation, external tables are created as relational tables in the destination schema.
A new entry is added each week, and contains the release notes for all of the Tonic Structural versions that were released during that week.
April 26, 2024
Added a new API endpoint to resolve all schema changes in a workspace. You can choose whether to resolve only conflicting changes, only notifications, or all of the schema changes.
Fixed an issue that caused the Tonic Structural PyML Service to be unreachable in IPV4-only containers.
Added a new conflicting schema change when a column that has an assigned generator becomes a foreign key. Foreign key columns must inherit the generator from the primary key.
Structural can now generate data with subsetting when a primary key table is truncated, as long as the foreign keys that reference the primary key are nullable.
Amazon Redshift
Added support to pass through varbyte, geometry, and hllsketch types.
File connector
For CSV, XML, and JSON files, fixed issues with the data preview in Database View and Table View. The preview no longer includes extra rows, and the preview now correctly reflects the Skip first N rows setting.
Fixed the validation of file groups that only contained compressed files.
Snowflake
For subsetting, added support for virtual foreign keys.
SQL Server
Fixed an issue that caused data generations to fail for versions of SQL Server older than SQL Server 2016.
April 19, 2024
For the notifications image, replaced alpine with ubuntu.
File connector
For a local files workspace, the job details view for a successful data generation now includes an option to download the transformed files that were produced by that job.
For workspaces that use files from cloud storage, you can now include prefix patterns in the file group definition. You can also provide file extension filters. The file group then includes all of the files that match a prefix pattern and the file extension filters. The file group details now include the content type, file extension filter, and prefix patterns.
For cloud storage workspaces, users who do not have permission to view all buckets can once again specify the bucket to view.
For cloud storage workspaces, added a configuration option to only process new files during data generation. For existing file groups, the new configuration is off by default. For new file groups, the configuration is enabled by default.
MongoDB
Fixed an issue where environment setting updates from the UI required a restart to take effect.
MySQL
For a workspace that is configured to write output to Ephemeral, you can now provide a custom configuration file.
Snowflake
Fixed an issue where the Subsetting tab disappeared after you edited a workspace.
SQL Server
For versions 2017, 2019, and 2022, you can now configure workspaces to write output to a container repository.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
Releases V1051 - V1053 were removed from quay because of a regression.
December 15, 2023
Enable administration functions in Tonic Cloud - For Tonic Cloud customers, the new Account Admin permission set provides access to Tonic administration functions for their organization. The Account Admin can reset passwords, delete users, copy and share all workspaces, and download the usage report. The Account Admin permission set is initially granted to the first user in the organization.
Databricks
For the Test Cluster Connection option, added an error message when the specified cluster cannot describe all the tables in the specified catalog + database.
Improved error messaging for the Test Cluster Connection option.
File connector
When you use the API to add files to a file group, Tonic now validates that the file exists and does not duplicate a file that is already in the group.
MySQL
Added a new environment setting, TONIC_MYSQL_MAX_CONCURRENT_INDEX_CREATION
, to limit the number of concurrent indexes that are created. The default value is 0, which indicates that there is no limit.
SQL Server
External tables are no longer displayed on the application or carried over to the destination database.
December 8, 2023
When you select the option to write destination data to container artifacts, you can now use Google Artifact Registry (GAR) authorization using Google Cloud Platform (GCP) service account keys.
For the JSON Mask and XML Mask generators, fixed the data preview for JSON or XML field samples that are larger than 120MB by generating a smaller subset of the field.
The Name generator now supports consistency with other columns.
Added new API endpoints to retrieve and set table replacements. These new endpoints are compatible with workspaces for data connectors that do not have schemas, such as Spark-based databases and the file connector. The existing endpoints, which require you to provide a schema, eventually will be deprecated.
Amazon EMR
You can now write data generation output either to a single, static location or to a job-specific folder. Previously, all data generation output was exported to a job-specific folder.
File connector
You can now use the Continuous generator in file connector workspaces.
MySQL
When writing destination data to a container repository, you can now provide the content of a my.cnf file that contains custom configuration.
Fixed an issue where indexes that use expressions in their definition caused an error during data generation.
Oracle
To fix an issue where turning off database links caused triggers to break, we now run the ALTER TRIGGER
statements after the TRIGGERS
are validated.
PostgreSQL
For Azure PostgreSQL, fixed an issue with the destination database connection test.
For Azure PostgreSQL, fixed the connection test check for schemas that are not admin owned.
SQL Server
Data generation jobs that are canceled during the schema gathering phase now cancel more promptly.
Snowflake
On Tonic Cloud, the Snowflake with Azure option is now enabled correctly.
Free trial users can now use Snowflake on AWS and Snowflake on Azure.
December 1, 2023
Added an environment setting TONIC_DELETE_COLUMN_SCHEMA_ON_WORKSPACE_DELETE
. If the setting is true
, then when a workspace is deleted, Tonic also deletes the associated rows from the ColumnSchemas
table in the Tonic application database.
The new environment setting TONIC_NOTIFICATION_SMTP_TRUST_CERTIFICATE
indicates whether to allow the SMTP server certificate to be trusted.
Improved the performance of previewing data in Privacy Hub.
Fixed an issue where SSO groups were not removed when the value of TONIC_SSO_GROUP_FILTER_REGEX
changed in a way that excluded previously imported groups. The removed groups are removed from any workspaces that they were granted access to.
For the Timestamp Shift Generator, added Month and Year as options for the date part to use to set the allowed range.
When writing data to container artifacts, Tonic now first shuts down the temporary database before it begins to write data to the container.
Amazon EMR
Fixed an issue where, when the Athena workgroup used engine version 3, you could not preview data for tables that contained struct data types. Tonic now supports both Athena engine versions 2 and 3.
Fixed an issue where Professional plan customers could not save workspaces.
Fixed an issue where the Struct Mask generator editor did not display a preview of the value.
Enabled support for the Email generator as a sub-generator within the Struct Mask generator.
Databricks
Fixed an issue where the Struct Mask generator editor did not display a preview of the value.
The FNR generator is now supported.
Enabled support for the Email generator as a sub-generator within the Struct Mask generator.
MongoDB
Fixed an issue where erroneous schema change items were reported when the frequency of a field changed.
Oracle
Fixed an issue where when Oracle database links were not enabled (TONIC_ORACLE_DBLINK_ENABLED
is false
), privileges were not copied from the source to the destination.
November 22, 2023
Amazon EMR
Fixed an issue where jobs failed if Tonic could not retrieve the tracking link for the EMR Step logs. To ensure maximum compatibility with all AWS regions, tracking links are now valid for 7 days instead of 14 days.
Fixed an issue with assigning the Passthrough generator as a sub-generator for the Regex Mask generator.
Databricks
Fixed an issue with assigning the Passthrough generator as a sub-generator for the Regex Mask generator.
MySQL
On destination databases, Tonic no longer checks for REPLICATION CLIENT
and REPLICATION SLAVE
grants.
Spark SDK
Fixed an issue with assigning the Passthrough generator as a sub-generator for the Regex Mask generator.
Spark with Livy
Fixed an issue with assigning the Passthrough generator as a sub-generator for the Regex Mask generator.
November 17, 2023
When an email notification contains a link to a comment, clicking the link now correctly navigates to and displays the comment.
Added a flag to mark log files that might contain sensitive information.
You can now assign the Timestamp Shift generator to primary key columns, including columns that are part of a composite primary key.
When writing destination data to a container repository, based on the size of the source database, Tonic attaches resource requests to the datapacker pod to verify that the cluster has sufficient resources for the data.
Tonic now supports writing destination data to container registries on Amazon Elastic Container Registry (Amazon ECR).
Fixed an issue where users could not assign access to permission sets when they did not have access to create and manage custom permission sets.
For foreign keys that have multiple primary keys, Tonic now prevents data generation if the assigned generation configuration for the primary key columns is inconsistent.
Standardized the display format of the heading area of the workspace management views.
File connector
FIxed an issue with deleting multiple files from a file group.
MongoDB
From Privacy Hub, you can now use the option to review and apply recommended generators to all of the detected sensitive fields.
When there are schema changes that are not yet scanned, such as a new collection or new fields in a collection, they are now handled as Passthrough. The schema changes no longer cause data generation to fail.
MySQL
You can now configure a MySQL workspace to write the destination data to a container repository.
PostgreSQL
Fixed an issue where when an excluded schema did not exist in the source database, the test connection option returned an error.
Customers are now by default automatically enrolled in Data Pipeline V2 mode, and do not see the option to enable or disable it. The ability to enable or disable Data Pipeline V2 is granted to individual customers as needed.
Snowflake
Fixed an issue where when an excluded schema did not exist in the source database, the test connection option returned an error.
November 10, 2023
API endpoints for subset configuration - The Tonic API now includes endpoints for subsetting configuration. You can use the endpoints to retrieve the subsetting configurations for a workspace, update subsetting configuration, and remove subsetting configuration. A subsetting configuration identifies a table as either a target table (percentage or WHERE
clause) or a lookup table.
Improved how Tonic identifies values as names, to reduce false positives.
For upsert data generation, fixed an issue that caused failures on tables that contain foreign keys but no primary keys.
File connector
Sensitivity scans no longer discard the most recently generated set of downloadable generated files, which are generated from files uploaded from a local file system.
MongoDB
Fixed an issue that caused duplicate or inaccurate schema issues to be displayed on Schema Changes view.
Oracle
Tonic now validates destination database objects, such as views and packages, that were invalidated by the data generation. A warning is issued for objects that fail to validate. Validation failure can be caused by insufficient permissions for the Tonic user on the destination schema.
PostgreSQL
For Data Pipeline V2 data generation, fixed a race condition that could cause data generation to fail.
November 3, 2023
Toggle between database server and container repository - On the workspace configuration view, if the data connector supports writing to a container repository, you can now switch between writing to a database server and writing to a container repository. Tonic saves the information you provide for each option.
Workspace override for statistics seed - From the workspace configuration view, you now can override the Tonic-wide statistics seed that is set as the value of the TONIC_STATISTICS_SEED
environment setting. You can either provide a custom seed value for the workspace, or disable consistency across data generation runs for the workspace. Consistency also applies across workspaces that have the same custom seed value. For more information, go to #enabling-consistency-across-runs-or-multiple-databases.
New telemetry URL - Tonic telemetry now routes through https://telemetry.tonic.ai/ instead of https://api2.amplitude.com/. The following IP addresses must be allowed:
75.2.74.76
99.83.246.105
The following IP addresses no longer need to be allowed: 52.43.241.47, 54.186.140.101, 54.203.75.164, 44.236.122.176, 34.215.78.194, 54.149.61.206, 54.191.147.220, 52.24.22.222, 52.37.168.36, 54.213.191.53, 54.68.108.104, 52.10.121.164, 52.27.184.186, 44.239.225.209, 54.148.216.233, 52.88.224.247
Other updates
Fixed an issue in the Kubernetes client configuration that caused Tonic to reject the SSL certificate of a Kubernetes context.
Fixed an issue where, during subset configuration, an error was returned incorrectly if the user had permission to edit subsetting but not to view source data.
Improved the error message that displays when an uploaded virtual foreign key file is invalid.
Fixed an issue that prevented the Character Substitution generator from being used on primary and foreign key columns when subsetting.
When a workspace import encounters an unhandled exception, the error now displays correctly.
For the Address generator, the Country and Country Code options can now be linked. When linked, the country and country code are either “United States” or “US” to match the other linkable components, which are locations in the United States.
Fixed an issue where SSO login using Okta did not work when a custom authorization server is used.
Fixed an issue where, when using data science mode on Tonic Cloud, users could not download CSV files that contained synthetic data.
Data generation jobs that write to a container now include the datapacker logs, if the worker has permissions to read the pods and logs.
Fixed an issue with uploading container output to Harbor.
For string data type columns:
Added yyyy/MM/dd
as a valid format for the Timestamp Shift generator.
Added yyyy/MM/dd
and MMddyyyy
as available output formats for the Random Timestamp generator.
Databricks
For Databricks 11.3 and later, the Databricks data connector now supports the Regex Mask generator. The regular expressions might work slightly differently than on non-Spark data connectors.
Spark SDK
The Spark SDK data connector now supports the Regex Mask generator. The regular expressions might work slightly differently than on non-Spark data connectors.
Spark with Livy
For Spark 2.3.x and 2.4.2, the Spark with Livy data connector now supports the Regex Mask generator. The regular expressions might work slightly differently than on non-Spark data connectors.
SQL Server
Added a new worker environment setting, SQL_SERVER_SCRIPT_CROSS_DATABASE_REFERENCES
. The default is true
, which preserves the existing behavior. To prevent scripting of objects that are defined in other databases that the source database references, change the setting to false
.
Fixed issues where SQL Server connections did not honor the TONIC_SQL_COMMAND_TIMEOUT
environment setting. The default is 0, which indicates an infinite timeout.
October 27, 2023
Apply recommended generators to multiple detected sensitive columns - On Privacy Hub, a new banner displays the number of detected (not manually designated) sensitive columns that are not protected. From the banner, you can display the list of columns, grouped by sensitivity type. For each sensitivity type, for the selected columns, you can:
Apply the recommended generator
Ignore the generator recommendation
Mark the columns as not sensitive
There is also an option to apply the recommended generators to all of the selected columns across all of the sensitivity types. For more information, go to Reviewing and applying recommended generators.
Writing destination data to container registries (beta feature) - For data connectors that support it, you can now configure a workspace to write destination data to container artifacts on a container registry instead of to a database server. The Job History view provides access to the generated artifacts for each job. For more information, go to Writing data generation output to a container repository and Viewing and downloading container artifacts.
Other updates
Fixed an issue where generated API documentation did not exclude endpoints and schemas from API versions other than the version being viewed.
Fixed an issue where the Protection Audit Trail displayed the incorrect override status of columns in child workspaces.
On the System Status tab of Tonic Settings, fixed an issue where the Data Sharing section incorrectly showed logging as unable to connect.
Databricks
On the table mode selection panel, for workspaces that write to Databricks Delta tables, disabled the Error on Overwrite table configuration for tables. The setting is not used in this case.
File connector
In the file group configuration, added options to skip lines from the beginning or end of CSV files.
MongoDB
On Collection View, fixed an issue that prevented array fields from displaying in the Hybrid Document View of a collection.
Oracle
For new workspaces, the Preserve source database file storage preferences setting is now off by default instead of on. Existing workspaces are not affected by this change.
October 20, 2023
Configure Tonic environment settings from Tonic Settings - On the Tonic Settings view, a new Environment Settings tab allows you to configure a subset of Tonic environment settings (previously referred to as Tonic environment variables). To use the Environment Settings tab to configure settings, you must have the new Manage environment settings global permission. When you change the values from Tonic Settings, you do not need to restart Tonic. For details, see Configuring environment settings.
Other updates
For the Regex Mask generator, fixed an issue where quotes in regular expressions were malformed.
Fixed an issue where the Test Webhook option failed when it shouldn’t.
Fixed issue where the Confirm Data Generation panel erroneously indicated that Tonic could not connect to the destination database when it actually was only unable to connect to the source database.
For the Address generator, removed an inappropriate value from the possible street names.
Improved error messages when Tonic containers fail to start because of missing or incorrect environment variable values.
Databricks
Fixed source catalog workspace handling when the Databricks Unity source catalog contains a table with a key constraint.
For Databricks 10.4 and earlier, generations to output Databricks tables now correctly create a new destination Databricks database if the specified one is not found.
File connector
For file connector workspaces, the post-job scripts option is now hidden. The file connector does not support post-job scripts.
For Amazon S3, fixed an issue where Tonic did not delete a temporary file that was created to test permissions.
Google BigQuery
External tables are now supported with some restrictions. Destination tables must be native BigQuery tables and cannot be external tables, whether masked or passthrough. Performance might be affected because of Google’s implementation of external tables.
MongoDB
Fixed an issue where an error was thrown when generators were applied.
Snowflake on AWS and Azure
Added an option to use key pair authentication to connect to the source and destination databases.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
Releases V897 and V898 were removed from quay because of a regression.
July 7, 2023
On Subsetting view, Graph View now displays a loading animation as new data is loaded.
Improved performance for the UUID Key and Integer Key generators.
File connector
Fixed an issue where files sometimes did not upload completely.
Google BigQuery
Improved performance of destination database writes during data generation.
MySQL
Improved performance for destination database writes during data generation.
Oracle
Improved error handling when the tablespace in the source database is missing from the destination database.
June 30, 2023
File connector
The new file connector data connector allows you to use files from either Amazon S3, Google Cloud Storage, or a local file system as the source data. The file connector supports .csv, .json, and .xml files. Within a file connector workspace, you create file groups. Each file group contains files that have an identical format and structure. A file group is treated as a table for the purposes of table mode and generator configuration. The file connector is available with the Professional and Enterprise license plans.
Generic OIDC SSO Support
Tonic now supports authentication using a generic OpenID connection.
Tonic API versioning
We have introduced a versioning scheme for the Tonic API. API versions are released more or less quarterly, with the version identifier in the format vYYYY.MM.P
(Year.Month.Patch
). The current release candidate (v.RC) contains API updates in progress.
You should now specify an API version in your API requests. The System Status tab of the Admin Panel lists the latest available version. You can also select the version to use when you do not provide a version in the request. If you do not provide an API version in a request or select a default API version, then until January 31, 2024, Tonic automatically uses the latest version. After January 31, 2024, Tonic will return an error from the request.
Other updates
Fixed an issue where Tonic incorrectly returned the error No Destination DB has been configured for this Workspace
for workspaces that used Preserve Destination.
For subsetting Graph View, updated the default zoom level to allow users to see more of the graph.
The Keycloak SSO provider now supports PKCE challenge.
Fixed an issue where deep links did not work for SAML SSO.
MongoDB
For Mongo Queries, Tonic now can now use disk as well as memory.
Oracle
Tonic now refreshes materialized views even when SKIP_CREATE_DB
is set to true.
PostgreSQL
In the job progress steps, fixed an issue that caused the number of rows in a table to display as -1.
When the Data Pipeline V2 processing is enabled, tables are now processed by size, with larger tables processed first.
SQL Server
Fixed support for Kerberos authentication.
Spark
Added support for the UUID Key generator to Livy and Databricks workspaces that use Spark 2.3.x and above. Added support for the UUID Key generator to the Tonic Java SDK.
June 23, 2023
The ASCII Key generator now includes an Exclude Lowercase Alphabet option to exclude lowercase letters from the destination values.
Fixed an issue that prevented free trial signups.
Data generation no longer fails when Tonic is unable to retrieve the destination database size.
Updated the FNR generator to prevent a possible leakage of PII.
Added a Date column to the usage report. The date column provides the date and time when the data generation job was completed.
Fixed an issue in the JSON Mask generator where it incorrectly changed the format of timestamps.
Subsetting is no longer prevented when a table that is not in the subset is assigned Preserve Destination mode.
Databricks
On the workspace details view, you can now optionally specify the catalog where the source database is located. If you do not specify a catalog, then the default catalog is used.
MongoDB
Fixed an issue where the collection statistics failed because the statistics object became too large.
Updated to write collection records in batches.
Tonic now continues to retrieve documents after a failure.
Oracle
Removed the uniqueness check for individual columns that are part of a composite index.
PostgreSQL
Fixed an issue where there was duplicated data from parent tables of inherited tables.
Improved performance for query to retrieve tables and columns.
Snowflake
Improved performance for data generation.
Updated how Tonic uploads files to Amazon S3 to reduce memory usage.
SQL Server
Fixed DNS resolution for Kerberos.
Fixed an issue where Kerberos authentication failed with an error that the destination array was not long enough.
June 16, 2023
The new usage report summarizes the data processed for each table for data generation jobs. The report is a .csv file that you download from Tonic. To download the report, on the Admin Panel, click Download Usage Report.
When certain sensitive loggers are enabled, Tonic now disables log collection.
Fixed an issue on Database View where a column configuration panel would close unexpectedly.
For the TONIC_ADMINISTRATORS
environment variable, you can now specify the names of SSO groups to grant administrator privileges to. Previously you could only specify user email addresses.
The Tonic SDK Javadoc now displays correctly.
Restored the ability to import a workspace configuration from Workspaces view.
MongoDB
Added support for the DBRef
datatype.
Improved performance for collection management.
PostgreSQL
For the beta Data Pipeline V2 data generation, adjusted the logging level for telemetry-related log messages to DEBUG.
For the beta Data Pipeline v2 processing, improved parallel processing for constraints that cross tables.
For the beta Data Pipeline V2 process, reduced the default maximum number of source and destination connections to 8. These are set as the values of the TONIC_JOBFLOW_MAX_SOURCE_CONNECTIONS
and TONIC_JOBFLOW_MAX_DESTINATION_CONNECTIONS
environment variables. We recommend that you set each value to the number of CPUs on the corresponding database.
Snowflake
Fixed an issue where preserve destination tables were removed during data generation.
June 9, 2023
The new FNR generator transforms Norwegian national identity numbers. The FNR generator was added in V857. It included options to specify a range of birthdates and preserve the indicated gender. In V866, removed the date range configuration options. The destination values are now always within the same date range as the source values. The FNR generator also now can be used for columns that have uniqueness constraints. The final digits in the destination value are not a valid checksum.
For the beta Data Pipeline V2 processing, fixed an issue where jobs would hang if they were canceled before the job started.
Fixed an issue where the Foreign Keys view would freeze.
Fixed an issue where when you typed @ to add a user mention to a comment, suggestions for the user did not display.
Upgraded to use .NET 7.
"Data science modeling" is changed to "data science mode".
When you configure the SSH tunnel settings for a workspace, Tonic now obscures the SSH passphrase.
MongoDB
Added a configuration to prevent Tonic from retrieving other information about collections when retrieving a collection list. Addresses an issue where retrieving the collection list took a very long time. To disable the retrieval of other collection information, set the environment variable TONIC_MONGO_DISABLE_COLLECTION_INFORMATION_FETCHING
to false.
Fixed an issue with collection scanning that caused application pages to not load.
Collection scans are now able to continue when the scan for an individual collection fails. The job logs include warnings for each failed collection.
For collection scans, for each schema Tonic now limits the number of documents to scan and the length of time for the scan.
PostgreSQL
Tonic now uses the estimated row count from PostgreSQL statistics to determine the parallelism for a table. Customers should ensure that they have up-to-date statistics for their source tables, especially for large ones.
Snowflake
For source tables that are assigned Preserve Destination mode, Tonic no longer attempts to add existing constraints to the destination tables.
Fixed a syntax error in the post-generation processing.
Fixed an issue where data generation failed with the error "Unable to determine AWS Region".
Fixed an issue where preserve destination tables were removed during data generation.
SQL Server
During data generation, Tonic now warns users when a filegroup that exists in the source database does not exist in the destination database.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
April 21, 2023
New PostgreSQL option to test the beta version of the new data generation process
Tonic is working on an improved version of the data generation process. The new process is designed to be more stable.
A beta version of this new process is available for you to try for PostgreSQL workspaces. On the Confirm Generation panel, to use the new data generation process, toggle Data Pipeline V2 to the on position.
Setting the baseline configuration for composite generators
For composite generators, from the column configuration panel, you can now update the baseline configuration of the generator preset, and reset the configuration to the current baseline. Note that you cannot configure generator presets for composite generators from the Generator Presets view, which does not have access to data to use to create path expressions.
Other updates
For Tonic data encryption, you can now configure different custom initialization vectors for encryption and decryption.
Workspace export and import now include columns that you manually marked as sensitive or not sensitive.
When you change or reset a user's Tonic password, Tonic now ends all of the existing Tonic sessions for that user.
Made the initial connection to databases more resilient to transient network issues.
Fixed a Privacy Hub issue where the sensitivity scan was shown as running when it was actually complete.
The SIN generator now masks all numeric values, and ignores other values. Previously, if the column contained non-numeric characters, the same value was applied to all of the rows.
Fixed an issue where on instances that do not use EC2 as the underlying server, workspaces created in versions 787 through 795 did not work correctly on versions 796 through 802. Issues included not being able to run data generation and not being able to view Table View. Those workspaces do work correctly as of version 803.
Fixed an issue where subsetting data generation might fail when there is a high degree of parallelism.
Amazon Redshift
Workspace views now load more quickly.
MongoDB
Fixed an issue where MongoDB failed to run jobs when a document contained a duplicate key.
Snowflake
Fixed an issue where entering an invalid connection URL caused the workspace settings page to not be able to load.
Workspace views now load more quickly.
Spark
For the Address and HIPAA Address generators, the list of address types now only includes types that are supported by Spark-based data connectors.
April 14, 2023
Release 795 was removed from quay because of a regression that was fixed in later releases. The issue caused new and edited workspaces to be broken.
Fixed an issue where the job status on the Job History view occasionally claimed incorrectly that a job was over 50 years old.
Fixed an issue where the web server and worker fail to launch with StackOverflowException
.
Snowflake
Updated how Tonic retrieves and writes to DDLs to ensure that the tasks are in the correct order.
Spark
For Amazon EMR and Spark workspaces, when the SerDe cannot be identified, Tonic now attempts to use the Glue table classification to determine the output file type.
Fixed an issue where Tonic incorrectly replaced the header row in output files with transformed data.
April 7, 2023
Release 790 was removed from quay because of a regression that was fixed in later releases.
In Database View, when you select multiple columns that apply the same generator, you can now choose to reset all of those columns to the baseline configuration for the built-in generator preset.
Tonic Docker containers no longer run as root.
Added warnings when a generated value for a varchar
column exceeds the column length.
Tonic now correctly identifies when the destination database details match the source database.
Fixed an issue with the container version check during data generation when TONIC_WEB_URL
is set to a custom value that includes a trailing slash (/
).
Fixed an issue where a removed WHERE
clause for an upstream filter was interpreted as an empty value, which caused subsetting data generation to fail.
Improved the security of the Tonic web server by limiting connections to a more secure set of TLS protocols and ciphers.
Improved performance for the Alphanumeric Key generator.
Fixed an issue where the column details panel incorrectly indicated that Tonic data encryption was disabled.
Fixed an issue where sensitivity scan results sometimes changed the sensitivity of a column that was manually marked by a user.
MySQL
Fixed a regression in performance.
PostgreSQL
Updated to automatically create a public schema if it is not present.
Improved handling of custom data types.
Snowflake
Reworked the Snowflake data generation to not drop schemas.
March 31, 2023
The Workspaces view now correctly updates the available actions for a workspace when your role for the workspace changes.
The workspace search filter field is now cleared correctly after you select a workspace from or close the list.
Filters that are applied when navigating from Privacy Hub to Database View now differentiate between tables that had the same name but were in different schemas.
Fixed a display issue on Privacy Hub where elements did not display well on large screens.
On Database View, the Filter button for table modes is now highlighted when you change the table mode filter from its default state.
The Foreign Keys view and Table View now correctly find and display foreign key columns that refer to a primary key value from the same table.
Table View
Fixed an issue where an updated table filter did not display correctly until the page was refreshed.
When Preview is enabled, null values no longer prevent users from being able to assign generators.
Fixed an issue where duplicate errors were displayed.
Subsetting
Clicking a table in Graph View no longer zooms in on the selected table.
For tables that contain no rows, the initial row count now shows 0 instead of NaNundefined
.
When loading Subsetting view, Tonic no longer marks all of the tables as Out of Subset until the view finishes loading.
The subset size is no longer incorrectly displayed as 0.
Corrected the setting of job start and end times to be null instead of the epoch start time before the job starts or ends.
Foreign Keys
When a database contains multiple schemas, the Foreign Keys view now always shows the schema for each table. Previously, if the existing foreign keys were within a single schema, we did not display the schema name, only the table name.
Fixed an issue where certain user selections persisted when the user switched to a different workspace.
Improved the scrolling behavior.
Generators
For the JSON Mask generator, added Boolean to the available options for the Type Filter dropdown list. Type Filter identifies the types of values to apply a selected sub-generator to.
Improved validation of the Percent True configuration for the Random Boolean generator.
For the Regex Mask generator, improved handling of regular expressions that match empty strings.Fixed an issue where when the source database connection became invalid, users were not able to view the workspace configuration in order to correct the connection details.
The ObjectId Key generator is now called the Mongo ObjectId Key generator. It can be assigned to any text column that contains a 12-byte MongoDB object identifier value. Previously, it could only be used for ObjectId
type fields in MongoDB databases. A new configuration option indicates whether to preserve the timestamp and increment counter portions of the object identifier, and only transform the random value portion.
For errors returned from SSO requests to Okta, we now display both the error and the error description, instead of only the error.
Improved performance for loading the job details view.
For webhooks, we now provide a built-in Content-Type
header with a default value of application/json
.
Upgraded libraries and clients to address security vulnerabilities.
MongoDB
The Regex Mask generator now identifies all of the matches when there are multiple regular expressions, and not just the first match.
MySQL
Preserve Destination mode now works correctly.
Fixed an issue where escape characters in a query were not processed correctly, which caused the Subsetting view to not display.
PostgreSQL
Fixed an issue where subsetting generation failed with the error "stack depth limit exceeded
".
March 24, 2023
Fixed an issue where the column configuration panel on Privacy Hub incorrectly allowed users to change the configuration of foreign key columns.
The option to create a completely new workspace is no longer available from the workspace management view. You can only copy that workspace and create a child workspace. To create a completely new workspace, use the Create New Workspace option on the Workspaces view.
Fixed an issue where the generator for a column in a table that uses Scale mode could have a configuration option that is invalid for that mode.
Upgraded libraries to address security vulnerabilities.
Fixed an issue where when you applied the Random Timestamp generator to a column, a Bad format string error
was returned.
Improved display of long table names in the table details panel on Subsetting view.
MongoDB
Introduced a new configuration to collapse child fields into a single field based on a regular expression, to reduce the size of the schema. TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX
provides the regular expression to check the field keys against. TONIC_DOCUMENT_COLLAPSE_FIELDS_REGEX_THRESHOLD
specifies the number of matching fields that causes the fields to be collapsed. A value of 0 indicates to not collapse the fields.
On Collection View, fixed an issue where toggling the data preview changed the frequency values for documents within documents.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
Releases V684 through V695 were removed from quay because of a regression that was fixed in later releases.
February 3, 2023
Update to PII detection - PII container removed
As of V697, updated how Tonic performs PII detection. The name detection process now also scans for international names with origins from 103 different countries.
The PII detection process now runs from the Tonic worker instead of in a separate container.
To avoid errors when you upgrade, future versions up to V999 will have a placeholder PII container. The container is not active and is not used to run Tonic.
Before you upgrade to V1000 or later, you must remove the PII container from your instance. To remove the container:
For a Kubernetes deployment:
Remove the file tonic-pii-detection-deployment.yaml.
Remove the file tonic-pii-detection-service.yaml.
Update tonic-worker-deployment.yaml to remove the entry for TONIC_PII_DETECTION_URL
.
Update values.sample.yaml to remove the entry for pii_detection.
If you do not remove the PII container before you upgrade to V1000 or later, then when you upgrade Tonic, you will encounter image pull errors.
Other updates
Improved error messaging on the Tonic UI for failed data generation jobs.
Reduced the telemetry logging for PII detection during sensitivity scans to reduce the performance impact on self-hosted instances with machines that do not have internet access.
Fixed an issue that prevented users from configuring the Random Timestamp generator Date Format when the generator was selected as a sub-generator of the JSON Mask and other composite generators.
Made a minor correction to a data generation log message.
Reduced Tonic slowdowns caused by telemetry.
MySQL
Fixed an issue that caused data connections to fail with the error The ConnectionString property has not been initialized
.
Fixed an issue where chained foreign keys caused subsetting jobs to fail.
Oracle
Fixed an issue where long column names caused data generation to return the error ORA-00972: identifier is too long
.
Fixed an issue where chained foreign keys caused subsetting jobs to fail.
PostgreSQL
Fixed an issue where chained foreign keys caused subsetting jobs to fail.
SQL Server
Users can now assign generators to timestamp and rowversion columns.
Fixed an issue where a database ID mismatch in the master and database tables prevented data generation from starting.
Fixed an issue where chained foreign keys caused subsetting jobs to fail.
January 27, 2023
Enhancements
When sharing a workspace, free trial users can now invite other users with the same corporate email domain to start their own free trial.
Other updates
Added messaging to the Tonic application about changes to the Tonic license plans.
Fixed an issue with the JSON Mask generator where when users deleted a sub-generator, a different sub-generator was deleted.
Fixed cases where data generation jobs remained in the queued state indefinitely.
Fixed a performance regression that affected workspace loading.
Oracle
Fixed an issue where a data generation fails when there are index-organized tables.
January 20, 2023
Enhancements
When creating new virtual foreign keys, you can now use the top level field name check box to select or deselect all of the fields that have that name.
Other updates
Fixed an issue where a warning did not display correctly when a subset table was configured with an invalid table mode for subsetting.
For the Custom Categorical generator, we no longer treat newlines as empty strings on numeric columns. Newlines are still treated as empty strings on string columns.
Fixed an issue where an error occurred when a column is assigned the Custom Value Processor generator.
Fixed an issue where data generation with subsetting failed with the following error: Could not load type from assembly 'Allos.Generators'
.
Redesigned the generator selection dropdown to better separate the suggested generators from the other applicable generators.
Oracle
Improved error logging during data generation.
SQL Server
Fixed an issue where views that reference preserved tables (table mode is Preserve Destination) were not created.
Added verification to ensure that all of the required tables and objects are created in the destination database before data generation.
January 13, 2023
Enhancements
Tonic now displays warnings at 30, 15, and 7 days before a Tonic license expires.
Other updates
After a one-click update of Tonic, containers for Docker Compose customers now include the version number in the name.
Addressed an issue for Docker Compose customers where Tonic did not restart properly after a one-click update.
Tonic can now download SAML IDP metadata from a URL. To configure the URL, set the environment variable TONIC_SSO_SAML_IDP_METADATA_XML_URL
.
Added helper text to indicate the value format for a database server.
Added the ability to configure the SAML request issuer. To configure the issuer, set the environment variable TONIC_SSO_SAML_ENTITY_ID
.
Improved error messaging for WHERE
clause validation in subsetting configuration.
Moved some of the temporary files used for data generation from /tmp to /tmp/tonic.
MySQL
Fixed an error where data generation occasionally failed with the error "Allos.Core.Exceptions.TonicException: No databases selected to overwrite"
.
Oracle
Updated to ensure that system tables are ignored.
PostgreSQL
Fixed an issue with data generation for partition tables.
January 6, 2023
Enhancements
New command-line tool for Tonic installation - Tonic now offers the Tonic Installation Manager (TIM), a command-line tool to deploy Tonic on either Amazon EKS or a VM.
Other updates
Updated how Tonic performs PII detection. Added additional name values to the information that the detection process looks for.
Fixed an issue where navigating to the workspace edit page sometimes threw an error.
Fixed an issue where SSO account creation erroneously returned errors even though the account was created successfully.
Improved our generator recommendations that are based on the security scan to prevent memory issues for larger databases.
Improved error messaging when Tonic-hosted users attempt to connect to a database on a local network.
Fixed an issue where job log timestamps displayed the wrong month value.
Amazon Redshift
Corrected a permissions check issue that caused connection tests to fail.
MySQL
Upgraded MySqlConnector to version 2.2.5.
Fixed an issue where subsetting jobs failed with a "No database selected" error.
Oracle
Fixed an issue where data generation jobs did not stop when they were canceled.
Fixed an issue with applying the Event Timestamps generator to timestamps that have timezone values.
Update to prevent Tonic from running out of space during subsetting.
Fixed an issue where materialized views and materialized view logs are not torn down after data generation.
When TONIC_ORACLE_SKIP_CREATE_DB
is true
, we now properly truncate tables.
PostgreSQL
Upgraded the Npgsql library to version 7.0. This is used for both workspace data and the Tonic application database.
Snowflake
Corrected a permissions check issue that caused connection tests to fail.
December 23, 2022
Enhancements
On the Subsetting page, you can now sort the tables based on whether they are in or out of the subset.
Other updates
Improved UI support for timeouts when loading schema information for a source database.
Updated the subsetting process to ensure that a small percentage-based target table contributes at least one row to the subset.
The environment variable TONIC_SUBSET_PARALLELISM
is deprecated. Tonic now uses the environment variable TONIC_TABLE_PARALLELISM
to control parallel processing for subsetting.
When a data preview or workspace loading process is no longer needed, such as when there are network issues or the user leaves the application, Tonic now attempts to cancel the process.
MySQL
Improved error messaging when a database connection times out.
Oracle
Fix to ensure that data generation fails when drop table statements fail.
PostgreSQL
Improved error messaging when a database connection times out.
SQL Server
Fixed an issue where some tables were not created in the destination database.
Improved error messaging when a database connection times out.
December 16, 2022
Enhancements
Other updates
Fixed an issue where containers do not start and the following error message is returned: "The directory named as part of the path /var/run/supervisor/supervisord.pid does not exist"
Corrected an issue where in some cases when the job failed because of a source connection issue, the job status was not set to Failed.
In the subsetting configuration, you can now configure a target table percentage value with up to 3 decimal places.
Resolved a potential database connection leak when canceling data generation jobs.
Improved error messaging when a deleted SSO user tries to log into Tonic.
Improved error display for subsetting WHERE
clause validation.
Fixed an issue where running a query to validate a subsetting WHERE
clause could cause the application to slow down.
Fixed an issue where jobs could fail when subsetting parallelism was enabled.
The Conditional generator now allows you to configure a selected sub-generator to be consistent with other columns. The selected generator must support consistency with other columns.
Databricks
Updated the Databricks job names to use the format tonic_
workspaceName
/
workspaceId
+
jobID
.
Improved performance for page loads.
Google BigQuery
Added support for the JSON data type.
MongoDB
Fixed a minor flickering issue on the Collection view.
MySQL
Improved logging of errors that produce data generation warnings.
Oracle
Corrected an issue where we displayed foreign keys that referenced a schema outside of the workspace.
PostgreSQL
Fixed an issue with handling HStore types during data generation.
Improved logging of errors that produce data generation warnings.
Improved management of partition tables.
Spark
Improved logging for data generation.
SQL Server
Fixed authentication handling for Active Directory. Tonic supports NTLM v2 authentication.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
March 17, 2023
Added a confirmation step when updating the baseline configuration of a generator preset from a column configuration panel.
Fixed an issue where job failures sometimes caused the worker process to crash in a way that prevented new jobs from running.
To improve Tonic web application security, added X-Frame-Options
/X-Content-Type-Option
headers.
Databricks
Corrected an issue on the workspace configuration page so that read-only items are no longer clickable.
PostgreSQL
You can now use connection strings to connect to the source and destination databases.
SQL Server
Fixed an issue where data generation failed when all URNs in the database have no dependencies.
March 14, 2023
Setting the default configuration for generators (Requires an Enterprise license)
The Generator Presets view allows you to configure the default configuration for generators. The current configuration is used whenever that generator is assigned to a column.
To update the configuration, you must be an owner or editor of a workspace in the instance.
In the configuration for a column, you can override the saved default configuration, which we call the baseline configuration. You can also revert to the current baseline configuration or save your configuration as the new baseline configuration.
Other updates
On the workspace management view, the workspace options are now included in the collapsed version of the heading.
For free trial users, the option to create a workspace is always visible in the Tonic heading.
For SAML SSO, if the value of NameID
is not an email address, Tonic uses the email claim in the SAML response.
Databricks
Added a TONIC_WORKSPACE_DEFAULT_ERROR_ON_OVERRIDE
environment variable to determine whether new Databricks tables ErrorOnOverride
by default. The default value is true
, indicating that new tables ErrorOnOverride
.
SQL Server
Changed the default value of BYPASS_MS_XML_PARSING
from false
to true
. The variable indicates whether to convert XML columns to nvarchar(max) to avoid potential XML parsing bugs.
March 10, 2023
Releases 742 and 743 were removed from quay because of a regression that was fixed in later releases.
Graph View for subsetting
The Configuration tab of Subsetting view now includes a toggle to switch between Table View and Graph View.
Table View is the existing tabular list of tables.
Tonic data encryption
Tonic data encryption uses AES encryption.
Tonic data encryption requires you to set environment variables for the decryption key (TONIC_DATA_DECRYPTION_KEY
) and encryption key (TONIC_DATA_ENCRYPTION_KEY
). Both keys must use the same key size - either 128, 192, or 256 bits.
Admin users configure Tonic data encryption from the Data Encryption tab of the Admin Panel.
When you enable Tonic data encryption, the generator configuration includes a setting to indicate whether to use it for that column.
Other updates
Fixed an issue where navigating to Database View from Schema Changes view for a table-specific issue did not apply the correct filters.
Fixed an issue where workspace import could inadvertently add multiple subset targets in the same table.
Added support for ssh-rsa for data connections.
Improved error logging for the Timestamp Shift generator.
Fixed the display for the import workspace dialog.
MongoDB
Moved the scanning of MongoDB collections into the Tonic worker.
PostgreSQL
Fixed an issue that caused data generation to fail with the error canceling statement due to statement timeout.
SQL Server
Fixed an issue where a bug in the SQL Server Management Objects (SMO) library caused data generation to fail when using partitioning in Azure.
Fixed an issue that caused data generation to fail when system tables were included.
March 3, 2023
When you log back in to Tonic, it now displays the workspace management view for the most recently viewed workspace.
Improved error messaging for rarely occurring user authentication issues.
Fixed display issues on the Subsetting view where part of the subset results graph did not display and the Configuration tab did not scroll.
Amazon Redshift
MongoDB
Fixed an issue where indexes created with an expireAfterSections value are not created in the destination database.
Fixed a regression where unscanned collections did not display in workspaces.
Fixed an issue that prevented Tonic from being able to connect to MongoDB-compatible DocumentDB databases.
Fixed an issue where DocumentDB jobs failed when run from Windows.
MySQL
Updated to ensure that TONIC_WRITE_PARALLELISM
is always 1, to prevent lock timeouts.
Increased resilience around possible failures when retrieving a MySQL schema during data generation.
Snowflake
SQL Server
Improved performance when reading large fields.
Fixed an issue that caused data generation to fail with the error Sequence contains no matching element
.
Fixed an issue that caused data generation to fail on SQL Server 2014.
February 24, 2023
Fixed an issue where authentication using Azure SSO failed if the user was a member of a large number of IdP groups.
Improved detection during sensitivity scans of columns that contain birth dates.
MongoDB
Fixed an issue where adding a new null type field to the data caused data generation to fail.
Oracle
Fixed an issue where data generation returned the following error: Specified argument was out of the range of valid values. (Parameter 'minSize').
PostgreSQL
Fixed an issue where tables with generated columns, but no generator assignments, were not correctly copied to the destination database.
Improved handling and messaging for connection timeouts.
Snowflake
Fixed an issue where retrying a connection multiple times returned an incorrect error.
SQL Server
Fixed issues where users could not configure data generation for tables that contained hierarchyid columns.
Fixed an issue that prevented Preserve Destination from being used on schema-bound tables.
Fixed an issue that prevented Incremental and Preserve Destination table modes on memory-optimized tables.
Fixed an issue where column attributes such as the nullability of a column were not correctly reflected in the destination database.
Fixed an issue where schema binding was not maintained in the destination database.
February 17, 2023
Redesigned Tonic navigation
On Workspaces view, the first column is now a checkbox to select workspaces to which to apply an option from the Actions menu.
The Tonic application header contains links to Workspaces view and the Admin Panel.
New table filtering for Google BigQuery
Other updates
Fixed an issue where Tonic workers did not start after an upgrade when a read-only file system was mounted for custom value processors.
Improved the detection of vehicle identification numbers (VINs) in source data.
Oracle
Improved generator recommendations for columns that are part of a compound unique constraint.
SQL Server
Fixed an issue that caused jobs to fail when none of the tables were assigned De-Identify mode.
February 10, 2023
Fixed an issue where when subsetting used parallel processing, the subsetting steps displayed incorrectly.
Improved the error message that is displayed when unsupported data types are present in a table.
On the Subsetting view, on the row count popup, provided a clearer explanation when the destination row count is larger than the source row count.
Fixed an issue where simultaneously updating workspace permissions for multiple users failed to apply the updates.
Made some small performance improvements to the Regex Mask and Array Regex Mask generators.
Improved the job cancellation logic to ensure that selecting the cancel option actually cancels the job.
When a SAML SSO login is initiated, Tonic now redirects the browser to the correct URL.
When Tonic detects an invalid or deprecated generator, it no longer returns the error message Unexpected generator id {generatorId}, create a dedicated Metadata class for this generator
.
MySQL
Fixed an issue that caused lockouts and timeouts on the destination database.
Oracle
Reduced the number of destination database permissions that are required for data generation.
PostgreSQL
When you use the Limit Schemas feature, and you choose to filter to only include specific schemas, Tonic no longer includes all of the extensions in the database. It only includes extensions that are related to the included schemas.
Values are now truncated to fit the char
type before they are written to the destination database.
Generators that support consistency can now be made consistent with columns that are user-defined enum values.
The Conditional generator can now use columns that are user-defined enums as conditions to apply sub-generators.
Tonic now correctly determines whether to truncate a unicode string when fitting the string into a data type that is close in size.
Improved resiliency to destination database connection failures.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
December 9, 2022
Enhancements
Other updates
Corrected an issue for instances using a Docker version before 20.10.10 that prevented Tonic processes from starting after an upgrade from version 642 or below to version 643 through 647.
Corrected an issue where estimated data generation time was logged as a negative number.
Improved the performance and resiliency of the subset WHERE
clause validation.
Improved the cleanup of database connections after they are used.
Improved error message when the source database version is newer than the destination database version.
Google BigQuery
Fixed a user interface issue that prevented a new workspace from being created.
Oracle
The minimum supported version for the Oracle data connector is now Oracle 12c. Tonic no longer supports 11g. As a result of this change, the Oracle Helper is no longer needed. After upgrading, you should remove Oracle Helper from the list of containers to fetch.
Snowflake
For database connections, the server name should not include https://
. Tonic now detects and removes https://
if it is entered.
Spark
Eliminated a vulnerability that could allow users to see the results of statistics jobs that they did not have access to.
SQL Server
Fixed an issue with loading tables in SQL Server 2012 and below
December 2, 2022
Enhancements
Other updates
You can now assign the SIN generator to fields that have uniqueness constraints.
The Address generator now only allows you to link columns that contain the following types of address values - City, City State, Country, Country Code, State, State Abbreviation, Zip Code, Latitude, Longitude.
Improved memory handling for subset processing of downstream tables.
On the Subsetting view, when the database returns an unrealistic value for the subset size, we replace the value with a warning message.
Fixed an issue where linking columns that have the same name as columns in another table caused the generators to be removed from those other columns.
Made minor memory improvements throughout the Tonic application.
Corrected an issue where Tonic displayed a "subsetting in progress" message before the processing started.
Amazon Redshift
Fixed an issue where data generation failed when the name of one table was part of the name of another table (for example, Test and Test1).
MongoDB
Generators are now applied correctly to documents that are nested in arrays.
Oracle
Removed the permissions check for reading truncated tables.
Fixed an issue where DBLinks created for data generation were not removed when the generation failed or was canceled.
Snowflake
The Categorical generator no longer returns errors when applied to binary, time, and date columns.
Improved the performance for loading the user interface when the data contains a large number of foreign keys.
Spark
The Audit Trail entries on Privacy Hub now display the correct table and column names.
Corrected an issue where sub-generators for the Conditional generator were not applied correctly.
Improved error messages when testing the connection to a Databricks database.
November 23, 2022
Updates
Improved user signup experience for the hosted version of Tonic.
Made some small styling updates to Database View. Removed the dropdown to uncheck or check all of the tables. Removed the highlighting on the table mode dropdown for tables that contain columns with assigned generators. Changed the Filter button labels to Filters.
For subsetting, verify that primary key fields are not assigned a non-primary key generator.
Amazon Redshift
Improved error messaging for permissions for data generation.
Improved performance for data generation.
MongoDB
Fixed an issue that caused errors when applying the Timestamp Shift generator.
Snowflake
Improved error messaging for permissions for data generation.
Improved performance for data generation.
Spark
Improved error messaging for unsupported versions.
SQL Server
Improved handling of alphanumeric sequences.
Case-sensitive databases now connect correctly to Tonic.
Corrected how indexing recommendations for subsetting are determined.
November 18, 2022
Enhancements
Other updates
Tonic no longer offers DB2 as a data connector type.
On the Job History list, changed the job type "Privacy Scan" to "Sensitivity Scan".
Corrected a display issue in Privacy Hub where the column details panel extended past the bottom of the page.
For the JSON Mask generator, the path expression selection tool now works for arrays and for keys that contain spaces or special characters.
Tonic now provides a more meaningful error when Preserve Destination mode is assigned to a table in a workspace that does not have a defined destination database.
Added a message to notify users when queries used to validate a subset WHERE
clause run for a long time.
Tonic now continues to record logs when a job fails.
Improved memory usage when running data generation.
MongoDB
Corrected a subsetting issue that occurred when multiple foreign keys in one collection referred to the same primary key in another collection.
MySQL
Updated the MySqlConnector driver to 2.2.0.
Oracle
Downgraded the Oracle drivers to 3.21.70 after encountering issues with 3.21.80.
Corrected an issue with the Event Timestamps generator.
PostgreSQL
When executing psql
and pgdump
commands, we properly enforce the requirement to use SSL for connections.
Corrected connection issues by escaping special characters in password files.
Redshift
Fixed an issue where Tonic failed to display an error for an invalid generator assignment.
Snowflake
Fixed an issue where Tonic failed to display an error for an invalid generator assignment.
Fixed an issue where large tables caused data generation to fail with "Error: Authentication token has expired. The user must authenticate again."
Spark
You can now apply the Timestamp Shift generator to string fields that contain datetime values.
November 11, 2022
Enhancements
Tonic can now integrate with Keycloak for SSO authentication.
Other updates
Fixed an issue where users received the error HTTP/1.1 415 Unsupported Media Type
from the /api/GenerateData/start
endpoint.
Improved performance for the /api/job
endpoint. As part of this update, the endpoint only returns the most recent 100 jobs.
Improved loading time for the Job History view, which now displays only the 100 most recent jobs.
Updated how Tonic determines whether to use SSL for connections.
The Add Foreign Key Relationships tab on the Foreign Keys view now expands to fill the height of the browser.
Fixed an issue where a failed cleanup step incorrectly marked a Tonic update as failed.
MongoDB
When performing a schema scan, Tonic now displays a progress indicator.
Improved subsetting performance.
Fixed an issue where subsetting failed because of duplicate keys.
Corrected how Tonic handles views.
Redshift
Improved memory management.
Snowflake
Fixed an issue where the Tonic user interface could not load when the database included foreign keys.
Improved memory management.
November 4, 2022
Enhancements
If data generation is not blocked on all schema changes, Tonic now displays a dismissible warning when there are non-conflicting schema changes. Conflicting schema changes always block data generation.
For the GenerateData
API endpoint, added an optional clientResourceId
query parameter. When you provide a value, then jobs that have the specified clientResourceId
run serially instead of in parallel. The check applies to all jobs across the instance, regardless of whether they belong to the same workspace.
Other changes
Fixed an issue that made it difficult to click the Cancel Job button in the Job History list.
On the workspace configuration view, the source and destination database details are now populated correctly when you refresh the page.
When granting access to a workspace, improved how we display long names and email addresses.
For tables that use Scale mode, removed Passthrough from the generator selection dropdown list. Previously, the option displayed in the dropdown list even though it couldn't be selected.
Improved performance for looking up tables and columns.
Corrected an issue where the same environment variable with different cases caused Tonic to crash.
When logging in using single sign-on (SSO), when the email address uses a different case from an existing username-password account, it now resolves to the same user.
For instances deployed using Docker Compose, Tonic now cleans up old, unused images.
Improved performance when running generators on de-identified tables.
Corrected an issue where the Event Timestamps generator produced unexpected values for linked columns.
Improved error handling for account creation.
MongoDB
Updated MongoDB.Driver and MongoDB.Bson to 2.18.0.
Subsetting jobs with foreign keys can now run on collections that have lower permissions. To enable this, set the environment variable TONIC_BYPASSDOCUMENTVALIDATION_ON_DOWNSTREAM_KEY_MERGES
to false. The default value is true
, which means that the destination database requires the dbAdmin
role in addition to readWrite
.
Oracle
Tonic now automatically uses the optimal table prefix for schema queries. The TONIC_ORACLE_DICTIONARY_TABLE_PREFIX
environment variable is removed.
Improved error handling for missing permissions for table size queries.
SQL Server
Improved performance for table queries.
October 28, 2022
Enhancements
The Subsetting view now shows for each table the percentage of data that is included in the destination database.
Other updates
The workspaces view no longer briefly flashes a message indicating that the workspace cannot be found.
Added the ability to display trace information in the log files. To enable the trace information, set the environment variable TONIC_LOG_TRACES
to true.
Fixed an issue that caused an incorrect warning to display when linking columns that were assigned the Custom Categorical generator.
Fixed a data type error in AI Synthesizer for models that only contain categorical data.
For the Event Timestamps generator, Tonic now prevents the generator from being assigned to a time-only value.
Amazon Redshift
Added ServerCompatibilityMode
to the Redshift connection string to prevent connection errors.
Databricks
Added support for Databricks 10.x.
MongoDB
Fixed an issue where Tonic generated collection names in Preserve Destination mode that exceeded the maximum length limit.
MySQL
The MySQL data connector now supports the ability to use the DELIMITER
command in post-job scripts.
Oracle
Improved performance for schema queries.
Improved performance for databases without database links.
Restored missing information for loader errors.
Improved performance for applying indexes and constraints.
Added the environment variable TONIC_ORACLE_REDO_LOG_ENABLED
, which by default disables recovery information writes to REDO LOG
files.
Improved performance for retrieving tables and columns.
Fixed an issue with loading custom types.
Fixed an issue where multiple indexes on the same column caused data generation to fail. Corrects a regression from V611.
When fetching constraints, Tonic can now support primary key indexes that have the same name.
Excluded views from table queries.
PostgreSQL
Updated the process of loading tables to improve performance and eliminate an error indicating that PostgreSQL types could not be found.
SQL Server
Fixed an issue where Preserve Destination and Incremental table mode could not be used in tables with foreign key relationships.
Tonic now provides a more informative error when data generation fails because the database is not accessible.
Tonic now properly handles identifiers with single quotes.
Fixed an issue that caused a worker process to crash when using Azure to connect to the database.
Added additional logging of the available databases.
Improved handling of schema and table names in privacy scans.
October 21, 2022
Enhancements
You can now generate DEBUG
level logs for the Tonic API. To do this, set the environment variable TONIC_CONSOLE_LOG_LEVEL
to DEBUG
.
Tonic now supports logging for long-running queries. The environment variable TONIC_LONG_RUNNING_QUERY_LOGGING_INTERVAL
provides the interval in minutes for logging queries. By default, Tonic generates a log entry for a long-running query every 10 minutes. To see this information, TONIC_CONSOLE_LOG_LEVEL
must be set to DEBUG
.
Other updates
Fixed an issue that prevented the use of Preserve Destination and Truncation modes on tables and collections that had names that were close to the maximum length limit.
You can now assign the Integer Key generator as a sub-generator for string values in composite generators.
For Google SSO, fixed an issue where users who did not have a group membership could not access Tonic.
Improved performance for format-preserving encryption (FPE), which is primarily used for key generators.
Improved generation performance for tables that use De-Identify table mode.
Improved error display when reading data from the source database.
Fixed an issue where values for columns that were assigned the Passthrough generator were being masked with 1s and 0s.
The User Settings page no longer displays the password change option for SSO users.
For the JSON Mask generator, fixed an issue where a large matching value caused the buttons to move off of the configuration dialog.
Fixed an issue where data connection pooling caused data generation to fail because of colliding queries.
Google BigQuery
Fixed an issue where source data was not loading properly into the Tonic user interface.
MongoDB
Fixed an issue where the generator configuration panel closed unexpectedly after making edits.
Oracle
Added a warning when a generator produces a value that is truncated in the destination database.
When Tonic retrieves table sizes, the results now include the size of the associated large objects (LOBs).
To help with troubleshooting, Tonic can now determine which Oracle patches were applied to a database.
Fixed an issue that caused errors when reading a large object array.
Added additional checks to the data connection tests.
Updated our Oracle Driver to 3.21.80.
Fixed an issue where primary keys could not be copied to another index.
Additional performance improvements.
Corrected how we check the current user's database privileges.
PostgreSQL
Added an option to not refresh materialized views. To turn off the refresh, set the environment variable TONIC_POSTGRES_REFRESH_MATERIALIZED_VIEWS
to false
.
Fixed an issue with parsing PostgreSQL schemas in Windows environments.
Fixed an issue where data generation failed when the data contained read-only arrays.
Snowflake
Fixed an issue where data generation failed when the schema definition (DDL) contained row access policies.
SQL Server
Added permission checks to the data connection tests.
Fixed an issue where Tonic processes failed.
For a Docker deployment, in docker-compose.yaml, remove the tonic_pii_detection
section. You can see the relevant section in this .
You can see the updates in this .
In a Databricks workspace, you can now choose to write all of the output tables to one of the following formats: Avro, JSON, Parquet, Delta, CSV, ORC. This setting replaces the previous option to write all of the output tables to Databricks Delta. See .
New endpoints and expanded API documentation for generators - For generators, the now contains descriptions for each endpoint and for all of the model properties. You can . You can now , instead of having to provide the configuration for an entire table. There is also a new endpoint to .
Tonic now supports .
For more information, see and .
Graph View is a diagram view that displays the tables and the relationships between them. Similar to Table View, when you click a table in Graph View, the table details panel displays for that table. You can configure subsetting from either view. For more information, see .
On Graph View, Tonic adds a marker when the subset configuration for a table changed since the last subsetting data generation. A table might be added to the subset, removed from the subset, or modified within the subset. See .
The new feature, available for Professional and Enterprise licenses, allows you to set up a configuration to decrypt source data before applying a generator, encrypt transformed data before writing it to the destination database, or both.
The new allows you to de-identify object ID columns.
Added the ability to provide .
Added the ability to provide .
As of V724, the is by default not available. To enable the AI Synthesizer generator, TONIC_NN_GENERATOR_ENABLED
to true
. If the AI Synthesizer generator is assigned in a table, but the environment variable is false
, then data generation fails.
During subset processing, when Tonic encounters a circular dependency, to break the circle, it nullifies the values of one of the foreign key columns. By default, it nullifies all of the foreign key column values. A new , TONIC_SUBSETTING_AGGRESSIVELY_NULL_CYCLICAL_FKS
, controls the behavior. By default, the value is true
. If you set the value to false
, then Tonic only nullifies foreign key values that do not exist as primary key values in the other table.
As of V719, we have redesigned the Tonic navigation. The left navigation menu is removed. On Workspaces view, click the workspace name to display the for that workspace. The workspace management view contains a horizontal navigation bar to provide access to the workspace configuration and generation tools, and a heading menu to provide access to other workspace actions.
As of V717, Google BigQuery supports . On the table mode selection panel, for tables that use De-Identify mode, you can provide a WHERE
clause to filter the records that are included in the destination data.
From the Subsetting view, you can now view the table configuration and results summary for previous subsetting data generation runs. On the , you select the run to view the details for. Note that you can only view the details for jobs that you run after you upgrade.
Tonic now supports post-job scripts in BigQuery workspaces. Post-job scripts run inside of transactions. They are limited to .
Overriding table mode and generator configuration in child workspaces - The feature now allows child workspaces to override the table mode and generator configuration from the parent workspace. Database View and Table View indicate when the parent configuration is overridden, and provide options to reset the configuration.
When using Helm to deploy Tonic via Kubernetes, integer environment variable values that are longer than 6 digits might be converted to scientific notation. To avoid this issue, . Tonic can now parse scientific notation to better handle this behavior for values that are not in quotes.
Added filters for upstream subset records - In the subsetting configuration, for upstream related tables, you can now filter the records to include based on either a date value or a WHERE
clause. Upstream tables contain data that has a foreign key that references a primary key in a target table. Upstream records are optional, and are not needed for referential integrity. For more information, see .
The Confirm Generation panel now provides access to tips to improve data generation performance. For more information, see .
The now requires the s3:ListBucket
permission.
The now requires the s3:ListBucket
permission.
Terminology change - In the Tonic documentation, we have changed the term "mask generator" to "".
June 17, 2022
When users create a new password, Tonic now displays a panel with the password requirements, and indicates whether the password meets those requirements.
Improved the parallelization and concurrency for processing foreign key constraints.
Databricks and Spark EMR
Improved the performance of the Noise generator.
MongoDB
Improved display of longer key values in the Key column. Widened the column and added truncation.
Optimized queries against the Tonic database.
June 10, 2022
Features/Enhancements
For Postgres databases, Tonic now supports name and char data types.
For Tonic single sign-on, Tonic now supports Azure Active Directory.
From the administration screen, administrators for customers that run Tonic on Kubernetes and Docker can now download logs from all containers that run Tonic.
For Spark-powered integrations, Tonic now supports the Address generator as a sub-generator.
Bugs
Improved performance for:
Random Hash generator on Spark and Databricks
UUID Generator on Spark
Subsetting
Improved the browsing experience on low resolution displays.
May 27, 2022
Features/Enhancements
Skip batch instead of failing generations on Postgres in some cases when values fail to be inserted
Improvement to Synthesis Reports for AI Synthesizer generated data
Changing Lambda deployment from ECR to ZIP for integrations using Lambda functions. Images no longer need to be manually deployed to ECR.
Bugs
Fix for One-Click Tonic update when the postgres application database is deployed in a Docker container
May 25, 2022
Features/Enhancements
Support for arrays with Struct Mask generator on EMR/Spark and Databricks
Bugs
Fix issue causing privacy scans on Dremio to fail
Minor UI fixes to workspace page
Fix for data generation issue on Db2 iSeries when destination is empty
More robust handing of Databricks host URLs
May 24, 2022
Features/Enhancements
Adds data types to all keys in Mongo Collection view and tooltip popovers to key names
Updates to data generation UI
Bugs
Fix shaky dialog when table mode resizes
May 23, 2022
Features/Enhancements
Add data type advanced search for Oracle, Redshift, and BigQuery
Performance improvements for constraint restoration
Improved performance of Random Integer generator on Spark
Bugs
Fix issue on SQL Server preserve destination mode when table names contain a ".".
Postgres: Fix issue copying arrays with trailing slashes
May 20, 2022
Features/Enhancements
Allow the console log level to be configured on the Web Server and Worker
User passwords can now be reset by an Admin in the Admin Panel
May 19, 2022
Features/Enhancements
Added compatibility with SQL Server 2012
Spark SDK: Support for validation of workspace when processing a dataframe
Bugs
Minor fixes for MongoDB integration
May 17, 2022
Features/Enhancements
Support graceful cancellation of privacy metric computation for AI Synthesizer
Java SDK documentation available from the SDK Setup dialog
Bugs
Only show SSO and Enterprise licensing information to users when appropriate
Minor UI fixes for scrolling on the workspace view table
Minor UI fixes for the Edit Workspace page
May 17, 2022
Features/Enhancements
For S3/EMR data source, support separately testing each component of the connection (Glue catalog, EMR cluster, and S3 bucket)
The Additive Noise Generator is renamed to "Noise Generator" with two noise options "Additive" (existing) and "Multiplicative" (new)
Improve Javadocs for Spark SDK and other enhancements
Bugs
Fixes for the SQL beautifier on post-job scripts, including scenarios which may have crashed the app on Safari
Skip processing Temporal Types when not supported (SQL Server 2012) to prevent job failures
May 16, 2022
Features/Enhancements
Add hourly heartbeat/status log message for web server and worker
Group database/integrations on the create workspace screen by type
May 13, 2022
Features/Enhancements
Add support for Azure Datalake Storage Gen 2 as an output destination for Databricks
Don't include statistics for Oracle schema copy
MongoDB: Support uploading foreign key file, support deleting configured foreign keys
Bugs
Do not stop generation in SQL Server if schema pre-fetch fails
Improved deletion of old logs to prevent timeouts
May 13, 2022
Features/Enhancements
Adds support for Spark 3.1
Improvements to Dremio for data type searching, Struct support, and a new getVersion method on SDK
Bugs
Fix workspace name not updating in dropdown after rename
Don't allow non-editor/owner to open table mode selector
May 12, 2022
Features/Enhancements
Update subsetting tooltip help text
Don't check destination recovery mode on Azure SQL
Improve performance of table constraint application when subsetting by applying single and cross table constraints in parallel
Bugs
Fix logo navigation redirect when signed in as a user with no workspaces
Fix issue with setting a partition filter on Spark
Fix issue with collection picker on MongoDB showing removed collections
Fix excessive memory usage of Hostname generator
May 10, 2022
Features
Improves performance of Custom Categorical generator on Spark and Databricks
Add support for Azure Databricks
Display SQL Server schema generation progress as a percentage
Bugs
Do not fetch column data on configuration popover open to reduce queries against the source database
Reduce the number of queries to the source database for table/column details during generation
May 10, 2022
Features
Spark 2.4 and Databricks Library Improvements
Add ability to remove users in the Admin Panel
Bugs
Fixed issue with the displayed port when connecting to MongoDB via connection string
Fix for data table column resizing not rendering properly
Support random latitude and longitudes with HIPAA Address generator
May 9, 2022
Features
New design for the workspace dropdown selector
Add warning when previewing AI Synthesizer on text which does not appear to be categorical
Bugs
MongoDB: Collections no longer uncollapse when a generator is applied in hybrid view
May 6, 2022
Features
Add support for Spark SDK on Spark v2.4
Bugs
Fix UI crash when selecting incremental as Table Mode
Fix AI synthesizer when no categorical columns present
Ensure AI Synthesizer model is not trained if only one column is selected
Render HTML Mask in the generator model panel on table view
Fix issue with not being able to set incremental table mode column if you refresh after setting it to incremental
May 5 2022
Features
Added a pre-job check to guard against cycles in applied generators
Better support for environments where websockets are not supported
Add environment flag to disable websockets on API
Gracefully fail if websockets can't connect
May 3, 2022
Bugs
Fixed issue preventing connecting to Postgres databases using an SSH tunnel
Fixed nullability error in US Phone generator
May 3, 2022
Bugs
Fix issue where collection loads were attempted for non-Mongo DB's in some instances
Write pg_dump results directly to file instead of reading them via stdout to prevent process locks
Fixed isssue where copying a workspace on the Workspace View doesn't update the table of workspaces
May 2, 2022
Features
Support applying generator values for PK fields in MongoDB to their FK fields
Bugs
Reduce Frequency of Upstream table filtering logs
Remove double scroll bar on foreign key page when viewing foreign keys
Improved database introspection queries on IBM Db2 iSeries
April 29, 2022
Consistently clear local storage on authorization failures
Aggregate dropped row exceptions for more efficient logging
April 28, 2022
Features
Improved performance for schema scans on MongoDB
Bugs
Minor fixes to workspace view
Fix for concurrent lambda function update issues
Fix brief non-ideal state while loading collections on MongoDB
April 27, 2022
Features
Allow skipping of privacy scan on MongoDB using TONIC_TABLE_SKIP_REGEX environment variable
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
August 12, 2022
Added a new Max Categorical Dimension parameter to the AI Synthesizer configuration. This parameter controls the dimension of each column that has categorical or location encoding. If a column contains more distinct categories than this parameter, the most frequent categories are embedded as distinct one-hot vectors. The remaining categories are combined into a single one-hot vector.
Improved error identification for an invalid where clause in subsetting configuration.
On the subsetting page, for tables that were not previously in the subset, the row count is now correctly represented as unknown instead of 0.
Fixed an issue with the Tonic update option in the Tonic application.
Configuring a subset target table to include 100% of the records no longer causes an error during data generation.
Removed connection pooling from Tonic workers to address database connection issues during data generation.
MongoDB
Added support for partial indexes.
Fixed an issue where the configured generators were not applied.
Indexes that use the collation option are now properly recreated in the destination database.
The subsetting user interface now uses the correct terminology for MongoDB.
Improved performance for Mongo subsetting when handling downstream tables.
Fixed percentage-based subsetting for Mongo versions before 4.4.2.
MySQL
Added support for HASH partition parallelization.
Running data generation on masked and passthrough tables with ranged sub-partitions no longer results in duplicated data.
Added support for parallel uploads with ordering.
Oracle
Added new environment variables (ORACLE_TRACE_LEVEL
, ORACLE_TRACE_FILE_LOCATION
, ORACLE_TRACE_FILE_MAX_SIZE
, and ORACLE_TRACE_OPTION
) to enable Oracle tracing.
Added support for datetime components in composite keys for subsetting.
Unique indexes are now detected and users cannot apply generators that might violate the enforced uniqueness.
Improved performance for data generation.
Improved handling of schema names during data generation.
Users are no longer incorrectly removed from the destination database.
PostGreSQL
Upgraded npgsql to address an issue with cross-schema types.
Spark
Added support for Kerberos authentication for HDFS with Spark / Livy.
Added support for repartition and coalesce options for Spark EMR and Livy.
Repartition and coalesce options can now be saved on Databricks.
Added support in Hive for varchar and char fields that have lengths.
August 5, 2022
Enhancements
On Privacy Hub, the list of unprotected sensitive columns is replaced with a Database Tables list. The Database Tables list summarizes the protection status of each table in the source database. The Privacy Status column summarizes the protection status of the columns in the table. It provides access to the same column details and configuration options as the protection status panels at the top of Privacy Hub.
You can now filter the Workspaces view based on your assigned role in the workspace. For example, you can only display workspaces for which you are an Owner or an Editor. To use the role filter, click the filter icon in the Role column heading. For admin users, the role filter includes a Role assigned toggle to allow them to only see workspaces that they have a role in.
Other updates
Customers in the Basic license tier can now transfer ownership of workspaces and assign workspace roles to other users.
New users are now logged into Tonic immediately after they create their Tonic account.
For AI Synthesizer:
The modeling now incorporates static features for each entity across a sequence of events.
AI Synthesizer now queries to retrieve the actual minimum/maximum values of relevant columns, instead of taking the minimum and maximum of a sample of the data.
To update a license key, self-hosted instances that do not have an admin user can set the license key as the value of the TONIC_LICENSE
environment variable. Tonic ignores the variable in instances that have an admin user.
The Workspaces view no longer waits for users to finish applying several new filters in quick succession before it fetches new results.
On the Job History view, fixed an issue where the copy job ID and download logs icons flashed on hover. Removed an error that flashed when job details were displayed.
Fixed the parallel processing for subsetting.
MongoDB
Indexes are now copied to the destination database.
Percentage-based subsetting is now supported in versions earlier than 4.4.2.
Deleted collections are no longer displayed.
MySQL
Improved memory handling for uploaded CSV files.
Oracle
Long and long raw columns are no longer converted to blob or clob.
Spark
Added Maven repository and artifact information to use the SDK Launch method to download the Spark SDK.
Improved the performance of data generation on Databricks when using Job Cluster. Added support for SQL Warehouses on Databricks.
July 29, 2022
Added Sequence Length Loss Factor and Order Column Loss Factor model configuration options for events data to the AI Synthesizer. Sequence Length Loss Factor indicates the importance of realistic sequence lengths in the model. Order Column Loss Factor indicates the importance of realistic column value ordering in the model.
For the Categorical generator, differential privacy is now off by default.
Increased the amount of time after which an inactive job is assumed to be canceled.
MongoDB
Fixed a subsetting issue that caused errors when there was missing data.
MySQL
Tonic now handles the BIT data type correctly.
Oracle
Decimal values that are larger than the dotnet decimal data type can now be handled.
Redshift
Fixed an issue that caused the system to crash when you clicked Test Connection during workspace creation.
Fixed an issue that prevented more than one generation run per version.
Fixed a HIPAA resource issue that caused data generation to fail.
Snowflake
Fixed an issue that caused the system to crash when you clicked Test Connection during workspace creation.
Fixed an issue that prevented more than one generation run per version.
Fixed a HIPAA resource issue that caused data generation to fail.
Spark
Fixed an issue with applying the Null generator to a struct array column.
SQL Server
Synonyms are now created correctly during data generation.
Fixed an issue with subsetting based on a percentage of a target table.
July 22, 2022
New features and enhancements
Redesigned the user experience for the subsetting feature. The new subsetting view displays a list of the source database tables and indicates whether each table is in the subset. When you click a table, the table details display in a new right-hand panel. From the details panel, you can configure target tables. You can also identify lookup tables (previously referred to as reference tables), and indicate how to handle tables that are not in the subset. For details, see About subsetting.
If subsetting is configured, when you run a data generation job, you can enable or disable subsetting. For details, see Running a data generation job.
Added latitude and longitude processing to the HIPAA Address generator.
You can now filter Database View based on the applied generator. On the filter panel, in the Applied Generator field, you can provide the list of generators to include. See #database-view-columns-filter-assigned-generator.
Redesigned the Schema Changes page. The Actions list is now called Conflicting Schema Issues. Tonic provides clearer warnings when a schema change resolution will result in a change to the workspace configuration. Resolving a conflicting issue or bulk dismissing non-conflicting issues now includes a confirmation step. See Viewing and resolving schema changes.
Other updates
Renamed the Events generator to Event Timestamps.
Fixed some small display issues in the new Privacy Hub and Subsetting displays.
Improved event generation for the AI Synthesizer.
Improved how we retrieve CloudWatch logs to include the job ID and to use the correct Tonic version.
Corrected the processing of downstream, multi-index tables during subsetting. Corrected an issue in the initial fix.
Improved memory usage during data generation.
PostGreSQL
Corrected how permissions are validated for Heroku.
Oracle
Tonic now prevents data generation jobs from running when the Oracle versions for the source and destination database do not match.
Improved the performance of writes to the destination database.
After deleting a table in the destination database, Tonic now also clears out the recycle bin.
Fixed generation for LONG and LONG RAW fields.
Snowflake
Fixed test connection and generation in response to an AWS API change.
Redshift
Corrected the handling of time zones in timestamps.
Fixed test connection and generation in response to an AWS API change.
MongoDB
UUIDs larger than 16 bytes are now truncated.
Improved the use of MongoDB resources.
Improved the display format of LUUIDs.
July 15, 2022
New features and enhancements
We enhanced Privacy Hub to add expanded top-level metric panels. These panels show the number of unprotected sensitive columns, protected columns, and unprotected non-sensitive columns. From these panels, you can display column details, select and configure the column generator, view sample data, and add column comments. For more information, see Privacy Hub.
You can now add or update a Tonic license from the Tonic application. For a new instance, you are prompted to provide the license key before Tonic displays the login screen. Tonic displays a message when the current license is expired. The Admin Panel includes an option to update the license key. For more information, see Entering and updating your license key.
For the Address generator, added City with State and City with State Abbr to the available options for the column format. You use these options for column values such as San Francisco, California or Boston, MA.
Tonic now supports subsetting for MongoDB databases.
Workspaces can now have a description (up to 200 characters) as well as a name. Use the description field to provide additional context for the workspace and how it is used.
In the Tonic API, you can now sort workspaces based on the last generation date.
Other updates
Tonic now prevents a job from running when the worker determines that the server is running a different version of Tonic.
Refactored the underlying implementation of the subsetting feature.
Made some small memory improvements for data generation.
Implemented performance improvements when applying parallel constraints.
Corrected errors for edge cases related to the Audit Trail.
When the selected workspace changes, the identifier in the URL is now updated correctly.
The workspace configuration is migrated to a data type that enables more efficient querying.
Tonic now validates uploaded foreign keys against the table definitions.
Oracle
A destination database can now have more than 1000 tables that have the table mode set to Preserve Destination.
PostgreSQL
Batch sizes are now set dynamically based on the average row size.
Memory improvements for PostgreSQL data generation that involve large rows.
Spark
Improved support for foreign keys.
Added support for Apache Livy and HDFS.
Improved performance and added SDK support for the Integer Key generator.
MongoDB
MongoDB aggregations can now use temporary files on disk to store data that exceeds the MongoDB size limit. This expands the possible generations for a MongoDB database.
Corrected the generation and display of UUIDs.
ObjectIds can now be used as primary keys for subsetting in MongoDB.
Improved the usability of the Add Foreign Key Relationships tab on the Foreign Key Relationships view.
Oracle
Improved how Tonic handles maximum lengths when it generates the following data types:
NCHAR, NVARCHAR2, CHAR, VARCHAR2
July 5, 2022
SQL Server
Eliminated duplicate default constraint URNs during database creation.
July 1, 2022
Refreshed the Audit Trail user interface on Privacy Hub. The new Protection Audit Trail provides a paginated list of the updates to the sensitivity designation and generator assignments.
Deep links now work correctly when you use Google SSO to authenticate.
Error messages from Oracle are now displayed in response to invalid where clauses in subset configuration.
Made minor memory improvements to the Address generator.
Snowflake
Reduced the frequency of schema change detection on Snowflake databases. This can result in cost savings on Snowflake clusters, because the clusters can sleep more often.
Snowflake generation now works correctly when there are foreign key constraints.
SQL Server
Added support for security policies, sequences, check constraints, and system versioned temporal tables.
June 24, 2022
Improved cross-tab support for automatic logouts when you configure an inactivity period.
The Update option in the actions menu now takes you directly to the System tab on the Admin Panel instead of the Users tab.
Corrected the password length requirement to be 12 characters or greater instead of greater than 12 characters.
Improved the estimated row progress for scaled tables.
Eliminated a race condition that occurred when applying constraints.
SQL Server:
The Categorical generator can now support more than 2 billion rows in a category.
Databricks:
Can now run concurrent jobs that use different versions of Databricks.
Tonic now supports ORC and Hive tables in Databricks.
April 27, 2022
Features
Improve usability of processing with Java SDK
Bugs:
Close connection to application database when not being used
Removed excess querying of database version
April 26, 2022
Features
Improves performance of Name Generator on Spark / Databricks
Change Dremio schema input to Tag Input and add an informational popover
Bugs:
Do not null FKs between reference tables
Allow foreign keys to be added by editors
April 26, 2022
Features
Improves performance of the SSN Generator on EMR Spark / Databricks
Bugs:
Keep the password field empty when copying from source database settings to destination database
Improved exception handling when masking and preserving destination tables
More quickly alert user when a cycle has been found in their relationships
Handle Postgres interval data types with a null boundary
Disable Add Foreign Keys tab for non-workspace owner users
April 25, 2022
Features
The Workspaces view now supports bulk actions. Share, transfer, delete, and leave multiple workspaces by checking the boxes at the right end of each row, and choosing the desired action from the Actions menu.
Bugs:
Fixed an issue where Safari would fail to load the application
Fixes to constraint application parallelism
Breaking Changes
The response for the Workspace deletion endpoint (/api/workspace/{workspaceId}
) has been modified slightly
April 22, 2022
Features
Enable Timestamp Generator To Work Without Statistics For Spark Object Masks
When a mongo foreign key field has a primary key with a generator applied, replace the example data with a link to the collection with the primary key
Bugs
Fix for MySql generation hanging with empty password
April 21, 2022
Features
Improves performance of Name Generator on EMR and Databricks
Bugs
Fix Display Bug in AI Synthesizer Configuration Panel
Fix application of constraints in Redshift
April 20, 2022
Features
Add the version of the worker that ran a job to the job details
On the Mongo Collection View for foreign/primary keys: Added a hover tooltip for key icons, replaced the generator with a label
Refresh the last collection visited when a new foreign key file is uploaded and optimize single_doc foreign key
Bugs
Disable date truncation generator as a subgenerator for Spark
Fixes Notification service consuming all disk space when crashing
Show the appropriate Conditional sub-generator label instead of passthrough
Fixed in memory table query for SQL Server
April 19, 2022
Features
Teardown and database creation performance improved for SQL Server
Skip preserve destination tables (just like truncated) for privacy scan
Bugs
Fix not allowing generators allowed on arrays that were scanned prior to v439
April 18, 2022
Features
Adds support for events (dependent rows) to AI Synthesizer (formerly known as Smart Linking)
Allow wildcard (%) in schema name for Dremio
Bugs
Support Date Truncation and Timestamp Shift on Snowflake TIMESTAMP_TZ and Redshift TIMESTAMPTZ columns
Disable the generator dropdown in the UI for columns with both primary and foreign keys
MongoDB - Fix errors when linking Categorical Generators
Bug fixes for synthesis report
Fix data type mappings for Dremio Integer and Varchar
April 14, 2022
Features
Display whether webhooks were used in job details
Terminology changes
Masked is changing to De-identify
Synthesized is changing to Scale
Truncated is changing to Truncate
Generator name change: Smart Linking is changing to AI Synthesizer
Bugs
Fixes to SQL script beautifier
April 14, 2022
Features
Move single connections on connection pools behind a feature flag
Do not drop indexes on auto increment columns in MySQL to improve performance
Add type filter to JSON Mask generator
Remove unique constraint on workspace names and change default workspace name to "Untitled Workspace"
Display key icons for user uploaded foreign/primary key fields in Mongo Single document view
Bugs
Fix issue where 0 results in Workspaces table sometimes let you navigate to a negative page number
Fix issue where workspace permissions didn't update in the UI until a refresh
Fixes authentication error on Postgres when username has special characters
Fixed reopening open connections when TONIC_ENABLE_SINGLE_CONNECTION is false
Remove owner section from exported workspace body
Improve readability of workspace tags in small-width window sizes
April 13, 2022
Features
Apply foreign key constraints serially on a different thread to avoid deadlocks & improve performance
Bugs
Clear bulk column search query on filters reset
Memory improvements to Mongo schema serialization
Better handling of schema name in Java SDK
April 12, 2022
Features
Allow changes to order of JSON Mask sub-generators
Running a job is now disabled when the license expiration date has passed
Separate Workspace Sharing and Workspace Role-based Workspace access into different license features
Improve SQL Server performance by prefetching all tables & views
Bugs
When using Java SDK only show Java supported generators
April 11, 2022
Features
Improves performance of Email Generator on EMR and Databricks
Add support from Dremio with Spark
Bugs
Allowing Kubernetes ImagePullBackOff for up to 5 minutes before throwing error when updating through UI
Fixes inability to run a data generator on Spark with default database as the source
April 8, 2022
Features
Display estimated time remaining on row-based tasks in job details page
Display key icons for user uploaded foreign/primary key fields in Mongo hybrid documents
Bugs
Copy Generator in Mongo: Fix bug where if generator was copying a field that wasn't in the document it would throw exception
Add exception handling for SQL Server datetime columns to drop invalid records, better SQL Server XML type exception handling
Create source pool just before processing tables to avoid any timeouts in SQL Server
April 8, 2022
Features
Display row based progress on jobs whenever possible
Statistics Jobs for the Java Spark SDK now appear in Jobs Table/Jobs Detail Page
Skip pre-job health check for PyML service when not needed
April 7, 2022
Features
Added database metric sharing for AWS users
Don't null self-referential foreign keys on reference tables when subsetting
April 7, 2022
Features
Improves performance of Email Generator on EMR and Databricks
Bugs
Better small-width responsiveness of tables in UI
April 6, 2022
Features
Add synthesis report
Support for update via UI with Docker
Bugs
Prevent unexpected behavior of progress tracker when the system clock shifts
Fixes error when creating partition function with null range on SQL Server
Fix issues with loading screen showing when not needed
April 5, 2022
Features
Add Remove Whitespace Transformer to Copy generator
Add ability to generate JSONPath expression for JSON Mask generator by clicking on preview JSON
Bugs
Improved support for large decimal numeric types in Postgres
April 4, 2022
Features
Add exception handling for invalid SSH Tunnel private key format and list valid formats in tooltip
April 4, 2022
Features
Apply constraints in parallel to data generation in Postgres
Bugs
Ensure workspace tag input closes when focus is lost from outside click
Remove foreign key checks when applying constraints in MySQL
April 1, 2022
Features
Invalidate refresh tokens on logout, log in or out all open sessions (tabs) on login or logout, cross-tab inactivity timeout support
Improve load times on workspace pages
Bugs
Fix linking of non-consistent generators in Spark
API update
Deprecated the GET /api/workspace
endpoint in favor of GET /api/workspace/search
March 31, 2022
Features
Include workspace ID in URLs
Parallelize writes on MySQL passthrough tables
Allow decimals in subsetting percentages
Bugs
Don't allow duplicate entries of tables in table mode settings for workspace
Handle null values with Copy generator on MongoDB
Database query performance improvements
March 30, 2022
Features
Improve MySQL generation performance by applying constraints in parallel
Bugs
Fix issue where indexes fail to be applied to converted columns on SQL Server
Fix issue with SSO account creation failing
March 29, 2022
Features
Support Copy generator on MongoDB
Support parallelism of MongoDB schema scan
Support truncation mode on MongoDB
Bugs
Set Command Timeout in Postgres to prevent timeouts
March 25, 2022
Features
Change password requirements
Length >= 12 characters
Must include number
Must include lowercase letter
Must include uppercase letter
Must include non-alphanumeric character
Bugs
Fixes to transfer workspace text
Allow opening Admin Panel when user has no workspaces
March 25, 2022
Features
Improve constraint restoration performance on MySQL
Allow incremental tables in MySQL where all columns are keys
Bugs
Prevent app from crashing when admin status of user is unknown
March 24, 2022
Features
Add login rate limitin3
Bugs
Fix issue with functions and procedures not being created in the destination on Oracle databases
March 23, 2022
Features
On EMR and Databricks, write JSON files to JSON instead of Parquet
Improved handling of Latitude/Longitude with Smart Linking Generator
Add support for enum arrays columns in Postgres
Bugs
Fix issue with EMR cross account data generation failing to resolve the Glue database
March 17, 2022
Features
Admin users can now view all workspaces in Tonic. They can copy, share, and transfer any workspace, regardless of whether they have a role in the workspace.
Multiple enhancements for workspaces. Users can now leave a workspace. Improved the workspace share and transfer functions. Added the ability to filter workspaces by owner.
Parse as bigint if decimal fails
Oracle Whitelist support in BuildDb
Initialize after log sink reset
Replace slashes with underscores in build versioning
Add 'No generators' filter for bulk table view
Add progress tracker for SQL server post-data script generation
Allow support for views without fully qualified tables in Redshift
Improve performance of listing workspaces
Improve performance of the Constant Generator on EMR and Databricks
Sensitive columns now appear in the privacy hub for Mongo workspaces
Allow processing of decimals with a scale of 1 up to 9223372036854775807 in Postgres
Check for cloudsqlsuperuser permission on GCP Cloud SQL destination postgres databases
Bugs
Address Lambda Function Setup bug
Improve Regex Mask Display of Empty Matches and values with new lines
Do not fail completely when SSO groups do not resolve
Misc Redshift View fixes
Fix issue where column can't be in schema changes and show as sensitive
Fix issue where some downstream tables are missing their foreign keys in the destination database when using subsetting
Fix issue with shifting table size on data preview in database view
Prevent error from occurring when MongoDB collection size is too large
March 10, 2022
Features
Added support for user defined enum types as keys in subsetting
Simplify schema diff changes error toast on generation
Show more unprotected columns button mongo privacy hub
Added filtering on custom value generator dropdown
Add ConsistentOn to AdditiveNoiseGenerator
Enabling Timeshift Generator on Compound Primary Keys
Regex Mask Generator supports matching empty capture groups
Accommodate OFFSET in where clauses in SQL Server subsetting
Miscellaneous workspace config improvements
Add support Azure Sql with SqlServerDatabaseCreator
Add progress tracking for mongo schema scans
Add syntax highlighting for where clauses in subsetting
Add support for integer primary keys on smallint
Bug fixes
Fixed error message for MySQL data type not found
Fix Postgres Test Connection bug
Fix Workspace patches after rename or modify tags
Fix Unix Timestamp empty column bug
Throw OperationCanceledException instead of CancelTonicJobException on cancellation request
Fix Redshift view replication missing schema
February 24, 2022
Features
Added the ability to configure Tonic administrators. Admin users have access to the new Admin Panel. The Admin Panel contains Tonic usage metrics, a table of all Tonic users, and the ability to manage other admin users.
For Kubernetes, admin users can now use the Admin Panel to update Tonic
Oracle perf improvements
Add new range options for integer pk generator
CNPJ and CPF generator performance improvements
Updates Regex Mask Generator to support replacing all matches
Start the progress token for passthrough tables when there are no passthrough tables with references
Add support for Timestampshift for spark
Add warning system to Privacy Scan
Regex mask generator with Custom Value Processor
Check for cancellation and report progress on text copy in Postgresql
Miscellaneous perf improvements
Bug fixes
Start the progress token for passthrough tables when there are no passthrough tables with references
Treat non-standard oracle error code 5 not as an error
Fix grabbing workspace users for comments
Dispose the source connection instead of just closing in Postgres
Fix for test discovery exception
Fix postgres sequences behavior
Object mask bulk view fix
Miscellaneous typos fixed
Fixed database type for BigQuery test connection
September 27, 2021
Features
Jobs now auto-stop if the worker crashes
Support for ENUM Arrays in Postgres.
Categorical generator can now be used on non-string types in MongoDB.
Allow Conditional Generator on JSON, JSONB, and Boolean columns.
CustomCategorical and Categorical Generator support for MongoDB arrays.
Add optional 'use compression' flag for MySQL.
Bug Fixes
Fixed issue when swapping between MongoDB workspaces
Fix TimestampShift Generator for mongo.
Subsetting enhancements
EMR partitioned table UI bug fix.
Fixed an issue where metadata for the conditional generator did not always save correctly
Fix EMR Test Connection issue.
September 20, 2021
Features
JSON Mask for Spark.
Make bulk applying generators more friendly on columns with disparate types.
Generated columns are now supported for Postgres.
EMR will output CSV files if the input file is a CSV.
Added support for unsigned integer columns to the Integer Key Generator.
Performance improvements for MySQL.
Bug Fixes
Improved iSeries UI response time.
Fixed copy workspace issue where some fields were empty until a page refresh was performed.
EMR Hive DDL fix for Csv.
Fixed issue with droping indexes on auto-incrementing composite keys in Mysql
Improve DB2 post data order.
Improved MySql performance for CopyTable.
September 10, 2021
Features
Add Pagination to ECR Image search.
Workspace Tags.
Optimized the Privacy Scan for Oracle DBs.
Add custom categorical generator to Mongo DB.
Support for EMR cross account.
Bug Fixes
Fix issues with resizable panes in Database and Table view.
Allow non-null self references in a table in subsetting only if the table is a reference table.
Security enhancements
Fixed issues in MongoDB workspaces with dismissing all schema diffs.
Better handling of Check constraints in MySQL
September 3, 2021
Features
Added TLS support for iSeries Db2.
Cloudwatch logs are available inside Tonic for AWS Lambda based jobs.
Support for 'consistent on column' in Regex Mask Generator.
Bug Fixes
Fix SIN generator being applied to non-sin columns.
Optimized Schema construction for Destination DB in DB2 LUW.
Truncated tables now handled properly for AWS Lambda based jobs.
Fix rendering of tables with double digit replacements on bulk view.
"Consistent On" drop down is now alphabetized.
August 30, 2021
Features
Added destination database connection summary to job start confirmation.
Subsetting and post-job action indicators have been added to the workspaces view.
Added workspace name as a webhook content option.
Bug Fixes
Fixed issue with Postgres index restoration and error handling.
Fixes for Tonic UI when connected to MongoDB
August 23, 2021
Features
Added option to Preserve Lambda S3 files
Added view support for Db2 LUW.
Improved RegexMaskGenerator performance.
Bug Fixes
Various fixes and improvements for Lambda generation.
Add additional logging to Lambda.
Change Lambda default config options (timeouts, memory).
August 19, 2021
Features
Added support for incremental table mode on Postgres.
Enable TimeStampShiftGenerator for text columns.
Updates to the Privacy Hub for MongoDB
Add support for object, variant types in snowflake.
HIPAA Address Generator now handles extended zip codes that don't contain a hyphen.
Inactivity timeout and authorization/refresh token timeout can now be configured.
MySQL now copies over routines and events (if permissions are set) to the destination database.
Bug Fixes
Empty schemas in Db2 output database are now handled properly.
Http now forwards to https on the healthcheck endpoint of the PII Scanner container.
Resolved issue in the subset preview when the estimated row count of the source is unavailable.
August 11, 2021
Features
New UI to view all workspaces.
Updates to Privacy Hub to track progress in protecting sensitive fields.
Db2 support for triggers and stored procedures.
Prevent Db2 summary tables and views from showing in the UI.
Support for Regex, Json, and XML generators on Snowflake and Redshift.
Allow Content-Type Header in webhook requests.
Ability to cancel data generation jobs from the job details page.
Schema differences can now be detected on a Mongo collection.
Bug Fixes
Memory usage reduction on MySQL and Oracle.
ErrorOnOverwrite for Databricks Table Mode.
Usability improvements with MongoDB workspaces.
August 2, 2021
Features
Added ability to process Redshift and Snowflake using AWS Lambda.
Update Databricks assets in parallel.
Updated Sandbox Terms, analytics, licensed features.
Bug Fixes
Updated connection to Google Cloud PostgreSQL databases.
July 29, 2021
Features
User information has been added to the Privacy Hub Audit Trail.
Bug Fixes
Improve usability when working with Db2.
Refactor queries to restore indexes and add timeout logic in MySQL.
Add batch size logic for subsetter when handling larger rows of data.
July 22, 2021
Features
Users can now download job logs directly through the Tonic application.
Integer primary key generator allows more control.
Add index restoration parallelism for PostgreSQL.
Add name case consistency.
Bug Fixes
Prevent Safari from crashing.
No longer print environment variables when launching the PII detection or machine learning containers.
Improve API performance for endpoint api/schemadiff.
Mongo RegexMaskEditor now gets column values.
July 15th, 2021
Features
Smart Linking generators added - train neural networks to mimic the implicit relationships across columns.
Webhooks now support sending JSON object literals.
MongoDB expanded generators on allowed data types and consistency for multiple types of paths.
Clarified error messages in the UI.
Constant generator will now show a true/false drop down when added to a boolean field.
Bug Fixes
Improved performance of schema change detection on large workspaces.
Handle long generator names in the UI.
Allow notifications container to start up without TONIC_URL env variable.
Updated TLS cipher negotiation between Tonic and AWS Aurora Mysql.
Full name will now be consistent with first and last name.
Null values in Sql Server xml fields are now handled properly.
Optimized memory usage on very large Sql Server rows.
Reduced the number of times statistics have to be calculated for Json Mask Generator, Xml Mask Generator, and Regex Mask Generator.
Fixed issue where S3 + Spark jobs would not start.
June 21st, 2021
Features
Preserve N bytes for MAC address generation
Modeling panel in table view can now be resized
Repartition and coalescing added for Databricks
Subset preview on small screen improved
Speed up PII Scan for MySQL
Updated subsetting logs
Increment default fingerprint schema version
Bug Fixes
Disable Generate Data during page load
Fix constant generator timestamp issue
Handle case when Spark not installed on EMR cluster
Increase Big Query limit of 10GB
Memory optimized tables fixed when tearing down SQL Server
Shift + Select in Bulk in MySQL fixed
June 11th, 2021
Features
Webhooks can now be created to alert external systems when a job has finished.
Magnifying glass now appears in Collection View in Mongo to show additional values for a given path.
Post Job Scripts can now be put in a specific order by user.
MongoDB expanded to support comments, notifications, current date generator, collection search.
Generators can be found by metadata now as well as name.
Copy workspace action limited to owner only.
Remove tutorial video.
Show UI warnings for Oracle when rows are rejected by sqlldr.
Bug Fixes
Data in identity columns is now properly inserted into Redshift databases.
User defined types are now dropped in the correct order when tearing down a SQL Server database.
Job status API no longer returns information about other jobs.
Make the polling for jobs resilient to failed checks.
Even better upstream null handling.
May 27th, 2021
Features
Added support for Db2 iSeries
Added support for Delta Table on Databricks
Added support for reading IAM roles off Databricks machines instead of providing IAM credentials
Added a UI notification when your Tonic version is 10+ versions behind
Added support for MongoDB 2.4 and 3.4
Added the ability to automatically skip tables that match a regular expression via an environment variable
Added the ability to obfuscate values inside an array in Mongo
Added the ability for the JSON Mask Generator to parse json objects containing escaped characters and surrounded by quotes
Added the ability to automatically cast for mismatched types during downstream subsetting
Added a Regex Generator
Added a Generative Adversarial Network Generator
Bug Fixes
Refreshes subset preview when table mode is changed
Improved handling of IAM credentials on Databricks when an IAM instance profile is present
Hide "Other" in generator dropdown when there are no additional generators
April 29th, 2021
Features
Added support for MongoDB
Added support for Amazon Redshift
Delta Table support on Databricks
Upgrades the subset preview with UI improvements
Bug Fixes
Better handling of adding foreign keys pointing to non-primary keys
Checks for Postgres version mismatch
Remove generator button now works in synthesis mode and on key columns
Added support to break subsetting cycle when both Foreign Key and Primary Key are nullable
April 20th, 2021
Features
Subset Preview
Support for Google SSO, including ability to read group membership
Java UDFs on Spark for Character Scramble and Mac Address generators
Added custom value processor extension framework
Bug Fixes
Auto-increment on MySql bug fix
No longer remote auto-increment on primary keys or on preserved tables
Fixes composite foreign key issue with one nullable column in key
Incremental mode now works when a rowversion or timestamp column is on the table
Fixed collation issue for MySQL 8
November 25th, 2020
Features
Better logging during job execution
Support for Spark + AVRO files
Various Spark improvements
November 11th, 2020
Features
Undo/Redo capabilities for workspace changes
More information in Job Details
New Strict Mode available when generating data via the API
Now supports for AVRO files
Now supports EMR Steps API
Bug Fixes
Error logging enhancements for Spark
Various UI improvements
October 28th, 2020
Features
Added support for Spark/S3 as a data source and destination
Improved Company Name generator
Added support for text and ntext types in SQL Server
Added support for Google Big Query as a data source and destination
Workspace ownership can now be transferred from user to user
Bug Fixes
AWS Commons extension no longer breaks data generations
JSON and XML mask generator configuration fixes
Minor display fixes
October 14th, 2020
Features
Find and Replace Generator introduced
Tonic saves a copy of the workspace configuration to job history every time a generation is started
You can now share workspaces with your SSO defined groups
Bug Fixes
Corrected rending issue for column sort option in the Database View
September 29th, 2020
Features
Support for PingID SSO
Option to download foreign key file
Social Insurance Generator added
Bug Fixes
Test Connection errors now show why the connection failed
September 17th, 2020
Features
Added the ability to create post job scripts that will be executed against the destination server at the end of the generation
Added Cross Table Sum generator. Allows summing of rows from another table by partition
Added single sign on support
Added Enterprise license key support
Added a check to validate the foreign key file before a generation executes
Allow maximum for Integer Key generator
Removed dependency on RabbitMQ
Added ability to have SQL Server and Postgres trust server certificates in workspace editor
Additional logging was added around relationship integrity during subsetting
Added adjustable batch sizes for Oracle
Bug Fixes
Fixed several rendering issues in the table view
Tries multiple connection methods for Oracle
Supports Extended VarChar2s for Oracle
No longer hides foreign key column headings for synthesize mode
Job descriptions for jobs cancelled before they started running are more accurate
Testing the source database connection now works for viewers and auditors
Workspaces are now sorted alphabetically in the workspace drop down
Changing a linked generator now properly breaks the link
Foreign key columns now show up properly without needing a refresh after setting the primary key table to synthesized mode
Expand long column names on hover in the Privacy Hubs's audit trail
Workspaces with broken connections to their database can once again be edited
Views no longer show in the list of tables for Oracle databases.
August 20th, 2020
Features
Workspace Sharing - allows sharing workspaces between multiple users with different access controls (This is an Enterprise Plan feature)
Added 'In' operator to the Conditional Generator.
Added environment variable to bypass certificate validation when connecting to databases with self signed certificates.
Schema change detection will now flag columns with null generators that have been made not-null.
Performance
Significantly improved the speed that information is gathered about the source database affecting overall UI performance and schema change detection.
Added additional diagnostic information around query execution times to the logs.
Bug Fixes
Fixed issue where switching table mode to synthesized mode in Database view would not work.
Increase accuracy of error messages on Sql Server when tables fail to get created.
Integer Key Generator in Oracle for Number columns now base off of the precision, stopping overflow
Resolved name collision when subsetting with a table that has more than one foreign key to the same primary key.
Fixed preview on tables that use the sequential integer generator.
Changing synthesized row counts from database view now saves properly.
Popover of column name on table view no longer flashes.
Large constant values no longer cause display issues in Database view.
Fixed issue with connection test sometimes failing when using MySql.
Character Scramble now preserves null values when operation on JSON.
Fixed OID collision issue on Oracle.
Preview no longer fails when using complicated partitioning strategies.
Constant Generator no longer loses preview when you click away.
July 31st, 2020
Features
ASCII Character Primary Key Generator which supports a wider range of characters in the column.
New API Endpoint for getting information on a single data generation job
Key Generators now work on Unique Columns
Improvements to PII Detection:
Added detection for columns containing passwords, postal codes, and VINs.
Reduced false positives for SWIFT codes, ICD9, ICD10, US Cities
Switched to a new MySQL driver
Make XML generator more human readable
Allow synthesize mode on Conditional Generator and Unique Email Generator
Add event generator in Oracle
Performance Improvements
UI performance for resolving schema difference
Bug Fixes
Better checking of uniqueness requirements for columns
Fixed constraints on large tables timing out in SqlServer
Fixed conversion issue with XML columns outside of subsets during subsetting
Block subsetting with synthesized tables
Do not allow source and destination to be the same in MySQL
Make preview stay up to date when generator is removed from different view
Fixed vulnerability with lodash
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
The following releases were removed from quay because of regressions:
V1054 - V1061
V1074
February 16, 2024
Fixed an issue with the Name generator where capitalization was not preserved if consistency was disabled.
For Table View, fixed an issue where the delete button to remove the generator assignment was sometimes hidden.
Oracle
Tonic now supports Oracle Advanced Queues. Queue tables and queues are created in the destination database. Messages are not copied from the source database to the destination database. Queue subscribers are not currently supported. You must add them to the destination database manually.
PostgreSQL
Tonic can now de-identify the ltree data type.
SQL Server
Fixed an issue where indexes were not created on schema bound views.
February 9, 2024
Redesigned Database View
We redesigned Database View to improve the display and the filtering.
In the updated columns list, the Column column contains the schema, table, and column name, and the column data type. It provides access to the data preview option.
The Applied Generator column shows the applied generator. Applied Generator indicates when a column is unprotected, when the column is a primary or foreign key, and when the configuration overrides the parent workspace. If the table mode is not De-Identify, it shows the table mode. It provides access to the commenting option.
Filters other than the column name filter are moved under the Filters option. There are also new filters for the sensitivity type (the type of sensitive data that Tonic detected in the column) and whether the column has a recommended generator.
Privacy Report updates
In the Privacy Report, new column, Column Privacy Rank, indicates the privacy ranking for a column based on the assigned generator and generator configuration. The generator summary and generator reference include the possible privacy ranking values for each generator.
Added a new column, Tonic Detected Sensitivity, that indicates whether the Tonic sensitivity scan identified the column as sensitive. Renamed the Is Sensitive column to Current Sensitivity. Current Sensitivity indicates whether the column is currently marked as sensitive.
Also corrected an earlier issue with the order of the columns.
Other updates
Fixed an issue that caused all subset runs to record the percentage of rows in the subset as 100%. Subset runs that occur after updating to this version display the correct percentage.
The option to write output data to a container repository is out of beta.
Databricks
For Databricks 13+, writing output to delta tables now uses the table’s liquid clustering configuration to cluster the insertion.
Fixed an issue with loading tables into the Tonic application.
Google BigQuery
Improved batch size calculations to reduce memory pressure during data generation jobs.
Fixed an issue with applying the Timestamp Shift and Date Truncation generators on datetime or timestamp data when TONIC_GRPC_ENABLED
was false.
For the Timestamp Shift and Date Truncation generators, improved support for time values that include fractional seconds.
Fixed an issue where the Character Substitution generator failed when assigned to a NUMERIC
or BIGNUMERIC
column.
Oracle
Tonic now supports IDENTITY
columns. Before this change, IDENTITY
columns caused errors during destination database creation.
February 2, 2024
For the Custom Categorical generator, you can now add a NULL value to the available custom category values. To indicate a NULL value, use the keyword {NULL}
.
Made the following API updates to better accommodate users of the previous version of the API:
jobs/{id}/workspace_snapshot
now returns the WorkspaceDataModel
object.
Fixed an issue where workspace_snapshot could return empty replacements.
Added a new endpoint, GET jobs/{id}/workspace_snapshot?api-version=v2023_07_00
, that returns V17WorkspaceDataModel
Databricks
When writing output to delta tables, the destination tables should now retain the TBLPROPERTIES
from the source delta table, including 'delta.feature.allowColumnDefaults'
.
January 26, 2024
Redesigned data model for generator assignments - The new version of the Tonic API includes a redesign of the data model for generator assignments. To use the previous version of the generator assignment data model, make sure that your API calls specify version 2023.07.0.
The generator assignment data model redesign includes the following changes:
For all generators, moved the following fields to the metadata
object in the link
object:
presetId
generatorId
customValueProcessor
encryptionProcessor
For the <value type> Mask composite generators:
Moved pathExpression
to the metadata
object in the link
object.
Removed the following fields from the link
object:
subPresetId
subGeneratorId
customSubGeneratorValueProcessor
Added the following fields to the subGeneratorMetadata
object under metadata
:
presetId
generatorId
customValueProcessor
Other updates
Updated the sensitivity scan to better identify company and organization names and suggest the appropriate generator.
Databricks
The Conditional Generator is now supported on version 11.3+.
Google BigQuery
Fixed some cases where a copied View did not have its definition updated to reference resources within the destination dataset. This could result in failures when attempting to copy the View into the destination dataset.
Added support for Preserve Destination table mode.
Fixed an issue where sub-queries in table filter clauses did not work if TONIC_GRPC_ENABLED
was set to true.
MySQL
You can now enable diagnostic logging for a MySQL data generation job.
PostgreSQL
Improved the destination database permissions check on Google Cloud SQL to handle additional superuser authentication setups.
SQL Server
Data generation now preserves hidden properties of columns.
January 19, 2024
If your instance of Tonic is deployed on Docker, you can now use an external Kubernetes cluster to enable the option to write destination data to container artifacts.
You can now assign the Integer Key generator to a column with a decimal data type. The actual column values must still be integers.
Fixed an issue in Table View where an error displayed if you changed the selected table while the data was loading.
Databricks
For cluster versions 11.3 and above, Tonic now displays accurate lists of available and suggested generators.
File connector
Fixed an issue that prevented users from adding files to existing local file groups.
SQL Server
Fixed an issue that caused errors when creating inline functions in the destination database.
January 12, 2024
The Enable Diagnostic Logging global permission is now granted to the built-in Account Admin permission set.
Databricks
CREATE CATALOG
or CREATE SCHEMA
permissions are no longer required if the destination catalog or schema already exists.
For Databricks 11.3 and higher, added support for the Random Double generator.
For Databricks 11.3 and higher, added support for the IP Address generator.
January 5, 2024
Diagnostic logging for data generation - By default, Tonic now redacts sensitive data in data generation log files.
When users start a data generation or upsert job, if they have the new global permission Enable diagnostic logging, they can choose to enable diagnostic logging, which does not redact the logs. The Enable diagnostic logging permission is also required to download the diagnostic logs. By default, the permission is only granted to the Admin and Admin (Environment) global permission sets.
In addition to the option for individual jobs, there are environment settings that enable diagnostic logging for specific data connectors.
Other updates
In the Release Candidate version of the API, the response model for the GET /api/workspace/minimal
endpoint has been updated for more straightforward de-serialization.
Fixed an issue where a non-unique composite primary key column could only be assigned unique generators.
Users can now press Enter to finish copying a workspace or a generator preset, instead of having to click Copy.
File connector
Added support for the Conditional generator for file groups that contain CSV files.
Google BigQuery
Post-job scripts now support DDL and EXPORT.
Oracle
Fixed an issue with the permissions check that prevented connecting to Amazon RDS for Oracle databases.
SQL Server
In Data Definition Language (DDL) that applies to the destination database, Tonic now strips WITH INLINE
clauses from definitions of user-defined functions (UDF). Inlining does not require these clauses. WITH_INLINE
clauses in UDF definitions that do not meet the requirements for inlining can prevent the UDF from being restored properly in the destination database.
Fixed an issue where the order of XML columns was changed in the destination database.
December 29, 2023
For the OpenID Connect (OIDC) SSO integration, Tonic now supports authentication by client secret that uses HTTP basic authentication (client_secret_basic
). To provide the client secret, configure the TONIC_SSO_CLIENT_SECRET
environment setting.
SQL Server
A new environment setting, TONIC_SQL_SERVER_SKIP_CREATE_DB
, indicates whether to skip schema creation for the destination database. If true
, then Tonic does not create the schema. It uses the existing schema to populate the destination database. The default is false
. You can configure this environment setting from the Environment Settings tab on Tonic Settings.
December 22, 2023
During free trial signup, the data connector options now include an option to use local files for the source data. This creates a file connector workspace for local files, and displays the File Groups view to allow the free trial user to start to add file groups to the workspace.
Added an environment setting, Tonic Test Connection Timeout In Seconds (TONIC_TEST_CONNECTION_TIMEOUT_IN_SECONDS
), that you can set from the Environment Settings tab on Tonic Settings. This setting configures the timeout for testing a database connection. Previously, connection test attempts timed out after 5 seconds. The new default is 15 seconds.
When you configure a workspace to write the output to container artifacts, you can now specify custom resources for the Kubernetes pod, including the ephemeral storage, memory, and CPU millicores.
Improved performance when marking a large number of columns as not sensitive.
Fixed an issue that caused Tonic workers that are deployed on Docker to crash unexpectedly.
For numeric columns that support arbitrary precision and scale, when the scale is 0 (for example, NUMERIC(N,0)
), or when the underlying values are all integers, these columns are now supported as primary keys for the purpose of subsetting.
Amazon EMR
You can now use the Timestamp Shift generator as a sub-generator within the Struct Mask generator.
Amazon EMR and Databricks
When files are saved to non-job-specific file destinations, the new environment setting TONIC_WORKSPACE_DEFAULT_SAVE_MODE
indicates the mode to use. If set to a value other than null (Ignore, Append, Overwrite), this setting takes precedence over TONIC_WORKSPACE_DEFAULT_ERROR_ON_OVERRIDE
.
Google BigQuery
Fixed an issue introduced in v982 where data generation failed with an HTTP 404 Not Found
error for the destination table when the source and destination are in different BigQuery projects.
MongoDB
Added an environment setting, TONIC_DOCUMENT_MAX_DEPTH
, to configure the maximum depth of JSON document that can be handled. The default value, which is also the recommended minimum value, is 32.
For bulk apply, fields with the same selection path but different data types no longer share a selection state.
SQL Server
Fixed an issue to properly restore schema-bound views in dependency order.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
Releases V980 - V982 were removed from quay because of an issue that was fixed in later releases.
October 13, 2023
Logging and telemetry connection status - On the System Status tab of Tonic Settings, a new Data Sharing section provides a summary of the logging and telemetry connectivity to the Tonic backend. The new section indicates:
Whether sending logs and telemetry to Tonic.ai is enabled
If they are enabled, whether Tonic is able to connect in order to send the logs and telemetry
Other updates
Tonic now checks for invalid virtual foreign keys for all data generation. Previously it only ran the check for subsetting data generation.
Fixed an issue where the Continuous generator failed when a column contained only NULL values.
Improved performance for the Continuous Generator.
Fixed an issue where the Cell Count in the Usage Report could throw an integer overflow error.
Fixed a subsetting issue where Tonic displayed the error “Error fetching Subset preview” when users navigated to a workspace.
For subsetting, updated the Graph View display to make it easier to see the connections between the tables.
Improved messaging when Tonic cannot reach the database when you test a database connection.
Databricks
On the workspace settings view, Test Cluster Connection no longer requires that you re-enter your API key.
On the workspace settings view, the default job cluster specification now recommends a Databricks 14.0 Spark version.
Databricks is now enabled for Professional and Enterprise users on Tonic Cloud, as well as for free trial users.
Fixed source catalog workspace handling when the source catalog contains a table with a key constraint.
Added a warning to the workspace settings view to prevent specifying the same source and destination locations.
File connector
Fixed the validation used to prevent duplicate files in Amazon S3 file groups.
PostgreSQL
Improved handling when the database does not contain a public
schema.
Improved subsetting performance for the Data Pipeline V2 data generation process.
Improved performance for the Data Pipeline V2 data generation process.
SQL Server
Fixed how Tonic handles system time periods.
Improved messaging when Tonic cannot access SQL objects.
October 6, 2023
The new TONIC_ENABLE_SECURE_COOKIES
setting indicates whether to enable the "Secure" attribute on Tonic authorization and analytics cookies. The default value is false
. Do not set this to true
if you access Tonic over an HTTP connection. When TONIC_HTTPS_ONLY
is set to true
, the “Secure’” attribute is always enabled on Tonic authorization and analytics cookies, and the value of TONIC_ENABLE_SECURE_COOKIES
is ignored.
Updated to prevent simultaneous updates to the same workspace configuration.
For the Constant generator, fixed an issue for JSON columns where setting the constant to an empty string caused data generation to fail without setting the job status to failed.
The upsert pre-job check that validates the constraints on the intermediate and destination databases no longer fails when a database has constraints with duplicate signatures.
Fixed an issue where an empty upstream filter WHERE
clause caused subsetting to fail if the schema changed so that the table was no longer upstream.
To use the API to obtain data encryption settings, the API user must now have the required global permission.
When users log out of Tonic, we now automatically invalidate any JSON Web Tokens (JWTs) that are not expired.
The API endpoint /api/permission-sets
now requires the ManageUserAccess
(Manage user access to Tonic and to any workspace) permission. Added a new endpoint /api/permission-sets/public
, which returns the subset of the data needed by users who do not have that permission.
Amazon EMR
Fixed an issue where when data generation was run from the SDK, some generators, including the Categorical generator, would not work.
Databricks
You can now configure a Databricks workspace to write destination data to Databricks Delta tables.
File connector
On the file group details for CSV files, added configuration options to quote spaces and trim whitespace.
Fixed an issue where Tonic was not able to display a preview of extremely large files.
MongoDB
On the Collection View, the preview icon is now hidden for types that cannot be previewed.
PostgreSQL
For Data Pipeline V2 data generation, fixed an issue where we did not correctly truncate destination tables.
Fixed an upsert issue for tables that have generated identity columns.
Snowflake
For a Snowflake on AWS workspace, you can now provide specific AWS credentials for the file storage locations (S3 buckets or external stages).
The Snowflake data connectors are now available to Tonic Cloud users with a Professional or Enterprise license. They are not available to free trial users.
September 29, 2023
New monthly pay-as-you-go plan on Tonic Cloud - Tonic now offers a pay-as-you-go subscription plan for Tonic Cloud. Free trial users are offered the option to use a credit card to purchase a pay-as-you-go license. The monthly subscription grants a Professional level license. With the pay-as-you-go plan, you can configure generators for up to 20 tables across all of your workspaces. Tonic bills you separately for each additional table that you configure. The license renews automatically each month. On Tonic Settings, a new Billing tab displays the next renewal date.
Data migration option for upsert - For upsert, Enterprise users can now connect a workspace to their own data migration script or tool to ensure that schema changes are automatically reflected in the intermediate database.
Other updates
Timestamp Shift is now the suggested generator for birthdate fields. Previously, the suggested generator was Random Timestamp.
Fixed an issue where editing workspace settings could cause you to be logged out.
File connector
On Tonic Cloud, you can now use Amazon S3 as a source of files. Previously, Tonic Cloud only supported Google Cloud Storage and uploaded local files.
For the Categorical generator, linked columns are now in the correct order.
Google BigQuery
Tonic now supports arrays of a supported type. The STRUCT
and INTERVAL
types are still not supported.
MongoDB
Fixed issue where a UI refresh was required in order to show automatic de-identification of foreign keys from de-identified primary keys.
Oracle
The Download SqlLdr Files workspace permission is now assigned to the built-in Manager and Editor workspace permission sets.
Downloaded sqlldr files no longer include .bad files.
PostgreSQL
Fixed an upsert issue for tables that have generated identity columns.
SQL Server
The Constant generator can now handle bit values.
September 22, 2023
New global permission to view organization users - A new global permission, View organization users, determines whether a user is able to see the lists of users and groups in the organization. This permission is required in order to use the Tonic application to grant access to and transfer ownership of a workspace, and to grant access to global permission sets. It is not required when you use the Tonic API to perform these tasks. The permission is granted to the built-in Admin, Admin (Environment), and General User permission sets. When you upgrade, Tonic automatically grants this permission to your custom global permission sets.
Other updates
On the workspace details view, added a new upsert processing option, Warn on Mismatched Constraints. When this is enabled, Tonic treats mismatched foreign key and unique constraints between the source and destination databases as warnings instead of errors, so that the upsert job does not fail.
Tonic now accepts all AWS RDS certificate authorities. Previously, we only accepted rds-ca-2019. The accepted certificates include:
rds-ca-rsa2048-g1
rds-ca-rsa4096-g1
rds-ca-ecc384-g1
When job log recording (used to download job logs from the Tonic application) fails, it no longer creates a recording retry loop.
File connector
Additional fixes to skip and log invalid rows instead of failing the data generation.
Fixed an issue where when you added a file to an existing file group, and any column name contained a leading or trailing space, Tonic incorrectly displayed a schema mismatch error.
You can now add .gzip files to a file group in a file connector workspace. The original file that was compressed must have the same format and structure as the other files in the file group. .gzip files are only supported for workspaces that use files from Amazon S3 or Google Cloud Storage. They are not supported in workspaces that use local files.
PostgreSQL
During upsert, improved performance when de-conflicting unique constraints.
September 15, 2023
Create virtual foreign keys from Subsetting view - On Subsetting view, from a table details panel, you can now add a virtual foreign key to that table. To add a virtual foreign key, you select the foreign key column from the current table, then select the primary key column from the other table.
Other updates
Fixed an issue with TLSv1 and TLSv1.1 support in Tonic.
Improved performance of downstream processing during subsetting.
Fixed an issue where subsetting failed when a composite foreign key included a Boolean value.
On Table View:
Added a column warning when the assigned generator fails to generate preview data.
Increased the default width of the preview data column.
File connector
Tonic now skips invalid CSV file rows and logs a warning instead of failing the entire file.
Added a warning when the same source file is added to multiple file groups in the same workspace.
Added JSON Mask and XML Mask as the recommended generators for JSON and XML files.
MongoDB
You can now assign the Conditional and Null generators to binary fields.
MySQL
Fixed an issue where Tonic did not clean up temporary file uploads after it wrote the data.
Oracle
For a data generation job that ran SQL Loader (sqlldr), if sqlldr either failed or succeeded with errors, the job details include an option to download the sqlldr log files.
PostgreSQL
Fixed an upsert issue where jobs failed with a unique constraint violation if the table contained a unique index but not a unique constraint.
During upsert, improved performance when de-conflicting unique constraints.
Snowflake on Azure
Fixed a bug that occurred when specifying the Azure Storage Account URL.
September 8, 2023
Removed the requirement that the authentication cookie goes over HTTPS (Secure Cookie). This fixed an issue where users could no longer log into Tonic over HTTP, but they could still log in over HTTPS.
Fixed an issue where users could not log out of Tonic from the email confirmation page.
Fixed an issue where upsert failed because of foreign key violations. Also improved upsert performance.
MongoDB
When generator configurations are updated in single document view, Tonic now generates the preview data without re-fetching the data and refreshing the page.
Fixed an issue that caused fields to disappear from hybrid view when the Null generator was applied.
September 1, 2023
In Table View, when a generator cannot be applied to a column in order to produce the preview data, the error message now includes the name of the column.
Expanded the table and collection dropdown lists to accommodate longer names.
The Privacy Report now marks a column as consistent when the generator is always consistent.
Fixed a migration issue with file connector files that were added before V920.
Fixed an issue where data generation could not run because of the permissions hierarchy.
Fixed a security issue related to JWT authentication.
Fix an issue where webhooks sometimes did not start when a job was canceled.
Improved error message when Tonic cannot display a date value.
PostgreSQL
For the Tonic Data Pipeline V2 processing, Tonic now stops job execution after the initial error.
Fixed an issue where check constraints failed to be applied in the destination database.
Fixed an issue where views that depend on both a table and a view at the same time were not created in the destination database.
Tonic no longer uses the TONIC_PAGE_PARALLELISM
and TONIC_PARALLEL_READ_RANGES_TABLES
environment variables for parallel processing.
SQL Server
Fixed a caching issue that occurred when connecting to SQL Server.
Improved error messaging when a view cannot be created.
Improved the readability of SMO error messages.
December 23, 2021
Features
Ability to specify a fallback generator on XML and JSON generators
Ability to exclude multiple email domains in the email generator
Update column schema to preserve dropped-date and isDropped status
Change job details checkmarks to green Prevent idle session timeout in Postgres
Drop rows if decimal parse fails in Postgres Add pre-job check logging
Ability to connect to MongoDB via connection string
Allow incremental mode on tables where the PK is also an FK
Retry polling job updates and cancel job checks when there's connectivity issues with the Tonic DB
Add warning message when fails to restore sequence on destination data
Always show job status warning icon when applicable
Bug Fixes
Fixed importing workspace json caused schema change alerts
Fixed Timestamp Shift for DateTimeOffset
Fixed for how we count the number of partitions on tables in SQL Server
Properly escaping MySQL Username/Password encoding
December 9, 2021
Features
Update Workspace Users/Shares via web sockets
Provide job summary updates via web sockets
Make jump to page in Workspaces table more discoverable
Display which Tonic tier the customer is on
Wrap DNS resolution in try-catch block to avoid errors upstream
Expose ports 443 and 80 in Dockerfiles for applicable containers
Update session variables to enforce autocommit for MySQL
Parallelization of row processing for improved job performance
Bug Fixes
Removed random sampling in ConsistentOn generators
Fixed newly edited Tags of active workspace not being exported
Updated Oracle writer path for job improvement
Improved error message when test connection fails
Made error message more clear for generators that perform frequency samples
Fixed DB2 connection spinning indefinitely via timeout
December 3, 2021
Features
Added incremental mode support for MySQL
Added Workspace filter/sort improvements
Added Recovery Mode check for SQL Server Destination databases
Added support for connecting to mongo via connection string
PII Report Jobs are now canceled when Zombied/Abandoned
Added button to resolve all schema changes
Added the ability to generate full US address
Bug Fixes
Faster sampling for large tables when running privacy scan
Improved handling of null exceptions when subsetting
Made performance improvements to get all workspaces
Made performance improvements to only job summaries when needed
Improved data generation for latitude/longitude with Smart Linking
Improved formatting of error messages on job details
November 24, 2021
Features
Extend timestamp format to microseconds
Restrict workspace generation details to Owners
Allow webhooks on all DB Types
Added Struct support on Spark
Bug Fixes
Fixed lingering column on bulk view when configuring generators
State abbreviation privacy check requires some variability
Added handling of when custom categories are null in the metadata for Custom Categorical generator
Fixed date parsing for event generator
Smart linking bulk edit + metadata
Do not add AND clause if there are no non-filtering columns (subsetting)
Improved Sidebar indicators
Fixed inability to generate over tables which contain invisible tables in Oracle
Added null check for error message in subsetting
Hide Subsetting/Post-Job Actions in Workspaces view when not applicable
Added used packaged explicitly & removed unused package dependencies
Primary key columns in synthesized mode should not have generator piсker
November 18, 2021
Features
Random Timestamp output format selection
Redshift allow users to preserve source database owners
Support binary types as primary keys in subsetting
Log and skip over unresolved URNs instead of failing the job
Ignore VIEWS for Databricks and EMR
Released Additive Noise Generator
Added generator for creating CPF numbers
Bypassing XML validation on SQL Server XML columns is now optional
Added support for Windows authentication on SQL Server
Bug Fixes
Editors should have read only on workspace edit dialog
If no user has have been created, show create account page
Increased dialog width to prevent overflow when EMR spark is selected.
Added leading zero to 4 digit zip codes
Removed unavailable features from UI for Spark
November 12, 2021
Features
Improved user experience for workspaces view
Custom value processors can now be applied to not-null non-replaceable data types.
Update subsetting toggle on/off via web sockets
Bulk store column info for performance improvements
Disable generators for generated columns in Oracle
Added support for DML table-level triggers
Preserve file preferences in Oracle
Bug Fixes
Login page no longer displays error on page load when SSO is not configured.
SQL Server now preserve their decimal places on numeric column types when using the continuous generator.
Continuous generator no longer fails on missing fields in MongoDB.
Fixed issue where Duo login button was disabled when REQUIRE_SSO_AUTH
is true.
Connection test for Snowflake and Redshift no longer reports incorrect permission error.
Fixed issue where you could not edit a workspace if you lacked permissions on the active workspace.
Generated columns in Postgres are no longer attempted to be written to the destination database.
Fixed Spark generations by no longer attempting to instantiate a SparkSession in the UDF
Fixed parsing issue in Databricks Jobs API where we were treating an Int64 value as an Int32
Fixed issue where continuous generator would fail when the precision of a numeric column is unavailable
Removed security vulnerabilities from Notifications container
Added signed tokens to Redshift load and unload statements
November 8, 2021
Features
Improved message when database job finishes
Added dependency sort on tables set to Incremental mode
Added HStoreMaskGenerator specific to Postgres hstore types
Adding conditional generator support to MongoDB.
Added support for AWS SSO
Better error messages on the front-end when out-of-range IntPkGenerator
Bug Fixes
Removed linking from Random boolean generator
EMR should show more than 100 tables when appropriate
Preserve Destination disabled in bulk on EMR and Databricks
Upgraded Redis to resolve CVE-2021-32762
Database type now shows properly for newly created workspaces in the workspace view
Changed the download logs button to be more intuitive for jobs that don't use SmartLinking
Consolidate Copy
vs Duplicate
terminology
Job end times now show properly when jobs terminate unexpectedly
Break long workspace names to avoid overflow
Fix Enable Log Collection for Lambda Functions
Fix scaled uuid foreign key generation for unique columns
Fixed data type for numeric in SQL Server
Fix varbinary truncation in MySQL
November 1, 2021
Features
Added UUID primary key support for synthesized mode.
Added support for Duo 2FA SSO.
Updates for Oracle XML, RAW Types.
Handle triggers and grants when creating DBs for Oracle.
Null Generator now supported as Spark UDF.
Handle spatial (geometry/geography) types in SqlServer.
Bug Fixes
Update tracking URL in Databricks.
Leave temp schemas alone.
Fix ConsistentOn in RegexGenerator.
Security updates for Linux.
October 18, 2021
Features
Update Database view filter behavior
Added configuration option to disable account creation.
Regex mask generator is now supported on unique columns.
Warnings and errors for Redshift and Snowflake are now displayed on the job details page.
Updated job queued status.
Add Cloudwatch Log Filtering by Log Level.
Bug Fixes
Performance improvement for hybrid document creation for MongoDB.
Fixed issue where constraints could be duplicated when using the same Oracle server for source and destination.
Enhanced Oracle log messages when copying data.
October 7, 2021
Features
PII Scan will continue on error and log all issues.
Added unique phone number generator.
Added support for Integer Key generator on decimal columns where precision is zero.
Rename "Output" Database to "Destination".
Added Table Mode descriptions on hover
Added TLS support for Db2 iSeries connections.
Bug Fixes
TONIC_MYSQL_USE_COMPRESSION environment variable was removed.
Slow api responses that are still in process when a workspace switch occurs will no longer cause stale data to show in the application.
Properly set last visited collection for Mongo workspaces.
Fix workspace row alignment.
Disallow applying generators on postgres generated columns.
Fixed logging issues with PII scan errors.
Leave destination schema alone on Db2.
Fix subsetting of FKs with same name.
July 15th, 2020
Features
Tables on Sql Server can now be updated in an incremental fashion where only the changes since the last generation are processed. More details here.
Added support for customer categorical generator in synthesize mode.
Date truncation generator has been added so you can truncate dates to a specified date part.
Added support for timestamp ranges in Postgres.
Phone number generator now supports multinational phone numbers. The output phone number will match the country/region of the input phone number.
Integer key generator can now operate on an Int64/Long column.
Bugs
Fixed issue where a specifically sized payload could cause the API request to fail.
Geo generator properly passes through null values.
Added back remove table API.
Tables containing unicode characters now work in preserve destination mode on Sql Server.
Improved handling of transaction scopes for Sql Server.
Fixed issue where generators applied through the conditional generator sometimes used their default options during generation.
Generators added through autodetect are now able to be marked consistent/deferentially private.
Resolved multithreading issue using the key generators on tables with linked foreign keys.
Sql server date columns no longer show time in the table view.
July 1st, 2020
Features
Added ability for full name generators to be consistent with partial name generators.
Generator popovers now scroll after reaching a certain height.
Add Random Double generator.
Add ability to delete Tonic account.
Improved job progress tracking for Oracle.
Allow workspaces to be imported and exported.
Added support for conditional generator on Sql Server image columns.
Negative numbers can now be used in the constant generator with numeric columns.
Added support for timezone arrays in Postgres.
Performance Improvements
Significant performance increase when using the continuous generator.
Bug Fixes
No longer displaying out of date conditional generator configuration in the collapsed view.
Fixed issue with not being able to backspace the constant value on a constant generator applied to a numeric column.
User's can no longer duplicate subset targets on the subset configuration page.
Removed conditional generator as a sub-generator choice when using the xml or json mask generators.
Fixed issue where consistency could not be applied to a generator inside the conditional generator.
Loading indicators on the table view are now more consistent.
Continuous generator now works with smaller partitions when generating statistics.
Job details page now scrolls when number of steps is too large for the screen area.
Fixed issue where adding a custom categorical generator inside the JSON mask generator could cause an error.
Removing a table from the Schema Changes page now properly removes it from the subsetting configuration.
Disabled including tables out of subset was enabled when subsetting was not.
Fixed broken link for API documents on Job Details page.
Fixed issue where sequential integer generator didn't reset itself between generations.
Forced browser to get new version of assets after each release.
Add sql injection safety to generators that partition.
Fixed issue with switching between workspaces on the jobs view.
Moved popovers in database view to not be blocking other columns.
Fix issue with attribute info generator failing on empty attributes.
Stopped job progress from updating after job is complete.
Fixed issue with canceled jobs still running if they were cancelled before they could run.
June 17th, 2020
Features
Added the ability to specify a domain and enable consistency on the unique email generator
Conditional generator can now be used with the unique email generator
Added confirmation step for canceling a generation.
Added support for JSON and JSONB arrays in Postgres.
Performance Improvements
Updates to your workspace are now done through Json Patch.
Bug Fixes
Fixed issue where a table was unable to be switched to synthesized mode.
Fixed issue where min and max on the random integer generator were not editable.
Fixed issue where multiple changes might not be saved properly if executed within a few milliseconds of each other.
Fixed issue where constraints failed to apply to a large Sql server database.
Fixed display issue where the password input appeared to be filled in on the destination database connection screen.
Swagger docs no longer report enum fields as integers.
Fixed issue where edit, copy, and delete workspace buttons where still clickable even when disabled.
Decreased clickable area on consistency and differential privacy switches.
Fixed issue with NaN values in Postgres double fields.
May 29th, 2020
Features
A new subsetting option allows you to process tables that are not included in your subset.
Several Conditional Generator improvements:
Conditional Generator can now filter rows by a regular expression.
Conditional Generator now supports the following additional generators: Categorical, Custom Categorical, Alphanumeric Key, Numeric String Key, Integer Key.
'Is null' and 'is not null' operators added to conditional generator
Custom Categorical Generator now supports numeric types.
Random timestamp generator can now be added to text columns.
Password managers are now prevented from interfering with the database connection form.
Improved logging for constraint application on Sql server.
Renamed 'private' to 'sensitive' when referring to a column with data that needs to be protected.
Column headers are now red when a column is sensitive but not protected.
Performance Improvements
Workspace updates no longer happen on keystroke and will now wait until you exit the field or popup.
Changed the way the application loads workspaces to increase performance for large workspaces.
Bug Fixes
Fix display issue with Custom Categorical Generator when there were no categories.
Fixed error message when applying random timestamp generator.
Fixed issue with column editor size changing when marking a column as sensitive.
Fixed an issue with xml columns on Sql server
May 12th, 2020
Features
Tables can now be filtered by their current mode in the database view.
Added support for the 'contains' operator on the conditional generator.
Company name generator now supports consistency.
Added version check when editing a workspace to better support multitab/multibrowser use. Multiple users can now edit the same workspace without worrying about race conditions. Note: this is a first step, more multiuser features are in development.
Performance Improvements
Improved speed of tables in mask mode by fixing issue introduced in v76.
Bug Fixes
Fixed issue with applying a large number of constraints with Sql Server.
Random boolean, IP address, and random integer generators now work correctly with the conditional generator
Tasks that complete immediately now display correctly in the job details page.
Fixed display issue with data preview button on privacy hub.
May 8th, 2020
Features
Added a generator for shipping container codes.
Custom Categorical Generator now works on json fields.
Destination database names can now differ from the source for Sql Server.
Job completion time has been added to the Job Details page.
Additional diagnostic tools added to docker containers.
Performance
Column search performance in bulk editor significantly improved.
Removed Intercom
Bug Fixes
Preserving a partitioned table on Sql Server no longer causes an error.
Fixed issue with the Categorical Generator when the table is empty.
Fixed issue with the last line break sometimes being removed from the category list on the Custom Category Generator.
Fix preserve tables for non-Oracle databases.
Styling fixes for Conditional Generator when used in Firefox.
Fixed issue where insert into an XML column can fail on Sql Server when the payload has a large attribute, too many nodes, or too much nesting.
Apr 28, 2020
Features
You can now apply generators conditionally based on the values within the column.
Added support for array types in Postgres
Added a custom categorical generator, which allows you to specify a list of categories for a given column
Starting a data generation via the API will now return the ID of the job
Primary key generators can now be applied in the bulk edit view.
Added additional information on the job details page and in the logging, especially around subsetting.
Return existing refresh tokens if they exist to better enable multiple open sessions for the same account.
Performance
Improved the speed of the integer primary key generator
Improved subsetting performance when subset targets are referenced by large tables
Bug Fixes
Tonic will now ignore tables created by unsupported database add-ons that previously would require the user to truncate.
You may see these tables show up as a schema change alert. Most common ones are spatial_ref_sys and sysdiagrams.
Skip some the relationship checks needed for synthesis when table mode is not set to synthesize.
Fixed display issue where primary key would show as needing a generator for synthesis mode.
Fixed issue where the table drop down on the subset configuration view sometimes rendered in the wrong location.
Json path generators will no longer overwrite objects or arrays.
Fixed issue where identity sequence was not being set correctly on the destination table.
Fix issue when empty strings are present in a column with the numeric string or alphanumeric string primary key generators attached
Apr 14, 2020
Features
Ability to add generators to primary keys
Support hex strings for inserting into binary columns in Sql Server
Bug Fixes
Replaced browser alert with a toast notification when a job fails to be scheduled
Oracle now returns columns in the order of their ordinal position
Adjusted sizing on subset dropdown
Fixed issue where the progress tracker could fail to write events near the end of a generation.
Apr 6, 2020
Features
Full Oracle support is out of beta
Users can now generate API tokens to more easily authenticate with Tonic's APIs.
Completion times now show on job details.
Subset configuration area now scrollable.
Added a check to ensure destination version is not older than source version for MS Sql Server.
Security
Upgraded Java version on docker images to resolve security issues.
Bug Fixes
Fixed issue where consistency did not work across multiple workers.
Ensure that subset configuration is validated when the subset configuration page has not been loaded yet.
Fixed issue where Privacy Hub could continue to refresh even after a scan was complete.
Incorrect generators are no longer showing in the MS Sql Server image data type.
Fixed magnifying glass placement on Privacy Hub on field hover.
Fixed various issues with toast notifications. Reset scrollbar position on table selection change in bulk editor.
Mar 25, 2020
Features
Improved logging when testing a connection
Added MySql 8 compatibility for our collated queries.
Complete overhaul of progress tracking and the job details page. Jobs Details now has progress bars, spinners, and various icons to present the current status of a job.
Performance
Index restoration parallelized on MySql.
Foreign keys added to a MySql table are now added with 'foreign_key_checks=0'.
Tables marked as 'Preserve Destination' now complete significantly faster on SQL Server.
Security
Added TLS support allowing you to use a custom certificate on the Tonic web server.
Encrypted all traffic between containers.
Update third party packages to fix various security vulnerabilities.
Bug Fixes
Fixed issue in MySql Restore Foreign Key query
Fixed issue when user changes a JsonMask or XmlMask generator to other generator types
Tiny Int not treated as Boolean in MySQL
Fixed MySql issue with preserved tables having the same name in different schemas
PII Scanner now picks up on zip code data by column name.
PII detection using lookup tables now removes leading and trailing whitespaces and converts all values to lowercase. This fixes our PII detection for cities and states.
Test Connection button no longer holds on to table locks on MySql databases.
Workspace Name field now receives focus when workspace editor is opened.
Enter key can now be used to save Workspace Editor.
Allow UUID primary keys as downstream type in subsetting mode.
Fix issue with GUID primary keys when subsetting is enabled.
Long columns names now render properly on Privacy Hub.
Fixed issue with bulk adding consistent generators.
Account creation no longer hangs when server has no internet connection.
Fixed spacing issue with JSON Mask Generator interface.
Mar 10, 2020
Features
Bulk operations on multiple columns in the Database View:
Add and remove generators
Change privacy status
UI now distinguishes between primary (gold) and foreign keyed (black) columns
Advanced search for columns in the Database view.
Generator select popover on database view is now at feature parity with same popover in Table View.
Ability to manually mark columns as private
Columns with private data that are not protected are flagged as red (Database View only) and have a warning in the generator selector
tonic_worker now has a /health endpoint
Phone Number Generator supports integer columns
Differential Privacy warnings now show for Categorical Generator when being used on JSON and XML columns
Ability for user to provide their own secret to be used for encrypting both JWTs and database connection info
Ability to change password (under User Settings in the hamburger menu located on the right of the menu bar)
Performance
Preserved tables in MySQL are no longer copied into the preserve schema via SELECT INTO. Instead they are RENAME'd
No longer drop and restore primary keys and indexes in MySQL
Bug Fixes
Better support for 3 byte unicode characters when writing data to masked tables in Postgres
Fixed rare concurrency exception being thrown in progress tracker when jobs are running
Fixed malformed query in Sql Server which is used when subsetting
MySQL handling of character
Additional logging of MySQL index handling
Feb 19, 2020
Features
Bulk operations on multiple tables in the Database View
Where clause in subset - in addition to a target %, you can now specify a custom where clause as the target for subsetting
URL Generator is more private
PII type is shown in Privacy Hub
Bugs
Styling issues for Chrome and Firefox
Event Ordering defaults corrected
Columns with nothing put nulls gave bad data previews
Feb 11, 2020
Features
Health Endpoints /health added for web server and PII detection server
Non-null values are no longer shown during data preview
Sort by column name in privacy hub
Ability to delete a workspace
Columns part of truncated tables are now labelled as such in the Database view
PII Detection now uses the Character Scramble on suspected phone numbers
Support for Noda time in PostgresSQL and throughout the product
Other columns no longer shows in the database view
Email Generator now has an excluded domain filter
Subset with Foreign Key Upload File
Performance
Database View UI rendering improvements
UI improvements for MySQL databases with 1000s of tables
Ability to process tables in parallel across all databases. This setting is controlled by the TONIC_TABLE_PARALLELISM environment variable
Faster reads from source database. This affects SQL Server, MySQL, and Postgres and should see improvements for reads from the source database across the board
Optimization for SQL Server to improve writes
Better distribution of load to workers
Bug Fixes
Fixed bug in XML generation for Postgres
Reduce rate of false positive when detecting zip codes
Subsetting now properly supports self-referential foreign keys
SQL Server uniqueidentifier columns now properly work with the UUID generator
Privacy hub has better styling for long column names
Fixed issues related to change of focus on Target % input on subset configuration page
Jan 2, 2020
Features
Support consistency for the email generator on a unique column
Rename and consolidation of docker images. Please contact support@tonic.ai before upgrading for a new docker-compose file
Massive refactor of subsetting, mostly behind the scenes. User facing improvements:
Support for all databases (not just Postgres and MySQL)
Heroku no longer requires a temporary database
Performance boost
Categorical generator now supports differential privacy (https://www.tonic.ai/post/differential-privacy-comes-to-tonic/)
Bug Fixes
Fix for certain unicode characters in table/schema names
Fix for character scramble on numerical columns
Several fixes for air gapped deployments
Significant performance improvements for the privacy scan
Dec 27, 2019
Features
Privacy Hub more details here: https://tonic.ai/post/introducing-privacy-hub/
Support for SQL Server partition functions and partition schemes
Data Preview in Database View
Suggested Generators now show privacy scan suggested generator when applicable
Bug Fixes
Better support for SSH Tunneling over private IPs
Nov 15, 2019
Features
Generator improvements:
Email Generator can be used on columns with a unique constraint
Character scramble now supports consistency
Filename and Email generators make use of character scramble as opposed to character substitution (more secure). They both also now support consistency.
More improvements to PII Detection, specifically speed, resilience, and better name detection
Ability to duplicate workspaces (see screenshot below)
Show generators applied on column in bulk view when table is truncated
Support for SSH Bastion when connecting to source and output database
Added new ENV variable for controlling parallelism for MySQL generation
Bug Fixes
Fixed bad description on FK Relationships upload help text
Fixed layout issue in Replacement Panel
Fixed issue with ICD10 PII detection
Fixed routing issues on Heroku
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
Releases V899 - V901 were removed from quay because of a regression that was fixed in later releases.
August 25, 2023
You can now export individual topics from the Tonic documentation to PDF files. To export a topic to PDF, click the actions menu next to the topic title, then click Export as PDF.
Fixed an issue with removing unique constraint conflicts in upsert where rows that didn’t have a conflict were excluded from the upsert process.
File connector
For Amazon S3 and Google Cloud Storage, the permission to list all buckets is no longer required. However, if that permission is not present, users must manually type in the bucket name where the file is located.
When you copy a file connector workspace, Tonic now copies the file groups to the new workspace.
August 18, 2023
Upsert data generation (beta feature)
Previously, the data generation process always replaced the entire destination database. The new upsert data generation option (currently in beta) allows you to add new records and update existing records without touching any of the other records in the destination database. For example, you might have a regular set of records that you use for testing that you want to maintain.
Upsert requires a connection to an intermediate database. When you run data generation with upsert, the initial data generation writes the transformed data to the intermediate database. It replaces the intermediate database, similar to regular data generation. After the generation to the intermediate database, the upsert process identifies the records to add to or update in the destination database. It ignores other records in the destination database.
Upsert requires a Professional or Enterprise license, and is only supported for the following data connectors:
MySQL
Oracle
PostgreSQL
SQL Server
New AI-enhanced documentation search option
The Tonic documentation now provides access to Lens, an AI-based search option. Instead of searching for specific words, you can ask questions such as "How do I create a workspace?". Lens searches the documentation for the answer. It generates a response that includes links to the topics that contain the information it used.
To use the Lens search, click the search field. At the top right of the search panel, click Lens. Then ask your question.
Other updates
The Custom Categorical generator now supports consistency with other columns. Previously, the generator only supported self-consistency.
Tonic now prevents you from starting data generation for a workspace that does not have a destination database specified.
Fixed a display issue where the generator preset details panel briefly showed the occurrences for the previously selected generator preset.
Tonic now suggests the Name generator for columns that Tonic detects as containing names, when the detection is based on the sampled data in the column. By default, the Name generator uses the First Last
format.
A new configuration option allows webhooks to bypass SSL certificate validation and trust the server certificate.
File connector
You cannot start data generation for a file connector workspace that has no source files specified.
The Table View data preview for file groups that contain JSON or XML files no longer displays above the real data records an extra row that contains the value 0.
Fixed an issue where generators such as the Categorical generator unexpectedly could not be used as sub-generators.
Improved error handling when a file group is incorrectly configured as having a header row.
Fixed an issue that caused data generation to fail for XML files because of missing metadata.
Fixed an issue where Database View displayed duplicate columns.
Table View no longer displays an extra data row.
MongoDB
Fixed an issue that caused unscanned collections to not display on Collection View.
The sensitivity scan no longer marks Null fields as sensitive.
PostgreSQL
Improved performance when refreshing materialized views in PostgreSQL v11, v12 and v13.
August 11, 2023
On the generator configuration panel, changed the label of the Save As menu to Preset Options. The menu contains options related to configuring generator presets.
Free trial users now have access to the file connector.
Tonic now displays the error that occurs when an Algebraic generator configuration does not include any floating-point values.
For composite generators, the generator preset details panel now provides a clearer explanation that presets for composite generators must be configured from within a workspace.
Improved performance for the Address generator and the HIPAA Address generator.
Amazon Redshift
Fixed an issue with clearing temporary tables.
File connector
Fixed how Tonic handles EOF characters.
MongoDB
Corrected the order of the available generator presets for a field.
Snowflake
Fixed an issue where data generation returned the error The specified bucket does not exist
.
SQL Server
Database connections can now use the MultiSubnetFailover
option.
Improved error messaging when a database cannot be created because of permissions issues.
August 4, 2023
Removed support for TIM - The Tonic Installation Manager (TIM) command-line tool to install and configure Tonic is no longer available.
Free trial users can now use a public email address to create the free trial account. Users with public email addresses cannot invite other users or share workspaces. Public accounts are only allowed for free trials.
Users on a Professional instance can now share the Manager workspace permission set with users and groups.
Improved error handling and validation messages for the foreign key file upload process.
Counts of generator preset occurrences no longer include occurrences in deleted workspaces.
On the bulk update panel in Database View, the consistency and differential privacy options now display correctly.
Fixed an issue where you could not select Passthrough as a sub-generator in a composite generator.
Fixed an issue where custom presets could not be deleted.
Fixed a display issue where long post-job action names overflowed into the next column.
Fixed an issue where you could not assign Random Timestamp as a sub-generator for the Conditional generator.
Fixed an issue where the generator configuration panel displayed the generator preset options when the user did not have the Manage generator presets global permission.
Fixed an issue where when a constraint failed to be applied, data generation failed.
Improved display when users who do not have the Manage generator presets permission try to display the Occurrences tab on the preset details panel.
Improved how we handle unavailable options for workspace actions in Workspaces view and in the Tonic navigation options.
For the Conditional generator, Tonic now correctly compares MySQL date values.
Databricks
Tonic cluster initialization scripts are now uploaded as workspace files instead of DBFS files. The new, optional Workspace Path setting for Databricks workspaces controls the parent directory where Tonic uploads initialization scripts. The default value is /Shared
.
File connector
The file connector can now support .txt files that contain CSV, XML, or JSON content.
Fixed an issue where Tonic incorrectly identified how a file connector file was encoded.
Improved error messages when uploading files for the file connector.
Fixed an issue when configuring a file group from Amazon S3 where users saw the error "Failed to fetch files from S3. The continuation token provided is incorrect." but could still see the list of files in the S3 bucket.
Tonic now correctly updates the file configuration for file groups. Previously, users could not add files that did not match the default configuration.
Tonic now displays an error when it is unable to read files from Amazon S3.
The file explorer for Amazon S3 can now list the files in folders that have names that contain special characters.
Improved encoding detection and file parsing.
Tonic now correctly handles EOF characters in .csv files.
Tonic now preserves the encoding of .csv files.
MongoDB
Fixed an issue where the protection status information at the top of Privacy Hub did not update correctly after a new sensitivity scan.
July 28, 2023
Custom generator presets
Earlier this year, for Enterprise instances, we introduced the concept of generator presets. A generator preset is a saved configuration of a generator. You can assign generator presets to columns.
The initial release only included built-in generator presets, which allowed you to set the default configuration for Tonic generators.
This update in V924 introduces custom generator presets, which allow you to set up multiple configurations of the same generator. You can create custom generator presets from Generator Presets view. From a generator configuration panel, you can also save the current configuration as a new custom generator preset.
Generator preset occurrences
From Generator Presets view, you can see how often each preset was used in a workspace configuration.
The Occurrences column of the generator presets list shows:
The number of times the baseline configuration was used
The number of times the baseline configuration was overridden, meaning that a user selected the generator preset and then made a change to the generator configuration
On the generator preset details panel, the Occurrences tab displays both the number of occurrences and the specific workspaces and columns where the generator preset was used. You cannot see workspace and column details for workspaces that you do not have access to.
Other updates
Tonic can now integrate with GitHub for SSO authentication.
To manage generator presets, users must now have the Manage generator presets global permission. Previously, you could also manage generator presets if you had the Manager or Editor workspace permission set for any workspace.
Fixed an issue where the table data in Table View was not updated correctly when switching the table mode to or from Scale mode.
Improved performance for the Regex Mask and Conditional generators.
MongoDB
Fixed an issue where the subsetting Graph View did not display virtual foreign key relationships.
You can now add collections to a subsetting rule before the sensitivity scan completes.
PostgreSQL
Fixed an issue where certain database constraints were not handled correctly, which resulted in job warnings about the failure to add those constraints.
July 26, 2023
Permissions and permission sets
As of V922, Tonic now uses permissions and permission sets to manage access to Tonic features and functions.
A permission controls access to a single feature or function. A permission set is a saved collection of permissions.
Tonic provides built-in global and workspace permission sets. You cannot change the configuration of built-in permission sets. Enterprise instances can create custom permission sets.
Global permission sets contain global permissions, which control access to actions outside the context of a specific workspace. The built-in Admin global permission set grants access to all global permissions. Users and groups configured in the TONIC_ADMINISTRATORS
environment variable are granted the Admin (Environment) global permission set, which also grants access to all global permissions. These global permission sets replace the previous admin user concept.
The built-in General User global permission set is assigned to all Tonic users, and grants access to create workspaces. You can also designate a different global permission set to assign to all Tonic users.
Workspace permission sets are assigned in the context of a specific workspace. They provide access to workspace permissions, which are associated with workspace management functions. The built-in workspace permission sets (Manager, Editor, Viewer Auditor) mirror the previous workspace roles. Similar to the previous Owner role, the Manager workspace permission set is granted access to all workspace permissions. However, unlike the Owner role, the Manager workspace permission set can be assigned to any user or group. You use the workspace sharing function to assign workspace permission sets within a workspace.
Each workspace has a single owner. The user who creates the workspace is the initial owner. All owners are by default granted the Manager workspace permission set. You can also designate a different workspace permission set to assign to workspace owners. You use the transfer ownership function to select a different owner for a workspace.
On Tonic Settings view, the User Management tab is replaced by the Access Management tab. From the Access Management tab, you can:
If you use SSO, view a list of SSO groups
View, configure, and manage access to global permission sets
API endpoint to track user access and permissions
A new API endpoint to track the following events related to user access and permissions:
A user account is created.
A user account is removed.
A user logs in to Tonic.
A permission is added to or removed from a permission set.
A permission set is assigned to or removed from a user. This might be a global permission set, or a workspace permission set in the context of a specific workspace.
A generator preset is updated.
The endpoint is:
GET
/api/audit-events/search
Other updates
You can now assign the Business Name generator as a sub-generator for the Regex Mask generator.
For subsetting, Graph View and the table details panel now display information about cycle breaks, when the subsetting process needs to break a circular dependency.
Databricks
Fixed an issue that prevented the use of partition filters in Databricks Unity Catalog workspaces.
File connector
When a file connector workspace is deleted, Tonic now deletes files that were uploaded from a local file system.
Spark
Fixed an issue with data generation for workspaces that use Hive.
July 21, 2023
The Admin Panel is renamed to Tonic Settings.
Tonic now provides a more meaningful error when Preserve Destination mode is assigned to a table in a workspace that does not have a defined destination database.
Fixed an issue where Tonic opened too many connections to the application database.
Fixed an issue with timestamps in the Tonic API specification.
Fixed a Tonic Cloud issue where using a different email domain to update a Tonic license caused issues with Tonic logging.
Enhanced the performance of the HIPAA Address generator.
The Data Pipeline V2 data generation process now respects the TONIC_PROCESS_PARALLELISM
environment variable.
Improved performance for subsetting, especially for data that contains a large number of foreign key relationships.
Made a small performance improvement to primary key generators.
File connector
Improved error messages when uploading files for the file connector.
The file connector now supports extended ASCII-encoded files.
Fixed an issue where the file preview omitted the first row of the file.
MongoDB
On the Jobs view, you can now filter for the Collection Schema Scan job type.
Fixed an issue where sensitivity scans failed on large collections.
Oracle
Tonic now provides a more meaningful error when it loads Table View for a table that the database account does not have access to query data from.
PostgreSQL
Corrected the job history entries for subsetting jobs that run using the Data Pipeline V2 process.
Spark
Fixed an issue that caused jobs to fail when an invalid Repartition or Coalesce value was specified.
July 14, 2023
Tonic Data Pipeline V2 for PostgreSQL ends beta
During the first half of 2023, Tonic has run a beta program for PostgreSQL for our new Data Pipeline V2. The beta program is now ending. Thank you to all of those who provided feedback.
Starting with version V905, Tonic.ai will progressively enable Data Pipeline V2 for all customers. To ensure a smooth transition for all our PostgreSQL customers, Tonic.ai controls the rollout remotely.
The remote rollout mechanism is controlled by an HTTPS request from your instance to https://feature-flag.tonic.ai. A JSON payload with a unique identifier for your deployment is sent, and the status of Data Pipeline V2 is returned. This request happens at the start of each data generation. If your Tonic server cannot reach https://feature-flag.tonic.ai, then the check is skipped.
What to expect for the enrollment:
Before your instance is enrolled in Data Pipeline V2, Tonic Customer Success will contact you to confirm your enrollment.
After your instance is enrolled, the Data Pipeline V2 toggle on the Confirm Generation panel is removed. All PostgreSQL jobs run using Data Pipeline V2.
For jobs that run on V2:
The job type is Data Pipeline Generation instead of Data Generation.
Jobs should run faster. Data Pipeline V2 has better resource handling, and can provide more parallel execution, especially for large tables and to apply constraints. Not all jobs will be faster, but for some jobs there should be a significant improvement.
We will continue to improve Data Pipeline V2 as we expand coverage to other data connectors.
Expanded Graph View for subsetting
The subsetting Graph View is expanded to use more of the available screen space. The Configure Subset panel, which includes the Options and Latest Results tabs, no longer displays on Graph View. It only displays on Table View.
Other updates
Fixed an issue where when a data generation job failed, tables that used Preserve Destination table mode were not restored.
The generated Tonic API documentation now includes the endpoints for managing file groups for file connector workspaces.
Fixed an issue that caused jobs for some workspaces to fail with the exception "Cannot modify workspace whose schema is not the latest version.".
Fixed an issue where the job details view displayed incorrect information.
Updated the /api/DataSource
endpoint to not contain secure data such as the API key.
Updated our OpenAPI documents to ensure that all values of operationId are unique.
Improved error messages for failed data generation.
Databricks
The Test Connection button on the workspace details view now works correctly.
MongoDB
Fixed an issue where schema scans time out even when a timeout is not configured.
Fixed an issue where MongoDB workspaces did not support MongoDB versions below v4.4.
MySQL
Tonic data generation now supports generated columns in de-identified tables.
Oracle
To improve the resiliency of Oracle commands during data generation, fixed the retry logic.
PostgreSQL
Tonic no longer waits to process a job cancellation until after it finishes the constraint application that was in progress when the job was canceled.
Fixed an issue where Tonic returned an "insufficient space" exception when writing numeric values to a destination database.
Snowflake
Improved performance for Snowflake on Azure.
Spark
Job tracking URLs now display correctly on the job details page.
SQL Server
Fixed an issue where Tonic did not retrieve all of the compound keys from the source data.
April 8th, 2021
Features
Add environment variable for altering MySql batch size during CopyRows
Ability to search label on Foreign Key viewer
CSV Generator support for tab/multi delimiters
Better performance tracking in the UI for upstream and downstream tables
Sunset Classic Subsetting in favor of the now default Full algorithm
Bug Fixes
Fix auto_increment issue on bigint columns
Don't fail if foreign key column has been removed
Fix Auto Increment on MySql when dropping indexes
Better support for JSON during MySql generation
March 31st, 2021
Features
Support added for non-primary key auto-incrementing columns
MySQL no longer requires locks on the source DB during generations
Generators are now easier to remove in the UI
Reference tables can now be defined in subsetting configurations
Enhancements to the Foreign Key UI
Added a clickable link to the Job Start notification
New Character Separated Value generator added
Bug Fixes
Performance optimizations for the Address generator
Improved UI experience for workspaces with thousands of tables
Optimized memory usage for large workspaces during data generation jobs
March 12th, 2021
Features
Subsetting upstream has better handling for table relations with multiple constraint groups where one of the constraint groups is often null
Differential Privacy is now available for Continuous generator
Added new Tonic logos
Generators are not allowed on foreign key columns that are also primary keys
Subsetting is now functional when Primary Key Generators are applied to primary keys
Bug Fixes
Resolved data upload for MySQL
Oracle 19 helper improvements
HIPAA address generator fix for zip codes
Refresh data table after Foreign Keys update
March 3rd, 2021
Features
Password reset functionality added
Search and Sort columns when adding Foreign keys in Tonic
Nullable Foreign keys are now an option when adding them in Tonic
New Date Shift generator added
Foreign Keys are now sorted alphabetically by default
Ability to set a starting point in the sequential integer generator
Bug Fixes
Fix in how we handle multi-column primary keys in subsetting
@ mentioning style improvements for Commenting
UI fixes for Foreign Keys section
Enhancements to statistics generation with conditional + categorical generators
Copying Databricks workspace works as expected
February 17th, 2021
Features
Support for Db2 LUW added
New option to set strictness for schema changes
Added support for pg_repack extension
Added support for more Key generators in Spark
Added support for tinyint and smallint data types in subsetting
Bug Fixes
Better cache handling for subsetting
Improvements for NULL checking in Consistency On
Fixed issues in Subsetting Preview
Reduced collisions in Unique Email generator
Workers no longer crash when unable to obtain a queued job
Empty Post Job Scripts now throw a warning rather than failing the job
Better log messages with primary key generators fail
Fixed arithmetic overflow error when calculating SQL Server database size
Fixed upstream exhaustion in subsetting
February 3rd, 2021
Features
Protection Audit trail now logs enable/disable of Differential Privacy toggle
Support for Memory Optimized tables in SQL Server
Ability to add new relationships in the Foreign Keys section
NULL generator can now be used on columns with uniqueness constraints
New Subsetting option: Full Algorithm (default is still Classic Algorithm)
Fixed issue with retrieving column names when adding FK in Foreign Keys section
resolved conflict when setting generator to consistent on a column with a constant generator
Switching between email generator and unique email generator now clears state as expected
MySQL point columns no longer halt generation
January 22nd, 2021
Features
New UUID Key generator
Support for PostgreSQL Client keys
Options to preserve OUIs and Uniqueness in MAC address generator
New HIPAA Address generator
Email generator can be used on synthesized tables
Custom Categorical generators can now be linked
Workspace ID copy button added
Column output data can now be made consistent on other column's data
Spark filtering of tables in Databricks
Improved logging
Bug Fixes
Better handling of times without timezone values in Postgres
UI fixes
Subscriptions and Publications no longer copied in Postgres
December 23rd, 2020
Features
Improved performance of synthesized tables
Added new Foreign Key Viewer + Remover
Bug Fixes
Fixed issue with display of password reset errors
Fixed rendering issue with boundary of some popovers
December 11th, 2020
Features
API endpoint test_destination_db_connection added
User can now filter tables by schema in the Database View
Support for Hive + Spark datasource added
Support for synthesis on datetime primary keys
Improvements/Bug Fixes
Improved error messages displayed in the UI
Logging improvements
Synthesis mode improvements
Fixed UI issue with First connection wizard
February 15, 2022
Features
Relax unique index constraint on username
Added tonic version in containers as an environment variable
Adding logic to bypass Fallback generators for Spark databases
Support case insensitive name consistency
Disable dropping replacements on missing columns for mongo
Update Oracle driver
Enable additive noise generation for strings in mongo
Add support for unsigned 64-bit integer columns in integer key generator
Add random timestamp and timestamp shift generators as options for integer columns (for unix timestamp columns)
Add checks for min/maxTime on timestamp and min/maxDate on dates
Adding null ref check for fallback generation option check during preview
Revamped workspace configuration screen
Support for Decimal on Spark (when using C# UDFs)
Allow Regex Generator on PKs
Added new environment variable to filter tables ingested by UI for Oracle
Misc perf improvements
Bug fixes
Fix mongo view more values issues
Disable dropping replacements on missing columns for mongo
Fix Workspace specified via API without DBs can be modified in UI
Fixing Random Timestamp generator on Java
Date truncation generator bug fixes
February 8, 2022
Features
Better error exception handling for number conversions in Mongo
Added loading indicator when viewing single Mongo document
Improved Mongo performance and logging
Added table filter validation for EMR
Handling numbers as doubles instead of integers in Snowflake
Added aiven.io system schema to Postgres
Added the Geo Generator for latitude and longitude data
Added support for materialized view in Redshift
Bug fixes
Fixed array regex mask generator with correct subgenerator options
Fixed conditional generator by adding binding
February 4, 2022
Features
Updated to dotnet 6
Moved FK import to FK page
Added support for tinyint in MySQL as a primary key
Added table partition filtering for EMR
Enabled Regex Mask generator on EMR
Added support decimals with precision and scale for EMR
Moved Tonic version number to hamburger menu
Added job sorting by start time
Ability to retrieve Kubernetes logs via API
Improved behavior of upstream reference tables for subsetting
Improved logging for subset filtering
Updated sensitivity scan to better reflect job status
Improved privacy scan performance for Mongo
Introduced automated schema checks (not for Mongo)
Added HTML Mask generator
Added Additive Noise Generator
Added CNPJ generator
Added Array Regex Mask generator
Deprecated endpoints
Updates to the foreign key feature impact the API endpoints used to fetch, upload, and delete the Foreign Key file. Two previous API endpoints are now marked as deprecated:
DELETE /api/datasource/delete_fkupload
has been marked as deprecated and replaced by DELETE /api/workspace/{workspaceId}/foreign-key/
GET /api/datasource/download_fkupload/
has been marked as deprecated and replaced by GET /api/workspace/{workspaceId}/foreign-key/
The main change between both APIs is the switch to using workspaceId
to fetch the Foreign Key file instead of the datasourceId
.
Both are slated to be removed after 2022-06-01.
Bug fixes
Fix number parsing issues in JSON mask generator
Fix MySQL parsing of auto_increment columns
Eliminated re-querying of data type on EMR Spark
Fixed issue viewing Mongo workspace when there's an empty collection
Fixed for TruncatedDateGenerator when used as SubGenerator
Fixed data passed to SubGenerator in RegexMaskGenerator
January 21, 2022
Features
Show tooltip for sidebar menu items when collapsed
Allow clicking of Queue Status and Download Job logs buttons on Jobs page
Added privacy scans to Jobs page
Added quick navigation to Jobs page from Workspaces tables
Added support for random timestamp generator for EMR
Improved Hstore generation performance
Added job-specific URLs
Webhooks are triggered when jobs are manually cancelled
Added warning for missing use statements in MySQL post-job scripts
Make JSON mask generator respect ordering with multiple subgenerators
Bug fixes
Fix Postgres binary timeout issues
Improved error handling & logging when applying Foreign Keys on subsetted SQL Server
Filter out constraints with same name as foreign key
January 7, 2022
Features
Added Privacy Report CSV download for Preview (to Privacy Hub) and Generation Report (to Job Details).
Added support for preserve destination in Oracle.
Added support for temporal tables on Sql Server.
Added support for read parallelism via paginated queries.
Updated fallback generators to filter by column details and to remove text-mask generator.
Disable page parallelism when pageSize too small.
Add styles for signup divider.
Bug Fixes
Fix AuroraMySql hang.
Clarify Subsetting UI.
SQL Server Constraint Performance Improvements.
Fixes for PII Scan.
Fix Jobs Foreign Key Migration Issues.
Log missing env var once per process.
Improvements to JSON Mask Performance using JToken creation to improve data type handling.
Text mask performance improvements.
Fix for Polling TS error.
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
Releases V849 through V854 were removed from quay because of a regression that was fixed in later releases.
June 2, 2023
Removed caching of AWS credentials.
For the beta Data Pipeline V2 job processing (available for PostgreSQL only):
When a job fails, Tonic no longer tries to fall back to the current job processing. In most cases, jobs fail for reasons that are not connected to the processing type. Falling back to the current job processing is not effective.
Improved performance for subsetting.
Adjusted the logging level for telemetry-related log messages to DEBUG
.
On the Foreign Keys view, when you filter the keys, click the select all option, and then clear the filter, only the matching keys are selected. Previously, the select all option always selected all of the keys.
Fixed an issue where CSV files could not be uploaded to data science mode workspaces.
You can now configure parallelism for sensitivity scans. For relational databases, you use the environment variable TONIC_PII_SCAN_PARALLELISM_RDMBS
, and the default is 4. For document-based databases, you use the environment variable TONIC_PII_SCAN_PARALLELISM_DOCUMENTDB
, and the default is 1.
On the table configuration panel for subsetting, Tonic no longer displays a count of post-subset rows before a subset is generated.
Fixed an issue where subsetting data from one workspace appeared in a different workspace.
You can now use UUID columns in the conditions for the Conditional generator.
Fixed an issue where when you deleted a linked column from a column configuration panel, the other linked columns were deleted.
The Timestamp Shift generator can now be assigned to columns where the values use the date format MMddyyyy
.
Fixed a regression that caused NPGSQL logging to occur even when it was disabled.
Google BigQuery
Improved error handling when rows contain invalid data. Tonic now provides a method to look at the data that caused the error. Fixed the handling of rows to prevent errors on certain data.
Fixed how we create views in the destination database.
MongoDB
Fixed an issue where Tonic was unable to get the schema for the source data.
Oracle
Fixed an issue where retries of Oracle commands from transient errors failed.
Snowflake
Improved resource handling during data generation in order to enable parallel processing.
Snowflake on Azure no longer requires CREATE SCHEMA
permissions in order to support Preserve Destination mode for tables. Snowflake on AWS with Lambda processing continues to require CREATE SCHEMA
permissions in order to support Preserve Destination mode.
When Tonic is unable to create a view in the destination database, it now returns a warning instead of an error.
Implemented a more accurate method to detect hexadecimal values.
SQL Server
Tonic now displays the correct generators for columns that are part of a composite unique index.
May 26, 2023
Fixed an issue on Table View where users could not use the delete icon to remove a generator assignment for a linked column.
Fixed an issue where the job details view for a subsetting job did not always show all of the steps as completed.
Updated the version of pytorch, which is used for data science mode. This new version addresses some security vulnerabilities.
Fixed an issue where when the Tonic server was air gapped, the Admin Panel did not correctly display the current Tonic version.
Fixed an issue where jobs took longer than expected to complete.
Fixed an issue where the subsetting Graph View did not show how table participation in the subset changed since the most recent subsetting data generation.
Fixed an issue where after a single failure to write logs, the Download Job Logs feature stopped refreshing the logs. Tonic now continues to try to upload logs.
Google BigQuery
Fixed an issue with the test connection function for the destination database.
Materialized views and routines from the source database are now copied to the destination database.
MongoDB
Improved performance for sensitivity scans.
MySQL
Fixed an issue where subsetting failed because of a data type mismatch between a primary key and a foreign key.
Oracle
When TONIC_ORACLE_SKIP_CREATE_DB = true
, foreign keys are now correctly enabled on the destination database.
PostgreSQL
Fixed an issue where the source database permissions check provided a false error about insufficient privileges for sequences.
For the beta Data Pipeline V2 data generation process, fixed an issue where the data generation process continued even after the job failed.
Snowflake
Fixed an issue where the presence of comments caused data generation to fail.
Improved performance for sensitivity scans.
SQL Server
Added support for user-defined types.
May 19, 2023
Generator presets are now supported for Enterprise licenses on Tonic Cloud.
For Tonic data encryption, fixed an issue where the previous encryption key environment variable value was saved in the application database, which caused Tonic to use those values even after they were removed.
The Tonic diagnostic logs now include the Tonic worker ID.
Fixed an issue where the Tonic web server would not launch unless the Tonic application database used PostgreSQL v13 or later.
For data connectors other than MongoDB, the sensitivity scan is now parallelized.
Google BigQuery
Data generation now works correctly when the region that hosts Google BigQuery for the destination database is different from the region for the source database.
Fixed an issue where failed data generation jobs were incorrectly reported as successful.
Tonic now handles the TIME data type correctly.
MySQL
For Incremental mode, fixed an issue where the values of timestamp columns on modified rows were not updated in the destination from the source.
Oracle
When TONIC_ORACLE_SKIP_CREATE_DB=true
, fixed an issue where the truncation of tables violated foreign key dependencies, which caused jobs to fail.
Added an option to enable the TCPS protocol for Oracle database connections. Previously, only TCP was supported. If you enable TCPS, you must also provide a wallet file.
Tonic now cleans up temporary destination database tables that were created during subsetting data generation.
PostgreSQL
For Incremental mode, fixed an issue where the values of timestamp columns on modified rows were not updated in the destination from the source.
During data generation, Tonic now warns users when an extension that the destination database needs is unavailable for installation.
Snowflake
For both Snowflake on AWS and Snowflake on Azure, you can now configure workspaces to limit the schemas to include.
May 12, 2023
Fixed an issue where users could only select generators that supported uniqueness constraints for columns that were not unique.
Fixed an issue where admin users who did not have edit permissions on any workspaces could not edit presets from the Generator Presets view.
Improved data generation resiliency against transient failures.
Removed erroneous error messages.
To add AWS credentials to containers, you can now mount to ~/.aws/credentials
.
Improved error messaging for Table View.
Fixed a display issue where the column configuration panel was too narrow and required horizontal scrolling.
Exporting or copying a workspace no longer requires the workspace to have a valid source database connection.
Reduced the amount of memory needed to run the Tonic web server.
MongoDB
Better handling of errors that involve invalid UUIDs.
Oracle
Updated the required permissions for destination database connections. If SELECT ANY DICTIONARY
or SELECT_CATALOG_ROLE
cannot be granted, then Tonic can use a selection of ALL_ views (not recommended).
If TONIC_ORACLE_SKIP_CREATE_DB=true
, then external tables are now excluded from the table list in Tonic. Tonic does not process those tables.
PostgreSQL
Fixed an issue where the Data Pipeline V2 flow would hang.
Fixed an issue where extensions such as pgcrypto were not transferred when data generation included schema filtering.
Improved performance when handling constraints.
Snowflake on AWS
As of V823, you can choose whether to use the Lambda process for data generation, which was previously the only option. By default, Snowflake on AWS uses a new, more resilient data generation process. You only need to use the Lambda data generation process for extremely large volumes of data (hundreds of gigabytes to terabytes). For existing workspaces, for versions before V826, the new default process is used. To use the Lambda data generation process, you must update your workspace configuration. As of V826, existing workspaces use the Lambda data generation process.
For the temporary CSV files used to retrieve and write source and destination data, you can now specify to use an external stage instead of an S3 bucket. The option to use an external stage is not available when you use the Lambda data generation process.
You can now specify different file storage locations for the temporary source and destination data files. In other words, you can have different S3 buckets or different external stages. Note that this option is not available when you use the Lambda data generation process.
For the new data generation process, fixed an issue where data generation jobs would hang instead of failing.
Snowflake on Azure
Before it runs a data generation, Tonic now verifies that there is a valid value for the Azure Blob Storage account key, which is set as the value of the environment variable TONIC_AZURE_BLOB_STORAGE_ACCOUNT_KEY
.
Fixed an issue where data generation jobs would hang instead of failing.
May 5, 2023
For Tonic data encryption, Tonic now only verifies the key for the enabled process. If you only enable decryption, then Tonic only verifies the value of TONIC_DATA_DECRYPTION_KEY
. If you only enable encryption, then Tonic only verifies the value of TONIC_DATA_ENCRYPTION_KEY
.
Upgraded our Docker images from Ubuntu 20 to Ubuntu 22.04.
Updated to ensure that the Tonic URL reflects the currently active workspace.
Recently started jobs no longer display a start time that occurred several years ago.
On the Data Encryption tab, the option to provide custom initialization vectors is now a toggle instead of radio buttons.
Resolved an issue where Tonic took an extremely long time to load.
Oracle
Reduced the permissions required to test database connections.
Changed the required permissions to better support when TONIC_ORACLE_DBLINK_ENABLED
is false. For the source database user, you can either grant SELECT ANY DICTIONARY
, grant SELECT_CATALOG_RULE
, or (not recommended) grant access to the ALL_*
views.
PostgreSQL
Improved the error messaging when testing the connection to the destination database.
Snowflake
You can now use a connection string to connect to the source and destination databases. Also added support for proxy connections.
April 28, 2023
Releases 805 and 806 were removed from quay because of a regression that was fixed in later releases. The issue caused performance degradation in data generation.
A copied workspace now includes manual sensitivity designations. A manual sensitivity designation is when you change the sensitivity designation that was assigned by the sensitivity scan to either sensitive or not sensitive.
When the configured encryption or decryption key is not valid - for example, the key is not configured or uses the incorrect size - then Tonic does not allow you to configure Tonic data encryption.
Tonic Cloud now correctly enforces the supported data connectors.
Improved error messaging when subsetting data generation fails because the generator cannot be used with subsetting.
When you change the type of Tonic data encryption (decryption, encryption, or both), Tonic no longer clears the decryption and encryption text fields.
MongoDB
Fixed an issue where workspaces that contained views did not load.
Oracle
Fixed an issue where the default Oracle NUMBER
type was not compatible with the Integer Key Generator.
Removed an invalid error that was returned when users tested data connections.
PostgreSQL
For the beta Data Pipeline V2 generation process, improved the error logic to prevent jobs from hanging when errors occur.
SQL Server
Improved the resilience of data generation to transient failures.
Oct 29, 2019
Features
Support for JSON datatype in MySQL
More easily remove a generator from a column in the bulk view
Improved PII detection
Ability to change table modifiers on Database View
Subsetting preview of included and excluded tables
Performance
Increase speed of restoring indexes for MySQL
Bug Fixes
Fixed bad timestamp for jobs when still in queued state
Fixed bug found when using some versions of MySQL where dropping an index on a FK column results in an error
Fixed description for algebraic generator
Resolves an issue that was causing the job details page to fail to load.
Fixed MySql failing to create foreign keys because the db/schema wasn't specified in all cases
More logging
Oct 14, 2019
Features
XML generator - use XPath to target one or more values for a generator. This implementation is very similar to the current JSON generator
Partitioning is out of beta. The Continuous and Event generators can be partitioned by either a Categorical or Passthrough generator
Support for computed columns in SQL Server
Re-sizing of autodetect and bulk view table trees, for long table names
Schema change notifications now use small icons
Active page is highlighted on sidebar and remember sidebar state
Check for correct user permissions on source database for SQL Server when first connecting (MySQL and Postgres already have this)
Gzip or br compress api responses
Performance
Faster Schema generation for SQL Server
Only SELECT required columns when gathering rows for data generation
Improved UI perf for SQL Server databases with a large number of objects
Bug Fixes
Properly send ALL logs to the log viewer in the Jobs UI
Continue when (most) errors occur while running pre-data script on SQL Server
Fixed issue blocking successful CSV data generation
No longer incorrectly filter out tables in the DBO schema in sql server
Improved logging around some SMO tasks for Sql Server
Sep 20, 2019
Features
New UI and Schema View
Ability to generate in place (MySQL only)
Support consistency across databases and runs via the new TONIC_STATISTICS_SEED env variable
Performance
Many MySQL performance improvements
Bug Fixes
Fixed potential deadlock issue in MySQL
No longer re-loads session when fetching a new refresh token
Better handling of comments inside JSON blobs that are being masked
Sep 5, 2019
Features
Auto-detect now supports booleans
Users can now click on data cells to see large text strings
PII reports (see screenshot below)
Performance
Additional multi-threading added in various places for data generation
Process generated tables before passthrough to encourage fast failure
New environment variables so users can change the connection timeout, and min and max pool size
Bug Fixes
Support for Constant Generator on MySQL Blob columns
Support MySQL set
datatype
Improved logging for PII detection
Multiple fixes involving switching between synthesize, excluded, and masked
Progress tracking added to BigQuery
POST to /api/autodetect/config will generate a default config first if one does not exist.
Multiple other small fixes
Features
PII Detection is now available locally
Column widths on the table are now saved between sessions
Truncating tables is now prevented when it will lead to FK violations when generating data
Ability to more quickly add JSONPaths
Schema Diff now auto-fixes issues with model due to schema changes
Performance improvements to data generation, and JSON Generator in particular which is now multi-threaded
Bug Fixes
Ability to handle NULL database values in the JSON Mask Generator
JSON Paths that start with the same prefix, e.g. $.value and $.value1 no longer cause an error
No longer cutting off list of tables in Auto Detect dialog
Better clean up of excluded_tables and scaled_tables in Fingerprint when tables are dropped from DB
Fixed issue preventing data generation on Postgres Standby DBs (i.e. Read replicas in RDS nomenclature)
Features
Can now hit 'Enter' on any input in the Database connection form and it will submit the form
Bug Fixes
Better layout for JSON Paths with long names
Fixed a 'missing key' prop issue in front end
Adding too many escapes to MySQL data when writing to CSV (prior to upload).
We now automatically truncate mysql tables before data generation begins
Features
JSON generator now supports consistency
Better JSON Mask UI
Name Generator now has consistency option
Can run auto detect without sending logs
API Documentation
Better tracking of Allos Console logs
Better progress tracking
Bug Fixes
Intermittent failures of data table are fixed
GetAllTablesAndColumns no longer fails when table has columns with multiple constraints on them
No longer front end crash when subset target table no longer exists
Use Custom Data button location has been fixed
Performance improvements
Fixed a myriad of issues in the First Connection Experience when using a custom data source
Additional fixes for the JSON Mask Generator
Fixed issue created in v16 where changing a table's mode to 'TRUNCATE' wasn't being saved properly
Features
Insight into whether database generation job is running or queued (was not distinguished in prior versions)
Ability to cancel a currently running database generation job
Ability to assign generators by jsonpath for the JSON Mask Generator
Tonic now prevents users from entering identical source and destination database information
When connecting to a database Tonic now defaults the Port to the standard port for the database selected.
Tonic now checks if the db account has necessary permissions and warns if it does not.
Subset button in header now reflects the state of your subset configuration
Google BigQuery now allows user to specify an input and output Dataset for generation.
Bug Fixes
Better escaping of schema and table names in mysql
Workspace edit dialog file upload inputs (Foreign Key upload and BigQuery Service Account upload) were out of sync
No longer show stale data in the table UI when switching between tables quickly
We were not properly handling queries to pg_catalog tables where columns had recently been dropped from tables
Fix to allow synthesizing on mysql tables that have weird characters in table name
Fix error message which occurs when user checks permissions on a Postgres database with 0 tables
Fix to properly keep in sync the source and destination database names in Sql Server
Constant no longer appears twice in the generator list for JSON columns in Postgres
Table search dropdown now clears search query once user hits 'enter' or closes dialog
CreatedDate in allos_db tables now has correct values by default
Ability to Filter data table based on SQL WHERE clauses
Support for Google Big Query
Continuous generator now supports nullable columns
Remember last visited Workspace and Table and navigate directly to them when reloading page
Added on-premises installation option of one Docker image
FK Columns are greyed out in UI
And several additional bug fixes
note: we've transitioned from major.minor.hotfix to simple integer based versioning
Features
Autodetect Generators (beta) - this feature that scans the source dataset and by analyzing both the structural properties (data types, column names, foreign key constraints) and the content of each data field it take a first pass at picking generators and linking generators.
Ability to define custom foreign key relationships, this addresses the issue where a database doesn't have any FK constraints or it's missing some
Algebraic Generator - when you link 3 or more columns (A, B, C, ...) with the algebraic generator applied it will search the space of functions (A + B = C, A / B = C, ... ) to find the function that best describes the algebraic relationship between the linked columns.
We now support ability to specify via an ENV variable which schemas to include
Email generator now supports a custom email domain
Renamed Gaussian generator to Continuous
Added TIN generator
First two digits are always 00
Guaranteed uniqueness. We use each of the 10M possibilities once and only once except when needing to be consistent (see above). This generator cannot be used in tables with more than 10M unique rows
Format preserving. If the cell value uses a hyphen we add a hyphen, otherwise we do not.
Upgraded pg_dump to v11
A new entry is added each week, and contains the release notes for all of the Tonic versions that were released during that week.
October 14, 2022
Updates
Improved the performance of Table View, particularly when scrolling.
Reworked data generation to better group generator errors.
On the Subsetting view:
Improved accuracy of the row counts for small tables.
The table list shows whether each table is in the subset and the number of pre-subset and post-subset rows even when subsetting is not enabled for data generation.
The health check for the PyML container can now use HTTP instead of HTTPS.
MongoDB
Collection View now updates after the first scan without having to be refreshed.
MySQL
Improved performance when loading workspaces.
Oracle
Fix to improve query performance.
Added permission checks when testing database connections.
PostgreSQL
Tonic now applies the source database check constraints on the destination database after data generation is complete.
Snowflake
Fixed an issue where data generation failed when there were tables that used Preserve Destination table mode.
October 7, 2022
Updates
Reverted an update from v580 that caused slow performance when retrieving the list of tables for a workspace. In some cases, this caused Tonic to indicate that no tables were available.
Fixed an issue where custom value processors interfered with the display of configuration options for the SSN generator.
Fixed an issue where the dropdown lists to select a column generator periodically scrolled to the top of the screen.
Improved performance when navigating among Tonic views.
Improved the list of suggested sub-generators for the Conditional generator.
Improved the data generation process to prevent jobs from hanging when an error occurs.
Fixed an issue where the post-subset generation row count always returned 0.
Updated the Tonic logging framework.
Google BigQuery
Improved handling of timestamp and datetime fields.
MongoDB
Fixed an issue where workspaces crashed when running data generation on new collections.
MySQL
Corrected the handling of the BIT data type.
Improved performance when loading workspaces.
Oracle
The Constant generator now works correctly for RAW columns.
Stopped using source database constraints that were in the recycle bin to populate the destination database.
Improved performance when retrieving schema information and loading workspaces.
Added the OracleDriverAnalyzer tool to analyze Oracle performance.
PostgreSQL
Display an error when an excluded or included schema does not exist in the source database.
Fixed a permissions error that occurred when creating a new workspace using Azure PostgreSQL for the destination database.
Spark
Fixed an issue with the workspace configuration view that prevented users from creating and updating Spark-based workspaces.
SQL Server
Fixed an issue that caused a data generation error when XML indexes had dependencies on other indexes.
Upgraded SqlClient to 5.0.0.
September 30, 2022
Enhancements
Complete list of blocking issues for data generation - When data generation is blocked, the generation panel now displays all of the blocking issues. This allows you to correct all of the blocking issues before you attempt to run data generation again.
Other updates
Made some visual updates to the Tonic navigation pane and the Tonic login panel.
On the Job History page, the details popover for queued jobs now points to the correct job.
In Database View, the generator list for the Applied Generators filter is now correctly alphabetized.
Fixed an issue where the dropdown arrow for the table mode selector was not always clickable.
When importing a workspace, Tonic now validates that columns do not have multiple generators assigned to them.
Updated to provide clearer error messages when there is an issue with an assigned sub-generator.
Fixed an issue where Privacy Hub sometimes did not reload after a new sensitivity scan.
The database type filter now includes all of the available Spark database types.
Corrected the link to the Tonic privacy policy.
Fixed an issue where jobs failed when multiple tables with the same name in different schemas were assigned Preserve Destination table mode.
Timestamp Shift is now the recommended generator for Date and Timestamp columns.
Corrected the display of available buttons on the Tonic application.
MongoDB
Improved error handling for database summary queries.
Added a warning about subsetting performance for percentage subsetting target tables on MongoDB versions before 4.4.2.
Fixed a memory usage issue with loading Collection View when the data contains large arrays.
Oracle
A new environment variable, TONIC_ORACLE_DATA_PUMP_PARALLELISM
, allows you to choose the maximum number of threads for parallelization for Oracle Data Pump.
Optimized queries for better performance.
Spark
Added the ability to specify the proxy user when using Livy.
Fixed an issue where users could change the configuration type when editing a Spark workspace.
September 23, 2022
Enhancements
When the job began and ended.
The percent reduction from the original source data to the subset destination data.
The percentage of the source data that is included in the subset destination data.
The volume of data in the source data and the subset destination data.
Other updates
Fixed an issue that prevented users from deleting more than one tag from the Edit Workspace view.
Added the ability to run Tonic workers, the Tonic web server, and Tonic notifications on Heroku.
Updates to improve handling of canceled jobs, both when users cancel jobs and when jobs fail.
Improvements to data generation memory handling and performance.
Improved the Synthesis Report for AI Synthesizer.
Fixed an issue where adding constraints to a destination database resulted in deadlocks.
Oracle
Update to allow the JSON Mask generator to be used on user-defined types (UDTs).
PostgreSQL
Fixed a regression that caused sequence fetching errors in v9.6 and earlier.
Spark
Improved performance for the Continuous generator.
Enabled partition filter validation on Hive.
SQL Server
Added support for full text catalogs in Server Management Objects (SMOs).
Fixed an issue to enable the correct handling of schema, table, and column names that contain single quotes.
Made a fix to correctly display error messages.
September 16, 2022
Enhancements
Links from Schema Changes to Database View - Added links from Schema Changes entries to Database View. The links automatically filter Database View to only include the affected column or table. The links only display for columns and tables that are in the source database. Removed columns or tables do not have links to Database View.
Schema filtering for PostgreSQL - For PostgreSQL workspaces, when you create or edit the workspace, you can specify a list of schemas to either include or exclude from the source database.
If the setting is off, the generator uses the current process, which replaces the last two digits of the zip code with zeros. For low population areas, the zip code is all zeros.
If the setting is on, then the generator selects a real zip code that starts with the same three digits as the original zip code. For low population areas, if a state is provided in the data, the generator selects a random zip code from that state. Otherwise it selects a random zip code from the United States.
Other updates
Minor updates to the Subsetting view. New icons for the subsetting summary and the inbound and outbound relationship counts. Added a Use subsetting toggle to indicate to use the subset configuration for data generation. This toggle is synchronized with the same toggle on the Confirm Generation panel.
Fixed an issue where Tonic logged users out between browser sessions more often than expected.
Improved performance for the JSON Mask generator.
Fix to ensure that the AI Synthesizer is canceled when a data generation job that includes AI Synthesizer is canceled.
Fixed an issue where a data generation job would hang if there was a failure.
Fix to address an issue where after a one-click update, the Tonic version unexpectedly regressed to an earlier version.
Fix to ensure that cross-table commands do not run in parallel, which could cause deadlocks in the database.
MongoDB
Fixed an issue that prevented the Regex Mask generator from being applied to fields in arrays.
Updated to require 3.6 as the minimum server version for MongoDB.
PostgreSQL
Sequence values are now copied over correctly.
Spark
Fixed a data generation error in Hive and Livy that occurred when a source table had non-lowercase column names.
September 9, 2022
Enhancements
Automatically resolving schema changes - Schema changes are now resolved or dismissed automatically when you update the table or column configuration.
For a new table, the schema change is dismissed when you set the table mode to Truncate or Preserve Destination.
For a new column, the schema change is dismissed when you assign a generator to the column.
For a column that has a conflicting data type or nullability change, the schema change is resolved when you assign a different generator to the column.
Other updates
Corrected the number of rows for out-of-subset tables. No longer show 0 when out-of-subset tables are processed.
Improved error messaging when a where clause for a subset target table is invalid.
Corrected an issue where the subsetting table configuration was not handled correctly.
The job types filter on the Job History view now only shows valid types for the workspace.
Improved performance for data generation.
Improved performance for the AI Synthesizer.
Improved reporting of subsetting progress when parallelism is enabled.
MongoDB
You can now assign the Preserve Destination mode to a collection.
Subsetting and foreign key management is now enabled for all customers.
Spark
For workspaces that use Livy, you can now assign the Continuous generator and Noise Generator as sub-generators for a mask generator.
Improved performance for the Categorical generator.
Improved performance for the HIPAA Address generator.
September 2, 2022
Enhancements
Other updates
Made a couple of small improvements to the AI Synthesizer generator.
Fixed an issue where users could not change the percentage on a subsetting target table.
Improved how we sort tables for parallel processing to improve efficiency.
MongoDB
The Categorical, Current Date, Date Truncation, HIPAA Address, and Unique Email generators now work on Mongo array fields.
Linking for the Continuous generator now works correctly.
Spark
Added support for Spark 2.3 on Livy.
August 26, 2022
Enhancements
Other updates
Improved the display of the Database View advanced filter for smaller screens.
Fixed an issue with the generation of the API reference.
Began to log latencies each hour for source and destination databases.
MongoDB
In Collection View, improved the display of key columns on smaller screens.
Corrected an issue that resulted in duplicate schema change results.
Fixed the collection scan status when a new scan is started.
Single document view no longer reloads when you apply a generator.
Corrected an issue where selecting a different collection to view applied updates from the previous collection.
Fixed an issue where the Null generator could not be removed from a field.
MySQL
Fixed an issue where the Tonic application could not load when table and row size estimates were not available.
Increased the connection resiliency for write operations.
Improved handling of different character sets in MySQL.
August 19, 2022
When you create a sample workspace, it now includes a tag called Sample Tag.
Fixed an issue where the Update Tonic button was not displayed correctly.
Improved the user interface for activating new hosted accounts.
Improved message to notify users that the subset configuration changed since the last subsetting data generation.
The HIPAA Address generator now works correctly for US addresses.
MongoDB
You can now link fields in MongoDB collections.
Conditional generators can now be applied to array elements.
The Random Timestamp generator now works correctly on datetime columns.
Oracle
Data generation can now work without DBLink.
PostgreSQL
Completed additional updates to support cross-schema types.
The new generator produces realistic names of businesses or companies. The Business Name generator can be consistent with itself or with other columns. It improves on and is intended to replace the Company Name generator, which is now deprecated.
The new data connector uses Azure Blob Storage to store interim uploaded and generated files.
Assigning generators from Schema Changes view - On Schema Changes view, new columns, changes to column data type, and changes to column nullability have a Select dropdown list that includes an option to assign a different generator to the column and then resolve the issue. See and .
Subsetting results - The Subsetting view now displays the results of the most recent subsetting data generation run. See . The information includes:
Schema changes filter on Database View - On Database View, the advanced filters now include an option to only display columns that have unresolved schema changes. This filter is not combined with other filters. When you filter for unresolved schema changes, the other column filters are disabled. See .
Larger where clause editor for subsetting - For where clause target tables, you can now display an editor with a larger text area for entering the where clause. This provides better support for longer, more complex where clauses. See .
Workspace inheritance - For enterprise customers, the workspace inheritance function allows you to create child workspaces that automatically inherit source data and Tonic configuration from their parent workspace. Changes to the parent workspace configuration are copied to the child workspace. You can override the subsetting configuration and post-job scripts in a child workspace. For more information about workspace inheritance and how it works, see .
User profile pictures - From the User Settings view, Tonic users can now upload a user image for their account. For SSO providers that support user images, the image from the SSO is used by default. See .
Zip code configuration for HIPAA Address generator - A new configuration setting for the allows you to determine how the generator sets the zip code.
Filtering workspaces by tags - On the Workspaces view, you can filter the workspaces by the assigned tags. In the Tags column heading, click the filter icon to display the list of applied tags. Check the checkbox for each tag to include. The list is filtered to include workspaces that have at least one of the selected tags. See .
You can now enable parallel processing for subsetting. The TONIC_SUBSETTING_PARALLELISM
environment variable sets the number of steps to process in parallel. See .
On Privacy Hub, the term "Unprotected Sensitive" is replaced with "At-Risk". The protection status counts exclude columns that are not included in the destination database. See .
On Database View, added an option to filter by whether a column is included in the destination database. A new At-Risk toggle provides a shortcut to filter for columns that are included, marked as sensitive, and not assigned a generator. The Column Type filters, which filter columns based on whether they are a primary or foreign key, are changed from checkboxes to toggles. See .
In the Privacy Report, added Not Included as a value for ColumnPrivacyStatus, to identify columns that are not included in the destination database. The value Protected replaces the current values Masked and Anonymized, which are moved to a new ProtectionType column. See .
Features
A handful of generators now support a notion of consistency across the database. In short, when consistency is turned on for a specific generator, the same input column will map to the same output column across an entire database (where it's turned on). Consistency can also be used to preserve the carnality of the source dataset in the generated data.
We now use row count instead of scale factor for synthesizing data
Synthesize mode now supports starting with a table that's empty
Better handling of DateTimeOffset data type in Sql Server
Added hostname generator
Re-factored of how generators are executed during data generation
We now show a tutorial video the first time a user logs in
Renamed Table Mode 'Excluded' to 'Truncated'
Build scripts now build win10-x64 assets
Changed the (!) failure icon in the jobs dropdown to a button so it's more obvious
UI and server versions now show when you click on the Tonic logo
CSVs with headers now map to column names
Bug Fixes
No long require a file to exist for the google_application_credentials env variable
Couldn't generate data if we used the MAC address generator with colons
Support for MySQL datetimes with 0000-00-00 00:00:00
Improved onboarding experience for first time users
Dropped walkthrough tutorial from 2.3.0
Renamed platform from Allos to Tonic
Additional support for date fields
Added Random Timestamp generator
Added Event generator
Partitioning for select generators (behind a feature flag) - contact support@tonic.ai if you want to try this feature
Added walkthrough tutorial and demo dataset for new users
IP Generator now uses 100% IPv4 by default
Fixed bad UX with Min/Max in Random Integer generator
Improved Gaussian Generator performance
Added user menu
The distribution of nulls in the source dataset is now persisted in transformed columns in the generated dataset
Added SSN Generator
Improved Address Generator (Zip-> City->State hierarchy preserved when columns using the address generator are linked)
Support for CSV files as a datasource
Support for arrays in Postgres
Improved PII detection
Support for Postgres databases that don’t have passwords
Renamed several generators
Categorical —> Shuffle
Random String —> Hash
String Mask —> Character Substitution
Text Mask —> Text Scramble
Added generator description callouts
Support for SQL Server as a datasource
JSON masking generator
Subsetting (Postgres only) - we integrated our open source subsetting tool
Support for MySQL as a datasource
Synthesized Mode (beta) - in addition to masking, you can now synthesize any number of rows in a table while preserving foreign key relationships