Using table filtering for data warehouses and Spark-based data connectors

Some Tonic Structural data connectors do not support subsetting.

However, for the following connectors that do not support subsetting, to generate a smaller set of data, you instead can add table filters.

The following data connectors support both subsetting and table filtering:

You can only filter tables that use De-identify table mode. The filter identifies the rows from the source database to process and include in the destination database.

Note that unlike subsetting, table filters do not guarantee referential integrity.

To add a filter, in the Table Filter text area on the table mode panel, provide the WHERE clause for the filter, then click Apply.

For Databricks workspaces where the source database uses Delta files, the filter WHERE clause can only refer to columns that have partitions.

For Amazon EMR, Google BigQuery, and Spark with Livy, the filter WHERE clause can refer to columns without partitions. However, the performance is better when the referenced columns have partitions.

On the workspace configuration for Amazon EMR, Databricks, and Spark with Livy, the Enable partition filter validation toggle determines whether Structural validates the WHERE clause when you create it. By default, the toggle is in the on position, and the WHERE clause is validated.

For Amazon Redshift, Google BigQuery, and Snowflake, Structural always validates the WHERE clause.

Last updated