Using table filtering for data warehouses and Spark-based data connectors
Last updated
Last updated
Some Tonic Structural data connectors do not support subsetting.
However, for the following connectors that do not support subsetting, to generate a smaller set of data, you instead can add table filters.
The following data connectors support both subsetting and table filtering:
You can only filter tables that use De-identify table mode. The filter identifies the rows from the source database to process and include in the destination database.
Note that unlike subsetting, table filters do not guarantee referential integrity.
To add a filter, in the Table Filter text area on the table mode panel, provide the WHERE
clause for the filter, then click Apply.
For Databricks workspaces where the source database uses Delta files, the filter WHERE
clause can only refer to columns that have partitions.
For Amazon EMR and Google BigQuery, the filter WHERE
clause can refer to columns without partitions. However, the performance is better when the referenced columns have partitions.
On the workspace configuration for Amazon EMR and Databricks, the Enable partition filter validation toggle determines whether Structural validates the WHERE
clause when you create it. By default, the toggle is in the on position, and the WHERE
clause is validated.
For Amazon Redshift, Google BigQuery, and Snowflake, Structural always validates the WHERE
clause.