Required license: Professional or Enterprise
Not available on Structural Cloud.
Amazon EMR workspaces do not support workspace inheritance.
You can only assign the De-Identify or Truncate table modes.
For Truncate mode, the table is ignored completely. The table does not exist in the destination database.
Amazon EMR workspaces cannot use the following generators:
Algebraic
Array Character Scramble
Array JSON Mask
Array Regex Mask
Cross-Table Sum
CSV Mask
Event Timestamps
HTML Mask
JSON Mask
SIN
The following generators are supported, but with restrictions:
Character Scramble is only supported for text columns.
Timestamp Shift is only supported on date column types.
Amazon EMR workspaces do not support subsetting.
However, for tables that use the De-Identify table mode, you can provide a WHERE
clause to filter the table. For details, to go Using table filtering for data warehouses and Spark-based data connectors.
Amazon EMR workspaces do not support upsert.
For Amazon EMR workspaces, you cannot write the destination data to container artifacts.
For Amazon EMR workspaces, you cannot write the destination data to an Ephemeral snapshot.
The logging of Spark jobs on the job details page is more limited than it is for other data connectors. This is because of how Spark clusters are distributed and managed.
The Jobs view provides information about the job's status as it runs.
After the job starts, it provides a tracking URL. The tracking URL leads to the Spark management portal, where you can find additional, more detailed logs.