Regex Mask
This is a composite generator.
Uses regular expressions to parse strings and replace specified substrings with the output of specified generators. The parts of the string to replace are specified inside unnamed top-level capture groups.
Defining multiple expressions allows you to attach completely different sets of sub-generators to to a given cell, depending on the cell's value.
How regular expressions are applied
If multiple regular expressions match a given string, the regular expressions and their associated generators are applied in the order that they are specified. Structural applies the selected sub-generators for the first matching expression.
With the Replace all matches option, the Regex Mask generator behaves similarly to a traditional regular expression parser. It matches all occurrences of a pattern before the next pattern is encountered. For example, the pattern ^(a)$
applied to the string aaab
matches every occurrence of the letter a
, instead of only the first one.
Regular expression compatibility
Note that for Spark-based data connectors, depending on your environment, there might be slight differences in the regular expression support.
To ensure consistent results across all data connectors, use regular expression patterns that are compatible with both Java and C#.
For more information about regular expressions in C#, go to this reference. For more information about regular expressions in Java, go to this reference.
Example expressions
In a cell that contains the string ProductId:123-BuyerId:234
, to mask the substrings 123
and 234
, specify the regular expression:
^ProductId:([0-9]{3})-BuyerId:([0-9]{3})$
This captures the two occurrences of three-digit numbers in the pattern ProductId:xxx-BuyerId:xxx
. This makes it possible to define a sub-generator on neither, either, or both of these captured substrings.
The following regular expression defines a broader capture that matches more cell values:
^(\w+).(\d+).(\w+).(\d+)$
This captures pairs of words ((\w+)
) and numbers ((\d+)
) if there is a single character of any value between them, instead of the relatively more specific pattern of the first expression.
Characteristics
Consistency
Determined by the selected sub-generators.
Linking
Determined by the selected sub-generators.
Differential privacy
Determined by the selected sub-generators.
Data-free
Determined by the selected sub-generators.
Allowed for primary keys
No
Allowed for unique columns
Yes
Uses format-preserving encryption (FPE)
No
Privacy ranking
5
Generator ID (for the API)
How to configure
Adding a regular expression
To add a regular expression:
Click Add Regex. On the configuration panel, Cell Value shows a sample value from the source database. You can use the previous and next options to navigate through the values.
By default, Replace all matches is enabled. To only match the first occurrence of a pattern, toggle Replace all matches to the off position.
In the Pattern field, enter a regular expression. If the expression is valid, then Structural displays the capture groups for the expression.
For each capture group, to select and configure the generator to apply, click the selected generator. You cannot select another composite generator.
To save the configuration and immediately add a generator for another path expression, click Save and Add Another. To save the configuration and close the add generator panel, click Save.
Managing the regular expressions
From the Regexes list:
To edit a regular expression, click the edit icon.
To remove a regular expression, click the delete icon.
Last updated