Tonic uses a variety of signals to identify PII. For example, Tonic analyzes column metadata such as data type, column name, as well as the uniqueness of the column values. Tonic also scans the actual data and uses a combination of regex matching, dictionary lookups, as well as NER (named entity recognition) algorithms to help identify PII. Like all model-based approaches, this process is not flawless and cannot guarantee perfect precision and recall for our models. We strongly recommend a human review of the results of the privacy scan as well as the broader dataset to ensure that nothing sensitive has been missed.