Previewing file output

Required dataset permission: Preview redacted dataset files

You cannot preview TIF image files. You can preview PNG and JPG files.

Displaying a dataset file preview

From the file list, to display the preview, either:

  • Click the file name.

  • Click the options menu, then click Preview.

Options menu for a dataset file with the Preview option

File preview for a redacted file

For a dataset that generates output files of the same type as the original file:

  • On the left, the preview displays the original data. The detected entity values are highlighted.

  • On the right, the preview displays the data with replacement values that are based on the dataset configuration for the detected entity types.

Format of redacted values

Note that in the preview, the redacted values do not include the identifier. They only include the entity type. For example, NAME_GIVEN instead of NAME_GIVEN_1d9w5. The identifiers are included when you download the files.

File preview with the original and redacted and synthesized text data

Preview for PDF and image files

For a PDF or image file, for entity types that use the Redact handling option:

  • If there is space to display the entity type, then it is displayed.

  • Otherwise, the value is covered by a black box.

File preview for a redacted PDF file

When you hover over a black box, the entity type displays in a tooltip:

Entity type tooltip for a redacted value in a PDF file

To view the entity type labels, you can also zoom into the file.

Zoomed in version of a PDF preview that displays entity types

The preview for a PDF file also reflects any manual overrides.

Selecting entity type handling options from the preview

You can use the preview to select the entity type handling option for each entity type. The options are:

  • Redact - This is the default value. Textual replaces the value with the name of the entity type followed by a unique identifier. For example, the first name John is replaced with NAME_GIVEN_12345. Note that the identifier is only visible in the downloaded file. It does not display on the preview.

  • Synthesize - Textual replaces the value with a realistic generated value. For example, the first name John is replaced with the first name Michael. The replacement values are consistent, which means that a given value always has the same replacement. For example, Michael is always the replacement value for John.

  • Off - Textual ignores the value and copies it as is to the output file.

To select the entity type handling option:

  1. In the results panel, click a detected value.

  2. On the panel, click the entity type handling option. Textual applies the same option to all entity values of that type.

Selecting an entity type handling option

From the preview, you can only select the entity type handling option. For the Synthesize option, you cannot configure synthesis options for an entity type. You must configure those options from the dataset details page. For more information, go to Configuring synthesis options.

Ignoring specific instances in PDF files

From the PDF preview, you can also choose to ignore a specific value.

To configure whether to ignore a specific detected value:

  1. In the results panel, click the value.

  2. On the panel, to ignore the value, toggle Ignore Redaction to the on position.

Panel with the option to ignore a PDF value

File preview for a JSON output file

For a dataset that generates JSON output:

  • On the left is the original content. For files other than .txt files, you can toggle between generated Markdown and the rendered file.

  • On the right is a set of tabs that summarize the results.

File preview for a text file in a JSON output dataset
File preview for a PDF file in a JSON output dataset

Entities tab - Detected entities in the file

The Entities tab displays the file content with the detected entity values in context.

The actual values are followed by the type labels. For example, the given name John is displayed as John NAME_GIVEN.

JSON tab - Output JSON for the file

The JSON tab contains the content of the actual output file.

For details about the JSON output structure for the different types of files, go to Structure of JSON output files.

JSON tab on a file preview for a JSON output dataset

Tables tab - Tables in a PDF or image file

For a PDF or image file that contains one or more tables, the Tables tab displays the tables. If the file does not contain any tables, then the Tables tab does not display.

Key-Value Pairs tab - Key-value pairs in a PDF or image file

For a PDF or image file that contains key-value pairs, the Key-Value Pairs tab displays the key-value pairs. If the file does not contain key-value pairs, then the Key-Value Pairs tab does not display.

Last updated

Was this helpful?