Viewing pipeline files, runs, and statistics

Required pipeline permission: View pipeline settings

For a file upload pipeline, the Files tab contains the list of all of the pipeline files.

For cloud storage pipelines, you use the pipeline details page to track processed files and pipeline runs.

For pipelines that are configured to also redact files, you can configure the redaction for the detected entity types. For more information, go to Selecting the handling option for entity types.

Viewing the list of all files for a pipeline

For uploaded file pipelines, when you add a file to the pipeline, it is automatically added to the file list.

For cloud storage pipelines, the file list is not populated until you run the pipeline. The list only contains processed files.

Viewing file statistics for the pipeline

The statistics panels at the right of the pipeline details page provide a summary of information about the pipeline files, the detected entities, and the detected topics.

Summary file statistics

The File Statistics panel displays the following values.

Total # of files - The number of files in the pipeline.
Total # of words - The number of words that the files contain.
Entities detected - The number of entity types for which Textual detected values in the files.
Topics detected - The number of topics that the files contain. A topic is a subject area that is common across multiple files. If the pipeline files contain completely unrelated content, then Textual might not detect any topics.

Entity type value counts

The entity types panel displays the 5 entity types that have the largest number of values in the pipeline files.

For each entity type, the panel displays the value count.

If there are more than 5 detected entity types, to display the full list of detected entity types, click View All.

Topics list

The topics panel displays the 5 topics that are present in the most files.

For each topic, the panel displays the number of files that include that topic.

If there are more than 5 detected topics, to display the full list of detected topics, click View All.

Viewing the list of pipeline runs

On the pipeline details page for a cloud storage pipeline, the Pipeline Runs tab displays the list of pipeline runs.

Required pipeline permission: View pipeline settings

For each run, the list includes:

Run identifier
When the run was started
The current status of the pipeline run. The possible statuses are:
- Queued - The pipeline run has not started to run yet.
- Running - The pipeline run is in progress.
- Completed - The pipeline run completed successfully.
- Failed - The pipeline run failed.

Viewing the list of pipeline run files

For a pipeline run, to display the list of files that the pipeline run includes, click View Run.

Information in a file list

For each file, the list includes the following information:

File name
For cloud storage files, the path to the file
The status of the file processing. The possible status are:
- Unprocessed - The file is added, but a pipeline run to process it has not yet started. This only applies to uploaded files that were added since the most recent pipeline run.
- Queued - A pipeline run was started but the file is not yet processed.
- Running - The file is being processed.
- Completed - The file was processed successfully.
- Failed - The file could not be processed.

Last updated 3 months ago

Was this helpful?