Migrating a pipeline to a JSON output dataset

Tonic Textual has added the pipeline functionality to datasets, and has deprecated the pipelines feature. The pipelines feature eventually will be removed.

To recreate your pipelines as JSON output datasets, use the following steps.

Create a JSON output dataset

Required global permission: Create datasets

When you create a dataset, the Output Format option prompts you to choose between original and JSON output.

The JSON option creates the same output as a pipeline.

Unlike for a pipeline, you cannot create both JSON output and synthesized files from the same dataset. Instead, to create synthesized files from the same data, create a separate dataset that uses the Original output format.

You then choose whether you want to upload files from a local file system, or use files from a cloud storage location. For a cloud storage dataset, Textual writes the output to a selected output location. For local files, you download the output files from Textual.

For cloud storage, the credentials fields are identical to those on the pipeline creation form. You can provide the same credentials values.

Cloud storage - select output location and files

For a cloud storage dataset, you specify where Textual generates the output, and select the files to work with.

Specify the output location

Note the output location that you configured for your pipeline.

On the Dataset Settings page for your dataset, you can select the same output location for the dataset. For more information, go to Changing cloud storage credentials and output location.

Select files

In your pipeline settings, note the selected files, file types, and prefixes.

In your dataset, to select the same files, click Select Files.

The Select Files dialog provides the same options to select file types, files, and folders. For more information, go to Selecting cloud storage files.

After you select the files, Textual scans the files to detect entities. You can then review the results and configure the entity type handling.

When you finish the configuration, you generate the output files.

Local files - Upload files

For a local file pipeline, you upload files to the dataset.

After you select the files, Textual scans the files to detect entities. You can then review the results and configure the entity type handling.

When you finish the configuration, you download the output files.

Last updated 2 months ago

Was this helpful?