Adding and removing dataset files

Supported file types for datasets

Tonic Textual can process the following types of files:

txt
csv
tsv
docx
xlsx
pdf
png
tif or tiff
jpg or jpeg

On a self-hosted instance, you can configure an S3 bucket where Textual stores the files. This is the same S3 bucket that is used for uploaded file pipelines.

For more information, go to Setting the S3 bucket for file uploads and redactions.

For an example of an IAM role with the required permissions, go to Example IAM role for file uploads and redactions.

Adding files to the dataset

Required dataset permission: Upload files to a dataset

From the dataset details page, to add files to the dataset:

In the panel at the top left, click Upload Files.

Search for and select the files.

Tonic Textual uploads and then processes the files.

Do not leave the page while files are uploading. If you leave the page before the upload is complete, then the upload stops.

You can leave the page while Textual is processing the file.

On a self-hosted instance, when a file fails to upload, you can download the associated logs. To download the logs, click the options menu for the file, then select Download Logs.

Removing files from the dataset

Required dataset permission: Delete files from a dataset

To remove a file from the dataset:

In the file list, click the options menu for the file.
In the options menu, click Delete File.

Last updated 1 month ago

Was this helpful?