Manage datasets
When true, excludes synthesized datasets from results
trueOK
Lightweight dataset summary containing only core metadata used in list views and search results.
The original upload location of source files (Local, S3, Azure, SharePoint, OneLake, or SDK).
A point in time represented as an ISO 8601 timestamp string.
A point in time represented as an ISO 8601 timestamp string.
OK
Returns the dataset specified by the datasetId
OK
Possible values:
- Redact: Run images through OCR and redact sensitive text
- Ignore: Leave images alone
- Remove: Cover image with opaque black box
Possible values:
- Redact: Cover signature with opaque black box
- Ignore: Do not attempt to detect signature
Possible values:
- Remove: Remove all comments for file
- Ignore: Leave comments alone
Possible values:
- Redact: Treat table content normally, feed into redaction process.
- Remove: Replace all characters and symbols in table with a placeholder.
The dataset cannot be found
Creates a new dataset with the specified configuration. You must specify a unique, non-empty dataset name
OK
Possible values:
- Redact: Run images through OCR and redact sensitive text
- Ignore: Leave images alone
- Remove: Cover image with opaque black box
Possible values:
- Redact: Cover signature with opaque black box
- Ignore: Do not attempt to detect signature
Possible values:
- Remove: Remove all comments for file
- Ignore: Leave comments alone
Possible values:
- Redact: Treat table content normally, feed into redaction process.
- Remove: Replace all characters and symbols in table with a placeholder.
The dataset name must be specified
Dataset name is already in use
Updates a dataset to use the specified configuration.
When true, triggers a rescan of dataset files after the update
Request to update an existing dataset's configuration, redaction policies, and entity settings.
Possible values:
- Redact: Run images through OCR and redact sensitive text
- Ignore: Leave images alone
- Remove: Cover image with opaque black box
Possible values:
- Redact: Cover signature with opaque black box
- Ignore: Do not attempt to detect signature
Possible values:
- V1: Original mode with incorrect font, size and style
- V2: New mode
Possible values:
- Remove: Remove all comments for file
- Ignore: Leave comments alone
Possible values:
- Redact: Treat table content normally, feed into redaction process.
- Remove: Replace all characters and symbols in table with a placeholder.
Possible values:
- Disabled: Do not use LLM for structured data classification
- Enabled: Use LLM to classify structured data for PII detection
Possible values:
- Disabled: Do not use LLM for structured data classification
- Enabled: Use LLM to classify structured data for PII detection
The OCR engine used for text extraction from images and scanned documents.
OK
Full dataset details including files, configuration, and permission information.
The output format for redacted files: Original preserves the source format, Markdown produces a markdown version.
A point in time represented as an ISO 8601 timestamp string.
A point in time represented as an ISO 8601 timestamp string.
Possible values:
- Redact: Run images through OCR and redact sensitive text
- Ignore: Leave images alone
- Remove: Cover image with opaque black box
Possible values:
- Redact: Cover signature with opaque black box
- Ignore: Do not attempt to detect signature
Possible values:
- V1: Original mode with incorrect font, size and style
- V2: New mode
Possible values:
- Remove: Remove all comments for file
- Ignore: Leave comments alone
Possible values:
- Redact: Treat table content normally, feed into redaction process.
- Remove: Replace all characters and symbols in table with a placeholder.
Possible values:
- Disabled: Do not use LLM for structured data classification
- Enabled: Use LLM to classify structured data for PII detection
Possible values:
- Disabled: Do not use LLM for structured data classification
- Enabled: Use LLM to classify structured data for PII detection
The original upload location of source files (Local, S3, Azure, SharePoint, OneLake, or SDK).
The OCR engine used for text extraction from images and scanned documents.
The dataset cannot be found
Dataset name is already in use
Last updated
Was this helpful?