Viewing workspace jobs and job details
Last updated
Last updated
Tonic Structural runs the following types of jobs on a workspace:
Sensitivity scans, which analyze the source database to identify sensitive data.
Collection scans, which analyze the source data for a MongoDB workspace to determine the available fields in each collection, the field types, and how prevalent the fields are.
Data generation, data pipeline generation, and containerized generation jobs, which generate the destination data from the source data.
Upsert data generation jobs, which generate the intermediate database from the source database.
Upsert jobs, which use data from the intermediate database to add new rows to and update changed rows in the destination database. If the migration process is enabled, then it is a step in the upsert job.
SDK table statistics jobs. These jobs only run when you use the SDK to generate data in a Spark workspace, and the assigned generators require the statistics.
You can view a list of jobs that ran on the workspace, and view details for individual jobs.
The Jobs view displays the list of jobs that ran on the workspace. The list includes the 100 most recent jobs.
To display the Jobs view:
On the workspace management view, in the workspace navigation bar, click Jobs.
On Workspaces view, from the dropdown menu in the Name column, select Jobs.
For each job, the job list includes the following information:
Job ID - The identifier of the job. To copy the job ID, click the icon at the left of the row.
Type - The type of job.
Submitted - The date and time when the job was submitted.
Completed - The date and time when the job finished running.
A job can have one of the following statuses:
Queued - The job is queued to run, but has not yet started. A job is queued for one of the following reasons:
Another job is currently running on the same workspace. For example, you cannot run a sensitivity scan and a data generation, or multiple data generations, at the same time on the same workspace. This is true regardless of the number of workers on the instance. On Structural Cloud, there is also a limit on the number of concurrent running jobs for each organization. When that maximum is reached, a new job remains queued until a current running job completes.
There isn't an available worker on the instance to run the job. A Structural instance with one worker can only run one job at a time. If a job from one workspace is currently running, a job from another workspace cannot start until the first job is finished.
To view information about why a job is queued, click the status value.
Running - The job is in progress.
Canceled - The job is canceled.
Completed - The job completed successfully.
Failed - The job failed to complete.
Each of these statuses has a corresponding "with warnings" status. For example, Running with warnings, Completed with warnings. A "with warnings" status indicates that the job had at least one warning at the time of the request.
You can filter the list by either the type or the status.
To filter the list by the job type:
Click the filter icon in the Type column heading. By default, all types are included, and none of the checkboxes are checked.
To only include specific types of jobs, check the checkbox next to each type to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.
To filter the list by the job status:
Click the filter icon in the Status column heading. The status panel displays all of the statuses that are currently in the list. For example, if there are no Queued jobs, then the Queued status is not in the list. By default, all of the statuses are included, and none of the checkboxes are checked.
To only include jobs that have specific statuses, check the checkbox next to each status to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.
You can sort the jobs by either the submission or completion timestamp.
To sort by submission date, click the Submitted column heading. To reverse the sort order, click the heading again.
To sort by completion date, click the Completed column heading. To reverse the sort order, click the heading again.
For jobs other than Queued jobs, you can display details about the workspace and the job progress.
From the Jobs view, to display the details for a job, click the job row.
The left side of the job details view contains the workspace information.
For a sensitivity scan, the workspace information is limited to the owner, database type, and worker version.
For a data generation job, the workspace information also includes:
Whether subsetting, post-job scripts, or webhooks are used.
The number of schemas, tables, and columns in the source database.
The number of schemas, tables and columns in the destination database.
The Job Log tab shows the start date, start time, and duration of the job, followed by the list of job process steps.
For data generation jobs, the Privacy Report tab displays the number of at-risk, protected, and not sensitive columns in the source database.
At-risk columns contain sensitive data, but still have Passthrough as the assigned generator.
Protected columns have an assigned generator other than Passthrough.
Not sensitive columns have Passthrough as the assigned generator, but do not contain sensitive data.
A workspace can write output to a Tonic Ephemeral snapshot, with an option to preserve the temporary Ephemeral database that is used to create the snapshot.
For data generation jobs that write to Ephemeral, the Data available in Tonic Ephemeral panel displays. It contains a link to Ephemeral, and access to either the snapshot or the database.
When the temporary database is not preserved, the Data available in Tonic Ephemeral panel provides access to the snapshot.
To navigate to Ephemeral and view the details for an Ephemeral snapshot, click View Snapshot in Tonic Ephemeral.
When the temporary database is preserved, the Data available in Tonic Ephemeral panel provides access to the database.
To display the connection details for the Ephemeral database, click View connection info.
For an Ephemeral database, the connection details include:
The database location and credentials. Each field contains a copy icon to allow you to copy the value.
SSH tunnel information, including instructions on how to create an SSH tunnel from your local machine to the Ephemeral database.
The job identifier is a unique identifier for the job. To copy the job ID, either:
You can cancel Queued or Running jobs.
For jobs with those statuses, the rightmost column in the job list contains a cancel icon.
To cancel the job, click the icon.
For workspaces that are configured to write destination data to container artifacts, the Jobs view also provides access to those artifacts. For more information, go to Viewing and downloading container artifacts.
Required workspace permission: Download job logs
To download diagnostic logs, you must have the Enable diagnostic logging global permission.
For all jobs, the job logs provide detailed information about the job processing. Tonic.ai support might request the job logs to help diagnose issues.
For a failed data generation to Ephemeral, the job logs include the Ephemeral logs and the destination database pod logs.
For upsert jobs where the migration process is enabled, and you configured the GET Schema Change Logs
endpoint, the upsert job logs include the migration process logs.
You can download the job logs from the Jobs view or the job details view. The download includes up to 1MB of log entries.
On the Jobs view, to download the logs for a job, click the download icon in the rightmost column.
On the job details view, to download the logs for a job, click Reports and Logs, then select Job Logs.
By default, Structural redacts sensitive values from the job logs. To help support troubleshooting, you can configure data connectors or an individual data generation job to create unredacted versions of the log files, referred to as diagnostic logs. For more information, go to Redacted and diagnostic (unredacted) logs.
To access diagnostic log files, you must have the Enable diagnostic logging global permission.
If you do not have the Enable diagnostic logging global permission, then you cannot download the logs for that job. The download option is disabled.
Required workspace permission: View and download Privacy Report
From the job details view, you can download a Privacy Report file that provides an overview of the current protection status of the database columns based on the workspace configuration at the time that the job ran.
You can download either:
The Privacy Report .csv file, which provides details about the table columns, the column content, and the current protection configuration.
The Privacy Report PDF file, which provides charts that summarize the privacy ranking scores for the table columns. It also includes the table from the .csv file.
To display the download options, click Reports and Logs. In the menu:
To download the Privacy Report .csv file, click Privacy Report CSV.
To download the Privacy Report PDF file, click Privacy Report PDF.
For more information about the Privacy Report files and their content, go to Using the Privacy Report to verify data protection.
For a workspace that writes the output to a container repository, the job includes the following additional logs:
Database logs - Logs for the database container that is used as the destination.
Datapacker logs - Logs for creating the OCI artifact and uploading it to an OCI registry.
To download these logs for a data generation job, on the job details view, click Reports and Logs, then select Database Logs or Datapacker Logs.
For workspaces that are connected to Amazon Redshift or Snowflake on AWS databases, the data generation job requires multiple calls to a Lambda function. For these data generation jobs, the CloudWatch logs monitor the progress of and display errors for these Lambda function calls.
To download the CloudWatch logs for a data generation job, on the job details view, click Reports and Logs, then select CloudWatch Logs.
The CloudWatch Logs option only displays for Amazon Redshift and Snowflake on AWS data generation jobs.
Required workspace permission: Download SqlLdr Files
For an Oracle data generation, if both of the following are true:
The data generation job ran SQL Loader (sqlldr).
sqlldr either failed or succeeded with errors.
Then to download the sqlldr log files, click Reports and Logs, then select sqlldr Logs.
For a data generation from a file connector workspace that uses local files, you can download the transformed files for that job.
The download is a .zip file that contains the files for a selected file group.
On the job details view, when files are available to download, the Data available for file groups panel displays.
To download the files for a file group:
Click Download Results.
From the list, select the file group. Use the filter field to filter the list by the file group name.
Required workspace permission: Download job logs
For workspaces that use the newer data generation processing, users can configure a data generation job to also generate performance metrics. This is usually done for troubleshooting purposes.
On the job details view, to download the performance metrics for the job, click Reports and Logs, then click Performance Metrics.
Status - The current status of the job, and how long ago the job reached that status. When you hover over the status, a tooltip displays the actual timestamp for the status change, and the length of time that the job ran. For queued jobs, to display a panel with information about why the job is queued, click the status value.
From the Jobs view, click the copy () icon in the leftmost column.
From the job details view, click the copy () icon next to the job ID.