Viewing workspace jobs and job details
Tonic runs the following types of jobs on a workspace:
- Sensitivity scans, which analyze the source database to identify sensitive data.
- Collection scans, which analyze the source data for a MongoDB workspace to determine the available fields in each collection, the field types, and how prevalent the fields are.
- Data generation and data pipeline generation jobs, which generate the destination database from the source database.
- Upsert data generation jobs, which generate the intermediate database from the source database.
- Upsert jobs, which use data from the intermediate database to add new rows to and update changed rows in the destination database.
- SDK table statistics jobs. These jobs only run when you use the SDK to generate data in a Spark workspace, and the assigned generators require the statistics.
- Model training jobs. These jobs only run on data science mode workspaces. A model training job shows the results of a model being trained. A trained model can be used to generate synthetic data.
You can view a list of jobs that ran on the workspace, and view details for individual jobs.
The Job History page displays the list of jobs that ran on the workspace. The list includes the 100 most recent jobs.
Job History view
To display the Job History view:
- On the workspace management view, in the workspace navigation bar, click Jobs.
- On Workspaces view, from the dropdown menu in the Name column, select Jobs.
For each job, the job list includes the following information:
- Job ID - The identifier of the job. To copy the job ID, click the icon at the left of the row.
- Type - The type of job.
- Status - The current status of the job, and how long ago the job reached that status. When you hover over the status, a tooltip displays the actual timestamp for the status change, and a summary of how long the job ran. For queued jobs, to display a panel with information about why the job is queued, click the status value.
- Submitted - The date and time when the job was submitted.
- Completed - The date and time when the job finished running.
A job can have one of the following statuses:
- Queued - The job is queued to run, but has not yet started. A job is queued for one of the following reasons:
To view information about why a job is queued, click the status value.
- Another job is currently running on the same workspace. For example, you cannot run a sensitivity scan and a data generation, or multiple data generations, at the same time on the same workspace. This is true regardless of the number of workers on the instance.
- There isn't an available worker on the instance to run the job. A Tonic instance with one worker can only run one job at a time. If a job from one workspace is currently running, a job from another workspace cannot start until the first job is finished.
- Running - The job is in progress.
- Canceled - The job is canceled.
- Completed - The job completed successfully.
- Failed - The job failed to complete.
Each of these statuses has a corresponding "with warnings" status. For example, Running with warnings, Completed with warnings. A "with warnings" status indicates that the job had at least one warning at the time of the request.
You can filter the list by either the type or the status.
To filter the list by the job type:
Job type filter options for the Job History view
- 1.Click the filter icon in the Type column heading. By default, all types are included, and none of the checkboxes are checked.
- 2.To only include specific types of jobs, check the checkbox next to each type to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.
To filter the list by the job status:
Job status filter options for the Job History view
- 1.Click the filter icon in the Status column heading. The status panel displays all of the statuses that are currently in the list. For example, if there are no Queued jobs, then the Queued status is not in the list. By default, all of the statuses are included, and none of the checkboxes are checked.
- 2.To only include jobs that have specific statuses, check the checkbox next to each status to include. Checking all of the checkboxes has the same effect as unchecking all of the checkboxes.
You can sort the jobs by either the submission or completion timestamp.
To sort by submission date, click the Submitted column heading. To reverse the sort order, click the heading again.
To sort by completion date, click the Completed column heading. To reverse the sort order, click the heading again.
For jobs other than Queued jobs, you can display details about the workspace and the job progress.
From the Job History view, to display the details for a job, click the job row.
Job details page for a data generation job
The left side of the job details view contains the workspace information.
For a sensitivity scan, the workspace information is limited to the owner, database type, and worker version.
For a data generation job, the workspace information also includes:
- Whether subsetting, post-job scripts, or webhooks are used.
- The number of schemas, tables, and columns in the source database.
- The number of schemas, tables and columns in the destination database.
The Job Log tab shows the start date, start time, and duration of the job, followed by the list of job process steps.
For data generation jobs, the Privacy Report tab displays the number of at-risk, protected, and not sensitive columns in the source database.
Privacy Report tab on the job details page
At-risk columns contain sensitive data, but still have Passthrough as the assigned generator.
Protected columns have an assigned generator other than Passthrough.
Not sensitive columns have Passthrough as the assigned generator, but do not contain sensitive data.
The job identifier is a unique identifier for the job. To copy the job ID, either:
- From the Job History view, click the copy () icon in the leftmost column.
- From the job details view, click the copy () icon next to the job ID.
You can cancel Queued or Running jobs.
For jobs with those statuses, the rightmost column in the job list contains a cancel icon.
Job list with a Running job that can be canceled
To cancel the job, click the icon.
Required workspace permission: Download job logs
For all jobs, the job logs provide detailed information about the job processing. Tonic support might request the job logs to help diagnose issues.
You can download the job logs from the Job History view or the job details view. The download includes up to 1MB of log entries.
On the Job History view, to download the logs for a job, click the download icon in the rightmost column.
On the job details view, to download the logs for a job, click Download Job Logs.
Required workspace permission: View and download Privacy Report
For data generation jobs, the Privacy Report summarizes the protection status of the database columns.
From the job details view, to download the Privacy Report, click Download Privacy Report CSV.
For workspaces that are connected to Amazon Redshift or Snowflake on AWS databases, the data generation job requires multiple calls to a Lambda function. For these data generation jobs, the CloudWatch logs monitor the progress of and display errors for these Lambda function calls.
To download the CloudWatch logs for a data generation job, on the job details view, click Download CloudWatch Logs.
The Download CloudWatch Logs button only displays for Amazon Redshift and Snowflake on AWS data generation jobs.
Required workspace permission: Download SqlLdr Files
For an Oracle data generation, if both of the following are true:
- The data generation job ran SQL Loader (sqlldr).
- sqlldr either failed or succeeded with errors.
Then to download the sqlldr log files, click Download sqlldr Logs.