Reviewing the training results

Required workspace permission: Configure, train, and export models

The model details view displays the list of training jobs that were run against the model. The job list can include information about the job itself, as well as about the model configuration that was in place when the job was run.

For jobs that are running, queued, or failed, you can view the job details. For queued and running jobs, you can cancel the job.

For completed jobs, you can view a visual summary of the results for a specific job, and compare jobs.

Managing the training jobs list

Configuring the displayed columns

You can configure the columns to include in the jobs list. By default, the jobs list includes:

  • The job identifier

  • The job status

  • The model version that the job ran against. When you change either the query or the column types, Tonic Structural updates the model version. If the updates cause the model configuration to match an existing version, the model is assigned that existing version number. Structural only assigns new version numbers to unique versions. Note that for training jobs that ran before we introduced model versioning, the model version is always 0.

  • When the job was submitted

  • When the job was completed

  • The general model parameter values that were used

For a tabular model, you can also display the tabular-specific parameters. For an event-driven model, you can also display the event-specific parameters.

To manage the displayed columns, click the list icon at the top right of the table. The column list contains the full list of available columns, and indicates whether each column is currently displayed.

To change whether a column is currently displayed, click the column name.

Filtering the list

You can use the job status to filter the list. To filter the list:

  1. In the Training Status column heading, click the filter icon.

  2. On the filter panel, check the checkbox for each status to include.

As you check and uncheck the checkboxes, Structural updates the list.

Sorting the list

You can use the following columns to sort the list:

  • Status

  • Model version

  • Job submission

  • Job completion

To sort by a column, click the column heading. To reverse the sort order, click the column heading again.

Displaying the visual summary of a job's results

The Model Synthesis Report for a completed model training job provides a visual summary of the training results. It allows you to see how well the values in the generated data correspond to those in the original data. This indicates how realistic the generated data is.

Displaying the Model Synthesis Report

Structural produces a Model Synthesis Report for each completed model training job.

From the model details view, to display the Model Synthesis Report for a previous training job, click the Synthesis Report option for that job. The option is only available for completed jobs.

From the job details view, to display the Model Synthesis Report for the job, click Synthesis Report.

Information in the Model Synthesis Report

The Model Synthesis Report contains the following sets of visualizations.

Categorical

For each categorical column, the Categorical section shows the distribution of each value in both the original data and the generated data.

For example, the possible values for a contract column are Month-to-month, Two year, and One year. In the Categorical section, the visualization for contract would show the number of real and generated columns that have each value.

The closer the value counts match, the more realistic the generated data.

Continuous

For each numeric column, the Continuous section shows the distribution of values in the original data and the generated data.

The closer the distributions match, the more realistic the generated data.

Correlations

The Correlations section contains a correlation matrix for the original data and a correlation matrix for the generated data.

Each correlation matrix shows how the values in each numeric column correspond to the values in the other numeric columns. For example, as the tenure for a customer increases, does their bill amount also increase?

The correlation is displayed using a color code that represents a value between -1 and 1. -1 indicates that an increase in one value always corresponds to a decrease in the other value. 0 indicates that there is no correlation between the values. 1 indicates that an increase in one value always corresponds to an increase in the other value.

The blocks that correlate a column to itself always have a correlation of 1.

The more similar the correlations between the matrices, the more realistic the generated data.

Measure of Privacy

The Measure of Privacy section shows how closely each generated record matches the most similar original record. It also plots how closely each original record matches the most similar other original record.

While the overall shape of the data should be similar between the original and generated data, the generated data should not replicate actual records.

Comparing jobs

To compare the model configuration and results for multiple jobs:

  1. Check the checkbox for each job to include in the comparison.

  2. Click Compare Jobs.

The comparison page displays a panel for each job.

At the top of the panel are the job start and end times.

Below that are tabs that summarize the results and contain the configuration that was in place when the training job ran:

  • Parameters shows the model version and the model parameter values

  • Schema contains the data schema

  • Query contains the query used to produce the model data

From the actions menu at the top right of the panel, you can:

  • Display the job details

  • View the Model Synthesis Report for the job

Last updated