Creating and training models for a model-based entity type

After you select your training data, on the Model training page, you create one or more trained models.

For each model, you select the version of the guidelines to use. Textual first uses those guidelines to annotate the training data. Based on how well the guidelines identified the values in the training data, you decide whether to start the model training.

When the training is complete, the model scans the test data. The model is scored based on how well it detected the definitive values that you confirmed in the test data.

Information on the model list

For each model, the model list includes:

Model training page
  • Model - The model name. Models are automatically named Model n, where n is the number of the model. For example, the first model you create is Model 1, the second is Model 2, and so on.

  • Status - The model status. The possible statuses are:

    • Annotating - The model is using the selected guidelines to annotate the training data.

    • Ready for training - The annotation is complete. For models with this status, Textual displays a Review option to allow you to review the annotations.

    • Training - The training is in progress. Textual displays the percentage of training data that the model has trained on.

    • Ready - The model is trained. You can select any trained model as the active model for the entity type.

  • Guideline version - The version of the guidelines used for the model. To view the guidelines text, click the view icon.

  • Benchmark score - A score that indicates how well the model performed when it annotated the test data after training.

  • Detected entities - The number of entity values that the model detected in the training data.

  • # of files - The number of training files that were used for the annotation and model training.

Starting a new model

To start a new model:

  1. Click Create new model.

Create new model panel to select the guidelines version for the model
  1. On the Create new model panel, from the Guideline version dropdown list, select the version of the guidelines to use for the model.

  2. Click Save.

Textual adds the model to the list and uses the selected guidelines version to annotate the training data files.

Reviewing the annotations for a model

Before you train the model, you review the annotations to see how well the model performed.

To review the annotations, click the model name. Models that are ready to review also display a Review and Train link next to the model name.

On the model details page:

  • On the left is the list of training data files, with the number of entities detected in each file.

  • On the right is the list of the entities in the training files, in descending order by the number of occurrences.

Model details page with the list of detected values

To display the content of a file with the annotations highlighted, click the file name.

Model details page with the content of an annotated training file

After you review the annotations, if you are not satisfied with the results, to return to the guidelines refinement:

  1. In the model list, in the Guideline version column, click the view icon.

  2. On the guidelines panel, click Go to guidelines refinement.

Guidelines panel for a model, with the option to return to the guidelines refinement

For a model that is not trained yet, the model details page also displays a Modify guidelines option.

Textual displays the Guidelines Refinement page, and selects that guidelines version. You can then edit the guidelines to create a new version, then create a new model that uses the new version.

Training the model

If you are satisfied with the annotation results, then on the model details page, to start the training, click Train model.

Train model option for a model

Downloading a data package for a model

To help troubleshoot issues with a trained model, you can download a model data package to send to Tonic.ai.

The data package is a .zip file that contains the following:

  • General information about the custom entity type and model. Includes the entity type name entity type identifier, and the model identifier.

  • The set of test files, including the established entity values that you identified.

  • The set of training files, including the entity values that the model identified.

To download the data package, either:

  • On the Model Training page, click the download icon for the model.

  • On the model details page, click Download Training Data.

Download Training Data option on the model details for a trained model

Last updated

Was this helpful?