Creating and training models for a model-based entity type

After you select your training data, on the Model training page, you create one or more trained models.

For each model, you select the version of the guidelines to use. Textual first uses those guidelines to annotate the training data. Based on how well the guidelines identified the values in the training data, you decide whether to start the model training.

When the training is complete, the model scans the test data. The model is scored based on how well it detected the definitive values that you confirmed in the test data.

Information on the model list

For each model, the model list includes:

Model - The model name. Models are automatically named Model n, where n is the number of the model. For example, the first model you create is Model 1, the second is Model 2, and so on.
Status - The model status. The possible statuses are:
- Annotating - The model is using the selected guidelines to annotate the training data.
- Ready for training - The annotation is complete. For models with this status, Textual displays a Review option to allow you to review the annotations.
- Training - The training is in progress. Textual displays the percentage of training data that the model has trained on.
- Ready - The model is trained. You can select any trained model as the active model for the entity type.
Guideline version - The version of the guidelines used for the model. To view the guidelines text, click the view icon.
Benchmark score - A score that indicates how well the model performed when it annotated the test data after training.
Detected entities - The number of entity values that the model detected in the training data.
# of files - The number of training files that were used for the annotation and model training.

Starting a new model

To start a new model:

Click Create new model.

On the Create new model panel, from the Guideline version dropdown list, select the version of the guidelines to use for the model.
Click Save.

Textual adds the model to the list and uses the selected guidelines version to annotate the training data files.

Reviewing the annotations for a model

Before you train the model, you review the annotations to see how well the model performed.

To review the annotations, click the model name. Models that are ready to review also display a Review and Train link next to the model name.

On the model details page:

On the left is the list of training data files, with the number of entities detected in each file.
On the right is the list of the entities in the training files, in descending order by the number of occurrences.

To display the content of a file with the annotations highlighted, click the file name.

After you review the annotations, if you are not satisfied with the results, to return to the guidelines refinement:

In the model list, in the Guideline version column, click the view icon.
On the guidelines panel, click Go to guidelines refinement.

For a model that is not trained yet, the model details page also displays a Modify guidelines option.

Textual displays the Guidelines Refinement page, and selects that guidelines version. You can then edit the guidelines to create a new version, then create a new model that uses the new version.

Training the model

If you are satisfied with the annotation results, then on the model details page, to start the training, click Train model.

Downloading a data package for a model

To help troubleshoot issues with a trained model, you can download a model data package to send to Tonic.ai.

The data package is a .zip file that contains the following:

General information about the custom entity type and model. Includes the entity type name entity type identifier, and the model identifier.
The set of test files, including the established entity values that you identified.
The set of training files, including the entity values that the model identified.

To download the data package, either:

On the Model Training page, click the download icon for the model.
On the model details page, click Download Training Data.

Last updated 1 month ago

Was this helpful?

Good night

hashtagInformation on the model list

hashtagStarting a new model

hashtagReviewing the annotations for a model

hashtagTraining the model

hashtagDownloading a data package for a model

Information on the model list

Starting a new model

Reviewing the annotations for a model

Training the model

Downloading a data package for a model