> For the complete documentation index, see [llms.txt](https://docs.tonic.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.tonic.ai/textual/entity-types/entity-type-custom-model/model-entity-type-flow.md).

# Overview of the process to create a model-based custom entity type

For a custom model entity type, the overall process is as follows:

<figure><img src="/files/FvKq28aqPqaNbuvYLU5X" alt=""><figcaption><p>Flow to create a model-based custom entity type</p></figcaption></figure>

## Select and annotate test files <a href="#flow-test-data" id="flow-test-data"></a>

The first step is to identify the entity values that are in a small set of test files. The test files and established values are used both to iterate over the model guidelines and to assess how well your trained models perform.

1. When you create the model-based custom entity type, you [provide an initial description of the entity type](/textual/entity-types/entity-type-custom-model/model-entity-type-new.md).  For example, "Scientific names of health conditions". \
   \
   The description is the first version of the model guidelines. The guidelines tell the model how to identify the entity type values.
2. You then [select a small set of smaller test files that contain entity values](/textual/entity-types/entity-type-custom-model/selecting-and-reviewing-test-data.md). For example, if you typically use Textual to redact values in patient appointment reports, then you might upload a few of  those reports to use as test files.\
   \
   The files should be no more than 5,000 words.
3. Textual uses your initial guidelines to identify values in the files.
4. You then review and correct the annotations to identify the definitive set of entity values that the test files contain.

## **Iterate over model guidelines** <a href="#flow-model-guidelines" id="flow-model-guidelines"></a>

After you establish the entity values in your test files, you [iterate over the guidelines for the model](/textual/entity-types/entity-type-custom-model/model-entity-type-guidelines.md).&#x20;

For each version of the guidelines, Textual uses the guidelines to detect entity values in the test files.

Textual then compares the values that the guidelines version detects against the values that you established when you annotated the test files.

Textual generates scores to identify how well that version of the guidelines performed. If you are not satisfied with the results, you can update the guidelines to create a new version.

Textual automatically generates suggestions to improve the guidelines, based on how well the current guidelines identified the values. For example, it might suggest more specific wording or additional text to describe exceptions.

## **Select training data** <a href="#flow-training-data" id="flow-training-data"></a>

When you have guidelines that you are satisfied with, you [select a larger set of data to use for model training](/textual/entity-types/entity-type-custom-model/selecting-the-training-data-for-your-models.md).

The training data should contain at least 1,000 entity values. The files should still be relatively small - no more than 5,000 words.

For example, when setting up a custom entity type to identify health conditions, you might use 5 or 6 appointment reports for your test data, but several hundred reports for your training data.

## **Train models** <a href="#flow-model-training" id="flow-model-training"></a>

When you [create a model](/textual/entity-types/entity-type-custom-model/creating-and-training-models-for-a-model-based-type.md#starting-a-new-model), you select the guidelines version to use for it.

The model uses the guidelines to annotate the training data - in other words, to detect entity values in the training files. You [review the annotation results](/textual/entity-types/entity-type-custom-model/creating-and-training-models-for-a-model-based-type.md#reviewing-the-annotations-for-a-model) to determine whether you are satisfied with the detections.

If you are not satisfied, you can:

1. Return to the guidelines refinement to edit the guidelines.
2. Create a new guidelines version.
3. Create a model that uses the new version.

If you are satisfied, then you can [start the model training](/textual/entity-types/entity-type-custom-model/creating-and-training-models-for-a-model-based-type.md#training-the-model). Model training can take a very long time - sometimes hours or days - depending on the data.

When the model finishes training, it scans and identifies values in the original test data. Each trained model receives a score to identify how well its detections matched the definitive values that you established.

## **Select a model to use** <a href="#flow-select-model" id="flow-select-model"></a>

To make the entity type available to use, you [select the trained model to use](/textual/entity-types/entity-type-custom-model/entity-type-model-set-active.md).

The custom entity type is then active and can be [enabled or disabled within individual datasets](/textual/entity-types/enabling-entity-types-for-datasets.md).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.tonic.ai/textual/entity-types/entity-type-custom-model/model-entity-type-flow.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.