# Enabling model-based custom entity types

To enable model-based custom entity types on a self-hosted instance, you must configure the connection to the LLM that you want to use.

Our overall recommendation is to use **Gemini 2.5 Flash on Vertex AI**.

Tonic Textual also supports the following configuration options:

* **Amazon Bedrock -** Anthropic Claude models on Amazon Bedrock.
* **OpenAI-compatible API -** Any OpenAI-compatible endpoint. We recommend Opus 4.5.

## Connecting to Gemini 2.5 Flash on Vertex AI

The following provides information on how to connect to Gemini 2.5 Flash and Vertex AI.

### Indicating to use Vertex AI

To indicate to use Vertex AI, set the environment variable `MODEL_BASED_ENTITIES_LLM_USE_VERTEX_AI_API` to `true`.

### Configuring the connection to the model

To configure the connection, set the following environment variables.

* `MODEL_BASED_ENTITIES_LLM_URL` - The endpoint URL for the LLM service.\
  \
  For Gemini 2.5 Flash, the URL is `https://<region>-aiplatform.googleapis.com`. For example, `https://us-central1-aiplatform.googleapis.com`.
* `MODEL_BASED_ENTITIES_LLM_MODEL_NAME` - The name of the model to use.\
  \
  For Gemini 2.5 Flash, the name is `gemini-2.5-flash`.
* `MODEL_BASED_ENTITIES_LLM_API_KEY` - The API key to use for authentication. For more information about obtaining an API key for Gemini with Vertex AI, go to [this information in the Google documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=standard).

### Example configuration

Here is an example configuration for Gemini Flash 2.5 on Vertex AI.

{% code overflow="wrap" %}

```
MODEL_BASED_ENTITIES_LLM_USE_VERTEX_AI_API=true

MODEL_BASED_ENTITIES_LLM_URL=https://us-central1-aiplatform.googleapis.com
MODEL_BASED_ENTITIES_LLM_MODEL_NAME=gemini-2.5-flash
MODEL_BASED_ENTITIES_LLM_API_KEY=your-gemini-flash-key
```

{% endcode %}

## Amazon Bedrock (Anthropic Claude)

The following provides information on how to connect to Anthropic Claude models on Amazon Bedrock. This option uses an extended thinking prompt that is optimized for Claude's reasoning capabilities.

### Enabling Amazon Bedrock and selecting the model

Configure the following environment variables:

* `MODEL_BASED_ENTITIES_LLM_USE_AWS_BEDROCK` - To enable Amazon Bedrock, set this to `true`.
* `MODEL_BASED_ENTITIES_LLM_MODEL_NAME` - The identifier of the Amazon Bedrock model. For example, `anthropic.claude-opus-4-5-20251101-v1:0`.

### Recommended Anthropic models

To enable cross-Region inference, we strongly recommend the `global.` prefix for model identifiers. For more information, go to [Claude on Amazon Bedrock](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock).

* `global.anthropic.claude-opus-4-5-20251101-v1:0` - Recommended. Highest quality, although slowest and most expensive.
* `global.anthropic.claude-sonnet-4-5-20250929-v1:0` - Cheaper. Good but inconsistent quality.
* `global.anthropic.claude-haiku-4-5-20251001-v1:0` - Cheapest and fastest option.

To list all of the available Anthropic models in your AWS Region:

```
aws bedrock list-foundation-models \
  --by-provider Anthropic \
  --region us-west-2 \
  --query "modelSummaries[*].modelId"
```

Note that some models might require that you request access through the Amazon Bedrock console before they can be used.

### Setting the AWS Region

To configure the AWS Region, use one of the following:

* `MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_REGION` - The specific AWS Region to use for Amazon Bedrock calls. This is the recommended configuration.
* `AWS_DEFAULT_REGION` - If you do not specify a Region for Amazon Bedrock, then this Region is used.

### Limiting the response output tokens

For Amazon Bedrock, you can optionally also limit the response tokens:

* `MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_MAX_TOKENS` - The maximum number of output tokens for Amazon Bedrock responses. The default value is 64000.

### Providing AWS credentials

The Amazon Bedrock client uses the standard AWS SDK credential chain. To configure the credentials, use one of these methods:

* **Environment variables:**
  * `AWS_ACCESS_KEY_ID`
  * `AWS_SECRET_ACCESS_KEY`
* **Shared credentials file:** `~/.aws/credentials`
* **IAM role:** When running on AWS infrastructure (Amazon EC2, Amazon ECS, Amazon EKS)

The credentials must have permission to invoke the `bedrock:InvokeModel` action for the specified model.

Note that the backend reads from `~/.aws/credentials`. If a `[default]` profile is configured, it takes precedence over environment variables. For more information about the credential resolution order, go to [Credential and profile resolution - AWS SDK for .NET (V3)](https://docs.aws.amazon.com/sdk-for-net/v3/developer-guide/creds-assign.html).

### Example configuration

Here is an example configuration for Amazon Bedrock:

```
# Enable Amazon Bedrock
MODEL_BASED_ENTITIES_LLM_USE_AWS_BEDROCK=true

# Model ID - use global prefix for cross-Region inference
MODEL_BASED_ENTITIES_LLM_MODEL_NAME=global.anthropic.claude-sonnet-4-5-20250929-v1:0

# Region
MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_REGION=us-west-2

# Optional: Adjust max tokens (default is 64000)
MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_MAX_TOKENS=64000

# AWS credentials (if not using an IAM role or credentials file)
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
```

## Connecting to an OpenAI-compatible API

The following provides information on how to connect to an LLM provider that has an OpenAI-compatible API.

For an OpenAI-compatible API, we recommend Opus 4.5.

### Configuring the connection to the model

To configure the connection, set the following environment variables:

* `MODEL_BASED_ENTITIES_LLM_URL` - The endpoint URL for the LLM service.
* `MODEL_BASED_ENTITIES_LLM_MODEL_NAME` - The name of the model to use. For example, `claude-opus-4-5-20251101`.
* `MODEL_BASED_ENTITIES_LLM_API_KEY` - The API key to use for authentication.

Also make sure that the following settings are either set to `false` or removed from the configuration file:

* `MODEL_BASED_ENTITIES_LLM_USE_VERTEX_AI_API`
* `MODEL_BASED_ENTITIES_LLM_USE_AWS_BEDROCK`

### Example configurations

Here is an example configuration for Opus 4.5:

{% code overflow="wrap" %}

```
# Opus 4.5
MODEL_BASED_ENTITIES_LLM_URL=https://api.anthropic.com/v1/
MODEL_BASED_ENTITIES_LLM_MODEL_NAME=claude-opus-4-5-20251101
MODEL_BASED_ENTITIES_LLM_API_KEY=your-anthropic-api-key
```

{% endcode %}

## Common configuration options

The following optional environment variables apply to all of the connection options:

* `MODEL_BASED_ENTITIES_LLM_TIMEOUT_IN_SECONDS` - The request timeout in seconds. The default value is 300.
* `MODEL_BASED_ENTITIES_LLM_RETRY_COUNT` - The number of times to retry when a request fails. The default value is 5.
* `MODEL_BASED_ENTITIES_LLM_SLOW_REQUEST_THRESHOLD_SECONDS` - If the request exceeds this number of seconds, then log a warning. The default value is 120.

## Verifying the configuration

After you configure the environment variables, restart the Textual Worker service.

The service is initialized the first time that it is used, for example when you create an entity type and add a test or training file.

You can then check the worker logs to verify that the connection is successful.

For Vertex AI:

{% code overflow="wrap" %}

```
Initializing Vertex AI HTTP client with timeout: <timeout>s
Vertex AI HTTP client initialized
```

{% endcode %}

For OpenAI-compatible LLMs:

```
Initializing OpenAI SDK using model: <model-name>
OpenAI SDK client initialized
```

For Amazon Bedrock:

```
Initializing Amazon Bedrock runtime client in region: <region> with timeout: <timeout>s
Amazon Bedrock runtime client initialized
```

If configuration is missing or invalid:

* The model-based custom entity type feature is disabled.
* A warning is added to the logs.
