Enabling model-based custom entity types
To enable model-based custom entity types on a self-hosted instance, you must configure the connection to the LLM that you want to use.
Our overall recommendation is to use Gemini 2.5 Flash on Vertex AI.
Tonic Textual also supports the following configuration options:
Amazon Bedrock - Anthropic Claude models on Amazon Bedrock.
OpenAI-compatible API - Any OpenAI-compatible endpoint. We recommend Opus 4.5.
Connecting to Gemini 2.5 Flash on Vertex AI
The following provides information on how to connect to Gemini 2.5 Flash and Vertex AI.
Indicating to use Vertex AI
To indicate to use Vertex AI, set the environment variable MODEL_BASED_ENTITIES_LLM_USE_VERTEX_AI_API to true.
Configuring the connection to the model
To configure the connection, set the following environment variables.
MODEL_BASED_ENTITIES_LLM_URL- The endpoint URL for the LLM service. For Gemini 2.5 Flash, the URL ishttps://<region>-aiplatform.googleapis.com. For example,https://us-central1-aiplatform.googleapis.com.MODEL_BASED_ENTITIES_LLM_MODEL_NAME- The name of the model to use. For Gemini 2.5 Flash, the name isgemini-2.5-flash.MODEL_BASED_ENTITIES_LLM_API_KEY- The API key to use for authentication. For more information about obtaining an API key for Gemini with Vertex AI, go to this information in the Google documentation.
Example configuration
Here is an example configuration for Gemini Flash 2.5 on Vertex AI.
Amazon Bedrock (Anthropic Claude)
The following provides information on how to connect to Anthropic Claude models on Amazon Bedrock. This option uses an extended thinking prompt that is optimized for Claude's reasoning capabilities.
Enabling Amazon Bedrock and selecting the model
Configure the following environment variables:
MODEL_BASED_ENTITIES_LLM_USE_AWS_BEDROCK- To enable Amazon Bedrock, set this totrue.MODEL_BASED_ENTITIES_LLM_MODEL_NAME- The identifier of the Amazon Bedrock model. For example,anthropic.claude-opus-4-5-20251101-v1:0.
Recommended Anthropic models
To enable cross-Region inference, we strongly recommend the global. prefix for model identifiers. For more information, go to Claude on Amazon Bedrock.
global.anthropic.claude-opus-4-5-20251101-v1:0- Recommended. Highest quality, although slowest and most expensive.global.anthropic.claude-sonnet-4-5-20250929-v1:0- Cheaper. Good but inconsistent quality.global.anthropic.claude-haiku-4-5-20251001-v1:0- Cheapest and fastest option.
To list all of the available Anthropic models in your AWS Region:
Note that some models might require that you request access through the Amazon Bedrock console before they can be used.
Setting the AWS Region
To configure the AWS Region, use one of the following:
MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_REGION- The specific AWS Region to use for Amazon Bedrock calls. This is the recommended configuration.AWS_DEFAULT_REGION- If you do not specify a Region for Amazon Bedrock, then this Region is used.
Limiting the response output tokens
For Amazon Bedrock, you can optionally also limit the response tokens:
MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_MAX_TOKENS- The maximum number of output tokens for Amazon Bedrock responses. The default value is 64000.
Providing AWS credentials
The Amazon Bedrock client uses the standard AWS SDK credential chain. To configure the credentials, use one of these methods:
Environment variables:
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEY
Shared credentials file:
~/.aws/credentialsIAM role: When running on AWS infrastructure (Amazon EC2, Amazon ECS, Amazon EKS)
The credentials must have permission to invoke the bedrock:InvokeModel action for the specified model.
Note that the backend reads from ~/.aws/credentials. If a [default] profile is configured, it takes precedence over environment variables. For more information about the credential resolution order, go to Credential and profile resolution - AWS SDK for .NET (V3).
Example configuration
Here is an example configuration for Amazon Bedrock:
Connecting to an OpenAI-compatible API
The following provides information on how to connect to an LLM provider that has an OpenAI-compatible API.
For an OpenAI-compatible API, we recommend Opus 4.5.
Configuring the connection to the model
To configure the connection, set the following environment variables:
MODEL_BASED_ENTITIES_LLM_URL- The endpoint URL for the LLM service.MODEL_BASED_ENTITIES_LLM_MODEL_NAME- The name of the model to use. For example,claude-opus-4-5-20251101.MODEL_BASED_ENTITIES_LLM_API_KEY- The API key to use for authentication.
Also make sure that the following settings are either set to false or removed from the configuration file:
MODEL_BASED_ENTITIES_LLM_USE_VERTEX_AI_APIMODEL_BASED_ENTITIES_LLM_USE_AWS_BEDROCK
Example configurations
Here is an example configuration for Opus 4.5:
Common configuration options
The following optional environment variables apply to all of the connection options:
MODEL_BASED_ENTITIES_LLM_TIMEOUT_IN_SECONDS- The request timeout in seconds. The default value is 300.MODEL_BASED_ENTITIES_LLM_RETRY_COUNT- The number of times to retry when a request fails. The default value is 5.MODEL_BASED_ENTITIES_LLM_SLOW_REQUEST_THRESHOLD_SECONDS- If the request exceeds this number of seconds, then log a warning. The default value is 120.
Verifying the configuration
After you configure the environment variables, restart the Textual Worker service.
The service is initialized the first time that it is used, for example when you create an entity type and add a test or training file.
You can then check the worker logs to verify that the connection is successful.
For Vertex AI:
For OpenAI-compatible LLMs:
For Amazon Bedrock:
If configuration is missing or invalid:
The model-based custom entity type feature is disabled.
A warning is added to the logs.
Last updated
Was this helpful?