Enabling model-based custom entity types
To enable model-based custom entity types on a self-hosted instance, you must configure the connection to the LLM that you want to use.
Tonic Textual supports the following configuration options:
OpenAI-compatible API - OpenAI, Azure OpenAI, Google Gemini, or any OpenAI-compatible endpoint.
Amazon Bedrock - Anthropic Claude models that use Amazon Bedrock.
Connecting to an OpenAI-compatible API
The following provides information on how to connect to an LLM provider that has an OpenAI-compatible API, including OpenAI, Azure OpenAI, and Google Gemini.
Configuring the connection to the model
To configure the connection, set the following environment variables:
MODEL_BASED_ENTITIES_LLM_URL- The endpoint URL for the LLM service.MODEL_BASED_ENTITIES_LLM_MODEL_NAME- The name of the model to use. For example,gemini-2.5-pro,o3.MODEL_BASED_ENTITIES_LLM_API_KEY- The API key to use for authentication.
Recommended models
We recommend the following Google Gemini models:
gemini-2.5-pro- Best balance of quality and cost.gemini-2.5-flash- Faster and cheaper option.
Example configurations
Here are example configurations for OpenAI-compatible APIs:
AWS Bedrock (Anthropic Claude)
The following provides information on how to connect to Anthropic Claude models that use Amazon Bedrock. This option uses an extended thinking prompt that is optimized for Claude's reasoning capabilities.
Enabling Amazon Bedrock and selecting the model
Configure the following environment variables:
MODEL_BASED_ENTITIES_LLM_USE_AWS_BEDROCK- To enable Amazon Bedrock, set this totrue.MODEL_BASED_ENTITIES_LLM_MODEL_NAME- The identifier of the Amazon Bedrock model. For example,anthropic.claude-opus-4-5-20251101-v1:0.
Recommended Anthropic models
To enable cross-Region inference, we strongly recommend the global. prefix for model identifiers. For more information, go to Claude on Amazon Bedrock.
global.anthropic.claude-sonnet-4-5-20250929-v1:0- Recommended. Best balance of quality and cost.global.anthropic.claude-haiku-4-5-20251001-v1:0- Cheapest and fastest option.global.anthropic.claude-opus-4-5-20251101-v1:0- Highest quality, slowest, and most expensive.
To list all of the available Anthropic models in your AWS Region:
Note that some models might require that you request access through the Amazon Bedrock console before they can be used.
Setting the AWS Region
To configure the AWS Region, use one of the following:
MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_REGION- The specific AWS Region to use for Amazon Bedrock calls. This is the recommended configuration.AWS_DEFAULT_REGION- If you do not specify a Region for Amazon Bedrock, then this Region is used.
Limiting the response output tokens
For Amazon Bedrock, you can optionally also limit the response tokens:
MODEL_BASED_ENTITIES_LLM_AWS_BEDROCK_MAX_TOKENS- The maximum number of output tokens for Amazon Bedrock responses. The default value is 64000.
Providing AWS credentials
The Amazon Bedrock client uses the standard AWS SDK credential chain. To configure the credentials, use one of these methods:
Environment variables:
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEY
Shared credentials file:
~/.aws/credentialsIAM role: When running on AWS infrastructure (Amazon EC2, Amazon ECS, Amazon EKS)
The credentials must have permission to invoke the bedrock:InvokeModel action for the specified model.
Note that the backend reads from ~/.aws/credentials. If a [default] profile is configured, it takes precedence over environment variables. For more information about the credential resolution order, go to Credential and profile resolution - AWS SDK for .NET (V3).
Example configuration
Here is an example configuration for Amazon Bedrock:
Common configuration options
The following optional environment variables apply to both configuration options:
MODEL_BASED_ENTITIES_LLM_TIMEOUT_IN_SECONDS- The request timeout in seconds. The default value is 300.MODEL_BASED_ENTITIES_LLM_RETRY_COUNT- The number of times to retry when a request fails. The default value is 3.MODEL_BASED_ENTITIES_LLM_SLOW_REQUEST_THRESHOLD_SECONDS- If the request exceeds this number of seconds, then log a warning. The default value is 60.
Verifying the configuration
After you configure the environment variables, restart the Textual Worker service. Check the worker logs to verify that the connection is successful.
For OpenAI-compatible LLMs:
For Amazon Bedrock:
If configuration is missing or invalid:
The model-based custom entity type feature is disabled.
A warning is added to the logs.
Last updated
Was this helpful?