LogoLogo
Release notesPython SDK docsDocs homeTextual CloudTonic.ai
  • Tonic Textual guide
  • Getting started with Textual
  • Previewing Textual detection and redaction
  • Entity types that Textual detects
    • Built-in entity types
    • Managing custom entity types
  • Language support in Textual
  • Datasets - Create redacted files
    • Datasets workflow for text redaction
    • Creating and managing datasets
    • Assigning tags to datasets
    • Adding and removing dataset files
    • Reviewing the sensitivity detection results
    • Configuring the redaction
      • Configuring added and excluded values for built-in entity types
      • Working with custom entity types
      • Selecting the handling option for entity types
      • Configuring synthesis options
      • Configuring handling of file components
    • Adding manual overrides to PDF files
      • Editing an individual PDF file
      • Creating templates to apply to PDF files
    • Sharing dataset access
    • Previewing the original and redacted data in a file
    • Downloading redacted data
  • Pipelines - Prepare LLM content
    • Pipelines workflow for LLM preparation
    • Viewing pipeline lists and details
    • Assigning tags to pipelines
    • Setting up pipelines
      • Creating and editing pipelines
      • Supported file types for pipelines
      • Creating custom entity types from a pipeline
      • Configuring file synthesis for a pipeline
      • Configuring an Amazon S3 pipeline
      • Configuring a Databricks pipeline
      • Configuring an Azure pipeline
      • Configuring a Sharepoint pipeline
      • Selecting files for an uploaded file pipeline
    • Starting a pipeline run
    • Sharing pipeline access
    • Viewing pipeline results
      • Viewing pipeline files, runs, and statistics
      • Displaying details for a processed file
      • Structure of the pipeline output file JSON
    • Downloading and using pipeline output
  • Textual Python SDK
    • Installing the Textual SDK
    • Creating and revoking Textual API keys
    • Obtaining JWT tokens for authentication
    • Instantiating the SDK client
    • Datasets and redaction
      • Create and manage datasets
      • Redact individual strings
      • Redact individual files
      • Transcribe and redact an audio file
      • Configure entity type handling for redaction
      • Record and review redaction requests
    • Pipelines and parsing
      • Create and manage pipelines
      • Parse individual files
  • Textual REST API
    • About the Textual REST API
    • REST API authentication
    • Redaction
      • Redact text strings
  • Datasets
    • Manage datasets
    • Manage dataset files
  • Snowflake Native App and SPCS
    • About the Snowflake Native App
    • Setting up the app
    • Using the app
    • Using Textual with Snowpark Container Services directly
  • Install and administer Textual
    • Textual architecture
    • Setting up and managing a Textual Cloud pay-as-you-go subscription
    • Deploying a self-hosted instance
      • System requirements
      • Deploying with Docker Compose
      • Deploying on Kubernetes with Helm
    • Configuring Textual
      • How to configure Textual environment variables
      • Configuring the number of textual-ml workers
      • Configuring the number of jobs to run concurrently
      • Configuring the format of Textual logs
      • Setting a custom certificate
      • Configuring endpoint URLs for calls to AWS
      • Enabling PDF and image processing
      • Setting the S3 bucket for file uploads and redactions
      • Required IAM role permissions for Amazon S3
      • Configuring model preferences
    • Viewing model specifications
    • Managing user access to Textual
      • Textual organizations
      • Creating a new account in an existing organization
      • Single sign-on (SSO)
        • Viewing the list of SSO groups in Textual
        • Azure
        • GitHub
        • Google
        • Keycloak
        • Okta
      • Managing Textual users
      • Managing permissions
        • About permissions and permission sets
        • Built-in permission sets and available permissions
        • Viewing the lists of permission sets
        • Configuring custom permission sets
        • Configuring access to global permission sets
        • Setting initial access to all global permissions
    • Textual monitoring
      • Downloading a usage report
      • Tracking user access to Textual
Powered by GitBook
On this page
  • Supported languages
  • Self-hosted instances
  • Enabling multi-language support
  • Providing auxiliary language model assets

Was this helpful?

Export as PDF

Language support in Textual

Tonic Textual supports languages in addition to English. Textual automatically detects the language and applies the correct model.

On self-hosted instances, you configure whether to support multiple languages, and can optionally provide auxiliary language models.

Supported languages

Textual can detect values in the following languages:

Name
Code

Afrikaans

af

Albanian

sq

Amharic

am

Arabic

ar

Armenian

hy

Assamese

as

Azerbaijani

az

Basque

eu

Belarusian

be

Bengali

bn

Bengali Romanized

Bosnian

bs

Breton

br

Bulgarian

bg

Burmese

my

Burmese (alternative)

Catalan

ca

Chinese (Simplified)

zh

Chinese (Traditional)

zh

Croatian

hr

Czech

cs

Danish

da

Dutch

nl

English

en

Esperanto

eo

Estonian

et

Filipino

tl

Finnish

fi

French

fr

Galician

gl

Irish

ga

Georgian

ka

German

de

Greek

el

Gujarati

gu

Hausa

ha

Hebrew

he

Hindi

hi

Hindi Romanized

Hungarian

hu

Icelandic

is

Indonesian

id

Italian

it

Japanese

ja

Javanese

jv

Kannada

kn

Kazakh

kk

Khmer

km

Korean

ko

Kurdish (Kurmanji)

ku

Kyrgyz

ky

Lao

lo

Latin

la

Latvian

lv

Lithuanian

lt

Macedonian

mk

Malagasy

mg

Malay

ms

Malayalam

ml

Marathi

mr

Mongolian

mn

Nepali

ne

Norwegian

no

Oriya

or

Oromo

om

Pashto

ps

Persian

fa

Polish

pl

Portuguese

pt

Punjabi

pa

Romanian

ro

Russian

ru

Sanskrit

sa

Scottish Gaelic

gd

Serbian

sr

Sinhala

si

Sindhi

sd

Slovak

sk

Slovenian

sl

Somali

so

Spanish

es

Sundanese

su

Swahili

sw

Swedish

sv

Tamil

ta

Tamil Romanized

Telugu

te

Telugu Romanized

Thai

th

Turkish

tr

Ukrainian

uk

Urdu

ur

Urdu Romanized

Uyghur

ug

Uzbek

uz

Vietnamese

vi

Welsh

cy

Western Frisian

fy

Xhosa

xh

Yiddish

yi

Self-hosted instances

On a self-hosted instance, you configure whether Textual supports multiple languages.

You can also optionally provide auxiliary language models.

Enabling multi-language support

The setting is used by the machine learning container.

Providing auxiliary language model assets

You can provide additional language model assets for Textual to use.

By default, Textual looks for model assets in the machine learning container, in /usr/bin/textual/language_models. The default Helm and Docker Compose configurations include the volume mount.

Last updated 4 months ago

Was this helpful?

To enable support for languages other than English, TEXTUAL_MULTI_LINGUAL=true.

To choose a different location, TEXTUAL_LANGUAGE_MODEL_DIRECTORY. Note that if you change the location, you must also modify your volume mounts.

For help with installing model assets, contact Tonic.ai support ().

set the environment variable
set the environment variable
support@tonic.ai