Language support in Textual

Tonic Textual supports languages in addition to English. Textual automatically detects the language and applies the correct model.

On self-hosted instances, you configure whether to support multiple languages, and can optionally provide auxiliary language models.

Supported languages

Textual can detect values in the following languages:

Name
Code

Afrikaans

af

Albanian

sq

Amharic

am

Arabic

ar

Armenian

hy

Assamese

as

Azerbaijani

az

Basque

eu

Belarusian

be

Bengali

bn

Bengali Romanized

Bosnian

bs

Breton

br

Bulgarian

bg

Burmese

my

Burmese (alternative)

Catalan

ca

Chinese (Simplified)

zh

Chinese (Traditional)

zh

Croatian

hr

Czech

cs

Danish

da

Dutch

nl

English

en

Esperanto

eo

Estonian

et

Filipino

tl

Finnish

fi

French

fr

Galician

gl

Irish

ga

Georgian

ka

German

de

Greek

el

Gujarati

gu

Hausa

ha

Hebrew

he

Hindi

hi

Hindi Romanized

Hungarian

hu

Icelandic

is

Indonesian

id

Italian

it

Japanese

ja

Javanese

jv

Kannada

kn

Kazakh

kk

Khmer

km

Korean

ko

Kurdish (Kurmanji)

ku

Kyrgyz

ky

Lao

lo

Latin

la

Latvian

lv

Lithuanian

lt

Macedonian

mk

Malagasy

mg

Malay

ms

Malayalam

ml

Marathi

mr

Mongolian

mn

Nepali

ne

Norwegian

no

Oriya

or

Oromo

om

Pashto

ps

Persian

fa

Polish

pl

Portuguese

pt

Punjabi

pa

Romanian

ro

Russian

ru

Sanskrit

sa

Scottish Gaelic

gd

Serbian

sr

Sinhala

si

Sindhi

sd

Slovak

sk

Slovenian

sl

Somali

so

Spanish

es

Sundanese

su

Swahili

sw

Swedish

sv

Tamil

ta

Tamil Romanized

Telugu

te

Telugu Romanized

Thai

th

Turkish

tr

Ukrainian

uk

Urdu

ur

Urdu Romanized

Uyghur

ug

Uzbek

uz

Vietnamese

vi

Welsh

cy

Western Frisian

fy

Xhosa

xh

Yiddish

yi

Self-hosted instances

On a self-hosted instance, you configure whether Textual supports multiple languages.

You can also optionally provide auxiliary language models.

Enabling multi-language support

To enable support for languages other than English, set the environment variable TEXTUAL_MULTI_LINGUAL=true.

The setting is used by the machine learning container.

Providing auxiliary language model assets

You can provide additional language model assets for Textual to use.

By default, Textual looks for model assets in the machine learning container, in /usr/bin/textual/language_models. The default Helm and Docker Compose configurations include the volume mount.

To choose a different location, set the environment variable TEXTUAL_LANGUAGE_MODEL_DIRECTORY. Note that if you change the location, you must also modify your volume mounts.

For help with installing model assets, contact Tonic.ai support ([email protected]).

Last updated

Was this helpful?