Language support in Textual
Tonic Textual supports languages in addition to English. Textual automatically detects the language and applies the correct model.
On self-hosted instances, you configure whether to support multiple languages, and can optionally provide auxiliary language models.
Supported languages
Textual can detect values in the following languages:
Afrikaans
af
Albanian
sq
Amharic
am
Arabic
ar
Armenian
hy
Assamese
as
Azerbaijani
az
Basque
eu
Belarusian
be
Bengali
bn
Bengali Romanized
Bosnian
bs
Breton
br
Bulgarian
bg
Burmese
my
Burmese (alternative)
Catalan
ca
Chinese (Simplified)
zh
Chinese (Traditional)
zh
Croatian
hr
Czech
cs
Danish
da
Dutch
nl
English
en
Esperanto
eo
Estonian
et
Filipino
tl
Finnish
fi
French
fr
Galician
gl
Irish
ga
Georgian
ka
German
de
Greek
el
Gujarati
gu
Hausa
ha
Hebrew
he
Hindi
hi
Hindi Romanized
Hungarian
hu
Icelandic
is
Indonesian
id
Italian
it
Japanese
ja
Javanese
jv
Kannada
kn
Kazakh
kk
Khmer
km
Korean
ko
Kurdish (Kurmanji)
ku
Kyrgyz
ky
Lao
lo
Latin
la
Latvian
lv
Lithuanian
lt
Macedonian
mk
Malagasy
mg
Malay
ms
Malayalam
ml
Marathi
mr
Mongolian
mn
Nepali
ne
Norwegian
no
Oriya
or
Oromo
om
Pashto
ps
Persian
fa
Polish
pl
Portuguese
pt
Punjabi
pa
Romanian
ro
Russian
ru
Sanskrit
sa
Scottish Gaelic
gd
Serbian
sr
Sinhala
si
Sindhi
sd
Slovak
sk
Slovenian
sl
Somali
so
Spanish
es
Sundanese
su
Swahili
sw
Swedish
sv
Tamil
ta
Tamil Romanized
Telugu
te
Telugu Romanized
Thai
th
Turkish
tr
Ukrainian
uk
Urdu
ur
Urdu Romanized
Uyghur
ug
Uzbek
uz
Vietnamese
vi
Welsh
cy
Western Frisian
fy
Xhosa
xh
Yiddish
yi
Self-hosted instances
On a self-hosted instance, you configure whether Textual supports multiple languages.
You can also optionally provide auxiliary language models.
Enabling multi-language support
To enable support for languages other than English, set the environment variable TEXTUAL_MULTI_LINGUAL=true
.
The setting is used by the machine learning container.
Providing auxiliary language model assets
You can provide additional language model assets for Textual to use.
By default, Textual looks for model assets in the machine learning container, in /usr/bin/textual/language_models. The default Helm and Docker Compose configurations include the volume mount.
To choose a different location, set the environment variable TEXTUAL_LANGUAGE_MODEL_DIRECTORY
. Note that if you change the location, you must also modify your volume mounts.
For help with installing model assets, contact Tonic.ai support (support@tonic.ai).
Last updated
Was this helpful?