You install a self-hosted instance of Tonic Textual on either:
A VM or server that runs Linux and on which you have superuser access.
A local machine that runs Mac, Windows, or Linux.
At minimum, we recommend that the server or cluster that you deploy Textual to has access to the following resources:
Nvidia GPU, 16GB GPU RAM. We recommend at least 6GB GPU RAM for each textual-ml worker.
If you only use a CPU and not a GPU, then we recommend an M5.2xLarge. However, without GPU, performance will be significantly slower.
The number of words per second that Textual processes depends on many factors, including:
The hardware that runs the textual-ml
container
The number of workers that are assigned to the textual-ml
container
The auxiliary model, if any, that is used in the textual-ml
container.
To optimize the throughput of and the cost to use Textual, we recommend that the textual-ml
container runs on modern hardware with GPU compute. If you use AWS, we recommend a g5 instance with 1 GPU.
To use GPU resources:
Ensure that the correct Nvidia drivers are installed for your instance.
If you use Kubernetes to deploy Textual, follow the instructions in the NVIDIA GPU operator documentation.
If you use Minikube, then use the instructions in Using NVIDIA GPUs with Minikube.
If you use Docker Compose to deploy Textual, follow these steps to install the nvidia-container-runtime.