Data that Tonic.ai collects

Tonic.ai collects telemetry data from the Tonic Textual application. Textual telemetry provides information about how the application is being used. It is primarily used to generate analytics for application usage and performance, but can also be used for debugging, tracking, and troubleshooting.

Textual telemetry includes:

Type
What it includes
How we collect it

Analytics

Data about end-user interactions with our application to understand how the application is used. Used for product research, roadmap development, debugging, and account management.

Amplitude Sentry

Logs

Information generated by the application to record its progress and status as it performs its functions. Includes information such as completed tasks and errors. Generally used for tracking and troubleshooting.

Amazon Web Services Sentry

In addition to telemetry data, Tonic.ai collects other information during the course of its interactions with customers.

The following information provides more detail about the types of data that Tonic.ai does and does not collect.

Self-hosted Textual data collection

Most customers self-host the Textual application in their own VPC. Customer data does not leave the customer's environment.

The Textual application transmits telemetry data to Tonic.ai to enable us to perform the following tasks:

  • Manage our accounts

  • Accurately invoice for usage

  • Provide customer support

  • Investigate errors within our application

  • Understand usage to improve product development

Data that Tonic.ai does NOT collect, process, or store

On self-hosted instances, Tonic.ai never sees the following data:

  • Customer data

    The content of files that the Textual application de-identified.

  • Dataset credentials

    • URI or IP address of the dataset

    • Credentials (password)

Analytics telemetry - end-user interactions

Tonic.ai collects data about end-user interactions with our application to understand how the application is used. We use this data for product research, roadmap development, debugging, and account management.

Tonic.ai collects the following information about end-user interactions:

  • End-user identity

    • First and last name

    • Email address

  • End-user interaction with the Textual application:

    • Last seen

    • First seen

    • Usage time

    • Total sessions

    • Total number of events initiated. Events can include jobs, configuration updates, downloads, and interactions with the dataset.

  • Application environment

    • Features enabled

    • Application version

    • License tier

  • Language

  • Browser used to access the application

    • Platform (iOS, Android, Web)

    • Device type (iPhone 13, MacBook Pro)

    • Carrier (AT&T, Verizon)

    • Browser (Chrome, Safari)

    • Browser version

  • Network and technical identifiers

    • Unique device identifier

Application delivery

To build and deploy software, Tonic.ai uses a container registry that is run by Quay.io. This container registry maintains information about access to these containers.

The registry maintains a list of authorized users (organizational accounts). It maintains, collects, and stores the following information:

  • Network and technical identifiers

    • Unique device identifier

Customer support and account management

Tonic.ai collects, processes, and stores information about end users:

  • When they interact with our customer support and success staff during account implementation (scoping sessions, implementation calls).

  • Throughout the life of the account, during customer support interactions (support emails, shared Slack channels).

Tonic.ai uses several tools to allow our customers to get the support they need quickly, including:

  • Chat support

  • Video training and implementation calls over web conferencing solutions

  • Email support

We aggregate requests from these tools into our Customer Management System (CMS) and our centralized customer support management portal. Aggregating these requests helps us to ensure responsiveness and quality, and to more easily integrate requests into our development process.

We collect the following information related to customer requests:

  • End-user identity

    • First and last name

    • Email address

    • Title

    • Avatar image

    • Images, video, or audio from participating in live training over a video or audio conference

    • Other personal information that the service provider collects and shares.

    For example, Google Mail collects voluntary directory information that it shares with email recipients. For an email interaction, Tonic.ai receives any information that is configured to be shared externally.

    Slack has configurable profiles that contain additional personal information such as pronouns and honorifics.

  • Network and technical identifiers

    • IP address

    • Unique device identifier

This data is collected from your organization and users through communication with our staff. The Textual application does not collect this data.

Debugging and application performance management

Tonic.ai engineers monitor the application performance and errors. They use this information to maintain, repair, and improve the application.

For these purposes, Tonic.ai collects the following information:

  • End-user identity

    • First and last name

    • Email address

  • Environment details

    • Name

    • Application version

  • Requests made by the application

    • URLs

    • Header information

    • HTTP POST parameters

  • Stack traces and exceptions

    • Method arguments

    • Classes called

    • Processing time

    • CPU usage

  • Location of error (application, file, and line)

    • File types

  • Network and technical identifiers

    • Hostname

    • Unique device identifier

  • Operating system logs

Textual Cloud analytics data collection

Customers who do not self-host Textual can use the hosted option, Textual Cloud.

Textual Cloud collects, processes, and stores data to support the Textual application.

Textual Cloud stores information about end users, configuration, hashed passwords, and datastore connections.

Customer data

Textual Cloud processes and stores customer documents and data to provide services for document analysis, sensitive value detection, and redaction.

During analysis and redaction, Textual Cloud temporarily processes customer document content, and might send documents to external AI services for advanced processing. Customer data is not used for analytics or logs. It is only used to process files.

When files are deleted from Textual, the file content is permanently deleted.

Textual Cloud collects the following customer data:

  • End-user identity

    • First and last name

    • Email address

    • Phone number

    • Avatar image

    • Organization domain (extracted from email address)

  • Document and file information

    • Original filename and file metadata

    • File size, type, and creation timestamps

    • Document content and extracted text

    • Detected entity types and confidence scores

    • Document structure (tables, paragraphs, form fields)

    • Processing results and redaction coordinates

    • File storage paths and object identifiers

  • Application environment

    • Document processing features enabled

    • Application version being run

    • License tier and subscription status

    • Machine learning model versions and configurations

    • Processing job types and frequencies

  • Usage and analytics data

    • Word count processing volumes

    • API usage patterns and request frequencies

    • Feature interaction events

    • Error rates and processing performance metrics

    • Session duration and user activity patterns

  • Browser and client information

    • User agent strings for SDK detection

    • Platform identification (Python SDK, Web client)

    • Request metadata and HTTP headers

  • Network and technical identifiers

    • Request timestamps and processing duration

    • Job execution details and status

    • System performance metrics

  • External service credentials (encrypted)

    • Cloud storage access keys (AWS, Azure, Google Cloud)

    • Database connection strings

    • Third-party API authentication tokens

    • Document processing service credentials

  • External document processing services

    • Amazon Textract - Document images and PDFs for text extraction and layout analysis

    • Azure Document Intelligence - Document files for form processing and data extraction

    • Google Document AI - Document content for classification and entity recognition

    • OpenAI GPT models - Extracted text for advanced content analysis and entity detection

Additional analytics from Textual Cloud

Organizations on Textual Cloud might also have additional analytics data collected, processed, and stored. This additional data allows Tonic.ai to replay user sessions to better understand usage patterns.

Sensitive data is redacted from these collections on the end-user device.

Tonic.ai does not collect this data from self-hosted customers.

Textual Cloud collects the following additional analytics data:

  • Usage patterns

  • Clicks

  • Mouse movements

  • Scrolling

  • Typing - Excludes data that is typed in sensitive fields such as password fields

  • Navigation

  • Pages visited

  • Referrers

  • URL parameters

  • Session duration

Textual log data

Textual log files are stored in an S3 bucket for one year.

Tonic.ai uses a log aggregator to make the log files searchable. On the log aggregator, job logs are deleted after six months. API logs are deleted after 60 days.

Usage information

Textual logs include many types of usage information, including information related to:

  • Actions in the user interface, from web requests that the web server sees.

  • Details related to output generation.

Performance data

Tonic.ai collects detailed performance data for the generation process, including data transfer rates and code profiler results.

Viewing the Textual logs that are sent to Tonic.ai

Textual writes all logs to STDOUT. To view the exact logs that are collected and shared, view what is written to STDOUT.

If the Textual container runs in Docker, you can run:

docker logs solar_worker

Last updated

Was this helpful?