Data that Tonic.ai collects
Tonic.ai collects telemetry data from the Tonic Textual application. Textual telemetry provides information about how the application is being used. It is primarily used to generate analytics for application usage and performance, but can also be used for debugging, tracking, and troubleshooting.
Textual telemetry includes:
Analytics
Data about end-user interactions with our application to understand how the application is used. Used for product research, roadmap development, debugging, and account management.
Amplitude Sentry
Logs
Information generated by the application to record its progress and status as it performs its functions. Includes information such as completed tasks and errors. Generally used for tracking and troubleshooting.
Amazon Web Services Sentry
In addition to telemetry data, Tonic.ai collects other information during the course of its interactions with customers.
The following information provides more detail about the types of data that Tonic.ai does and does not collect.
Self-hosted Textual data collection
Most customers self-host the Textual application in their own VPC. Customer data does not leave the customer's environment.
The Textual application transmits telemetry data to Tonic.ai to enable us to perform the following tasks:
Manage our accounts
Accurately invoice for usage
Provide customer support
Investigate errors within our application
Understand usage to improve product development
Data that Tonic.ai does NOT collect, process, or store
On self-hosted instances, Tonic.ai never sees the following data:
Customer data
The content of files that the Textual application de-identified.
Dataset credentials
URI or IP address of the dataset
Credentials (password)
Analytics telemetry - end-user interactions
Tonic.ai collects data about end-user interactions with our application to understand how the application is used. We use this data for product research, roadmap development, debugging, and account management.
Tonic.ai collects the following information about end-user interactions:
End-user identity
First and last name
Email address
End-user interaction with the Textual application:
Last seen
First seen
Usage time
Total sessions
Total number of events initiated. Events can include jobs, configuration updates, downloads, and interactions with the dataset.
Application environment
Features enabled
Application version
License tier
Language
Browser used to access the application
Platform (iOS, Android, Web)
Device type (iPhone 13, MacBook Pro)
Carrier (AT&T, Verizon)
Browser (Chrome, Safari)
Browser version
Network and technical identifiers
Unique device identifier
Application delivery
To build and deploy software, Tonic.ai uses a container registry that is run by Quay.io. This container registry maintains information about access to these containers.
The registry maintains a list of authorized users (organizational accounts). It maintains, collects, and stores the following information:
Network and technical identifiers
Unique device identifier
Customer support and account management
Tonic.ai collects, processes, and stores information about end users:
When they interact with our customer support and success staff during account implementation (scoping sessions, implementation calls).
Throughout the life of the account, during customer support interactions (support emails, shared Slack channels).
Tonic.ai uses several tools to allow our customers to get the support they need quickly, including:
Chat support
Video training and implementation calls over web conferencing solutions
Email support
We aggregate requests from these tools into our Customer Management System (CMS) and our centralized customer support management portal. Aggregating these requests helps us to ensure responsiveness and quality, and to more easily integrate requests into our development process.
We collect the following information related to customer requests:
End-user identity
First and last name
Email address
Title
Avatar image
Images, video, or audio from participating in live training over a video or audio conference
Other personal information that the service provider collects and shares.
For example, Google Mail collects voluntary directory information that it shares with email recipients. For an email interaction, Tonic.ai receives any information that is configured to be shared externally.
Slack has configurable profiles that contain additional personal information such as pronouns and honorifics.
Network and technical identifiers
IP address
Unique device identifier
This data is collected from your organization and users through communication with our staff. The Textual application does not collect this data.
Debugging and application performance management
Tonic.ai engineers monitor the application performance and errors. They use this information to maintain, repair, and improve the application.
For these purposes, Tonic.ai collects the following information:
End-user identity
First and last name
Email address
Environment details
Name
Application version
Requests made by the application
URLs
Header information
HTTP POST parameters
Stack traces and exceptions
Method arguments
Classes called
Processing time
CPU usage
Location of error (application, file, and line)
File types
Network and technical identifiers
Hostname
Unique device identifier
Operating system logs
Textual Cloud analytics data collection
Customers who do not self-host Textual can use the hosted option, Textual Cloud.
Textual Cloud collects, processes, and stores data to support the Textual application.
Textual Cloud stores information about end users, configuration, hashed passwords, and datastore connections.
Customer data
Textual Cloud processes and stores customer documents and data to provide services for document analysis, sensitive value detection, and redaction.
During analysis and redaction, Textual Cloud temporarily processes customer document content, and might send documents to external AI services for advanced processing. Customer data is not used for analytics or logs. It is only used to process files.
When files are deleted from Textual, the file content is permanently deleted.
Textual Cloud collects the following customer data:
End-user identity
First and last name
Email address
Phone number
Avatar image
Organization domain (extracted from email address)
Document and file information
Original filename and file metadata
File size, type, and creation timestamps
Document content and extracted text
Detected entity types and confidence scores
Document structure (tables, paragraphs, form fields)
Processing results and redaction coordinates
File storage paths and object identifiers
Application environment
Document processing features enabled
Application version being run
License tier and subscription status
Machine learning model versions and configurations
Processing job types and frequencies
Usage and analytics data
Word count processing volumes
API usage patterns and request frequencies
Feature interaction events
Error rates and processing performance metrics
Session duration and user activity patterns
Browser and client information
User agent strings for SDK detection
Platform identification (Python SDK, Web client)
Request metadata and HTTP headers
Network and technical identifiers
Request timestamps and processing duration
Job execution details and status
System performance metrics
External service credentials (encrypted)
Cloud storage access keys (AWS, Azure, Google Cloud)
Database connection strings
Third-party API authentication tokens
Document processing service credentials
External document processing services
Amazon Textract - Document images and PDFs for text extraction and layout analysis
Azure Document Intelligence - Document files for form processing and data extraction
Google Document AI - Document content for classification and entity recognition
OpenAI GPT models - Extracted text for advanced content analysis and entity detection
Additional analytics from Textual Cloud
Organizations on Textual Cloud might also have additional analytics data collected, processed, and stored. This additional data allows Tonic.ai to replay user sessions to better understand usage patterns.
Sensitive data is redacted from these collections on the end-user device.
Tonic.ai does not collect this data from self-hosted customers.
Textual Cloud collects the following additional analytics data:
Usage patterns
Clicks
Mouse movements
Scrolling
Typing - Excludes data that is typed in sensitive fields such as password fields
Navigation
Pages visited
Referrers
URL parameters
Session duration
Textual log data
Textual log files are stored in an S3 bucket for one year.
Tonic.ai uses a log aggregator to make the log files searchable. On the log aggregator, job logs are deleted after six months. API logs are deleted after 60 days.
Usage information
Textual logs include many types of usage information, including information related to:
Actions in the user interface, from web requests that the web server sees.
Details related to output generation.
Performance data
Tonic.ai collects detailed performance data for the generation process, including data transfer rates and code profiler results.
Viewing the Textual logs that are sent to Tonic.ai
Textual writes all logs to STDOUT
. To view the exact logs that are collected and shared, view what is written to STDOUT
.
If the Textual container runs in Docker, you can run:
docker logs solar_worker
Last updated
Was this helpful?