Links

About data science mode

Tonic data science mode allows you to create data models that provide views of underlying data from SQL query results. A model represents a downstream analysis or a data science task to answer a specific question.
Based on the defined model parameters, the model training process generates a set of synthesized data with values that correspond to those in the original data.
Users can export the results to a Jupyter notebook, and use Jupyter analysis and visualizations to verify that the synthesized data corresponds accurately to the source data. You can also export generated model data to a CSV file to use the trained data for other analysis.

Overview of the Tonic data science mode workflow

In Tonic, the data science mode workflow involves the following steps:
Overview diagram of the Tonic data science mode workflow
  1. 1.
    To get started, you create a data science mode workspace. After you create the workspace, to identify it as a data science mode workspace, toggle Enable data science mode to the on position. In the workspace configuration, you identify the source of the data to use to create the model. You can connect to an existing database, or you can upload CSV files that contain the data.
  2. 2.
    Next, you create and configure the model. The model configuration starts with a SQL query to retrieve the set of data to use in the model. You then configure the model parameters to guide the model training. You can also adjust the column data types in the query results.
  3. 3.
    After you complete the model configuration, you train the model. When it trains a model, Tonic uses the model configuration to generate new, de-identified data that is based on the SQL query results.
  4. 4.
    You then analyze the resulting model data. The Model Synthesis Report contains visualizations that provide insight into how well the generated data replicates the shape of the original data.
  5. 5.
    You can export the model to use for further analysis. The exported model allows you to generate samples of synthetic data in your Python workflow. You can export the model to a Jupyter notebook that is based on a template that Tonic provides. You can export a code snippet to use as a starting point for your own Jupyter notebook. You can also generate and export a CSV file containing the generated model data. From the Jupyter notebook or CSV file, you can sample the generated model data to use in other analysis tools.
Last modified 2mo ago