Using the Spark SDK to run data generation
// Sets a statistics seed for the data generation
val baseStatisticsSeed = 489465;
// Identifies the workspace and provides the API token
val workspace = Workspace.createWorkspace("https://path/to/tonic", "<<api-token>>", "<<workspace-id>>", baseStatisticsSeed);
// Retrieves the source data
val sourceDf = spark.read.parquet("s3://parquet/source/users")
// Defines the output in Spark
val processedDf = workspace.processDataframe("users", sourceDf);Last updated
Was this helpful?