# Using the Spark SDK to run data generation

To use the Structural SDK to de-identify data in a Spark SDK workspace, you run a Spark program.

For details about the SDK, go to the [Tonic SDK Javadoc](https://app.tonic.ai/javadocs/).&#x20;

Here is a very basic example of using the SDK to run data generation on a workspace and write the output to a DataFrame:

<pre class="language-scala" data-overflow="wrap"><code class="lang-scala"><strong>// Sets a statistics seed for the data generation
</strong><strong>val baseStatisticsSeed = 489465;
</strong><strong>
</strong><strong>// Identifies the workspace and provides the API token
</strong>val workspace = Workspace.createWorkspace("https://path/to/tonic", "&#x3C;&#x3C;api-token>>", "&#x3C;&#x3C;workspace-id>>", baseStatisticsSeed);

// Retrieves the source data
val sourceDf = spark.read.parquet("s3://parquet/source/users")

// Defines the output in Spark
val processedDf = workspace.processDataframe("users", sourceDf);
</code></pre>
