# Types of data generation

## Simple data generation <a href="#data-generation-types-simple" id="data-generation-types-simple"></a>

In the simplest type of data generation, Tonic Structural uses the configured table modes and generators to transform data in the source database and then write the transformed data to the destination location. The destination location is usually a database server, but might also be:

* A storage location such as an S3 bucket.
* A container repository.

For a [file connector](https://docs.tonic.ai/app/setting-up-your-database/file-connector) workspace, the data generation job uses the configured generators for each file group to transform the data in the source files. The transformed data is used to create output files that correspond to the source files.

<figure><img src="https://3378426797-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LSQCLFQ4bslJ-HYc8c3%2Fuploads%2FYDDfFMraUQT6ZrYqoEx9%2FDataGenerationRegular.png?alt=media&#x26;token=cb39c13d-4430-46ac-9726-d846d77a52d0" alt=""><figcaption><p>Simple data generation process</p></figcaption></figure>

## Subsetting data generation <a href="#data-generation-types-subsetting" id="data-generation-types-subsetting"></a>

When [subsetting](https://docs.tonic.ai/app/generation/subsetting) is enabled, Structural first identifies the tables and rows to include in the subset. It uses the configured table modes and generators to transform the data. It then writes the transformed data to the destination location.

<figure><img src="https://3378426797-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LSQCLFQ4bslJ-HYc8c3%2Fuploads%2F1diH24EnfSeleTMhW6SL%2FDataGenerationSubset.png?alt=media&#x26;token=a0a51ad0-ffd1-409f-b351-4804f4a2d0eb" alt=""><figcaption><p>Data generation process with subsetting</p></figcaption></figure>

## Upsert data generation <a href="#data-generation-types-upsert" id="data-generation-types-upsert"></a>

{% hint style="info" %}
**Required license:** Professional or Enterprise
{% endhint %}

When [upsert](https://docs.tonic.ai/app/workspace/managing-workspaces/workspace-configuration-settings/workspace-config-upsert) is enabled, Structural runs a data generation job that writes the transformed data to an intermediate database. The data generation can include subsetting.

After the initial data generation, Structural runs an upsert job to add or update the appropriate records from the intermediate database to the destination database. The upsert job only adds and updates records. It does not remove any records from previous data generation jobs.&#x20;

<figure><img src="https://3378426797-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LSQCLFQ4bslJ-HYc8c3%2Fuploads%2Fav3RDfU41BRVveGNiX6b%2FDataGenerationUpsert.png?alt=media&#x26;token=47d3a9b2-6132-42da-98e9-d0701be6daf2" alt=""><figcaption><p>Data generation process with upsert</p></figcaption></figure>

Before Structural can run an upsert job, the destination database must already exist and have the correct schema defined. To initialize the destination database:

1. Disable upsert.
2. Run a regular data generation.
3. Re-enable upsert.
