Unstructured Data

The Unstructured Data generator uses an LLM to populate a column with unstructured data, such as text or JSON, that can include values from other columns in the table.

For information about the API data model, go to Unstructured Data in the column attributes.

Configuring the access token and models

This configuration is only available on the cloud application.

On a self-hosted instance of Fabricate, to enable all of the Fabricate AI features, including the Unstructured Data generator, you configure the LLM provider in fabricate.env.

On the cloud application, on the My Account page, the Unstructured Data Generator section contains settings for the OpenAI access token and the models to use to generate the content.

To display the My Account page, click the user menu, then click My Account.

Providing an access token for the LLM provider

On the cloud application, you must provide an access token for the LLM provider that you plan to use. By default, Fabricate uses OpenAI.

  1. In the Access Token field, provide the token.

  2. Click Save Changes.

Providing the base URL for the API

By default, Fabricate uses OpenAI as the LLM provider.

To use a different API, in the URI base field, provide the base URL for an OpenAI-compatible API such as OpenRouter.

Selecting the model for each content type

The Unstructured Data generator can produce either free text, a conversation, or JSON.

By default, Fabricate uses:

  • For free text and conversations, gpt-4.1-mini

  • For JSON, gpt-4.1

For each type of content, you can configure the model to use.

  1. In the Model for Free Text field, provide the model to use for free text.

  2. In the Model for Conversations field, provide the model to use for conversations.

  3. In the Model for JSON field, provide the model to use for JSON.

Testing the model connections

To verify that Fabricate can connect to the models, click Test Settings.

Configuring the generator

On the column details panel, you select the format of the generated content and provide prompts to describe it.

Selecting the type of content to generate

From the Type dropdown list, select the type of unstructured content to generate:

  • Free text generates unstructured plain text or Markdown.

  • Conversation generates a transcript of a conversation between two personas. For example, you might use this option to replicate a customer support call.

  • JSON generates data in JSON format.

Providing a prompt to describe the content

In the Prompt field, provide an overview description of the content.

The prompt can include other columns from the table. To include a column, use {column_name}.

For example:

  • "Generate notes for a patient visit. The patient is named {patient_name}, is {age} years old, and was diagnosed with {diagnosis}."

  • "Generate a conversation between an airline customer named {customer_name} who is trying to get help with a cancelled flight and a support agent named {agent_name}."

  • "Generate a JSON object to summarize the customer information. Include {customer_name}, {mailing_address}, and {telephone_number}.

Describing personas for a conversation

For the conversation type of unstructured content, you provide a description of each persona in the conversation.

The personas can include other columns from the table. To include a column, use {column_name}.

  1. In the Persona 1 field, provide a description of the first persona.

  2. In the Persona 2 field, provide a description of the second persona.

For example:

  • "An airline customer {customer_name} who is upset about a canceled flight."

  • "A customer support representative who is trying to help the customer {customer_name}."

Defining the schema for JSON content

For the JSON type of unstructured content, you provide the JSON schema.

In the Schema field, provide the JSON schema.

You can type or paste the schema directly, or you can have Fabricate generate the schema based on an example object.

To generate the schema from an example:

  1. Click Derive JSON schema from an example.

  2. On the panel, in the Example JSON field, provide an example of a JSON object that reflects the schema. For example:

Example JSON to derive a JSON schema for the Unstructured Data generator
  1. Click Derive JSON Schema.

Setting the randomness factor for the content

In the Temperature field, provide a number to indicate how varied or creative the content is.

The higher the number, the more creative the output.

A typical value is between 1.0 and 1.2. A value higher than 1.2 starts to produce garbled or nonsensical values.

Selecting the output format

For free text and conversations, you select the format to use for the output.

From the Format dropdown list:

  • To generate plain text, select Plain Text.

  • To use Markdown format, select Markdown.

Generating example output

After you configure the required fields, to verify that the results are what you expect, you can generate an example entry.

To generate the example, click Generate a sample result.

After you generate an initial example, to generate a new example, click Regenerate. For example, you might generate a new example after you make adjustments to a prompt or schema.

Displaying an unstructured column value

For a column that uses the Unstructured Data generator, to display the content, click the content icon.

Database column that uses the Unstructured Data generator

Last updated