Configuring Spark SDK workspace data connections

The SDK requires a connection to the Tonic Structural web server. In Structural, you create a workspace that connects to a Spark database.

To retrieve the table information and data, you provide a connection to a Hive catalog database.

When you create the workspace, to indicate that you are using the Spark SDK:

  1. Select Spark as the connection type.

  2. Select Self-managed as the cluster type.

Providing the connection details

Under Catalog Database, to connect to a Hive catalog using the SDK:

  1. In the Hive Catalog Database field, enter the name of the database.

  2. In the Server field, provide the server where the database is located.

  3. In the Port field, provide the port to use to connect to the database.

  4. In the Username field, provide the username for the account to use to connect to the database.

  5. In the Password field, provide the password for the specified user.

  6. To test the connection to the Hive catalog database, click Test Hive Connection.

Enabling validation of table filters

For Spark workspaces, you can provide WHERE clauses to filter tables. For details, go to Applying a filter to tables.

The Enable partition filter validation setting indicates whether Structural validates those filters when you create them.

By default, the setting is in the on position, and Structural validates the filters. To disable the validation, toggle Enable partition filter validation to the off position.

Last updated

Was this helpful?