Configuring Spark SDK workspace data connections

The SDK requires a connection to the Tonic Structural web server. To access the SDK experience from within Structural, you must create a workspace that connects to a Spark database.

When configuring the SDK, the Structural application requires a connection to a catalog database to retrieve table information and data.

Structural supports Hive or Dremio.

In the workspace configuration, select Spark as the connection type, then select Self-managed as the cluster type.

Connecting to a Hive catalog database

Providing the connection details

Under Catalog Database, to connect to a Hive catalog using the SDK:

Under Catalog Type, click Hive.
In the Hive Catalog Database field, enter the name of the database.
In the Server field, provide the server where the database is located.
In the Port field, provide the port to use to connect to the database.
In the Username field, provide the username for the account to use to connect to the database.
In the Password field, provide the password for the specified user.
To test the connection to the Hive catalog database, click Test Hive Connection.

Enabling validation of table filters

For Spark workspaces, you can provide where clauses to filter tables. For details, go to Applying a filter to tables.

The Enable partition filter validation setting indicates whether Structural validates those filters when you create them.

By default, the setting is in the on position, and Structural validates the filters. To disable the validation, toggle Enable partition filter validation to the off position.

Connecting to a Dremio catalog database

To connect to a Dremio catalog using the SDK:

Under Catalog Type, click Dremio.
Under Connection Method, select either Legacy ODBC or Arrow Flight.
In the Server field, provide the name of the server.
In the Port field, provide the port to use to connect to the database.
In the Username field, provide the name of the user to use to connect to the database.
In the Password field, provide the password for the specified user.
By default, the source data contains all of the schemas. To limit the data to specific schemas, in the Schema(s) field, enter the list of schemas.
If you selected Legacy ODBC as the connection method, then in the Delegation Username field, enter the name of the delegation user.
By default, SSL is enabled, and Enable SSL/TLS is in the on position. We strongly recommend that you do not turn off SSL.
To test the connection to the Dremio catalog, click Test Dremio Connection.

Last updated 23 days ago

Was this helpful?