Tonic supports the running of Spark jobs via Databricks on AWS. To run Tonic on Databricks on Azure please reach out to [email protected]
Tonic supports Spark 2.4.x and Spark 3+. However, Spark 2.4.2 is not supported. Any version of Databricks running one of the aforementioned Spark versions should be compatible, however, Tonic has only been tested against Databricks v7.4.
Tonic supports the following data providers:
Note that source data written in JSON will be outputted to Parquet files.
Databricks supports both MANAGED and EXTERNAL tables. MANAGED tables store all of their data within Databricks, whereas EXTERNAL tables store their data on a separate file system (often S3). Tonic can read from both table types but when writing output data will only write to EXTERNAL tables.