Using Databricks CLI to work with Databricks system tables
The Databricks CLI, released last year, is a versatile tool. With it, you can do everything that is possible via Databricks REST APIs, but it's much easier because the CLI handles authentication, pagination of results, and many other things that could be hard to do correctly on the first attempt. The list of available commands grows as soon as new REST APIs are released.
One of the regular tasks that administrators need to do is enable the new schemas for Databricks system tables—new functionality is released constantly, allowing customers to gain more insights.
With Databricks CLI (+jq) it's very easy to automate the automatic discovery of not-enabled system schemas and enable them - you just need a few simple steps:
export DATABRICKS_CONFIG_PROFILE=...
export DB_METASTORE_ID=$(databricks metastores current|jq -r .metastore_id)
databricks system-schemas list $DB_METASTORE_ID|jq -cr '.[] | select( .state == "AVAILABLE" ) | .schema'
for schema in $(databricks system-schemas list $DB_METASTORE_ID|jq -cr '.[] | select( .state == "AVAILABLE" ) | .schema'); do
echo "Enabling $schema"
databricks system-schemas enable $DB_METASTORE_ID $schema
done
That's all!
And if necessary, we can disable system schemas that we don't need, but this isn't a very frequent operation...
Hi Alex, How frequently schemas are added? And what should be frequency for running this?
Keerti Bafna
Amazing
Awesome tips, Alex! I love how easy system tables make it to promote observabilty across workspaces. The Databricks CLI really can do it all!