Databricks Apps - changing the Data Platform game

Databricks Apps - changing the Data Platform game

In the first post in this series, I commented how Databricks was changing the game in data platforms and data analytics with their new developments. Today I will take a look at Databricks Apps and show why these are such powerful tools to add to the organization's quiver.

Databricks Apps offers you the capability to extend the data warehouse with custom applications. These applications are hosted on Databrick's infrastructure, and allow you to leverage existing governance structure and connections. The apps can be setup as using either python or node.js frameworks, and support common frameworks like streamlit, gradio, and dash.

As an example, with streamlit, bringing data to an application to allow editing is as simple as writing the following:

edited_df = st.data_editor(df)        

Connecting the App to unity catalog is similarly easy to accomplish with a function as simple as writing a select query with a connection pointing at a Databricks SQL endpoint.

def read_table(table_name: str, conn) -> pd.DataFrame:
    """Read a table from Databricks and return as DataFrame."""
    with conn.cursor() as cursor:
        cursor.execute(f"SELECT * FROM {table_name}")
        return cursor.fetchall_arrow().to_pandas()
        

Similarly, we can write back to the underlying table with any type of SQL statement we want, whether that be insert, update, or merge.

We can also add a dropdown selector for a set of category labels to the data editor which provides a preset list of options with the following code.

CATEGORY_OPTIONS = ["Enterprise", "Premium", "Growth", "Standard"]

edited_df = st.data_editor(
        column_config={
            "Customer ID": st.column_config.TextColumn(),
            "Customer Name": st.column_config.TextColumn(),
            "Country": st.column_config.TextColumn(),
            "Orders": st.column_config.NumberColumn(),
            "Total Value (€)": st.column_config.NumberColumn(),
            "Suggested Category": st.column_config.TextColumn(),
            "Budget Value": st.column_config.NumberColumn(),
            "Currency": st.column_config.TextColumn(),
            "Category": st.column_config.SelectboxColumn(options=CATEGORY_OPTIONS),
        },
        hide_index=True,
        use_container_width=True,
        num_rows="fixed",
    )

        
Article content
Category selector in as configured in the app

Now we have a good method to connect from the Databricks Application to Unity Catalog, and can make changes to both numeric and categorical data in our data editor and save them back in a Unity Catalog table, we can consider what kind of additional features we want to include in the application and how the governance should be handled for the what data we see, how it is handled, and what is done with it afterwards.

Logic can be added to validate the user input, to add graphs, to show changes in the data table, but at it's base we've now created a business application. If you want to try a similar exercise yourself, Databricks has a good tutorial to get you started https://docs.databricks.com/gcp/en/dev-tools/databricks-apps/tutorial-streamlit

In the next article, we will extend this logic to introduce agentic actions into our app

Great example of where things are heading. Databricks Apps are finally making data usable for real business teams, not just dashboards. This is where ROI actually starts to show 👏

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories