Impact Analysis for Dashboards with SQL Code in Initial SQL

What Is Impact Analysis in Tableau?

If you've ever been asked: "We're renaming a column in the database — which dashboards will break?" — you've done impact analysis.

Impact analysis is the process of tracing upstream dependencies — identifying every workbook, data source, and dashboard that touches a specific database table, column, or stored procedure. It's essential for:

Database migrations — renaming tables, moving schemas, deprecating columns
Data model changes — knowing which reports break before you push changes
Compliance and auditing — documenting what data flows where
Decommissioning — safely retiring tables nobody uses anymore

How Impact Analysis Is Typically Done

1. Tableau Server / Tableau Cloud UI & Built-in Options

The most common approach: click through each workbook, open the data source tab, and manually check what tables or SQL it uses. For a project with 5 workbooks, this works. For 40+ workbooks across multiple projects — it doesn't scale.

Lineage View (available in Tableau Cloud and Server with Data Management Add-on) provides a visual dependency graph. You can navigate from a database → tables → datasources → workbooks. It's useful, but:

Requires the Data Management Add-on
Only shows dependencies that Tableau's internal cataloging has indexed
Does not show Initial SQL — these are completely invisible

Beyond the UI, there are several programmatic approaches depending on your environment:

Tableau Server — PostgreSQL Repository

Tableau Server stores all of its metadata in an internal PostgreSQL database (the Tableau Repository). If you have admin access, you can connect directly to this repository and write queries to perform impact analysis. Key tables include _workbooks, _datasources, _views, and _data_connections. For example, you can query _data_connections to find every workbook connected to a specific database or server. This is a powerful approach for on-premise Tableau Server environments, but it's not available in Tableau Cloud — Salesforce does not expose the underlying PostgreSQL repository to cloud customers.

Tableau Cloud — GraphiQL via External Assets

In Tableau Cloud, the equivalent approach is through External Assets in the web UI, where you can browse databases, tables, and their downstream lineage. For programmatic access, Tableau Cloud exposes the Metadata API — a GraphiQL endpoint at /api/metadata/graphql. You can write GraphiQL queries to traverse the lineage graph: start from a database or table and find every data source and workbook downstream. This is the closest equivalent to querying the PostgreSQL repository on Server, and it works well for relationship tables and Custom SQL — but as we'll see, it has a critical blind spot.

REST API

Both Tableau Server and Tableau Cloud support the REST API, The API gives you simple access to the functionality behind Tableau data sources, projects, workbooks, site users, sites, flows, and more. You can use this access to create your own custom applications or to script interactions with Tableau resources.

2. Tableau Metadata API (GraphiQL)

Tableau provides a GraphiQL-based Metadata API at /api/metadata/graphiql that programmatically exposes lineage. You can query for databases, tables, and their downstream datasources and workbooks:

{
  databases(filter: { name: "my_database" }) {
    name
    tables {
      name
      schema
      downstreamDatasources {
        name
        projectName
        downstreamWorkbooks {
          name
          projectName
          owner { name }
        }
      }
    }
  }
}

This is powerful and gets you 80% of the way there. But the Metadata API has a critical blind spot: it cannot see Initial SQL.

3. The Blind Spot: Initial SQL (Invisible to All Built-in Tools)

Tableau supports a feature called Initial SQL (also called one-time SQL). This is SQL code that runs once when the workbook opens — before any data query executes. It's commonly used for:

Creating temp tables (CREATE TABLE #staging ...)
Setting session variables (SET @fiscal_year = 2026)
Running INSERT INTO or MERGE statements to stage data

Here's the problem: Initial SQL references tables that Tableau never registers as dependencies. The Metadata API doesn't index them. The Lineage View doesn't show them. They're invisible.

Our Approach: A Fully Automated Python Pipeline

We built an end-to-end pipeline using the REST API and XML parsing to find every dependency — including the ones Tableau hides:

REST API — authenticate and bulk-download all workbooks (.twbx files) from a Tableau project
TWB XML parsing — extract .twb XML from the archives, then parse specific XML tags to find every SQL statement and table reference
SQL parsing — use regex to extract actual table names from the SQL code
Excel export — push the complete inventory to a formatted Excel workbook

The result: a complete inventory with every SQL statement and every table name, mapped to the workbook and dashboard that uses it.

Architecture Overview:

┌─────────────────────────────────────────────────────────────┐
│                    Tableau Cloud / Server                    │
│                                                             │
│  REST API ──► Download .twbx files (workbook archives)      │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│                    Python Pipeline                           │
│                                                             │
│  1. Authenticate via PAT (Personal Access Token)            │
│  2. Download all workbooks in target project (.twbx)        │
│  3. Extract .twb XML from .twbx ZIP archives                │
│  4. Parse XML tags for 5 types of dependencies:             │
│                            │
│     • <relation one-time-sql="...">                         │
│     • <relation name="Custom SQL Query" type="text">        │
│     • <relation type="table" table="[dbo].[TableName]">     │
│     • <datasource name="sqlproxy..." caption="DS Name">     │
│  5. Extract table names from SQL using regex                │
│  6. Export to formatted Excel                               │
└─────────────────────────────────────────────────────────────┘

Step-by-Step Walkthrough

Step 1: Authenticate and Download Workbooks via REST API

We authenticate using a Personal Access Token (PAT) and download every workbook in the target project as .twbx files.

def authenticate():
    url = f"{TABLEAU_SERVER}/api/{API_VERSION}/auth/signin"
    payload = {
        "credentials": {
            "personalAccessTokenName": PAT_NAME,
            "personalAccessTokenSecret": PAT_SECRET,
            "site": {"contentUrl": SITE_NAME}
        }
    }
    response = requests.post(url, json=payload, headers={"Content-Type": "application/json"})
    # Parse XML response for auth token and site ID
    root = ET.fromstring(response.text)
    auth_token = root.find(".//credentials").attrib["token"]
    site_id = root.find(".//site").attrib["id"]
    return auth_token, site_id

We then list all workbooks in the project with pagination and download each one:

def get_workbooks_in_project(auth_token, site_id, project_name):
    """Paginated fetch of all workbooks in a project."""
    all_workbooks = []
    page_number = 1
    while True:
        url = (f"{TABLEAU_SERVER}/api/{API_VERSION}/sites/{site_id}/workbooks"
               f"?filter=projectName:eq:{project_name}"
               f"&pageSize=100&pageNumber={page_number}")
        data = requests.get(url, headers={"X-Tableau-Auth": auth_token}).json()
        workbooks = data["workbooks"]["workbook"]
        all_workbooks.extend(workbooks)
        if page_number * 100 >= int(data["pagination"]["totalAvailable"]):
            break
        page_number += 1
    return all_workbooks

def download_workbook(auth_token, site_id, workbook_id, name, folder):
    """Download a workbook as .twbx file."""
    url = f"{TABLEAU_SERVER}/api/{API_VERSION}/sites/{site_id}/workbooks/{workbook_id}/content"
    response = requests.get(url, headers={"X-Tableau-Auth": auth_token}, stream=True)
    filepath = os.path.join(folder, f"{name}.twbx")
    with open(filepath, "wb") as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)

Gotcha we discovered: Some .twbx files downloaded from the REST API are actually plain XML — not ZIP archives. The file has a .twbx extension but starts with <?xml. Our pipeline handles this gracefully:

try:
    with zipfile.ZipFile(path, "r") as z:
        # Extract .twb from ZIP
        for name in z.namelist():
            if name.endswith(".twb"):
                twb_xml = z.open(name).read()
except zipfile.BadZipFile:
    # Fallback: file is plain XML disguised as .twbx
    with open(path, "rb") as f:
        header = f.read(10)
    if header.startswith(b"<?xml") or header.startswith(b"<work"):
        twb_xml = open(path, "rb").read()  # Read as raw XML

Step 2: Parse TWB XML — The Core Engine

A .twb file is XML. Every datasource connection, every SQL statement, every table reference is embedded in the XML structure. We parse it to extract five types of dependencies:

Type 1: One-Time SQL

Stored as a one-time-sql attribute on <relation> elements:

 <connection application-intent='false' authentication='sspi' class='sqlserver' dbname='db_name' minimum-driver-version='SQL Server Native Client 10.0' multi-subnet-failover='false' odbc-native-protocol='yes' one-time-sql='Declare @year int&#10;set @year=year(GETDATE())&#10;--select @year,@year-1&#10;&#10;select lID,&#10;&#9;&#9;SID,&#10;&#9;&#9;Level1_Name,&#10;&#9;&#9;Level3_Name,&#10;&#9;&#9;rhc.Level4,&#10;&#9;&#9;rhc.Level6_Name ,&#10;&#9;&#9;rhc.fileId &#10;&#10;&#9;&#9;into #Hierarchy&#10;from&#10;&#9;&#9;sales (NOLOCK)&#10;&#9;LEFT JOIN orders rhc(nolock) on&#10;&#9;

for rel in root.iter():
    if strip_ns(rel.tag) == "relation":
        one_time = rel.attrib.get("one-time-sql", "")
        if one_time.strip():
            found.append(("One-time SQL", one_time))

Type 2: Custom SQL Queries

Text content inside <relation name="Custom SQL Query" type="text">:

<relation name="Custom SQL Query" type="text">
    SELECT e.StudentID, e.ProgramCode, d.AcademicYear
    FROM dbo.program e
    JOIN dbo.date_ay d ON e.DateKey = d.DateKey
</relation>

for rel in root.iter():
    if strip_ns(rel.tag) == "relation":
        name = rel.attrib.get("name", "")
        if name.lower() == "custom sql query":
            sql_text = (rel.text or "").strip()
            if sql_text:
                found.append(("Custom SQL", sql_text))

Type 3: Relationship Tables (Direct Table Bindings)

When a workbook uses a direct table connection (drag-and-drop in the data source pane), it appears as:

<relation type="table" table="[dbo].[orders]" connection="sqlserver.abc123">

We resolve the connection attribute back to the database name using a lookup map we build from <named-connection> elements:

# Build connection → database lookup
conn_map = {}
for nc in root.iter():
    if strip_ns(nc.tag) == "named-connection":
        nc_name = nc.attrib.get("name", "")
        for child in nc:
            if strip_ns(child.tag) == "connection":
                conn_map[nc_name] = {
                    "dbname": child.attrib.get("dbname", ""),
                    "server": child.attrib.get("server", ""),
                }

# Extract table references
for rel in root.iter():
    if strip_ns(rel.tag) == "relation" and rel.attrib.get("type") == "table":
        table = rel.attrib.get("table", "").replace("[", "").replace("]", "")
        conn_ref = rel.attrib.get("connection", "")
        db = conn_map.get(conn_ref, {}).get("dbname", "")
        full_name = f"{db}.{table}" if db else table
        found.append(("Relationship Table", full_name))

Step 3: Parse Table Names from Initial SQL

Initial SQL often contains complex T-SQL with CREATE TABLE, INSERT INTO, EXEC, FROM, and JOIN references. We use regex to extract every table name:

def extract_tables_from_sql(sql_text):
    """Parse SQL and extract all referenced table names."""
    table_pattern = r'[\[\"]?[A-Za-z0-9_#@]+[\]\"]?(?:\.[\[\"]?[A-Za-z0-9_#@]+[\]\"]?){0,2}'
    
    patterns = [
        rf'\bFROM\s+({table_pattern})',
        rf'\bJOIN\s+({table_pattern})',
        rf'\bINTO\s+({table_pattern})',
        rf'\bUPDATE\s+({table_pattern})',
        rf'\bCREATE\s+TABLE\s+({table_pattern})',
        rf'\bEXEC(?:UTE)?\s+({table_pattern})',
        rf'\bMERGE\s+(?:INTO\s+)?({table_pattern})',
    ]
    
    found = set()
    for pattern in patterns:
        for match in re.findall(pattern, sql_text, re.IGNORECASE):
            clean = match.replace("[", "").replace("]", "")
            if clean.upper() not in SQL_KEYWORDS:
                found.add(clean)
    return sorted(found)

def classify_table(name):
    if name.startswith("#"):
        return "Temp Table"
    elif name.startswith("@"):
        return "Table Variable"
    else:
        return "Permanent Table"

This catches dependencies like:

dbo.sales (permanent table — invisible to Metadata API)
#temp_staging (temp table — created and consumed within the session)
@report_params (table variable)

Step 4: Export to Formatted Excel

Everything is exported to an Excel workbook with two sheets:

Sheet 1 — SQL Inventory: Every SQL statement and table reference, mapped to its workbook and dashboard.

Sheet 2 — Table References: Every actual database table involved, with the type of reference and the source datasource.

Getting Started

The full pipeline requires:

Python 3.x with requests, pandas, openpyxl, and standard library modules
A Tableau Personal Access Token (PAT) with site-level access
The Tableau REST API (v3.15+) and Metadata API endpoints

The pipeline is project-agnostic — change the TARGET_PROJECT_NAME variable and re-run to scan any project on your Tableau site.

Impact Analysis for Dashboards with SQL Code in Initial SQL — What Tableau's Lineage Can't See

Venkatesh Eshwarappa

What Is Impact Analysis in Tableau?

How Impact Analysis Is Typically Done

1. Tableau Server / Tableau Cloud UI & Built-in Options

Tableau Server — PostgreSQL Repository

Tableau Cloud — GraphiQL via External Assets

REST API

2. Tableau Metadata API (GraphiQL)

3. The Blind Spot: Initial SQL (Invisible to All Built-in Tools)

Architecture Overview:

Step-by-Step Walkthrough

Recommended by LinkedIn

Step 1: Authenticate and Download Workbooks via REST API

Step 2: Parse TWB XML — The Core Engine

Type 1: One-Time SQL

Type 2: Custom SQL Queries

Type 3: Relationship Tables (Direct Table Bindings)

Step 3: Parse Table Names from Initial SQL

Step 4: Export to Formatted Excel

Getting Started

More articles by Venkatesh Eshwarappa

Others also viewed

Azure SQL vs. Fabric SQL Database: Pros, Cons, and Use Cases

Mastering SQL Performance: 30 Essential Tips with Best Practices and Limitations

How many databases can you handle?

Power Query (The tremendous and fantastic gift from Microsoft)

The Future of SQL: Why It Remains Essential in Data Analytics

Snowflake vs SQL Server

Azure Synapse Link for SQL (Preview)

Best Practices for SQL Database in Fabric

How to Enable SQL Insights (preview) to monitor your SQL deployments

Best SQL Databases for Analytics

Explore content categories

What Is Impact Analysis in Tableau?

How Impact Analysis Is Typically Done

1. Tableau Server / Tableau Cloud UI & Built-in Options

Tableau Server — PostgreSQL Repository

Tableau Cloud — GraphiQL via External Assets

REST API

2. Tableau Metadata API (GraphiQL)

3. The Blind Spot: Initial SQL (Invisible to All Built-in Tools)

Architecture Overview:

Step-by-Step Walkthrough

Recommended by LinkedIn

Step 1: Authenticate and Download Workbooks via REST API

Step 2: Parse TWB XML — The Core Engine

Type 1: One-Time SQL

Type 2: Custom SQL Queries

Type 3: Relationship Tables (Direct Table Bindings)

Step 3: Parse Table Names from Initial SQL

Step 4: Export to Formatted Excel

Getting Started

More articles by Venkatesh Eshwarappa

Anomaly Detection in Tableau

Others also viewed

Azure SQL vs. Fabric SQL Database: Pros, Cons, and Use Cases

Mastering SQL Performance: 30 Essential Tips with Best Practices and Limitations

How many databases can you handle?

Power Query (The tremendous and fantastic gift from Microsoft)

The Future of SQL: Why It Remains Essential in Data Analytics

Snowflake vs SQL Server

Azure Synapse Link for SQL (Preview)

Best Practices for SQL Database in Fabric

How to Enable SQL Insights (preview) to monitor your SQL deployments

Best SQL Databases for Analytics

Explore content categories