From EDA to Impact: Understanding Variable Interactions Before Modeling

Suhasini S

Published Mar 25, 2025

Having a big dataset gives you the strength to build model reliability and generalize patterns across diverse populations. A reliable model depends on having enough data, ensuring high data quality, and including relevant features that truly reflect the problem space. Large volumes of clean, well-structured data support the model in learning real-world variability and making robust predictions. However, when the goal is to explore how different variables interact without predefined outcomes, analyzing a smaller, focused dataset allows for clearer pattern discovery and deeper insight into relationships among metrics.

Hence, as a foundation for my unsupervised machine learning project this semester, I began by working with a smaller, structured healthcare dataset to explore patterns in hospital operations and patient behavior. This allowed me to understand key interactions between variables such as patient demographics, wait times, satisfaction scores, and referral types—factors that are highly relevant to respiratory-related hospital visits. These early insights helped inform the features to focus on as I scale up with my team to the larger Weekly Hospital Respiratory Data (HRD) Metrics by Jurisdiction from the National Healthcare Safety Network (NHSN).

Recommended by LinkedIn

The Lab’s Strategic Edge: Connected Data, Confident…

Javeria Abid 5 months ago

Why Longitudinal Health Data Interoperability Must…

Joshua Hickey 4 months ago

Finding Meaning Within Data

Chenguang Wang 3 years ago

While working with this emergency room dataset, I uncovered two important takeaways that highlight both the challenges and potential in analyzing healthcare data:

One key insight was the presence of missing values in patient satisfaction records- Many entries had no rating provided, making it difficult to get a complete picture of patient experience. This emphasizes the importance of data imputation techniques—especially when building machine learning models where incomplete records could skew clustering or prediction results.
Another major aspect I recognized—but couldn’t fully explore in this dashboard—was the interaction between appointment types, weekdays/weekends, and department referrals-Understanding how scheduled versus walk-in appointments behave across different days and departments could reveal patterns that impact staffing needs and wait time predictions.

Two core Power BI features—Power Query for data cleaning and DAX for creating custom measures like average wait time and satisfaction scores—played a major role in shaping my analysis and insights.

Looking forward to building on these insights and learning even more as I dive deeper into the next phase of the project !

Regita Zakia 1y

Great insights. Kudos to you!

1 Reaction

Ajay Vishnu Addala 1y

I love the color palette and the way insights have been shown. Great dashboard Suhasini Singh Indeed, making sense of smaller datasets and building dashboards out of it is pretty hard. Great job 👏

See more comments

To view or add a comment, sign in

From EDA to Impact: Understanding Variable Interactions Before Modeling

Suhasini S

Recommended by LinkedIn

More articles by Suhasini S

Others also viewed

The Quality Requirement Gap: How to Align Data Visualization with Clinical Action

Statistics & Math for Analytics: A Practical Guide

The Data Dilemma: Turning Insights into Action

Data Drives Value

From my doctor to big data analytics

Leveraging Knowledge Graphs for Advanced RAG Technology in Healthcare: A Neo4j and Cypher Guide

The Power of Small Data: Everyday Insights That Transform Care

SPRINT 1: HEALTH DATA & INTEROPERABILITY - Final Week: From Health Data to Real Decisions (WEEK 8 of 8)

Big Data, Big Research?

📊 Unlocking Insights: The Power of Data Validation and Expanding Scope 🚀

Explore content categories

Recommended by LinkedIn

More articles by Suhasini S

Understanding A/B Testing: 'A Spotify-Inspired Example for a Data-Driven Strategy'

Others also viewed

The Quality Requirement Gap: How to Align Data Visualization with Clinical Action

Statistics & Math for Analytics: A Practical Guide

The Data Dilemma: Turning Insights into Action

Data Drives Value

From my doctor to big data analytics

Leveraging Knowledge Graphs for Advanced RAG Technology in Healthcare: A Neo4j and Cypher Guide

The Power of Small Data: Everyday Insights That Transform Care

SPRINT 1: HEALTH DATA & INTEROPERABILITY - Final Week: From Health Data to Real Decisions (WEEK 8 of 8)

Big Data, Big Research?

📊 Unlocking Insights: The Power of Data Validation and Expanding Scope 🚀

Similar topics

Machine Learning Models For Healthcare Predictive Analytics

EMR Data Analytics

Health Systems Data Analytics

Explore content categories