EXPLORATORY DATA ANALYSIS (EDA) in Data Work
I have come to strongly advocate for something dear to my heart. EXPLORATORY DATA ANALYSIS (EDA) in data work. Ever heard of it and wondered what it really is? It’s one thing to know how to use tools and software. It’s another to appreciate the process data must go through to produce reliable insights. EDA is a step many people either are not aware of or choose to rush. But first things first. Data analysis does not start with dashboards but with curiosity, skepticism, and discipline and that’s where EDA comes in. So, what is EDA?Technically, EDA is an approach to doing due diligence on your data before drawing conclusions. It involves understanding the structure of your data Identifying errors, gaps, and inconsistencies, checking data quality (missing values, duplicates, outliers), Summarizing patterns and trends and leading to comparing groups and most importantly exploring relationships before Visualizing data clearly and honestly. Unlike traditional hypothesis testing where we choose a model before seeing the data, EDA comes before modeling and may or may not even involve statistical modeling at all. Without EDA, Averages will lie and Outliers will distort reality. So, EDA asks things like Does this data make sense? What is missing? What is unusual? A simple example When analyzing mobile data usage, EDA might reveal insights like:>Most users consume small amounts of data and a few heavy users inflate the average>Some regions consistently use more data>High usage doesn’t always mean high revenue From this, the company can decide to Design affordable low-data bundles instead of overpricing plans most users won’t use. Report median usage, not only averages, for more honest performance tracking and Prioritize network investment and marketing in high usage regions. EDA doesn’t aim to prove causation, but it lays a foundation for reliable and responsible analysis. (we’ll talk about correlation vs causation in a later post as well as other data analysis approaches and when to use what), If we work with data without prioritizing EDA, we are just guessing. That’s the difference between analysis and analytics. Has this been insightful? Hi, I'm Faith, on a journey to being a data professional who makes analysis not just beautiful, but truthful for Holistic learning and strategic action, LET'S CONNECT📩. For collaborations, consulting, or engagement: follow my page, DM, or email faithalimo.official@gmail.com.
Good read it was. FAITH ALIMO , the example made it more insightful.
Wonderful piece. Thank you Faith.