🚢 Moving Beyond the “Iceberg” Narrative: A Strategic EDA of the Titanic I recently completed a deep-dive Exploratory Data Analysis (EDA) on the Titanic dataset as part of my Data Science coursework—and I challenged myself to go beyond surface-level insights. Instead of relying on automated tools, I followed a fully manual, interpretation-driven workflow using Pandas, NumPy, Matplotlib, and Seaborn, ensuring that every visualization was backed by meaningful analysis—not just code. 🔍 What made this project different? 🔹 Data-Centric Thinking, Not Just Plotting Every chart was accompanied by a written interpretation, focusing on why patterns exist—not just what they show. 🔹 Strategic Data Cleaning & Feature Engineering Imputed missing age values using group-based medians (Sex + Class) Engineered features like family_size, travel_group, and age_group to uncover behavioral patterns Removed high-missing columns (e.g., deck) to preserve statistical integrity 🔹 Key Insight: The “Large Family” Penalty A powerful multivariate pattern emerged: 👉 Small groups (2–4 members) had the highest survival rates 👉 Large families (5+)—especially in 3rd class—faced near-zero survival This highlights how logistical constraints during evacuation can outweigh even strong social bonds. 🔹 Beyond “Women and Children First” By analyzing survival across class, gender, and age simultaneously, I found that this narrative does not hold equally across all passenger classes—revealing deeper socio-economic inequalities. 🔹 Narrative-Driven Visualization Created an annotated storytelling chart to communicate insights clearly—applying principles from data journalism, not just analytics. 🔹 Interactive Dashboard Development Built a dynamic dashboard using Streamlit to transform static EDA into an interactive decision-support tool with real-time filtering and KPIs. 💡 Key Takeaway: Data visualization is not about creating charts—it’s about generating insight. This project reinforced that the real value of EDA lies in asking better questions and uncovering the hidden stories within the data. 🔗 Explore my work: 📂 GitHub Repository: https://lnkd.in/dVrijnKR ✍️ Medium Blog: https://lnkd.in/dp67aH9T #DataScience #EDA #Python #DataVisualization #MachineLearning #Streamlit #Analytics #Seaborn #Matplotlib #TitanicDataset

To view or add a comment, sign in

Explore content categories