Analyzing CSV Data with NumPy

3mo Edited

𝐃𝐚𝐲 15 | 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today’s focus was on analyzing structured data from a CSV using NumPy, with emphasis on slicing and conditional logic. ✔️ Imported CSV data using genfromtxt() and transposed the array ✔️ Extracted names, ages, and gender into separate arrays using slicing ✔️ Filtered records based on age conditions ✔️ Counted subsets of data using Boolean logic ✔️ Isolated records by gender and calculated group averages ✔️ Computed average ages for specific names Key takeaway: with proper slicing and conditions, NumPy can efficiently handle real datasets and answer practical analytical questions without higher-level libraries. Day 15 complete. Momentum continues. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #ostinatorigore

To view or add a comment, sign in

More Relevant Posts

Perseverance Ebah
3mo
Report this post
𝐃𝐚𝐲 22 | 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today was about doing the unseen but critical work that makes analysis reliable: preprocessing. ✔️ Counted missing values across columns to understand data quality ✔️Compared two strategies for handling missing data: dropping vs. imputing with column means ✔️Updated existing data by adding new attributes and reshaping it from wide to long format ✔️Used melt() to make the dataset more analysis-friendly ✔️Applied conditional filtering with where() to isolate valid records ✔️Standardized column headers for consistency and readability Key insight: preprocessing decisions directly shape the quality of insights you can extract. How you handle missing values, structure data, and standardize formats often matters more than the analysis itself. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #ostinatorigore
Like Comment
To view or add a comment, sign in
Perseverance Ebah
3mo
Report this post
𝐃𝐚𝐲 18 | 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today’s focus was on working with real-world datasets and feature engineering using Pandas. ✔️ Imported and explored a running dataset from CSV ✔️ Created derived columns by converting units (km to miles) ✔️ Calculated speed using existing data ✔️ Merged multiple datasets to enrich the analysis ✔️ Transformed time data using apply() ✔️ Filtered and reshaped data using Series.map() ✔️ Identified the runner who covered the longest distance Key takeaway : Transforming data with functions like apply() or map() allows for consistent calculations across the dataset. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #ostinatorigore
Like Comment
To view or add a comment, sign in
Perseverance Ebah
3mo
Report this post
𝐃𝐚𝐲 20 | 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today’s focus was on exploring data and visualizing insights using Pandas and Matplotlib. ✔️ Created a DataFrame to organize product data ✔️ Identified the most profitable product and visualized it with a bar plot ✔️ Determined the least profitable product and calculated the profit difference ✔️ Plotted costs and profits across all products using a line chart ✔️ Calculated average cost and average profit per product Insight: using Pandas for both analysis and quick visualizations, alongside Matplotlib for more detailed plots, makes it easier to interpret data and communicate insights effectively. Day 20 complete. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #ostinatorigore
Like Comment
To view or add a comment, sign in
Perseverance Ebah
2mo
Report this post
𝐑𝐮𝐧𝐧𝐞𝐫𝐬 𝐀𝐧𝐝 𝐈𝐧𝐜𝐨𝐦𝐞 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐃𝐚𝐲 35: 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today’s work focused on cleaning and analyzing a combined runners and income dataset using pandas and NumPy. ✔️ Inspected dataset structure, shape, and missing values ✔️ Handled NaNs by dropping empty rows and imputing remaining values ✔️ Used describe() to summarize data and extract key statistics ✔️ Calculated total miles run using NumPy operations ✔️ Filtered individuals based on income thresholds ✔️ Created and exported a clean subset of the data for reuse This session reinforced the importance of data inspection, basic preprocessing, and targeted filtering before moving into deeper analysis. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #SQL #Learning #ostinatorigore
Like Comment
To view or add a comment, sign in
Perseverance Ebah
3mo
Report this post
𝐃𝐚𝐲 16 | 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today’s focus was on Pandas Series and basic DataFrame creation, exploring how Series can simplify analysis and preparation of data. ✔️ Created a Pandas Series and counted the occurrences of each item ✔️ Checked for the presence of specific values in the Series ✔️ Extracted all unique values from the Series ✔️ Updated the Series by inserting new items at specific indices ✔️ Converted the Series into a DataFrame and inspected its shape and dimensions Key takeaway: Pandas Series provide a flexible structure for handling labeled data, and converting them to DataFrames allows for more advanced analysis. Day 16 complete. Building fluency with Pandas step by step. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #ostinatorigore
4 Comments
Like Comment
To view or add a comment, sign in
Perseverance Ebah
3mo
Report this post
𝐃𝐚𝐲 17 | 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today’s focus was on creating and modifying Pandas DataFrames, a core skill for cleaning, transforming, and querying structured data. ✔️ Created a DataFrame from a nested list with labeled columns ✔️ Accessed and updated specific values using the .at accessor ✔️ Built a DataFrame from real-world style records ✔️ Used .loc to retrieve and modify values based on labels ✔️ Filtered the DataFrame to extract subsets of data ✔️ Identified the oldest individual within a filtered group Key takeaway: precise indexing and modification are what make DataFrames powerful tools for real analytical workflows. Day 17 complete. Steady progress, stronger foundations. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #ostinatorigore
Like Comment
To view or add a comment, sign in
Perseverance Ebah
2mo
Report this post
𝐓𝐢𝐦𝐞 𝐒𝐞𝐫𝐢𝐞𝐬 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐃𝐚𝐲 43: 50 𝐃𝐚𝐲𝐬 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 𝐰𝐢𝐭𝐡 𝐏𝐲𝐭𝐡𝐨𝐧 Today’s work focused on exploring time series data by resampling, visualizing trends with line and bar plots, and applying a rolling average to smooth short-term fluctuations and highlight longer-term patterns. 𝐎𝐬𝐭𝐢𝐧𝐚𝐭𝐨 𝐑𝐢𝐠𝐨𝐫𝐞 #Python #NumPy #DataAnalysis #DataScience #MachineLearning #ArtificialIntelligence #DataAnalytics #LearnInPublic #GitHub #Data #TechCommunity #DailyPractice #Consistency #DataDriven #50_days_of_data_analysis_with_python #SQL #Learning #ostinatorigore
Like Comment
To view or add a comment, sign in
Obokparo Esibe
3mo
Report this post
I recently updated a script for a rock-paper-scissors game to log every game outcome to a CSV file and conducted a small exploratory analysis using pandas. Here’s what I examined: - Total rounds played - Win counts and win percentages - User choice distribution This process reflects a fundamental data workflow: data generation → storage → analysis. Although it's a small dataset, the emphasis was on developing the habit of transforming raw events into structured, analyzable data. You can find the updated repository on GitHub: https://lnkd.in/ea3fxBbi #DataScience #DataAnalytics #Python #Pandas #LearningInPublic #GitHub
1 Comment
Like Comment
To view or add a comment, sign in
Anuj Kumar
3mo
Report this post
Started exploring Pandas today and it finally clicked why it’s such a core tool in data work. I worked with Series and DataFrames, created structured data from lists and dictionaries, and then moved on to reading real data from a CSV file. Filtering rows based on conditions, adding derived columns, and calculating aggregates like mean salary made the data feel alive, not just rows and columns. What stood out most was handling real-world messiness — grouping data to compute total sales per product and dealing with missing values using isnull() and fillna(). These are the exact steps that turn raw data into something usable for analysis and decision-making 📊 Still early, but this feels like a solid transition from pure Python into practical data handling. #Python #Pandas #DataAnalysis #LearningInPublic #DataEngineeringBasics
Like Comment
To view or add a comment, sign in
Mohammad Sartawi
2mo
Report this post
📊 Just completed a comprehensive Data Analysis project building custom Python functions for statistical analysis! Built two powerful tools: - quantDDA() - Extended descriptive statistics with 15+ metrics including outlier detection, skewness, and kurtosis - vizDDA() - Automated visualization grids with smart plot selection and missing data heatmaps Applied them to real-world datasets (restaurant tipping patterns & Titanic passenger data), uncovering interesting insights: ✓ Identified systematic missingness patterns (77% in Cabin field, 21% in Age) ✓ Detected heteroscedasticity in tipping behavior across party sizes ✓ Strong correlation between bill amount and tips Tech stack: Python | pandas | NumPy | SciPy | matplotlib | seaborn The framework is reusable for any dataset - perfect for initial exploratory data analysis before modeling. Check out the code: https://lnkd.in/eQ85zTcP #DataAnalysis #Python #Statistics #DataScience #MachineLearning #DataVisualization #EDA

GitHub - MoeSartawii/Quantitative-Visual-DDA-Python: Custom Python functions for comprehensive Descriptive Data Analysis (DDA) with quantitative statistics and visualization grids. Analyzes restaurant tipping patterns and Titanic passenger data. github.com

3 Comments
Like Comment
To view or add a comment, sign in

972 followers

139 Posts

View Profile Follow

Analyzing CSV Data with NumPy

More Relevant Posts

Explore related topics

Explore content categories