Pandas Cheat Sheet for Data Analysis Beginners

🐼 Ultimate Pandas Cheat Sheet for Data Analysis (Beginner → Intermediate) If you're learning Data Analysis, Pandas is your strongest weapon. Here’s a structured cheat sheet I’m building while learning: 🔹 Import / Export Data • read_csv(), read_excel(), read_sql() → load datasets • to_csv(), to_excel() → export cleaned data • read_json() → handle API data 🔹 Inspect Data • head(), tail() → preview rows • sample() → random data check • shape → dataset size • columns → list of column names • info() → data types + null values • describe() → stats summary 🔹 Data Cleaning (Core Skill) • isnull(), notnull() → detect missing values • fillna() → replace missing data • dropna() → remove nulls • astype() → change data types • rename() → clean column names • drop_duplicates() → remove duplicates 🔹 Column Operations • df['col'] → select column • df[['col1','col2']] → multiple columns • apply() → custom functions • map() → transform values • value_counts() → frequency count 🔹 Filtering Data • df[df['col'] > value] → basic filtering • & (and), | (or) → multiple conditions • isin() → filter multiple values • query() → SQL-like filtering 🔹 Sorting Data • sort_values(by='col') • ascending=False → descending order • sort by multiple columns 🔹 Grouping & Aggregation • groupby() → split data into groups • agg() → multiple operations • sum(), count(), mean() • pivot_table() → advanced summaries 🔹 Merge & Join (Very Important) • merge() → combine datasets • join(), concat() → combine tables • inner, left, right joins → real-world usage 🔹 String Operations • str.lower(), str.upper() • str.replace() • str.contains() → filtering text 🔹 Date & Time • to_datetime() → convert to date • dt.year, dt.month → extract features 🔹 Visualization • plot.line(), bar(), hist() • scatter() → relationships • boxplot() → outliers • kde() → distribution 🔹 Performance Tips • Use vectorized operations (avoid loops) • Use .loc[] and .iloc[] properly • Work with smaller samples for testing 🎯 What I’ve learned so far: • Data cleaning takes most of the time • Understanding data > writing complex code • Real datasets teach more than tutorials • Consistency is the real key Still learning, but building step by step. If you're learning Pandas — save this for later. #datascience #dataanalysis #python #pandas #learning #students

To view or add a comment, sign in

Explore content categories