Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) is the process of visualizing and analyzing data to extract insights from it. In other words, EDA is the process of summarizing important characteristics of data in order to gain better understanding of the data set.

In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Refer to CODEC https://www.codecnetworks.com/

No alt text provided for this image

           

Purpose of EDA


  • ·        Check for missing data and other mistakes.
  • ·        Gain maximum insight into the data set and its underlying structure.
  • ·        Uncover a parsimonious model, one which explains the data with a minimum number of predictor variables.
  • ·        Check assumptions associated with any model fitting or hypothesis test.


Types of Exploratory Data Analysis


·        Univariate non-graphical

·        Univariate graphical.

·        Multivariate non-graphical

·        Multivariate graphical.

Merging Datasets

No alt text provided for this image

Graphical Representations

1.Histogram

No alt text provided for this image

2.Box Plot

No alt text provided for this image

3.Scatter Plot

No alt text provided for this image

4.Violin Plot

No alt text provided for this image

Handling Missing values

Heatmap

No alt text provided for this image

Heatmap takes a rectangular data grid as input and then assigns a color intensity to each data cell based on the data value of the cell. This is a great way to get visual clues about the data.

You can also Learn DATA SCIENCE ANALYSIS


Author- Riya Goel

Mentor-Vishwa Prabhakar Singh




 

          

To view or add a comment, sign in

Explore content categories