Anscombe's Quartet
Introduction:
In the World of Data Analytics it is a mistake to assume that statistical analysis of a Dataset can be relied upon to predict the shape of the Dataset. To be completely accurate you must graph the dataset and do the Statistical Analysis. The Graphs below show a small Dataset of X and Y values.
Statistical Analysis:
Drawing out the regression line you will get a line equal to Y=3.00 + 0.500X. The Correlation between X and Y (the Person Correlation Coefficient) is 0.816 This is very close to the value one, so you might think the data is tightly bunched together around the regression line. Then you might find the mean for x and y which is 9 and 7.5 respectively. Next you might get the Variance of the X values, which is 11 so the average distance of the X Data Points away from the mean is 3.3 (Root 11). After that you get the Variance of the Y Values which is 4.12 giving the average distance of the Points away from the mean as 2.03 (Root 4.12). You might put all these figures together in a Table which may make you pretty confident of the shape of the Graph that you are going to Sketch Out.
Unfortunately you would have sketched the wrong Graph for the Data Points.
Analysis:
The Errors in these thought processes are demonstrated by Four Graphs reproduced here called, Anscombe’s Quartet, which were created by F. J. Anscombe for his classic 1973 paper, Graphs in Statistical Analysis. All four graphs have identical (to two decimal places) statistical coefficients. However as these graphs demonstrate and here is the big takeaway, summary statistics don’t tell us everything about a Dataset.
To really understand the Dataset you must obtain the summary statistics and you must Graph the relationship of the dataset. !
-----------------------------------------------------------------------------------------------
The Graphs of the Datasets : Dataset 1 to 4
Note: That In each Dataset the Red line is the Regression Line.
Does the Quartet play any tunes? But in all seriousness it's a good reminder.