5 Examples of Awful Data Visualization
Sylvain Guiheneuc - unsplash.com/sylvain_guiheneuc

5 Examples of Awful Data Visualization

This Friday I’ll be giving a short presentation on data visualization (alongside some top notch speakers) at an event co-hosted by General Assembly and Keboola. Tickets are still available and if you’re in Singapore you should stop by. The event starts at 7PM and is free! You can register here.

In anticipation of the event I’ve been thinking a lot about data visualization, design principles and storytelling. I love my job because I get to spend a fair amount of my time thinking about creative ways to communicate through data. To get my creative juices flowing I often look for inspiration in a few different places, including but not limited to Nathan Yau’s Flowing Data and David McCandless Information is Beautiful. Yau and McCandless are both leaders in this field who create and curate some of the best examples of data visualization you can find on the web today. But beyond their craft they are also educators who advance a dialogue on best practices and principles for what I like to call empirical storytelling.

Flowing Data and Info is Beautiful can be great sources of inspiration if you're on the lookout for beautiful, creative and cutting edge data visualization. But sometimes I also like to draw a little inspiration from the worst examples of dataviz. These are the kinds of charts and infographics that ignore every basic rule and design principle when it comes to visualising data. From the deceptive to the confusing to the downright ugly monstrosities created in the name of statistics, sometimes it’s the lessons you learn from failure that are the most impactful.

Enter WTF Visualizations, a fabulous Tumblr blog that curates a collection of the most sinful dataviz blunders around. It’s as informative as it is amusing, and I thought it would be fun to take a look at a few recent WTF Viz submissions and break down what, exactly, makes them such a strain to both the eyes and the mind.

1 – Misleading labels and headers

No alt text provided for this image

What is it?

This is a snippet of a full graphic created by MPH Today, and is based on a recent peer reviewed article which analyzed “79 studies on the effects of stress and the human body”.

What’s wrong with it?

Of the 5 examples we’ll run through today this is probably the least sinful of the group. Design wise I actually think the graphic looks ok, though it has a little too much copy for my liking. That said, there is a problem with the section shown above, particularly the column titled Relationship. The horizontal bar chart is showing the volume of something, in this case, the occurrence of each symptom relative to the workplace stressor. Sure, there is a relationship between the symptom and the stressor, but labeling the column header as relationship is both confusing and misleading. The bar chart is either showing the total occurrences (in volume) or the frequency at which the symptom occurs, represented as a % of the total sample. I don’t know which because the graphic doesn’t tell me (and I couldn’t check because the journal article is behind a pay wall). But either way, the column title should clearly state the unit of measure (e.g. count, sum of, % of, etc) so the reader can easily understand what was measured and how to interpret it.

What Chart should they have used?

In this case the horizontal bar chart was the right choice, but always remember to clearly and meaningfully label your chart or table axis and headers.

2 - Charts within charts

No alt text provided for this image

What is it?

Now that we’re warmed up let’s jump right into the deep end. Here’s an example of data visualization gone wrong, terribly wrong. This graphic was created by a company named JBH, who by the way, create infographics for a living. I hate to name and shame, but seriously, if you’re going to tout infographic production as a core offering you need to understand the basic principles of data visualization and design. What this graphic is showing is the “State of Social Media Marketing in 2015”, which includes a range of stats related to social media network usage and behviour. The full graphic can be viewed here.

What’s wrong with it?

Honestly I had to stare at this graphic for about 5 minutes before I understood what was happening, and I'm still not sure I get it. The most problematic part of the graphic is the section shown above. It’s downright confusing.

The first problem is that they’ve presented a volume metric (Total Users) as a ratio metric (i.e. 99.48%), and it’s unclear to me as to what the data is showing here. Is this the % of total users who access each app on an Android device? If so, the only interpretation I can derive from this is that 99.48% of users of the YouTube mobile app in the USA are using an Android smartphone. But intuitively this can’t be true. I mean, surely more than 0.52% of YouTube app users in the U.S. are on iOS. Apple has a marketshare of roughly 43.6% in the U.S. and YouTube mobile is a popular app, so this just doesn’t seem possible. Which means that a) their data is wrong, b) they have twisted the interpretation of this so far it’s impossible to read, or c) I’m completely misreading this. But with this statement – “According to data for the USA from SimilarWeb, the share of total Android users was” – I’m just not sure how else this graphic can be read.

But the confusion doesn’t end there. The inner circle, which shows the % of active users, is also hugely problematic. My first question is, are active users a subset of total users? It seems logical that this should be true, and if so they’ve actually misinterpreted the data (i.e. that Twitter, Pinterest and LinkedIn have more active audiences). This graphic actually shows that YouTube and Facebook have the highest levels of activity, and I think what they’ve done is incorrectly conclude that the level of active users for Twitter, Pinterest and LinkedIn relative to the % of total users means that they have higher rates of activity, which is totally wrong. Either way, this graphic is poorly constructed and unnecessarily confusing. You shouldn't have to think this much to consume and interpret the meaning of an infographic.

What Chart should they have used?

Honestly, I don’t know where to begin. My first suggestion would be to never create a pie chart within pie chart, or any other chart type for that matter. Beyond that, there are tons of other issues with the data they’ve used and how they have presented it (e.g. volume vs ratio metrics). Simply removing the pie within a pie isn’t going to solve this, so my suggestion would be to scrap this graphic completely and start over.

3 - The parts don’t add up to a whole

No alt text provided for this image

What is it?

This was created by a U.S. based storage company named Sparefoot. The graphic above is a snippet of the full infographic which was based on a combination of U.S. census data and Gallup polls, and was intended to show how American society is changing over time with respect to household living arrangements.

What’s wrong with it?

I’m a sucker for flat design and nice typography so I almost gave this one a pass. But the data visualization sin here is common enough that it should never happen. In short, the chart creator has used multiple values that aren’t part of a whole in a single pie chart. If you look at the above graphic you can see that each pie chart is related to a state (e.g. have children, don't have children, etc), and the charts are supposed to show the change over time between 3 non-adjacent time intervals (1990, 2003 and 2013). Quick tip, if you’re attempting to show change over time a pie chart is never going to be the right choice, a line or bar chart would be better suited to the task.

Anyways, the main issue here is that the 3 data points (i.e. time intervals) aren’t part of a whole, but they've been presented as if they are. For example, the values attached to the “Have children” pie chart shows data from 3 distinct data sets, and these don't combine to make 100% of something. By presenting them in a pie chart, the creator has unintentionally changed the meaning of the numbers. You can see the difference between the actual vs charted values (what the data means in the pie chart) in the table below.

No alt text provided for this image

What chart should they have used?

Although I mentioned above that line charts are typically better suited to showing change over time, I wouldn’t recommend a line chart here as the time intervals aren’t adjacent (year over year), so a bar chart would be the best way to go.

There are 2 more wonderfully bad visualizations to cover. Want to see the them? Read the full post on my blog.

What about your LinkedIn photo as an example.. given the scruffy chap you are in person :)

To view or add a comment, sign in

More articles by Stephen Tracy

Others also viewed

Explore content categories