Changing the message without changing the data

Two formats, two messages. Time for a new example?

If you’ve seen me present in the last 3 years, you’ll probably have seen me show the Iraq Bloody Toll chart. Then you’ve seen me turn it upside down to create an entirely different message (full post here).

I still love showing this example to new audiences. I love seeing the light bulb go off as they realise that a data and a chart is just a method of communicating a message: facts are not neutral.

But it’s time to find a new example and for that I turn to you for help.

Have you got any other great examples of charts where the message can be transformed in as simple a way as this one? 

(Note: I’m only looking for examples that stay true to good practice. Truncating the y-axis doesn’t count!)

There are some older examples. Obama’s bikini chart was a cracker, described very well by Robert Kosara in 2012.

Do you know any others? If I can find enough, we could turn this into an entire blog post or webinar. Let me know in the comments or on Twitter.

The little of visualization design: grouping with colour

[I’m shamelessly stealing the idea for “the little of” from Andy Kirk. His series has been amazing]

Yesterday in MakeoverMonday we tackled the worldwide shipping industry. Today the Economist shared a chart about the same topic (above). I highly recommend you also read the full article as it’s about an industry trying to save itself through the use of better data.

What’s I like in this chart is the “Proposed alliances” blocks next to the company names. The primary question in the story is the size of the shipping companies. The secondary question, dealt with in the second half of the article, covers alliances. They designed the chart so you can get on with answering the primary question without the initial distraction of the alliance information.

If I’d designed this chart, I’d have probably coloured the whole bar according to the proposed alliance. I’ve mocked that up below.

coloured-bars

What’s the problem? You can’t escape the colours. You see the colours before you answer the primary question. What the Economist did was keep simple bars that allow the primary question to be answered first. Once that’s dealt with, you can examine the alliance colours and focus on the secondary question.

The perils of bubbles

Another day, and another tragic event. The events in Nice horrified us all. One excellent response is to reinforce that, should this turn out to be a terrorist attack, the individual does not represent all Muslims. The number of terrorist-minded Muslims is tiny. We can make that point using data.

This afternoon I saw the following chart tweeted by Ian Bremmer, which makes this point:

The point being made is great, but look at those circles: if you put that ISIS circle inside the Muslim circle, it’s kinda big. So I took the data from that chart, and redrew it in Tableau. Turns out, the circles were the wrong size. Here’s what the size really look like, if you draw the circles so that their AREA represents the value:

Muslim

 

Can you see the circles for Al Qaeda, ISIS, or the Taliban? They’re up there in the top right. Tiny, aren’t they?

The lesson here is that, if you’re going to use circles, size them according to the area. The people behind the original chart, MIIM Design, were trying to make a valid point, but the circle size misrepresented the sizes, which could cause confusion.

Note: I used the same numbers MIIM used in their chart, taking the upper estimate of each category. I have not done research to check the validity of these numbers. My goal is to make a point about circle sizes, not a political point about the size of different religious or terrorist groups.

Update: I replaced the original image with one with a new title. One commenter suggested, rightly, that I might have implied “Muslim” is a terrorist organisation. That absolutely was not my intention or my belief.

 

Dataviz criticism: know the author’s intentions first

Which do you prefer?
Which is most effective? Which do you prefer? Cole on the left, Steve on the right?

Stephen Few has been discussing 100% stacked bar charts. He’s opened a great debate with Cole Nussbaumer. I recommend you go and read the blog and the comments. In a nutshell, Steve prefers lines over stacked bars because they allow comparison of all data points at all places in the time series.

My problem is Steve’s assumption that Cole intended people to be able to compare all data points at all times in chart. In one of the comments, he says this:

“The information and eventual understanding that we get from a graph is more important than any impression that it provides unless that impression is all that matters.”

What if ‘that impression’ IS all that matters?

What impression do you get from this?
What impression do you get from this?

Sure, Steve’s lines allow you to do more accurate comparison of more data points. However, why should that be the only purpose of a chart? That purpose is one he imposed on Cole’s chart. It might not be the objective Cole had in mind when designing it.

Maybe Cole’s objective was to highlight the % missed only (ie the red marks on each chart). The other bar segments are secondary information, not vital to her key objective in making a 100% stacked bar. The “impression” I get from her stacked bar is that Missed Targets have increased to 42%. If her intention was that I get that impression, then it’s a success.

I might argue that if that was her intention, she could have drawn just the % missed data, and ignored the rest. However, this is at the expense of secondary information. Here’s what that could look like:

Show just the data necessary
Show just the data necessary

Steve also wanted other example of multi-segment stacked bars. I found one from my UK Election project:

% of total

I acknowledge the problems with this chart: it’s very very hard to see which organisation tweeted most about UKIP, or the Lib Dems, or any of the other central segments.

But my intention was NOT to allow comparison of every segment. My intention (as shown by the title of the view) was to highlight which organisations were tweeting most about the Conservative party. That was my prime goal. The Conservatives are the left-most segment and sorted in descending order: it’s easy to see which orgs tweeted most about the Conservatives.

Other information which can be learnt from this is secondary to my prime purpose and therefore was intentionally compromised by using stacked bars. An interactive version would add tooltips and more contextual data, allowing the curious to discover more in the chart.

I acknowledge the stacked bar isn’t perfect, but I don’t know how else I could have designed the chart so that it answered my prime intention (% of tweets about Conservatives) and allowed the viewer to see secondary information. How would you have redesigned it? If you wish, the data is here.

Remember, every visualisation is a compromise. And every visualisation has a prime intention which must be considered before critiquing it.

I wholly recommend the following posts for further reading:

 

Design Tips for Functional and Beautiful Dashboards

before after portrait

Over on the Tableau blog, we published a post on how easy it is to quickly format a key metrics dashboard. We then redesigned it in response to some valid criticisms of the original (click here). The changes made are subtle but effective. In this post I wanted to describe them to show how to maintain functionality when going minimal with their dashboards.

Example 1: the line charts

before and after line charts

What was wrong with the original? The lack of labels make things clean, but stop you learning anything about the values:

why label

I changed the following:

  • Added the x-axis headers. I’m all for hiding headers (see “How to design an axis for maximum impact”), but I think it’s wrong to hide the x-axis.
  • Dual axes provide better design control in Tableau:
    • The Sales chart is an area/line chart. This gives more definition to the top of the area.
    • The profit chart is a line/circle chart. I find this gives more definition to the points than just choosing the “All markers” on the Colour shelf:
      Dual or all Markers
  • I wanted to avoid showing a y-axis but there still needs to be some way of seeing the magnitude of the measure. For this I added a maximum reference line, with the text aligned to the right. If a user filters the dashboard, the max will always be accurate
  • All titles were changed to be left-aligned and single line. This creates better consistency across the dashboard. I aligned the Titles to the left and the Maximums to the right in order to prevent visual clutter and confusion. If they were closer together, it’s harder to tell which is which:
  • Note also that the zero line on the profit chart above is a Constant Reference Line, rather than an actual axis. I did this because it’s easier to control the formatting on a minimal chart.

Example 2: the Profit Charts

donuts

I don’t object to donut charts in all circumstances, but they are simply a bad choice when you make a donut for each Year (see p11 of Stephen Few’s classic, “Save the pies for dessert”). It’s nigh on impossible to see changes over time. I switched to a stacked area chart instead. This took up less space and allowed me to label the marks, too (removing the need for a y-axis).

Notice how the colour for Furniture is a really light salmon pink. That can be a problem against a white background, but I added borders to the marks to make them more obvious:
borders

BTW, I didn’t address a key risk with this kind of chart: what happens if there is negative profit?

Example 3: the highlight table

highlight table

I love highlight tables. However, I didn’t think this one worked for 2 reasons. First, the lack of labels creates a real problem here. Without them, it’s just a mosaic. I also was not convinced that you’d ever really want to see Sales by Week and Weekday. I changed it to Month and Day. The increase in sales towards the end of each year is now clear.

Example 4: General changes

  • Colour changes
    • I used extensive use of grey in the text in order to soften the labelling and accentuate the data marks.
    • I used more distinct colour palettes to differentiate Sales and Categories.
    • I switched from a floating layout to a grid layout. This way, Tableau controls the size of the grey borders between the charts.
    • I have to say, changing the colour was the single hardest thing on this. All the other decisions were kind of straightforward but every time I changed the colour, I felt like I’d made one thing better but something else worse. Also – the image on the Tableau blog is from before I added the borders around the marks. 
  • Layout
    • I changed the layout so that Sales-related charts are on the left, and Profits ones are on the right.

Conclusion

Would you have done something different? The dashboard is still not perfect. For example, the blue highlight table and the blue map are problematic. However, I hope this exercise has shown that it is possible to have both beauty and functionality in a Tableau dashboard.

The key lesson for me is that it’s not hard in Tableau to acknowledge feedback and then iterate fast to fix the problems.

Feel free to download the workbook and share your ideas.

Before and after...
Before and after…

Carlisle Flood Defences: the one chart I’d like to see

Carlisle flood

Carlisle, in the north of England, has been hit, again, by severe floods. I feel huge sympathy for those whose homes, again, have been damaged by floodwater. The £38 million spent on flood defences by the Environment Agency since the last major floods in 2005 is being described as ineffective. “How could all that money not have protected the houses?” they are being asked.

First of all, the EA never said the defences would protects against all future flooding.

What I’d like to see is a chart like the one above, but with real rain data. All the labelled years were genuine flooding years, but I do not know the actual rain values (apart from the 34.1cm-in-one-hour being reported for 2015). For the purposes of this, I’m using that value of 34 as the index value for 2015.

When setting a flood defence, an agency has to balance cost against the predicted maximum future level of flooding. If the EA predicted in 2005 that there would never be a 34cm deluge, then they wouldn’t waste public money building flood defences that high.

With real data in the above chart, we’d be able to see:

  1. How unprecedented is 2015? (in my chart, it’s 5cm higher than 2005)
  2. What level of rainfall are the flood defences set to protect against? (in my chart it’s higher than 2005 and 1822 flood events)
  3. What predictions do they have for future precipitation levels and how regularly will they be higher than the flood defence level?

If 2015 truly is unprecedented and wholly exceptional, then blame shouldn’t be put on the flood defences.

This post was inspired by memories of the Fukushima disaster. Then, the tsunami which hit the power station was higher than the station’s sea wall.

What do you think? Would this chart help asses the success of the Environment Agencies flood defences? What chart would help you answer the question?

NOTE: don’t forget, the data in the chart is not real. It’s for illustrative purposes only.

Is information visualisation research flawed?

From "Beyond Memorability...."
From “Beyond Memorability….

Stephen Few’s latest newsletter, “Information Visualization Research as Pseudo-Science” is a critique of the academic process in visualisation research. In it, he savages one paper in particular: “Beyond Memorability: Visualization Recognition and Recall.” He uses this as an example of what he thinks is a problem widespread in this field.

I agree there are problems in this paper. I agree with his suggestions for fixes.

However, I think it’s unfair to say this is a problem with visualisation research: it’s a problem with all research. In all fields, there are great studies and there are bad studies.

In this post, I’ll explain my own thoughts on the flaws of the paper, then the areas where I think Stephen is being unfair.

Here are my own thoughts on reading the paper (which I noted before I read Stephen’s article, as he instructed):

1. Stop publishing 2-column academic papers online!

Not the way to read papers online!
Not the way to read papers online!

Why are academic papers STILL written in two columns? This is ridiculous in a time when most consumption is on screen. To read a 2-column PDF on my phone or tablet I need to do ridiculous down-up-right-down-left scrolling to follow the text. Come on academia: design for mobile!

2. Why are they measuring memorability?

I agreed with Stephen on the key problem: why are they measuring memorability? Isn’t it more important to understand the message of a visualisation?

3. Hang on, Steve! Problems with experimental technique are not unique to visualisation research

Stephen goes to town dismantling the study’s approach. For example, he criticises the small sample size and much of its methodology. I am not as expert as Stephen in this, but I find myself agreeing with most of this.

But where I differ is how he damns visualisation research as if the rest of research doesn’t have the same problems.

Let’s look at some:

i. Statistical unreliability

There are no shortage of academics papers with statistical problems caused by small samples. Here’s one on fish oil, dismantled by Ben Goldacre. Incidentally, the study he refers to also used 33 subjects.

He also outlines a statistical anomaly so extreme, that half of all neuroscience studies are statistically wrong.

Conclusion? Statistical problems are not unique to visualisation research.

ii. Methodological misdirection

How many of the 53 landmark studies in cancer had results that could be replicated? 6.

Yes, 89% of landmark cancer studies have results which cannot be replicated. (source: this great article “When Science Goes Wrong” from The Economist)

Conclusion? Methodological problems exist in all science.

iii. Logical fallacies

Logical fallacies are hardly unique to visualisation research. For example, this list of the top 20 logical fallacies is a good example of how this is a problem in all science, not just visualisation research.

Part of this critique is surely just part of scientific rigor?

For a conclusion, I acknowledge that I’m not an academic and I don’t read many academic papers, so I am naive.

Part of me thinks that surely lots of this critique is just part of scientific research? Researchers publish papers and the world responds, positively and negatively. Future research then improves.

I assume Stephen’s frustration stems from the fact that many of these problems are perpetual and should have been fixed before the study started. I can’t disagree with that. But I don’t think the paper is “fundamentally flawed” as Stephen describes. Maybe memorability of the view is important? If so, this is a first step in the iterative, slow advance of academic research. The paper at the very least makes us consider the question of what it’s important to remember from looking at a visualisation. Having read it critically, I have considered the question and formed an opinion. That’s of value, surely?

I found it very interesting to sit and really read an academic paper in detail. I don’t do it often, and I respect people who can wade through the dense formulaic wording to get to the meaning.

[Updated 5pm 3 Dec to expand my summary]

 

What’s YOUR elevator pitch for data visualisation?

If you had to pitch dataviz, what would you say?

In this week’s #AskAndy Anything About Data, Chris Love (@chrisluv) asked myself and Andy Kirk the following:

A great question to ask, but tougher to answer than you’d think. Andy K and I both gave our answers (go check the webinar recording to hear them). I then threw it back to Chris. He gamely took on the challenge. Here it is:  

Nice work, Chris.

NOW IT’S OVER TO YOU: what is YOUR elevator pitch for data visualization. Give yourself 30 seconds to 1 minute. How do you pitch data visualization?  Tweet your efforts to me at @acotgreave. Hashtag: #datavizElevatorPitch

2853282
Image from LinkedIn Pulse

#AskAndy webinars: the director’s cut

Hello again. We didn’t manage to answer all the questions that were submitted in yesterday’s #AskAndy webinar. But don’t worry, we collected them all, and I’ve written some brief answers to each one.

Apurvaa Vijay Sharada: Do you think at some point the beauty of dataviz overtakes the actual intent behind it? If yes, how do we avoid that pitfall?

Can you have functionality and beauty in dataviz?
Can you have functionality and beauty in dataviz?

Yes – this can happen. You need to remember the purpose of your visualization. Always take a step back and ask if it’s actually possible to discern the meaning from the visualisation. Ask other people if they understand your viz. Adding beauty will make it more engaging, but unless you’re actually making data art, keep checking that the view actually works.

Melissa Black: What do you think are the most common mistakes made with visualisations?

Thinking of beauty before functionality! Many people get enticed by whizzy effects BEFORE they’ve learnt the basics about effective visual communication. On a practical note, I see many visualizations with disappointing titles. The title and caption are a chance to set out the exact purpose of the visualization. It takes moments to think about something appropriate and it can make a big difference to how it is interpreted.

Paul Banoub/Nicholas Bignell: Have you noticed any variations in dataviz trends / styles / usage across different global regions?

Great question. You know what, I don’t think I have. However, I do remember seeing a post somewhere on the blogosphere about this recently, but cannot find it. If anyone has that link, it was a useful read.

Imogen Robinson: Hi all, my question is: how can data visualisations be made more effective in informing decision-making?

The titles tell the storyThe titles tell the story

  1. Make the titles of your views questions. Eg “Should we invest in Singapore?” The question reflects the decision you’re trying to reach. It also forces you to look at your view and ask yourself – does the view help me decide. If not, you have the wrong view.
  2. Use reference lines to show where thresholds are being beaten – anything above a line, for example, needs attention. You can also use simple colour schemes for this. eg grey for most things, and red for those that need action.
  3. Use scatterplots as a way of comparing measures. This helps you determine if one thing causes another.

Peter Wallis: Could you point us towards some other good blogs to help newbies?

I use and recommend Feedly for blog reading. You can import this OPML file into feedly. It’s a list of all the Tableau and BI blogs I read.

Nicholas Bignell: Who are your favourite bloggers?

The Tableau blogging scene is huge (just check any month’s Best of Tableau Web roundups)

I’m more enthused by some of the amazing dataviz podcasts at the moment. Go check them out.

Suzanne Wilson: How do you define the relationship btw data, info, knowledge & intelligence?

I think there’s value in thinking of a wisdom funnel (this isn’t anything new – check out Google) but I have an aversion to funnels as metaphors: they suggest a linear process and a single direction. I’m more keen to think of data, info, knowledge and intelligence as part of a cycle of visual analysis. A bit of knowledge might make you seek out more data. More data gives you info, but only by sharing and acting with others do you get the intelligence

Anton Lokov: What is your favourite tool for rapid dataviz prototyping?

Why, Tableau Desktop, of course! But seriously, I don’t think anything is as good for exploring data quickly to prototype views.

Anil Mistry: Following on from Biel’s question, I am new to Tableau, are there any specific beginner books you’d reccomend?

A great new book is Storyelling With Data by Cole Nussbaumer. It covers the basics. I really enjoyed Information Dashboard Design by Stephen Few, it’s a fantastic introduction. For starting with Tableau, try Communicating Data With Tableau by Ben Jones.

Auzema Qureshi: What was/is the biggest challenge you have faced when using Tableau for data visulisation?

I remember doing a live “Analysts in the Hot Seat” at a Tableau conference in 2011. We were given a dataset, which we’d not seen before, and asked to explore it and build a dashboard in front of an audience. It wasn’t a great conference session. I spent 10 minutes semi-randomly dragging and dropping things around some worksheets, and duplicating lots of things. Then I realised that, uninspiring as it was to watch, this was actually EXACTLY why Tableau’s so amazing – I was throwing things around to see what stuck. What had so far felt embarrassing (“Really? This is how Andy Cotgreave plays with data?”) was actually the power of Tableau – drag, drop, fail fast, move on. After the 11th minute, all that exploration began to gel into something really powerful and I build a cool dashboard.

What charts would make your 5-a-side squad?

I just got off the fun webinar with Andy Kirk: AskAndy Anything About Data. It was fun to field questions from social media. I hope you all enjoyed it.

One of the questions was from Andy Kriebel:

We decided to take Tableau’s Show Me charts and do a squad selection 5-a-side. Andy picked one, then I picked from the remainder. Here’s the “squads” we ended up with, in order we picked them (ie Andy K picked line chart, I picked bar chart, he picked treemap, etc)

Which 5 would you choose?
Which 5 would you choose?

I was surprised Andy K took the treemap as second choice. Andy suggested that maps are overrated (shock!). We both agreed that boxplots and bubbles could be happily left on the bench.

What do you think? If you could only use 5 chart types ever again, what would they be?

The webinar recording is available here.