The little of visualization design: grouping with colour

[I’m shamelessly stealing the idea for “the little of” from Andy Kirk. His series has been amazing]

Yesterday in MakeoverMonday we tackled the worldwide shipping industry. Today the Economist shared a chart about the same topic (above). I highly recommend you also read the full article as it’s about an industry trying to save itself through the use of better data.

What’s I like in this chart is the “Proposed alliances” blocks next to the company names. The primary question in the story is the size of the shipping companies. The secondary question, dealt with in the second half of the article, covers alliances. They designed the chart so you can get on with answering the primary question without the initial distraction of the alliance information.

If I’d designed this chart, I’d have probably coloured the whole bar according to the proposed alliance. I’ve mocked that up below.

coloured-bars

What’s the problem? You can’t escape the colours. You see the colours before you answer the primary question. What the Economist did was keep simple bars that allow the primary question to be answered first. Once that’s dealt with, you can examine the alliance colours and focus on the secondary question.

Design Tips for Functional and Beautiful Dashboards

before after portrait

Over on the Tableau blog, we published a post on how easy it is to quickly format a key metrics dashboard. We then redesigned it in response to some valid criticisms of the original (click here). The changes made are subtle but effective. In this post I wanted to describe them to show how to maintain functionality when going minimal with their dashboards.

Example 1: the line charts

before and after line charts

What was wrong with the original? The lack of labels make things clean, but stop you learning anything about the values:

why label

I changed the following:

  • Added the x-axis headers. I’m all for hiding headers (see “How to design an axis for maximum impact”), but I think it’s wrong to hide the x-axis.
  • Dual axes provide better design control in Tableau:
    • The Sales chart is an area/line chart. This gives more definition to the top of the area.
    • The profit chart is a line/circle chart. I find this gives more definition to the points than just choosing the “All markers” on the Colour shelf:
      Dual or all Markers
  • I wanted to avoid showing a y-axis but there still needs to be some way of seeing the magnitude of the measure. For this I added a maximum reference line, with the text aligned to the right. If a user filters the dashboard, the max will always be accurate
  • All titles were changed to be left-aligned and single line. This creates better consistency across the dashboard. I aligned the Titles to the left and the Maximums to the right in order to prevent visual clutter and confusion. If they were closer together, it’s harder to tell which is which:
  • Note also that the zero line on the profit chart above is a Constant Reference Line, rather than an actual axis. I did this because it’s easier to control the formatting on a minimal chart.

Example 2: the Profit Charts

donuts

I don’t object to donut charts in all circumstances, but they are simply a bad choice when you make a donut for each Year (see p11 of Stephen Few’s classic, “Save the pies for dessert”). It’s nigh on impossible to see changes over time. I switched to a stacked area chart instead. This took up less space and allowed me to label the marks, too (removing the need for a y-axis).

Notice how the colour for Furniture is a really light salmon pink. That can be a problem against a white background, but I added borders to the marks to make them more obvious:
borders

BTW, I didn’t address a key risk with this kind of chart: what happens if there is negative profit?

Example 3: the highlight table

highlight table

I love highlight tables. However, I didn’t think this one worked for 2 reasons. First, the lack of labels creates a real problem here. Without them, it’s just a mosaic. I also was not convinced that you’d ever really want to see Sales by Week and Weekday. I changed it to Month and Day. The increase in sales towards the end of each year is now clear.

Example 4: General changes

  • Colour changes
    • I used extensive use of grey in the text in order to soften the labelling and accentuate the data marks.
    • I used more distinct colour palettes to differentiate Sales and Categories.
    • I switched from a floating layout to a grid layout. This way, Tableau controls the size of the grey borders between the charts.
    • I have to say, changing the colour was the single hardest thing on this. All the other decisions were kind of straightforward but every time I changed the colour, I felt like I’d made one thing better but something else worse. Also – the image on the Tableau blog is from before I added the borders around the marks. 
  • Layout
    • I changed the layout so that Sales-related charts are on the left, and Profits ones are on the right.

Conclusion

Would you have done something different? The dashboard is still not perfect. For example, the blue highlight table and the blue map are problematic. However, I hope this exercise has shown that it is possible to have both beauty and functionality in a Tableau dashboard.

The key lesson for me is that it’s not hard in Tableau to acknowledge feedback and then iterate fast to fix the problems.

Feel free to download the workbook and share your ideas.

Before and after...
Before and after…

The 5 most influential vizzes of all time

I was delighted to have another chance to deliver my 5 influential vizzes talk at Tableau’s European Customer Conference today. For those of you who couldn’t make it, or want to watch it again, here is a recording of the session I did for Data Science Central:
[iframe src=”http://player.vimeo.com/video/62299097″ width=”500″ height=”281″ frameborder=”0″ webkitAllowFullScreen mozallowfullscreen allowFullScreen][/iframe]

The 5 Most Influential Visualizations of All Time from Tim Matteson on Vimeo.

And the slides:

The 5 Most Influential Vizzes of All Time

[iframe class=”scribd_iframe_embed” src=”http://www.scribd.com/embeds/114444983/content?start_page=1&view_mode=scroll&show_recommendations=true” data-auto-height=”false” data-aspect-ratio=”undefined” scrolling=”no” id=”doc_40187″ width=”100%” height=”600″ frameborder=”0″][/iframe]

The right way to implement a bump chart

I am a great fan of bump charts. They show changes in rank very effectively. Two bump charts caught my attention this week; they teach us an interesting lesson on how to implement them. The first was based on the history of the Oxford Bumps and bought to my attention by a post on  Infosthetics. (click the image to go to the original)

The second was caught from a tweet by Andy Kirk

premiershipbump

Both visualise really interesting data. But one is vastly more successful than the other. The Oxford Bumps chart is far too cluttered and difficult to read. The premiership chart is less cluttered and thu easier to read. Bump charts are, by their nature, like spaghetti and the appropriate reduction of clutter is vital. One way to solve this is to use well implemented highlighting. This is here where the Premiership bump succeeds and, unfortunately, the Oxford chart does not. Here are screen grabs of both bump charts with one item selected (St Edmund’s College on the left, Newcastle United on the right)

highlighted

The problem should be very clear. The highlighting on the Oxford chart does not create enough contrast: it is very hard to see the yellow highlighted chart against the background. Newcastle United, on the other hand, is clear as day – the background is faded out.

The Oxford bump chart could be easily improved by reducing the noise in the chart. If the highlight is done in conjunction with fading out the background, a bump chart is much easier to read.

What do you think? Feel free to comment below.

Simple v simplistic: if McCandless reworked Minard:

There has been much interesting debate about David McCandless’ Information Is Beautiful this week, initiated by a well written critical piece by Stephen Few. I am pleased he has challenged the orthodox view that McCandless is the answer to the data visualisation industry’s problems.

On both FlowingData and Stephen’s own blog, there is debate about the difference between “simple” and “simplistic” graphics. This is a hard point to describe, and there are two comments on Stephen’s blog (by DR, and Stephen himself) that made me realise that a picture would emphasise the difference. I wondered how McCandless, if the accusation of simplicity is correct, would rework what’s often claimed to be one of the greatest data visualisations.

Consider the classic work by Charles Minard showing Napoleon’s disastrous 1812 march on Moscow. This graphic, I believe, fits Stephen’s definition of simple. Sure, there are many dimensions being displayed (soldier numbers, temperature, location, etc) but once the viewer understands that, the powerful anti-war message is unavoidable. The waste of life is brutally clear and well contextualised by the time location and temperature:

Minard's march on Moscow (from Wikipedia)#

(image from Wikipedia: http://en.wikipedia.org/wiki/File:Minard.png)

If McCandless is to be accused of simplicity, how might a simplistic version look? I think it would be like this:

Why? The simplistic approach tries to strip away as much data as it can. Minard’s main point was the loss of life. The version above shows just that and no more. One could argue that therefore it is more effective. But it isn’t. Minard’s simple graph gives much more context without any fluff and that, to me, is the difference between “simple” and “simplistic”.

Update: Fixed the spelling of David McCandless’ name – sorry about the typos

Lollipop charts: part two

In my previous post, I explained how I stumbled across the lollipop chart as a way of displaying data when the values are all very high. This post reveals how I did it. Those of you who are savvy with the new features in v6 will probably have immediately guessed it was using dual axes. Here’s what my lollipop chart looks like:

What’s the trick? It revolves around duplicating your measure on the columns shelf, as follows:

You need to make sure it’s dual axis. Right-click on the second Measure pill and choose “Dual Axis” to draw each AVG(Satisfaction) on the same pane. Then right-click on one of the axes and choose “Synchronise axis” to make sure they match completely. This step is important because if you add labels to one of the measures, Tableau might stretch one axis to fit the text of the label in.

The next step is to define multiple marks. This feature might be new to you, and doesn’t exactly jump out at you in the user interface. Click on the little drop-down arrow at the right of the Marks shelf and choose “Multiple Mark Types”:

The Mark shelf now has a new row: it will say “All” and have left/right arrows. With the multiple mark feature, you can format all measures at once, or each one individually. What we need to do is set one mark to be a bar (a very thin one) and the other to be a circle, as shown below:

In my example, I have also added colour to emphasise the Customer Segment dimension. You’ll see that each Mark uses a different Dimension. While experimenting with the lollipops, I discovered that if the connecting line is the same intensity as the circle, it overwhelms the circle. By using a lighter palette on the line, things look nicer. I achieved this by duplicating the Dimension, and assigning lighter colours of the same hue to its members.

To label the min/max I turned on labels for just the Circle mark.

That’s about all you need to do. I added some shading and subtle row divider lines to highlight the different states. Finally, I formatted the axes so they were pretty much hidden.

I’ve started using this technique regularly as it engages users, and improves the data-ink ratio without sacrificing interpretation. What do you think?

Lollipop charts: the search for the perfect mark (part one)

Here’s the problem: I am visualising satisfaction rates over multiple dimensions. In almost all cases, satisfaction rates are high (between 70% and 100%). I want a visualisation that allows comparison over multiple dimensions that is also nice on the eye. Below is the result: a lollipop chart. Although I stumbled across this design by trial and error in Tableau, it is a chart type found elsewhere, eg on Chandoo’s excellent Excel blog. What I thought I would do in this post is explain why I think it’s a great chart in this situation and how to do it in Tableau. Note: in this post, I’m using the Superstore data, not my real dataset. In my next post, I’ll explain how to build a lollipop in Tableau. If you can’t wait that long, you could try it yourself as your homework 🙂


To me, it’s a great way to reduce the data-ink ratio while retaining readability. What do you think? Here’s how I arrived at this design.

Tableau’s default visualisation is the bar. What’s the problem with this? Well, when the bars are all very long (as is the case with my data), there’s just too much ink, and it creates an unpleasant Moire effect:

How can we solve this? Well, we can reduce the bars to wafer thin ones, but this looks, well, flakey:

Maybe we should push the size slider to the max (and add a border). This is what I would normally do in this situation. It removes the Moire effect, and isn’t too bad, but boy, there’s now a lot of ink being used:

Given there’s too much ink, maybe the bar itself is the problem. So how does a circle work? Well, the problem is that the circle is a long long way away from the label. When we try and foist this kind of thing on our users, they tell us it’s too hard to relate the circle to the name, even using shaded lines:

We can get round this distance problem in a couple of ways. One is to fix the axis so that it’s range is only as wide as the min/max values:

But we all know that an axis that doesn’t start at zero is a bad thing, right? Well, sometimes it isn’t a bad thing, but it sure makes the states at the bottom of the list look like poor performers, even though they’re actually only 0.6% lower than the top of the list. Best in this case to keep the axis starting at zero. Maybe we could label the circle directly instead:

This still isn’t right: all that white space at the left of the chart seems wrong.

And this was when I had my brainwave. Thick bars are no good and lonely circles are no good. How about making a combination of them both? And that’s how I came up with a lollipop.I think it has the following benefits:

  1. Can be used when all dimension members have high values (i.e. long/tall bars in a bar chart)
  2. Greatly reduces the data-ink ratio while maintaining a clear link to axis labels
  3. All the users I’ve shown it to so far have really engaged with it – they think it’s both pretty and easy to read

I also like the fact that it works if you add more dimensions to make small multiples:

Next time we’ll look at how to build it in Tableau. In the meantime, let me know your thoughts.





Oxford Geek Night: Pie charts

(scroll down to see the slides)

ognI gave a presentation at Oxford Geek Night #20 on Feb 9th. If you were there and have visited the blog because of that session, welcome and thanks for having a look around. If you are new to the blog, you can follow me on twitter (@acotgreave) to keep up with Tableau and Data Viz related things. Or you can subscribe to the blog’s RSS feed.

My session, Pie charts: good or evil, was intended to show people how poor pie charts are at represensting anything other than at-a-glance information. One could argue they fail even for that. This post is a collection of resources I used for the session, in case you want to do some further reading.

I didn’t get to cover some important points in the session. One common rebuttal to the argument is that you could add labels to the pie, showing the percentage. Well, yes, but that’s like admitting defeat. That is an acceptance that the chart alone does not convey the information correctly. If you need to label the slices, then surely you don’t even need the chart itself? And if that’s the case, then you’ve just proven to yourself that a ranked table of text is a better way to display the data, as shown below:

Wikipedia’s pie chart page now has a prominent section describing why the pie is bad.

I also recommend Jorge Camoes’ great blog. He eloquently describes the problems in at least two posts (this one, and this one, for starters).

For more on the specifics of the problems with Google Analytics, visit Coda Hale’s blog.pie v bar

Stephen Few provides some in-depth analysis of the problems with all things circular (pdf). He also describes the significant problem with comparing relative sizes of pies, something I didn’t have time to touch upon in my session. There is a really great description of why size/area of circle is a poor choice at Contrast’s blog.

If you want to find some bad examples of pies, well, that’s like shooting fish in a barrel. Here’s just a few:

The Breakdown of the blogosphere. Six variations on the pie chart. None of them any use. EagerEyes features this infographic in his excellent March Chart Madness post.

I did mention there are some reasons a pie might be a good option. There are at least a couple. If you specifcally need to compare several dimension members against other dimension members, a bar chart doesn’t make that easy. If your pie chart is going to seen for a short period of time, such as during a presentation, and it only has a 3 or fewer slices, then a pie can make a point quickly. In this latter case, don’t put the pie in the handouts – give the user more info with a table or bar chart.

Finally, here are my slides. I’m not sure how useful they are without my words, mind: