Killing the paired bar chart

In which years did B outsell A?
In which years did B outsell A?

Jon Schwabish just posted a nice solution to the problem of the side-by-side bar chart. I won’t go into why they’re A Bad Thing – he does that just fine.

I wanted to put this post together because it’s something I’ve been thinking about too. My solution is slightly different. Consider the side-by-side bar chart at the top showing sales of Product A and B over ten years. Too much ink! It’s confusing and impossible to interpret. It’s really hard to see anything.

How else can we show this info and ask “in which years did B outsell A?” Simple. Do something heretical and connect the dots using a line (what? Use a line to connect discrete values? But you can’t do that!):

side by side slope

Because we’re so well evolved to see slopes, we quickly and easily see the three years in which B outsold A:

side by side highlight
Click here to download a workbook with this chart in it

In this example, because it’s sales over time, I kept the years as separate panes.

With slightly different data, you can acheive the same results using a categorical slope chart. I’m doing this as part of my analytics based around the UK General Election (http://impartialityuk.tumblr.com/).

SNP and LIb Dem Mentions

A new role: evangelist

The Ancient Mariner
The Ancient Mariner
I pass, like night, from land to land;
I have strange power of speech;
That moment that his face I see,
I know the man that must hear me:
To him my tale I teach.
– The Rime of the Ancient Mariner, Samuel Taylor Coleridge

Today’s my birthday and Tableau gave me a nice present: a new role. I’m now Technical Evangelist for the company. I’m humbled, thrilled and delighted with this.

In many ways it’s what I’ve been doing, unofficially, since first buying Tableau 7 years ago, starting this blog in 2010 (here’s my first post) and then joining Tableau in 2011.

I’ve kinda always felt a little like the Ancient Mariner. He went through a crazy experience and then felt driven to share it with everyone he sees. That’s how I feel about visual analysis and Tableau.

(another reason for the quote? The Rime of the Ancient Mariner is my favourite poem. I even tried to learn it while riding a bike around New Zealand ten years ago. If you’ve not read it, I recommend you do. Then go read the hilarious version by Hunt Emerson. Finally, go listen to Iron Maiden’s epic tribute to the poem; go on, you know you want to!)

 

How is medal count affected by population at the Olympics?

USA and Russia dominate the Olympic Medal tables but is that simply because they are large countries? Are there countries who get more medals per million people in their country? Yes. Check out the dashboard below:

This dashboard is being used as part of my presentation about data visualisation in the media at Future Web Forum in Moscow on 28 November where I will be speaking with John Burn Murdoch.

Click here to see a bigger version of the dashboard.

NOTE: the population figures are for 2011/2012. When you filter and search for a different Olympic year, it continues to calculate medals per millions based on the most recent data for that country.

Are our defaults at fault?

Here are two dashboards. The bottom one uses all Tableau’s default formatting settings. The top one has at least 25 formatting changes or design decisions: these changes take a few hours to implement. Why bother? What’s wrong with Tableau’s default formatting?

My entry - click to see it bigger
At least 30 design choices were made to make this!  – click to see it bigger
100% default formatting
The same dashboard with zero formatting changes: it’s 100% Tableau default

The short answer is: nothing. The longer answer is more nuanced.

It’s Tableau Public’s Design Month so I wanted to do a series of related posts. In these posts, I’ll be focusing on the dashboard above (click here for bigger version). This was my entry into Tableau’s annual internal “VizWhiz” competition. In this round, we were given data was about US road traffic accidents.

There are at least 25 things design decisions I’ve made to produce that viz. That’s 25  changes I have made to the default Tableau formatting: some small, some large.

But first: why change the default formatting? What’s wrong with Tableau’s defaults?

Let’s look again at the default dashboard:

100% default formatting
100% default formatting

Let me repeat: There is nothing inherently wrong with Tableau’s defaults.

My story can be understood from the default dashboard above. In fact, I am sure some people reading this will think the defaults are better than my design. If so, let me know in the comments below.

So why bother? Why would I spend hours tweaking what is already a perfectly-fine dashboard?

Chris Hoy (Image: BBC)

I take my inspiration from ex-British Cycling performance director Dave Brailsford. He led the British Cycling team to huge success through his “marginal gains” (click here to find out more). The principle is that you make all the small and large changes you make. Even the small changes, when aggregated, make a difference.

Each formatting tweak might only improve my viz a tiny bit.

The default formatting gives you a bronze-medal dashboard. There’s no shame in getting a bronze medal for something. If you’re in a business environment, producing dashboards for fast, iterative consumption, it is perfectly fine to leave the defaults as they are.

However, what if you want the Gold medal? In this case, you make all the changes you can. Even if they are small, the aggregate effect is significant. For that reason, I am happy to go the distance and make grab every “marginal gain” that I can.

Over the next month, I’ll be describing most of the formatting and design decisions I made.

Humblebrag alert: I am pleased with my dashboard but am wide open to criticisms about it. I am sure that what I think are good decisions might well seem like bad ones to you)

 

Gifts for Data Geeks and a chance to support Movember

store image
Head along to http://www.zazzle.co.uk/acotgreave

Are you looking for a gift for the data geek in your life? Well, now’s your chance to get your hands on some pretty unique data viz gear. I’ve always printed my own t-shirts for Tableau conferences and I’m now making them all available via my Zazzle store.

In the UK: http://www.zazzle.co.uk/acotgreave

In the US: http://www.zazzle.com/acotgreave

For the month of November, ALL royalties I make will be donated to the “Tableau Mo Bros & Sistas” Movember team, raising money to support men’s health issues.

Mythbusters: Should you start your axes at zero?

I’ve written before about the problem of “rules” and “laws” in data visualization. A classic one is “Thou must start your axes at zero.” If you’re reading my Brinton blog, go see what he had to say about it)

In this post I want to dispel this myth. It’s a myth that’s close to my heart. In August 2009 I went on the record (by commenting on The Guardian’s Datablog) that I disliked one of their charts because their axes didn’t start at zero.

andy comment

Let’s take the data from that Guardian article in order to investigate this Rule. Here’s how it looks with zero included:

with zero

This is bad for at least 4 reasons:

  1. It doesn’t really expose the change of the record over time.
  2. It especially doesn’t highlight the impact Usain Bolt had on the record.
  3. It doesn’t make great use of the space – there’s lots of dead space.
  4. It’s boring.

What happens when you break the rule?

without zero

 

All the problems are removed. It’s engaging, Usain Bolt’s impact is clear and it makes great use of space.

There’s one final reason not to include zero in this case. I do not know what the ultimate fastest time a human being will run 100m, but I can guarantee it isn’t zero seconds. My point is that not all measures you are charting have real zeros. In this case, the “zero” might be 9 seconds or so.

Once you learn the guidelines, you’ll be able to fine tune your charts by bending or breaking them according to your use case and objective. Sticking to the rules means you will satisfy the 5 criteria Alberto Cairo defines for a successful chart (he discussed these at his 2014 Tapestry Keynote):

  1. Truthful
  2. Functional
  3. Beautiful
  4. Insightful
  5. Enlightening

 

 

 

 

Iraq’s Bloody Toll: control your message with title, colour and orientation

Conside these charts. They have completely different messages and yet the only difference is the title, the colour and the orientation of the bars.

Creating a totally different message with just a title, axis and colour change.
Creating a totally different message with just a title, axis and colour change.

Watch the video below for an explanation, or go check out my slides from an older presentation, “Drive the message home with the right dashboard“. I also make this point in my Brinton talks, as he also covered this point.

Manual data collection

2014-10-14 09.32.18
My log of books/films/gigs

In the 21st century, the age of big data and the Internet of Things, it’s easy to get carried away logging everything you do in databases. I find there’s a charm and happiness in doing some data logging the old-fashioned way: on pen and paper.

What’s the log book above? I write down every book/film/gig/concert/play I read or see. I add a date and a score out of 5. I started the log in 2010. I was complaining to a friend, “Oh, I wish I’d kept a record of every book and gig I’d been to in my life.” It was probably the third time I’d had this conversation with her.

She replied, “Andy, quit bitching and just start one now.”

Good point. I did.

I love the physical object, and the easy nature of browsing back a few years. Sure, I could log this on Goodreads.com and equivalents, but it’s not the same. And part of me thinks it might be something I can share more easily with my kids one day. It’s also an anti-Tableau thing. I love Tableau but, you know what, sometimes I want my data to stay away from the screen. Unlike my music habits of course.

wanted to share another one we received at work recently – this is data collected about Premier League players by someone when they were 8 years old:

IMG_0918

 

Do you collect data? Post a link in the comments below or on Twitter. Let’s share our manual data logs. Geeks of the world: Unite!

Why Twitter’s decision on Scraperwiki is bad for data democracy

Let’s say I need some basic Twitter data. What’s the difference between these two scenarios?

Scenario 1

scenario 1 developer

I pay a developer to write a script using Twitter’s free API and they push the data into a spreadsheet for me.

Scenario 2

scenario 2 scraper

I connect to Scraperwiki, who have automated the work of the developer, and use their tools to push tweets into a spreadsheet.

Scraperwiki is a great company. They have been able to provide Twitter data for the non-programmer. I’ve used their tools to do effective analysis that highlights the power of Twitter data. I am very disappointed that Twitter has forced them to turn off their Twitter tools. I appreciate Twitter is a business and has to make money, but this decision does not aid data democracy.

The decision favours developers and hinders data enthusiasts. Why? What makes developers a better class of person in Twitter’s eyes? Why can developers with their fancy pants Python, PhP and Ruby skills get their hands on the data, but the rest of the world can’t?

Scraperwiki’s service, in my eyes, removed the middle man for those who don’t know how to code.

Apart from keeping developers in business, I don’t see the difference between the scenarios. Feel free to explain it to me in the comments.

Twitter, if you’re listening, why don’t you support data democracy?