Data visualisation and process

What was the process that lead to this?*

What is your process when doing visual design? This was asked at David McCandless’ talk this week. It’s also something I and others get asked a lot.

David McCandless starts with a question and then finds the data to answer the question. Often he ends up finding many other questions to answer along the way!

In my job, I more often start with the data and then need to find the story.

Both are valid ways to start. Sometimes the approach is decided for you.

I wanted to go through my process, based on reflections of my US Fatalities dashboard. In this case, I’m explaining a situation where I have a ready-made dataset that I have not seen before. It’s my job to find and communicate anything interesting I can find.

The dashboard is available for download if you want to dissect it and see lots of my workings and early iterations.

Get a sense of the data

How many records and for how long?

The first thing I always do is look at the number of records and what they look like over time:

fatalities over time

Why? So I can see how much data I have and what the trends are. Looks like things are trending downward and something odd happened in 2009/9. Economic crash, perhaps? Missing data?

I’ll delve into seasonality, at this stage. If you’ve been reading my Design Month posts, you’ll know that I ultimately focussed on seasonality.

What dimensions do I have?

Second step is to break out the bar charts. I look for the ones that are interesting and have good data in them. “Light Conditions” was a pretty clean Dimension:

light conditions

I need to focus on a few things at this stage:

  • Is there a lot of Unknown/Null data? If so, it’s unlikely to be of interest
  • Does the data need me to go find myself more data? Looks like most accidents happen during the day. But I would guess most driving is done during the day, so it’s risky just reporting the raw numbers
  • Am I interested in the Dimension? If I’m not, I’m unlikely to care much about exploring it further

Do dimensions and measures compare interestingly?

Next up, I’ll begin comparing dimensions. Is there a relationship between Light Conditions and Road Type and fatalities, for example? (answer: no)

show me

At this stage, I could not do without Show Me. If there’s one thing that puts Tableau above everything else, it’s this part of the process. Even by now, I’ve probably drawn 50 or so views, most of which I have looked at for less than 3 seconds. Each one gives me a sense of the data at virtually no cost of effort or time. For example, the chart below is a bit of a shocker but was a valuable part of my exploration. It existed only fleetingly.

Awful. But useful.
Awful. But useful.

As I begin to really dive into the data to find the links, the ability to cycle through views, drop dimensions in new places is just amazing. It doesn’t matter if 99% of the views I make reveal nothing or are visually awful, I’m getting a sense of the data and I will eventually find the gems that form the story.

Geography?

Stories that answer where questions are engaging so I’ll check out the quality of the geographic data. In this dataset, we had State available but it was pretty clear it reflected population.

The wonderful XKCD (http://xkcd.com/1138/)

What level of detail is there?

I need to explore the level of detail. Is the story more interesting at aggregate level of down in the detail? In this dataset I didn’t find anything I wanted to proceed with. An example where the granular detail was of great interest was my Gooaaaal! viz during the world cup.

Focussing on the story

By this stage I might have cycled through 250+ views of the data in maybe 30 minutes or so. I’ll now have  a good sense of the data and will begin to find the story.

I kept some of the charts as you can see below.

find the story

Early on I had noticed that seasonality had a great story. The only dimension I’d find that really interested me was about road type (rural/urban).

At this stage, I decided to go with seasonality.

Make a painting

From Queens University (link)

By now, I know what my data feels like and I have a sense of the story I want to tell: I wanted to focus on fatalities during the holiday seasons.

The next stage is like painting. I start again with a blank canvas and begin to add elements here and there. For a while there’s a mix of analysis, exploration, making calculations, tweaking designs, choosing fonts, colours, and changing layout. They all happen simultaneously, intuitively. These decisions are fully documented in my Design Month posts.

Finishing touches

Feedback

Finally I’ll be almost there. Now to open the door and share with others. I asked for feedback on dataviz communities, on Twitter, from my wife and kids, from colleagues. You cannot get enough feedback! You don’t need to listen to all of it, but if 80% of people tell you the same thing, you know that thing is a problem.

Walkaway and come back

It’s important to walk away from your work for a day or so (as long as you afford). This allows you to return to it with fresh eyes.

Know when to stop

Sometimes you will just know. Other times you could keep on tweaking forever. There’s often deadline which will make the decision for you but in the end you need to know when to stop.

Have a gin and tonic

Publish it and reward yourself.

Note the vertical lines
My finished dashboard

What’s your process like? I would love to know how you do things.

* the image at the top is an infamous US diagram portraying the Afghan war. It was widely lambasted for being incomprehensible. I’ve always thought that was kind of the point, and they deliberately designed it like this to make the point.

Choosing the right colours for your visualizations

Colour in data visualisation: apparently easy but filled with pitfalls. There are volumes of posts about colour on the web. I’ve written about it before when discussing the Iraq’s Bloody Toll chart. And here’s a recent post about exploring and choosing potential palettes.

My entry - click to see it bigger
There were no accidents in the colour choices for this dashboard (click to see the interactive version)

For this post, one of a series supporting Tableau Design Month, I’ll explain the colour choices made in design the dashboard above. There are three points I will highlight in this post:

  1. Simplify the colour scheme as much as you can
  2. Choose a colour that relates to your topic
  3. Soften the darker tones

Simplify the colour scheme

Let’s see what Tableau’s default colour scheme would have been:

100% default formatting
100% default formatting

Tableau, or any visualisation tool, cannot know what the purpose of your vizualisation is. Therefore its choices should be appropriate to the chart being built. But in the above, the end result is overwhelming. There are colours everywhere.

It turns out that using just 2 colours: red and grey, you can tell the exact same story more clearly. You can even test your dashboard by trying it in greyscale: is the story still visible in the version below:

Get it right in black and white
Get it right in black and white

Choose a colour that represents your topic

I chose red to evoke the emotional aspect of this dataset. Red is powerful and emphasises the reality of fatalities. What if I’d have chosen a different colour? Blue, for example:

Going for neutrality
Going for neutrality

In this case the dashboard is much more neutral. It’s less provocative. It’s less opinionated.

Your colour choice should depend on your audience and your goal.

Soften the darker tones

You can choose palettes to emphasise just the parts of the data you want to.  Tableau defaults to a perfectly serviceable green gradient palette.

My goal was to make the 3 most lethal seasons (Jan 1, Jul 4, Dec 25) pop out. A simple red palette didn’t do it so I tried red-black, but the black was too prominent. I settled on a red-white diverging as this really popped the days I wanted to focus on. All my choices can be seen below:

Iterating through different colour choices
Iterating through different colour choices

I went a lot further in this dashboard to soften the dark tones. For example, all the fonts are softened from black to a lighter grey. As I write this post, I’m unsure now whether that was a successful choice. Check out the image below. Which do you think is more successful – the dark font or the light font?

Which do you prefer? The lighter tones on the left or the darker ones on the right?
Which do you prefer? The lighter tones on the left or the darker ones on the right?

Conclusion

Colour isn’t easy. In this post I’ve covered just 3 choices. You also need to consider cultural implications, colour-blidness, publication type, and much much more. As always, I am very interested in your thoughts – let me know in the comments.

How do you communicate that people can interact with your designs?

If you publish something interactive to the web, how is your audience supposed to know it is interactive? And how do you instruct them what to do to interact?

what you write and what they read

When a user sees a dashboard for the first time, they need to learn how to read it and how to interact with it.

You can do this in many ways. Often I see people put the instructions somewhere on the viz or on an instructional tooltip. Here’s an example from a recent Viz of the Day:

"Click on a party" (click here to see the original)
“Click on a party” (click here to see the original)

That’s fine but there is one major problem: most people don’t read the text on your viz. They’ll probably read the title but not much else.

One way you can inform a user they can interact is through tooltips and that’s what I will cover here.

Interactivity can be divided into 3 types, all of which are available in Tableau. Something can be triggered when a user:

  • …hovers their mouse over something (how does this replicate on mobile? That’s a question for another post)
  • …clicks on a data point
  • …lassos and selects some marks of the interactive

In Tableau, these are defined a “Hover”, “Select” and “Menu”. If you’re new to actions, I recommend this post by Peter Gilks.

The menu action is always in Hyerlink blue
The menu action is always in Hyerlink blue (click to see interactive version)

For my Design Month dashboard, I chose to go with a Menu action. I like the fact that when a user hovers their mouse over a mark, a nice customised tooltip with a call to action appears right where their eyes are looking.

I used a Menu action but a similar trick can be achieved with a Select action. Using a Select action gives you more control over the format of the Call to Action. I like this example from another recent Viz of the Day:

Select as Menu

This technique is not perfect:

  • There isn’t a Hover equivalent on a mobile interface.
  • What if the user DOESN’T move their mouse over the viz?

Which actions do you prefer in your dashboards? What else do you consider when instructing people about interactivity?

Less is more: improve chart clarity by removing borders and lines

Lines reduced as far as possible.
Lines reduced as far as possible. Click to see and interact with the full dashboard.

When you design a chart, just how many borders and lines can you remove to maintain clarity? Do you improve clarity by removal?

Could I have gone any further? Sometimes I will hide the y-axis completely and just label the max value but I think that’s pushing it a little too far:

tick mark too far
Removing the y-axis completely: a step too far?

That’s what we’ll look at in this post. I’ll cover axis ranges and tick marks separately. In this post, I’m going to focus on what’s available from the formatting pane.

In the image above, you can see that my formatting approach is to reduce the lines as far as possible while retaining the meaning.  Did I go too far? I think I got it about right.

Let’s look at how my end result compares with the defaults: Default formats on the right, extreme reduction on the left.

Default formats on the right, extreme reduction on the left. There’s nothing wrong with the defaults – the gridlines and borders are very sensible choices for a default setting. I do think I have emphasised the data more by reducing the lines.

Here’s how you can reduce the borders and grid lines in Tableau:

Borders

Remove all of the outer borders by selecting Format…Borders from the menu and then turning off all dividers at the sheet level:

How to remove outer borders
How to remove outer borders

 Grid lines

I owe a hat-tip to Nelson Davis (@nelsondavis) for suggesting that it’s great to show only the horizontal grid lines in a view.

To achieve the effect, just go to the Format…Lines pane and set the Columns Grid Lines to None:

This setting leaves horizontal gridlines only
This setting leaves horizontal grid lines only

Conclusion

I like the end result, it’s very crisp. One bonus is that because there’s no border at the bottom, it makes it less likely someone will think the y-axes start at zero.

You can see and download the full Fatalities dashboard here.

My entry - click to see it bigger
My entry – click to see it bigger

 

Are there terms for these two measure types?

Here’s a new-to-me problem: a dataset has two measures, and they’re both different types, but is there a term for the two types?

Here’s my data:types of measures

“Sales” can be shown broken down by year or as a total because a sale is only counted in one year

“Staff” cannot be totalled across all years because some of the staff are being counted in all years. I don’t have 135 staff, I have 60.

Is there a term for these two measures as they appear in a dataset? I’m thinking that “Staff” is cumulative, maybe? Sales are discrete? But that doesn’t sound right…

Help!

Layout: the hardest and most important thing to get right

What should your dashboard look like?

Should it be a horizontal or vertical? How do you divide up the pieces?

There’s no single right answer. It depends. It depends on your intuition and it definitely depends on getting feedback from people before you publish it.

Let me say that again: GET FEEDBACK FROM PEOPLE BEFORE PUBLISHING ANYTHING.

The best post on this was written by Steve Wexler back in 2011: “Hey, your Tableau Viz is Ugly *and* Confusing

How did I choose the layout for my Fatalities dashboard, the focus of my Design Month posts?

Horizontal. The "best" layout?
Horizontal. The “best” layout?

The answer was I tried everything else and went with what I felt was best.

A vertical layout would have worked very well for a blog post:

Vertical gives much more space in a blog
Vertical gives much more space in a blog

How did I choose? I asked lots of people for their feedback. Some favoured horizontal, some favoured vertical. Their feedback was greatly appreciated. In the end I chose horizontal because left-to-right felt like a more comfortable way to read the story. Horizontal also allows you to compare across charts more easily.

You can make an okay decision about this on your own but it’s not until you share your work and get feedback that you can make an informed decision.

Having made the decision to go horizontal there was one more thing I needed to add – a vertical line between each chart. You can see them below.

Note the vertical lines
Note the vertical lines

The lines allow each view to stand alone. Without the lines, the focus of the dashboard was more blurred. I created these lines separately and imported them as images.

Which do you prefer? Horizontal or vertical?

Should you add a border to marks on your visualizations?

Non-default borders
Non-default borders. Click to see the interactive version.

The highlight table above has white borders around each mark. Why did I make this change to the defaults?

Tableau’s default mark border is None. in this post I will explain why I often find myself changing the defaults. Here’s what the chart looks like with the default border setting:

The highlight table with no border
The highlight table with no border

As I mentioned in my first post about these charts, there’s nothing inherently wrong with having no border. It largely comes down to personal intuition. I think that changing the border to white brings the marks into more focus.

What do you think? Is my highlight table improved by using white borders?

I use this technique at other times with Tableau. I wrote about this when sharing my lollipop chart idea back in March 2011. I sometimes feel our default sizing, with gaps between the bars, creates something of a moire effect. This can be solved by upping the mark size to maximum and adding a white border to the marks:

Which is easiest on the eye?
Which is easiest on the eye?

Should you add borders to your marks at all times? No. Should you add them sometimes? Yes. How should you decide? The good news is that it is so easy to do this in Tableau you can easily try it with our without borders and go with how you feel.

Borders don’t apply just to marks. Should you add them to your legends too? I will leave that question and discussion to Jeffrey Shaffer and his excellent post on borders on legends.

All you need to do is change this section on the Mark Shelf:

default border is none

 

Try it out on your charts – do they look better with mark borders? Borders can potentially enhance all your chart types.

My entry - click to see it bigger
My entry – click to see it bigger

Designing the right tick marks for your date axes

Check out those date tick marks!
Check out those date tick marks! Click to see the interactive version.

It’s no accident that the tick marks are like they are in the above chart. Check out how they’d have looked if I’d left them at their defaults:

1983? THREE? I work in FIVES.
1983? THREE? I work in FIVES.

I’d rather my tick marks were at round numbers (1975, 1980, etc) than at odd numbers like 1983. As I’ve said throughout this series of Design Month blogs, it’s not a big problem but it’s a nice one to solve.

Tick Origin

Let’s look first at the time series:

time series

 

Getting the tick marks to start at 1975 is acheived by using the “Tick Mark Origin” feature that appears when editing time Date dimension axes. Set it to the value of the first major tick mark you want as below:

tick origiin on the time series

 Fixing axes on a slope chart

My slope chart has a couple more tricks up its sleeve.

slope

You can find out more about making dynamic slope charts in a previous post. In this slope chart, I used continuous dates (you can use discrete or continuous depending on what your goal is).

I had to jump a couple of hurdles to show the tick marks correctly and leave space for the line labels at the right of each line.

First of all, I set the tick marks to 36 years in order to show only the years that represent the start and of my slope (1975 and 2011).slope fixed size

The second thing I did was fix the axis range so that there’s space on the right hand edge of the axis for the labels to fit correctly. I used trial-and-error to find the best fit (2022):

slope fixed rangeAre these important? For this slope chart: absolutely. Here’s what it looks like if i reset the axis formats:

slope if it was default

 

 

Note the misleading tick marks on the x-axis and the misaligned line labels.

Conclusion

Slope charts are amazingly powerful and it only takes a few tweaks to the defaults to make them extremely effective in Tableau.

My entry - click to see it bigger
My entry – click to see it bigger

How to control your dates in your Tableau dashboards

You’ve spent hours crafting the perfect dashboard. It has some time based data on it and you’ve used Tableau’s amazing date hierarchy to build it. Every pixel is perfect. You publish it, proud and delighted.

Beware! Don't click the button!
This button might cause pain. Click to see the full dashboard.

And then someone interacts. They see a plus button. They click it.

Boom! What happened? The view’s broken.

But I only clicked the button! I didn't mean to break it.
But I only clicked the button! I didn’t mean to break it.

The viewer leaves your view confused and somewhat disappointed.

Have no fear, this needn’t happen to you.

The plus button is designed so you can drill into and out of hierarchies (find out more here). It’s an amazing feature for exploration and works well on many dashboards.

Sometimes you don’t want this behaviour. In my highlight table above, for example, I don’t want my users to be able to drill into the date any further.

To get rid of the plus buttons when using dates, you can use custom dates. These are date fields focussed on just one level of a date hierarchy and cannot be expanded.

Right-click on your date field and choose Create Custom Date:

Create custom date with box
Create Custom Date

Then choose the date level you want. You can create discrete (“Date Part”) or continuous (“Date Value”) fields at any level from day to year.

Custom date box

You now have a new dimension available. Put the new date dimension onto your view instead of the original date and your view no longer has the plus symbol.

Default dates on the left. Custom date on the right.
Default dates on the left. Custom date on the right.

Well done. You’ve just got control of your guided analytic dashboard back!

Comets are in the news. How about in dataviz?

The Rosetta is about to attempt to land on a comet. This is astonishing and exciting. Here’s some incredible photos of the comet on the New York Times. In honour of this event, here’s a post about comet charts:

If only I’d gone vertical and not stayed with horizontal.

I saw this tweet today:

“Comet chart”? But… But…. But…. I came up with that idea in 2012. How dare they steal my idea.

What? You’ve not heard about my comet charts before?

That’s fair enough: they were a doomed experiment several years ago and only ever seen in a thread on our Tableau Community. Below, in its non-intuitive glory is my comet chart:

Horizontal fail
Horizontal fail

(Before I continue I’m happy to acknowledge other reasons you might decide the above dashboard doesn’t work)

Zen Armstrong’s version succeeds where I failed. Her up-down orientation fits in with ones mental image of growth/decline and gravity. If only I’d thought about trying that. In order to make my chart more readable, all I needed to do was orient the marks differently:

The marks are more readable
The marks are more readable

I focused too much on horizontal orientation in order to ensure the labels were readable. Once I’d made that design choice, I was stuck with it and didn’t see the simple change I could have made. Orientation was even something I talked about in the thread where I posted this content.

What’s the lesson? Don’t get stuck in your viz too much. Be ready to keep trying  changes. Get feedback and keep experimenting.

Congrats and thanks to Zen Armstrong for coming up with her approach.