MakeoverMonday: MakeoverMonday!

Cohort Analysis
Download the v10 workbook here.

We’re at halfway. Over 200 of you have posted more than 1,000 makeovers over 25 weeks. This is simply amazing.

Where else could you find a growing source of charts and data to play with?

Where else could you find so many different ways of telling data stories?

When Andy and I started this, we thought it’d just be us goofing around. But it’s you, the community, who have made this something more special than we could have imagined.

Thank you!

This week we’re making over MakeoverMonday data. I wanted to do some cohort analysis (check out a great post on using LOD calcs for this here). It turns out that those of you who’ve done the first 5 weeks of Makeovers contribute about 50% of all makeovers each week. Below is a percent of total view of the chart above:

Cohort Analysis (%)

Keep it up, gang! I’m loving seeing all the incredible ideas you come up with each week.

quick snap

 

 

MakeoverMonday: Am I safe in Japan?

I’m very excited to be in Tokyo this week. I’ll be presenting the Tableau 10.0 roadshow and at a bunch of partner and customer events, too. I’m very thankful for this opportunity!

The makeover

This week, Andy K found a chart showing reported thefts in Japan. The chart we’re focusing on shows the number of reported thefts in Japan in 2012. I thought I’d make it over to make it personal. Personal to me. Could I use this to prove the reputation Japan has of being a safe country to visit?

Will I be safe in Japan dash
Download the workbook here. (Tableau v10)

I didn’t start off wanting to ask that question. My original starting point was to draw a straightforward treemap. I haven’t seen too many of these in MakeoverMonday so I thought they deserved some attention.

Treemap: only ok
Treemap: only ok

The treemap was ok, but it didn’t amaze me, and I didn’t feel that I’d really hit on anything interesting for MakeoverMonday. All I’d done was take sectors of a circle and make them sub-rectangles of a bigger one.

As I interacted with the data, though, I realized I could look for patterns relevant to me. This week is my first visit to Japan. As the original article describes, Japan has a reputation for being safe. “Well then,” I thought, “which of these crimes could I fall prey to and are they common?”

That led to my makeover and a personal story to prove the relative safety of Japan, using available data. Given there are so few crimes, it’s fair to say that this data supports the reputation Japan has of being a safe place to visit.

Iterations and alternatives

Do I love this treemap? Not especially: in fact, with this dataset, I think a pie makes it easier to see the proportional to whole relationship than a treemap. Look at the figure below. In which chart is it easier to see that Vehicle Theft accounts for about 30% of reported theft. [Note: the previous statement comes with the normal caveats about pies and their problems. I know the problems with pies, you don’t have to tell me them; I’m just describing my thought process as I explored the data and built different views.]

Pie or Treemap?
Pie or Treemap?

If I was going for efficiency in the makeover, I’d probably have chosen a stacked bar or even a normal bar chart. These allow for the easiest lookup of data. Here they are below.

stacked bar bar

The original chart

Here’s the original chart and my thoughts on it:

What I liked

  1. Everything is labelled, so I can lookup any value I want
  2. There’s a total in the middle so I can the proportions and relate it to the entire number of reported thefts
  3. The labels are aligned making them easier to lookup than otherwise

What I didn’t like

  1. It’s a sunburst chart. There’s a certain pleasure in looking around and following shapes from the centre outwards, but it’s so slow and inefficient. A normal bar chart gets the job done quicker
  2. The inner label shows the actual total number of reported thefts, but the outer numbers show percentages. That’s not made clear.
  3. The outer level of the sunburst appears to be randomly sorted. It could have been in descending or alphabetical order.

 

MakeoverMonday: Women in the workplace

This week MakeoverMonday is LIVE at Tableau Conference on Tour. Check out the hastags #makeovermonday and #data16 during Monday to follow things live.

Women in the workplace

For my makeover this week, I wanted to simplift the message. The differences between 2012 and 2015 weren’t that great. There are more women at each level, but the trends themselves haven’t changed. I decided to remove 2012 from my data to focus more clearly on the Pipeline story.

I liked the quote in the first paragraph of the original so lifted that for the title.

In our makeover about women in legislature, I extended the y-axis to 100% to emphasise the distance to parity with men. In this case, I decided to end the y-axis at 50%. To make it clear that the top of the chart is 50% I made the reference line stand out, and put the title beneath it. Did that succeed? Did you see the reference line?

The original

Well, it’s a pipeline. Of sorts.

The original chart wasn’t a great one this week.

What I liked:

  • There’s a table, so I can lookup the numbers
  • The colour scheme is very easy to distinguish
  • They attempted to use a visual metaphor for a pipe

What could have been improved:

  • The mix of line chart and pipeline renders the chart pretty meaningless: it’s not possible to see what’s actually being shown in the chart
  • The designers appear to have drawn a straight line in the chart, but the data doesn’t quite drop the way it’s shown.

MakeoverMonday: Facebook’s Energy Footprint

MM
My makeover. Click here to download the workbook (requires Tableau v10)

A first for MakeoverMonday: I ended up pretty much remaking the original chart, with only small tweaks. Once I’d locked onto the story I wanted to tell, I couldn’t escape the fact that Facebook’s original version of this chart was pretty much just right. In fact, it’s possible that I’ve complicated the message with my version.

How did I get to my version?

The original.
The original.

I thought Facebook’s whole report was fascinating. I learnt a lot from this graphical report.

As I explored the data, and cross-reference the report, it all seemed to hone in on the amount of renewable energy being used. The power usage itself is interesting, but the ambition for Facebook is to get the CaRE up to 50% by 2018 (and ultimately 100%).

Too sparse
Too sparse

I tried to draw a slope chart first, but it looked too sparse. Also, it hid what I thought was some important information – the volatility of CaRE:

Clean and renewable is highly volatile
Clean and renewable is highly volatile

I wanted to pursue this volatilty because it’s hard to say there’s a long-term trend upwards for CaRE when there was such a big trough in 2013. Unfortunately, I couldn’t find that information in the report.

Without the information on the volatility, I figured I’d accept Facebook’s word and focus on Facebook’s hitting the 25% CaRE by 2015 goal. As I drew different versions, it seemed that only a line chart or an area chart with CaRE along the baseline made the point. The other energy types are secondary information: as long as CaRE is going up, I don’t really care too much what’s happening to coal and nuclear.

What do I like about the original?
  1. Annotations on the marks explain the data
  2. Headline on the left summarises the point being made
  3. Forecast line has a different format
What don’t I like about the original?
  1. The x-axis year labels aren’t horizontal, and they don’t align very well to the marks themselves
  2. The y-axis % scale only goes up to 50%. On the one hand, this is fine, because it fits the range of the data. On the other hand, 35% means that 65% of data is still not renewable.
My changes

MM

  • I used an area chart, with faded colours for all but CaRE to add context to the main story about CaRE usage
  • This choice also forced the y-axis to go from 0-100%. Now you can see that while goals are being hit, companies with huge data centres still have a long way to go.
  • I added a reference line for the 2015. This helps imply that the goal is continuous. The goal doesn’t stop in 2015.

Joseph Priestley’s 1765 Big Data Case Study

Go check this out

Reading articles on the process of data visualization is enlightening. At the London Tableau User Group this week, for example, Andy Kirk explained the process behind his recent work showcasing Liverpool FC’s roller-coaster season. It was entertaining and educational.

You might think this kind of thing is new, only happening in the world of meetups and blogs. Wrong!

One of the earliest examples I know of is Joseph Priestley’s commentaries on his Charts of Biography and History.

In those books, you get the same challenges and opportunities described that you would do. It amazes me to read this and realise that the challenges we face today are the same as Priestley faced 250 years ago.

Ground zero of visualizing time

Here he is talking through his ideas of representing time in a chart:

We have no distinct idea of length of time until we have conceived it in the form of some sensible thing that has length as of a line.

What’s most fascinating is that this pamphlet IS THE START of people thinking about how to visualize time. Priestley came up with the idea of a Gantt bar, with length, to represent time. I cannot stress the importance of this enough. Today we don’t think twice about using lines and bars. But in 1765, they were inventing these ideas.

Collecting and cleaning data: a pain back then too

I shall not mention the pains it has cost me to reconcile and adjust the different values I have met concerning great numbers of them.

He describes the challenges of finding and cleaning data, and the wondrous opportunities of discovering new insights as he went along.

Laborious and  tedious as the compilation of this work has been… a variety of views were continually opening upon me during the execution of it

That one is my favourite.

prriestly quote

Aggressive dataviz police were a problem then too

He also tries to address potential criticism head on. Dealing with the dogmatic dataviz police is still a problem in 2016, which is a shame:

No human work of such a nature as this can be expected to be faultless. I hope no candid person will think … that they are either so numerous or so great as considerably to lessen the use of the whole

It’s hard to know when to stop

He also describes the difficulty of knowing when to stop. I’ve written about this when talking about the process of data visualization. It can be so much fun tweaking and pruning a data visualization that sometimes you just have to stop.

The many times I have altered my lists convinces me that I should never revise them without seeing some reason to make farther alterations. The many times that I have replaced the same names after having rejected them convinces me that farther alterations would have been of very little consequence

He provides his data

No data project is truly authentic if you can’t access the underlying data. Fortunately, Priestley provided the entire dataset for the Chart of Biography. If you want to remake the chart or challenge his assumptions, you can!

the data

Joseph Priestley  – a legend

Joseph Priestley was an amazing polymath. Not only was he great with data, he also discovered oxygen, caused riots, catalogued electricity and was friends with US presidents (I recommend this book about him). He paved the way for data visualization.

MakeoverMonday: Horizontal History

Chart of Biography
Interactive, downloadable version here

Today’s makeover sees me completing an ambition of 5 years: remake Joseph Priestley’s Chart of Biography in Tableau. Finally, all of my 5 Most Influential Vizzes have been remade in Tableau.

Here’s the source chart for this week’s Makeover:

See the others at Why Ask Why

It’s a horizontal history from the excellent site Why Ask Why. It’s a cool data experiment and exploration. The article inspired Yura Bagdanov to do a horizontal version. Of course, when I read the article, I saw only the Chart of Biography, what I think is the most influential chart of all time:

Go read more on Wikipedia.

Priestley’s chart was the first to condense time onto an x-axis which fit on a single page (Dubourg did something similar earlier, but his chart was 54ft long!). It’s also the first Gantt chart. And it directly influenced William Playfair as he created his statistical line charts. Boom! Check the blog tomorrow as my next post is all about Priestley’s own analysis of his chart.

This dataset doesn’t have the same names as Priestley’s, but it’s the same type of data: thousands of famous people with details of their life and death.

Making the chart: the 1765 version

Priestley created a Gantt chart. Could I do the same in Tableau? Well yes, but all my early efforts didn’t really work out:

Chart of biggraphy true to original
Too much detail in the Gantt

The problem was there’s just too many names! I used jittering to randomly distribute the names in each pane but I wasn’t happy with the output. Time to rethink the view with a 2016 perspective.

Making the chart: A New Chart of History

I did have a go of recreating A New Chart of History too, but it only highlighted the inaccuracy of the data. 50% of famous people born in 1950-2000 came from North America? Not sure about that.

New Chart of History

Making the chart: the 2016 version

Interactivity gives us lots of options!

Tooltips and highlighting

Priestley labelled every single bar. Can you imagine how hard and tedious that must have been? All I did was make a nice tooltip!

tooltip

You can also see in the above image that the country is highlighted. Another advantage of modern interactive tools!

filters

Just the women in this view
Just the women in this view

I also get the advantage of filtering. Above is the view with only the females in the dataset shown.

The view above also highlights the problem with the Pantheon project: it’s incomplete. Only 6 famous female explorers? 5 business women?

I am very grateful to the London Viz Club for getting the data from the Pantheon project – that was a cool little Alteryx task. I’ve waited 5 years for a dataset like this: today is a happy day!303

MakeoverMonday: Militarization of the Middle East

Oh gang, I apologise but this is a super-brief Makeover this week. Work and life have multiple demands this week. Given the time squeeze, I gave myself 15 minutes to make a viz, with the rest of the time spent on this blog post.

15 minutes? What can you possibly focus on?

With only 15 minutes, I’m clearly not going to go deep into the data (even though it looks like it has some amazing detail).

0-5 minutes

Instead, I looked for one point being made in the article.

This caught my eye
This caught my eye

The flag section caught my eye. Firstly because it was the hardest part to interpret. So many flags and arrows and words and icons. What does it all mean? Once I deciphered the meaning, it seemed pretty interesting: 3 Middle East nations are importing way more than previously.

That was interesting: is it called out in the article? It sure is: they claim that these three countries are arming themselves in response to unrest in Syria. I find this horrible and fascinating: military hardware companies will be rubbing their hands with glee at the conflicts around the world.

Could I remake this story using the data?

5-11 minutes: building the viz

I’d spent a few minutes digesting the story, so needed to see what the dataset revealed. If you only have a matter of minutes, and want to look at how a measure changes over time, go for a line chart. There’s no time for mucking around with different views.

I filtered out the countries, and there you go: Qatar, Saudi Arabia and UAE all going up.

11-15 minutes: formatting

How do you make a simple makeover look really fancy? Choose an unusual font and background colour. Instant respect from us all! Well, there’s no time for that today. All I could manage was a quick switch to Smooth theme and writing a nice title. For simple charts like this, I’ll always try to ask a question, giving the viewer the information they need to query the chart itself.

15 minutes and I’m done.

How does a 15 minute makeover feel?

The main feeling is of fraud. I didn’t do much detailed checking of other countries. I don’t feel like I have really tested the hypothesis that “Middle Eastern Countries are spending more because of Syria.” I feel like I’ve just accepted the point made by the journalist and made a chart which, kind of, supports that opinion.

I feel like I’ve used the data to support the story, rather than use the data to find the story.

 

 

MakeoverMonday: Global Warming is Spiralling Out of Control

[Note 1: Yesterday I had a classic MakeoverMonday experience. I wrote this post, and was ready to hit publish. I then realised I just needed to tweak one of my images. I went back to Tableau, had a brainwave, and ended up with a completely new idea. I NEVER would have come up with that idea had I not been able to drag drop and experiment so readily. I chose to keep this post for Tuesday]

[Note 2: The CORRECT baseline is 1961-1990. The charts in this post have incorrect titles. Download the workbook to see correct versions]

The world is getting hotter indeed. The data comes from the UK MetOffice’s HADCrut4 data: a global, gridded dataset of surface anomaly temperatures.

The science behind the dataset is complex, but the data’s straightforward: the measure is going up over time. How should you best show an upward trend?

This week, three ideas came to mind before I explored the data. I implemented each one.

1. Straight line (with politics)

politics
Which presidential candidate do you trust on climate change?

You can’t beat a trend line. It’s visually the most straightforward and effective for displaying an upward trend. I chose to emphasise the moving average (red) with the actual anomalies in grey in the background.

The rising line chart conveniently leaves white space into which you can insert objects to further make your point. In this case I found two representative tweets from the likely US presidential candidates.

2. Bloomberg-inspired animation

global temps
This is a GIF – click the image to see the animation if it doesn’t begin

Bloomberg did an amazing visualisation with this data last year. Here was my excuse to recreate it. I think this is an especially good way to show the data because the animation brings drama to numbers. As the hottest year creeps ever upwards you have a sense of dread. “Wow, 1995 was hot. It can’t get hotter, can it? Oh. It did, 1998. Ouch. And again. And again. Yikes.”

This week’s chart was essentially exactly the same idea, spiralised. Personally, I think the radial display makes it much harder to see the extremes creeping ever higher.

3. The highlight table

Click image to see a bigger, hi-res version

I love a highlight table. This one lets you look up each month, should you wish to, but shows, right at the top, just how common the broken records are happening. It was fact-checking the rank calculation which led me to the idea of histograms for my actual MakeoverMonday, published yesterday.

I also quite like tall and thin, but in this case, I think there’s just too much detail. We’re really making the point that the most recent months are super-hot. The highlight table takes a lot of vertical space to make that point.

top 10 labels
Detail from the top of the chart: the most important stuff.

Click here to see the final Makeover post.

MakeoverMonday: Global temperature is spiralling out of control

Click the image to see a larger version. 10.0 only this week – click here to download a copy (when it asks you to locate the extract, point it to this one)

[Today I had a classic MakeoverMonday experience. I wrote my original post, and was ready to hit publish. I then realised I just needed to check my calculations were correct. I went back to Tableau. While checking the data, I came upon a completely new idea. I NEVER would have come up with that idea had I not been able to drag, drop and experiment so readily. I will publish the original post tomorrow.]

The world is getting hotter. This week’s data comes from the UK MetOffice’s HADCrut4 data: a global, gridded dataset of surface anomaly temperatures. Note: the baseline for HADCrut4 is 1961-1990, not 1850-1900 as stated in the original article. See the MetOffice page for more details. 

The science behind the dataset is complex, but the data’s straightforward: the measure is going up over time. How should you best show an upward trend? I had three ideas, which I implemented, and will publish tomorrow.

A final check of the data, though, led me to the idea of a histogram.

Are histograms good charts?

I really like my chart this week. It shows just how much the 21st century has been above average in an unusual way. The challenge with histograms though is that they aren’t as immediately understandable as a line chart. You’ll see in tomorrow’s posts that I was initially riffing on line charts. If you’re sharing your findings with people who don’t usually see many charts, or have much time, you might want to show a simpler chart. Or you could trust that your audience is in fact intelligent and go with this design.

Iterations

I built a histogram initially just to check whether one of my calculations was correct. I immediately realised it was an interesting way of showing the data. But which chart shape and at what level of granularity?

histogram bar with month detail orange

My first version showed every month as a separate mark (because that’s what I was trying to validate). However, it’s just too much detail and nobody really wants to know the specific value for a particular month in the 1990s. It’s the trend that’s important.

histrogram area orange

I tried an area chart too. I like this as it shows the waves of the different time periods. However, it’s just one level of complexity too far. A histogram’s challenging enough without colouring it by groups and using area instead of bars.

histogram bars1 orange

Bars it was! My final step was to tell the story. I turned to colour here. My story is about the years since 2000, so I changed the palette to emphasise those years. Red for the recent colours, greys for everything else:colours

Unstacked area?

Finally, does an unstacked area work best of all? I think it might…

Helping the user understand a histogram

Here’s my biggest challenge with histograms: how do you help a reader understand it in as short a time as possible?

Custom labelling to aid the user
Custom labeling to aid the user
  • I created custom axis labeling as shown above
  • I annotated one of the marks
  • I used colour in the title to further explain what each mark showed

Did that work? How easy was it for you to interpret the chart?

The original chart

With 12k retweets at time of writing, people clearly love spirally climate data!

What I like
  • If you watch the animation, it clearly expands outwards
  • The colours pop out (although they seem arbitrary)
  • Spirals fit into a small space, like a tweet
What I would improve

Spirals?

This is a straightforward timeline and the radial nature simply does not show the growth over time. Bloomberg did a much more exciting animated version. A simple trendline shows growth better, too, in my opinion. Growth in a sprial is only visible by a vague awareness of an expanding circumference. Spikes in months or years are lost in the noise and confusion of the sprial.

But…. Twelve Thousand Retweets? For all the problems of spirals, people engage with them. Is it better to get people thinking about the data, or be a chart purist? Bloomberg, when it tweeted about it’s story with a map, got only 192 retweets. From an account with THREE MILLION FOLLOWERS.

Conclusion? Spirals aren’t the “best” way to show the data, but they make people look at it.

MakeoverMonday: American women work way more than their European counterparts. [Really?]

TL;DR - just look at this chart. More details below.
TL;DR – just look at this chart. More details below.

[This week you have multiple ways to see my Makeover. It’s available here as a Tableau Story. Or read below for the story rendered as a post. Notes on the original chart are at the end, too]

Go see this as a Story in Tableau
Go see this as a Story in Tableau

The Makeover

Business Insider make a bold claim in their headline

2016-05-09_10-21-05
Click here to see the article

Actually, of the 21 EU countries in the dataset, women work more than the Americans in 9 of them.

2016-05-09_11-20-22

The Netherlands is an interesting outlier

2016-05-09_11-08-01

What is about the Dutch?

Click here for to see the articleCheck out these great articles from The Economist and Slate on why Dutch women don’t work so much as other nations.

Finally, should we trust Labour Force Statistics which involve gender?

more or less2

Check out this week’s fantastic episode of More or Less for more information.

The Analysis

You can download my workbook here. It’s using v10 of Tableau. (in beta at time of writing)

As you’ve gathered, this week my makeover was inspired by questioning the orignal chart. The chart itself is ok as far as stacked bar charts go. I question the boldness of the claim, though.

First of all, lots of the EU countries have higher levels of work than the US.

Secondly, as More or Less discussed this week, there are many reasons why gender data in employment statistics might be incorrect. Or, if not incorrect, the surveys are bias against female employees. For example, surveys often ask about “primary” employment. This ignores second jobs, which more women have than men. Uganda changed its surveys and female employment numbers went up by hundreds of thousands!

What did I like about the original?

  • A stacked bar is pretty clear.
  • I can easily find the categories on the legend and compare countries

What didn’t I like?

  • Everything’s got the same intensity. I’d have softened the borders, axis lines and labels, so that the data is more clear
  • It’s a good job I know my country abbreviations. GER, ITA? Some people might not know what they mean. They may think OECD is a country of its own.
  • There are too many tick marks on the axis. I don’t need all that information.
  • The title, “Female hours worked relatively low” doesn’t make much sense. I don’t mind using abbreviated language in titles, but this one seems to have gone too far.