We’re at halfway. Over 200 of you have posted more than 1,000 makeovers over 25 weeks. This is simply amazing.
Where else could you find a growing source of charts and data to play with?
Where else could you find so many different ways of telling data stories?
When Andy and I started this, we thought it’d just be us goofing around. But it’s you, the community, who have made this something more special than we could have imagined.
This week we’re making over MakeoverMonday data. I wanted to do some cohort analysis (check out a great post on using LOD calcs for this here). It turns out that those of you who’ve done the first 5 weeks of Makeovers contribute about 50% of all makeovers each week. Below is a percent of total view of the chart above:
Keep it up, gang! I’m loving seeing all the incredible ideas you come up with each week.
I’m very excited to be in Tokyo this week. I’ll be presenting the Tableau 10.0 roadshow and at a bunch of partner and customer events, too. I’m very thankful for this opportunity!
This week, Andy K found a chart showing reported thefts in Japan. The chart we’re focusing on shows the number of reported thefts in Japan in 2012. I thought I’d make it over to make it personal. Personal to me. Could I use this to prove the reputation Japan has of being a safe country to visit?
I didn’t start off wanting to ask that question. My original starting point was to draw a straightforward treemap. I haven’t seen too many of these in MakeoverMonday so I thought they deserved some attention.
The treemap was ok, but it didn’t amaze me, and I didn’t feel that I’d really hit on anything interesting for MakeoverMonday. All I’d done was take sectors of a circle and make them sub-rectangles of a bigger one.
As I interacted with the data, though, I realized I could look for patterns relevant to me. This week is my first visit to Japan. As the original article describes, Japan has a reputation for being safe. “Well then,” I thought, “which of these crimes could I fall prey to and are they common?”
That led to my makeover and a personal story to prove the relative safety of Japan, using available data. Given there are so few crimes, it’s fair to say that this data supports the reputation Japan has of being a safe place to visit.
Iterations and alternatives
Do I love this treemap? Not especially: in fact, with this dataset, I think a pie makes it easier to see the proportional to whole relationship than a treemap. Look at the figure below. In which chart is it easier to see that Vehicle Theft accounts for about 30% of reported theft. [Note: the previous statement comes with the normal caveats about pies and their problems. I know the problems with pies, you don’t have to tell me them; I’m just describing my thought process as I explored the data and built different views.]
If I was going for efficiency in the makeover, I’d probably have chosen a stacked bar or even a normal bar chart. These allow for the easiest lookup of data. Here they are below.
The original chart
Here’s the original chart and my thoughts on it:
What I liked
Everything is labelled, so I can lookup any value I want
There’s a total in the middle so I can the proportions and relate it to the entire number of reported thefts
The labels are aligned making them easier to lookup than otherwise
What I didn’t like
It’s a sunburst chart. There’s a certain pleasure in looking around and following shapes from the centre outwards, but it’s so slow and inefficient. A normal bar chart gets the job done quicker
The inner label shows the actual total number of reported thefts, but the outer numbers show percentages. That’s not made clear.
The outer level of the sunburst appears to be randomly sorted. It could have been in descending or alphabetical order.
This week MakeoverMonday is LIVE at Tableau Conference on Tour. Check out the hastags #makeovermonday and #data16 during Monday to follow things live.
For my makeover this week, I wanted to simplift the message. The differences between 2012 and 2015 weren’t that great. There are more women at each level, but the trends themselves haven’t changed. I decided to remove 2012 from my data to focus more clearly on the Pipeline story.
I liked the quote in the first paragraph of the original so lifted that for the title.
In our makeover about women in legislature, I extended the y-axis to 100% to emphasise the distance to parity with men. In this case, I decided to end the y-axis at 50%. To make it clear that the top of the chart is 50% I made the reference line stand out, and put the title beneath it. Did that succeed? Did you see the reference line?
The original chart wasn’t a great one this week.
What I liked:
There’s a table, so I can lookup the numbers
The colour scheme is very easy to distinguish
They attempted to use a visual metaphor for a pipe
What could have been improved:
The mix of line chart and pipeline renders the chart pretty meaningless: it’s not possible to see what’s actually being shown in the chart
The designers appear to have drawn a straight line in the chart, but the data doesn’t quite drop the way it’s shown.
A first for MakeoverMonday: I ended up pretty much remaking the original chart, with only small tweaks. Once I’d locked onto the story I wanted to tell, I couldn’t escape the fact that Facebook’s original version of this chart was pretty much just right. In fact, it’s possible that I’ve complicated the message with my version.
How did I get to my version?
I thought Facebook’s whole report was fascinating. I learnt a lot from this graphical report.
As I explored the data, and cross-reference the report, it all seemed to hone in on the amount of renewable energy being used. The power usage itself is interesting, but the ambition for Facebook is to get the CaRE up to 50% by 2018 (and ultimately 100%).
I tried to draw a slope chart first, but it looked too sparse. Also, it hid what I thought was some important information – the volatility of CaRE:
I wanted to pursue this volatilty because it’s hard to say there’s a long-term trend upwards for CaRE when there was such a big trough in 2013. Unfortunately, I couldn’t find that information in the report.
Without the information on the volatility, I figured I’d accept Facebook’s word and focus on Facebook’s hitting the 25% CaRE by 2015 goal. As I drew different versions, it seemed that only a line chart or an area chart with CaRE along the baseline made the point. The other energy types are secondary information: as long as CaRE is going up, I don’t really care too much what’s happening to coal and nuclear.
What do I like about the original?
Annotations on the marks explain the data
Headline on the left summarises the point being made
Forecast line has a different format
What don’t I like about the original?
The x-axis year labels aren’t horizontal, and they don’t align very well to the marks themselves
The y-axis % scale only goes up to 50%. On the one hand, this is fine, because it fits the range of the data. On the other hand, 35% means that 65% of data is still not renewable.
I used an area chart, with faded colours for all but CaRE to add context to the main story about CaRE usage
This choice also forced the y-axis to go from 0-100%. Now you can see that while goals are being hit, companies with huge data centres still have a long way to go.
I added a reference line for the 2015. This helps imply that the goal is continuous. The goal doesn’t stop in 2015.
You might think this kind of thing is new, only happening in the world of meetups and blogs. Wrong!
One of the earliest examples I know of is Joseph Priestley’s commentaries on his Charts of Biography and History.
In those books, you get the same challenges and opportunities described that you would do. It amazes me to read this and realise that the challenges we face today are the same as Priestley faced 250 years ago.
Ground zero of visualizing time
Here he is talking through his ideas of representing time in a chart:
We have no distinct idea of length of time until we have conceived it in the form of some sensible thing that has length as of a line.
What’s most fascinating is that this pamphlet IS THE START of people thinking about how to visualize time. Priestley came up with the idea of a Gantt bar, with length, to represent time. I cannot stress the importance of this enough. Today we don’t think twice about using lines and bars. But in 1765, they were inventing these ideas.
Collecting and cleaning data: a pain back then too
I shall not mention the pains it has cost me to reconcile and adjust the different values I have met concerning great numbers of them.
He describes the challenges of finding and cleaning data, and the wondrous opportunities of discovering new insights as he went along.
Laborious and tedious as the compilation of this work has been… a variety of views were continually opening upon me during the execution of it
The many times I have altered my lists convinces me that I should never revise them without seeing some reason to make farther alterations. The many times that I have replaced the same names after having rejected them convinces me that farther alterations would have been of very little consequence
He provides his data
No data project is truly authentic if you can’t access the underlying data. Fortunately, Priestley provided the entire dataset for the Chart of Biography. If you want to remake the chart or challenge his assumptions, you can!
Joseph Priestley – a legend
Joseph Priestley was an amazing polymath. Not only was he great with data, he also discovered oxygen, caused riots, catalogued electricity and was friends with US presidents (I recommend this book about him). He paved the way for data visualization.
Today’s makeover sees me completing an ambition of 5 years: remake Joseph Priestley’s Chart of Biography in Tableau. Finally, all of my 5 Most Influential Vizzes have been remade in Tableau.
Here’s the source chart for this week’s Makeover:
It’s a horizontal history from the excellent site Why Ask Why. It’s a cool data experiment and exploration. The article inspired Yura Bagdanov to do a horizontal version. Of course, when I read the article, I saw only the Chart of Biography, what I think is the most influential chart of all time:
Oh gang, I apologise but this is a super-brief Makeover this week. Work and life have multiple demands this week. Given the time squeeze, I gave myself 15 minutes to make a viz, with the rest of the time spent on this blog post.
15 minutes? What can you possibly focus on?
With only 15 minutes, I’m clearly not going to go deep into the data (even though it looks like it has some amazing detail).
Instead, I looked for one point being made in the article.
The flag section caught my eye. Firstly because it was the hardest part to interpret. So many flags and arrows and words and icons. What does it all mean? Once I deciphered the meaning, it seemed pretty interesting: 3 Middle East nations are importing way more than previously.
That was interesting: is it called out in the article? It sure is: they claim that these three countries are arming themselves in response to unrest in Syria. I find this horrible and fascinating: military hardware companies will be rubbing their hands with glee at the conflicts around the world.
Could I remake this story using the data?
5-11 minutes: building the viz
I’d spent a few minutes digesting the story, so needed to see what the dataset revealed. If you only have a matter of minutes, and want to look at how a measure changes over time, go for a line chart. There’s no time for mucking around with different views.
I filtered out the countries, and there you go: Qatar, Saudi Arabia and UAE all going up.
11-15 minutes: formatting
How do you make a simple makeover look really fancy? Choose an unusual font and background colour. Instant respect from us all! Well, there’s no time for that today. All I could manage was a quick switch to Smooth theme and writing a nice title. For simple charts like this, I’ll always try to ask a question, giving the viewer the information they need to query the chart itself.
15 minutes and I’m done.
How does a 15 minute makeover feel?
The main feeling is of fraud. I didn’t do much detailed checking of other countries. I don’t feel like I have really tested the hypothesis that “Middle Eastern Countries are spending more because of Syria.” I feel like I’ve just accepted the point made by the journalist and made a chart which, kind of, supports that opinion.
I feel like I’ve used the data to support the story, rather than use the data to find the story.
[Note 1: Yesterday I had a classic MakeoverMonday experience. I wrote this post, and was ready to hit publish. I then realised I just needed to tweak one of my images. I went back to Tableau, had a brainwave, and ended up with a completely new idea. I NEVER would have come up with that idea had I not been able to drag drop and experiment so readily. I chose to keep this post for Tuesday]
The world is getting hotter indeed. The data comes from the UK MetOffice’s HADCrut4 data: a global, gridded dataset of surface anomaly temperatures.
The science behind the dataset is complex, but the data’s straightforward: the measure is going up over time. How should you best show an upward trend?
This week, three ideas came to mind before I explored the data. I implemented each one.
1. Straight line (with politics)
You can’t beat a trend line. It’s visually the most straightforward and effective for displaying an upward trend. I chose to emphasise the moving average (red) with the actual anomalies in grey in the background.
The rising line chart conveniently leaves white space into which you can insert objects to further make your point. In this case I found two representative tweets from the likely US presidential candidates.
2. Bloomberg-inspired animation
Bloomberg did an amazing visualisation with this data last year. Here was my excuse to recreate it. I think this is an especially good way to show the data because the animation brings drama to numbers. As the hottest year creeps ever upwards you have a sense of dread. “Wow, 1995 was hot. It can’t get hotter, can it? Oh. It did, 1998. Ouch. And again. And again. Yikes.”
This week’s chart was essentially exactly the same idea, spiralised. Personally, I think the radial display makes it much harder to see the extremes creeping ever higher.
3. The highlight table
I love a highlight table. This one lets you look up each month, should you wish to, but shows, right at the top, just how common the broken records are happening. It was fact-checking the rank calculation which led me to the idea of histograms for my actual MakeoverMonday, published yesterday.
I also quite like tall and thin, but in this case, I think there’s just too much detail. We’re really making the point that the most recent months are super-hot. The highlight table takes a lot of vertical space to make that point.
[Today I had a classic MakeoverMonday experience. I wrote my original post, and was ready to hit publish. I then realised I just needed to check my calculations were correct. I went back to Tableau. While checking the data, I came upon a completely new idea. I NEVER would have come up with that idea had I not been able to drag, drop and experiment so readily. I will publish the original post tomorrow.]
The world is getting hotter. This week’s data comes from the UK MetOffice’s HADCrut4 data: a global, gridded dataset of surface anomaly temperatures. Note: the baseline for HADCrut4 is 1961-1990, not 1850-1900 as stated in the original article. See the MetOffice page for more details.
The science behind the dataset is complex, but the data’s straightforward: the measure is going up over time. How should you best show an upward trend? I had three ideas, which I implemented, and will publish tomorrow.
A final check of the data, though, led me to the idea of a histogram.
Are histograms good charts?
I really like my chart this week. It shows just how much the 21st century has been above average in an unusual way. The challenge with histograms though is that they aren’t as immediately understandable as a line chart. You’ll see in tomorrow’s posts that I was initially riffing on line charts. If you’re sharing your findings with people who don’t usually see many charts, or have much time, you might want to show a simpler chart. Or you could trust that your audience is in fact intelligent and go with this design.
I built a histogram initially just to check whether one of my calculations was correct. I immediately realised it was an interesting way of showing the data. But which chart shape and at what level of granularity?
My first version showed every month as a separate mark (because that’s what I was trying to validate). However, it’s just too much detail and nobody really wants to know the specific value for a particular month in the 1990s. It’s the trend that’s important.
I tried an area chart too. I like this as it shows the waves of the different time periods. However, it’s just one level of complexity too far. A histogram’s challenging enough without colouring it by groups and using area instead of bars.
Bars it was! My final step was to tell the story. I turned to colour here. My story is about the years since 2000, so I changed the palette to emphasise those years. Red for the recent colours, greys for everything else:
Finally, does an unstacked area work best of all? I think it might…
Helping the user understand a histogram
Here’s my biggest challenge with histograms: how do you help a reader understand it in as short a time as possible?
I created custom axis labeling as shown above
I annotated one of the marks
I used colour in the title to further explain what each mark showed
Did that work? How easy was it for you to interpret the chart?
With 12k retweets at time of writing, people clearly love spirally climate data!
What I like
If you watch the animation, it clearly expands outwards
The colours pop out (although they seem arbitrary)
Spirals fit into a small space, like a tweet
What I would improve
This is a straightforward timeline and the radial nature simply does not show the growth over time. Bloomberg did a much more exciting animated version. A simple trendline shows growth better, too, in my opinion. Growth in a sprial is only visible by a vague awareness of an expanding circumference. Spikes in months or years are lost in the noise and confusion of the sprial.
But…. Twelve Thousand Retweets? For all the problems of spirals, people engage with them. Is it better to get people thinking about the data, or be a chart purist? Bloomberg, when it tweeted about it’s story with a map, got only 192 retweets. From an account with THREE MILLION FOLLOWERS.
As you’ve gathered, this week my makeover was inspired by questioning the orignal chart. The chart itself is ok as far as stacked bar charts go. I question the boldness of the claim, though.
First of all, lots of the EU countries have higher levels of work than the US.
Secondly, as More or Less discussed this week, there are many reasons why gender data in employment statistics might be incorrect. Or, if not incorrect, the surveys are bias against female employees. For example, surveys often ask about “primary” employment. This ignores second jobs, which more women have than men. Uganda changed its surveys and female employment numbers went up by hundreds of thousands!
What did I like about the original?
A stacked bar is pretty clear.
I can easily find the categories on the legend and compare countries
What didn’t I like?
Everything’s got the same intensity. I’d have softened the borders, axis lines and labels, so that the data is more clear
It’s a good job I know my country abbreviations. GER, ITA? Some people might not know what they mean. They may think OECD is a country of its own.
There are too many tick marks on the axis. I don’t need all that information.
The title, “Female hours worked relatively low” doesn’t make much sense. I don’t mind using abbreviated language in titles, but this one seems to have gone too far.