I tried Naomi Robbins’ Graph Makeover Contest recently. After trying a few such competitions, I find I learn more after seeing the other entries if I first make the effort of producing an entry myself. Robbins presented the following table from the Bureau of Labor Statistics report on prices and spending [pdf] and asked for a graphic version.
The first step was to get the data and understand it. Ignoring the percent change, which is computed, you can think of the data as one continuous variable, annual spending, by three categorical variables: year, housing tenure and expenditure. Expenditure is tricky though. It’s really expenditure and sub-expenditure. The food expenditure is completely broken down into sub-expenditures, but transportation and healthcare are only partially subdivided. And double checking the data, it seems that some expenditures are missing altogether because they don’t add up to the stated total.
The second step is to decide what message the graph should communicate, and that will lead into the third step of selecting an appropriate visualization for the data and the message. The table presentation, for example, shows all the data with good accuracy, but shows some messages better than others. It’s not too hard to see that the total expenditures stayed about the same from 1986 to 2010 or that health insurance has by far the biggest precent increase. However, it’s harder to rank expenditures or to see that renters pay relatively more on housing than homeowners.
If I were writing the report, I would probably include several graphs to make several points about the data. Here’s a collection I experimented with.
This area chart does a good job of showing that total expenditures stayed about the same and how the totals compare across housing tenure levels. Less effectively, it shows how each expenditure changed over time and their ranking (larger expenses are at the bottom).
Note that I added in an “Other” category for the missing expenditures and replaced the umbrella expenditures with their sub-expenditures so summing would work out. That meant, for instance, replacing “Transportation” with the new sub-expenditure “Transportation (non-fuel)”.
The area chart muddles the changes of the middle items though. This slope chart works better to seeing each expenditure change.
Coloring is another challenge with this many categories, and I didn’t find a great solution. Maybe only the interesting lines should have color…
We’ve lost the total, of course, and though we can see absolute amounts and changes, we still don’t see the giant relative increase in health insurance. For that I tried a log scale.
A log scale is may not be appropriate for a general audience, but it does show relative change well because equal vertical distances represent equal multipliers.
If you want to concentrate only on relative change, this “spoke” chart I made by accident makes the point even better about the health insurance cost change.
It took me a while decipher it, but it’s sort of like all the lines from the previous chart centered on the same location.
My actual entry used the log scaled slope lines, except I changed the vertical axis to relative values and added the total spending numbers at the bottom, which adds the message about stable spending and provides a basis for the percentages. Using relative spending amounts helps see that renters spend relatively more on housing, which was one point raised in the prose of the original report.
I’m still not happy with having so many colors. With more time, perhaps I could have found better colors, mixed in different line styles, tried putting the labels on the lines, combined similar categories, …