Archive for September, 2007

Income Share Graph

Monday, September 24th, 2007

After my last graph analysis, a reader asked that I review the following graph from a New York Times columnist’s blog post on income disparity.

Income Share from Krugman

Overall, I think this graph is good. Everything is labeled, making the message clear, but there are several minor problems with the details.

  • The rotated year labels are hard to read, especially since they’re not on even multiples of 5s or 10s.
  • The data points and connecting lines are fighting for attention and saying the same thing. Either de-emphasize/remove the points or replace the connected lines with a smoother.
  • The grid lines are too bold — competing with the data marks.
  • The labels use inconsistent capitalization (and there appears to be a missing space in “classAmerica”). It’d be nice if all the labels were within the graph frame, too.

I don’t know enough about economics to comment on the currency of the graph’s message. One commenter suggested that the exclusion of capital gains diminishes the value of the data. I found the original paper [pdf], and the absence of capital gains seems to stem from the way the data was collected from income tax records though it is justified by calling capital gains “lumpy” and “volatile” and so presumably independent of long-term trends.

Here are a couple of attempts I made a reproducing the graph to fix the minor problems. I used GraphClick (great product) to get the data. (The original paper’s data only goes through 1998.)

Income Share (BW)

My first graph leaves off the labels, uses fainter data points and gridlines and adds a spline smoother to show trends. Labels can be added in a variety of ways to highlight sections of interest.

Income Share (annotated)

In my second graph I experiment with labeling ranges more specifically than with just a single arrow. With an arrow, it’s unclear whether it’s pointing to a single event or a section. The shading fixes that problem but adds other distractions that probably aren’t worth the price.

Petraeus Chart

Sunday, September 16th, 2007

I can’t decide if this final chart from General Petraeus’s report is the worst ever or the best ever.

Petraeus Recommended Force Reductions Chart

What makes it the worst chart? Mainly the axes.

  • unlabeled Y axis — you have to rely on the text of the report to know the Y axis is brigades.
  • no Y origin — it’s standard for bar charts to start at 0 since their values are encoded as lengths. Here, not only do the bars float, but the axis origin is not even labeled. Is 0 at the base of the bars or at the blue line or elsewhere?
  • nonlinear Y axis — the distance between 15 and 20 is noticeably smaller than the distance from 10 to 15 and from 5 to 10. The distance from the presumed 0 to 5 also very different.
  • irregular Y axis labels — it’s unusual to have labels at 5, 7, and 10 instead of evenly spaced labels such as 5, 7.5, and 10, but at least the 7 is closer to 5 than 10.
  • star images around brigade ranges add useless clutter.

What makes it the best chart? The purpose of a visualization is to communicate a message, and this chart somehow communicates the paradoxical message, “we have a detailed plan, but we just don’t know the details.” The definite bars and dated call-outs show certainty while the fuzzy axes and question marks show uncertainty. So in a sense, the chart communicates its message perfectly.

However, there’s something unsettling about the duplicity of the chart. The normal way of showing uncertainty is with confidence intervals, showing the lower and upper limits with 95% confidence. That’s hard to do with a stacked bar charts like these, so I would split the roles into two kinds, Overwatch and Combat (Partnering and Leading). These two summary roles could be plotted separately each with confidence bands.