Displaying variability

There are many ways to display data. Some place more emphasis on visualizing averages, trends, proportional changes, etc. The purpose of including error bars in the Community Charts graphs is to provide a sense of uncertainty. The error bars in this application will only every provide a view of one of several sources of uncertainty in future climate projections, specifically variability around an estimator used to measure the average of a handful of climate models across a period of ten years.

Even when using the range of the models to inform the error bars, this is an underestimation of uncertainty, which is another argument against bothering with the substantially even narrower standard deviation bars. If you look back at the error bars in the original Community Charts example graph, they are suggestive of a level of precision regarding the future that begs criticism (and as mentioned in a previous side note, they may be flat out wrong). An informed viewer will know to disregard such narrow bars (so why have them?), if for no other reason that an awareness that the bars cannot possibly be intended to portray a complete profile of uncertainty about the future. Most will interpret them as suggesting the colored bars are highly precise with regarding predicting the future.

Overlay error bars

We have already seen one method for visualizing variability, which is to overlay error bars on top of the primary bars of the bar plot. This is in keeping with what was done in the original version (questionable precision aside). The only difference considered so far was regarding what statistic to use as a measure of dispersion, the standard deviation or the range. Others could be considered, such as the inter-quartile range (IQR), but as mentioned previously, there are so few data points contributing to estimation of variability that some statistics just won’t work.

To recap, and for comparison with subsequent plots, here are two plots with error bars using CRU 3.2 data. The first shows a range bar and the second shows +/- one standard deviation from the five-model average.

Floating bars

The focus now is on plotting the data differently, in a way that will enhance the view of variability, while at the same time benefiting from complete removal of the bar baseline problem (for temperature). Whereas measures of spread or dispersion around a mean cannot be inferred from looking at a mean value (this is why we must add explicit error bars), measures of center can be inferred relatively easily from a plot of dispersion without having to add layers to the plot. Here are plots of the same data corresponding to the two plots above.

Referring back to the discussion of CRU 3.2 data and error bars, it is clear how the ability to have error bars for the historical baseline period assists in this Community Charts plot. There is something to see. Without some measure of dispersion around the historical baseline average, the twelve spaces for historical bars in this plot would be blank.

With PRISM data we do not have a measure of inter-annual variability or any other measure of uncertainty. To give the user something to look at for mere visual orientation - which is after all the main goal of including the historical baseline despite it being based on an different data source - we have to fake an error bar to at least provide a little rectangle of sorts to remind them where the historical mean is located. I suppose this can also remind us that PRISM is not perfectly estimated either, even though the little rectangle means nothing. It’s basically still a point value, but I want it to be clearly visible.