top of page
Search

Don’t Be That Analyst – Properly Setting Y-axis Values

When designing data-driven graphics and dashboards, I encourage analysts to keep three guiding principles in mind:

  • Avoid distorting what the data has to say (pretty much the Hippocratic oath for analysts)

  • Communicate key aspects of data in a more intuitive way

  • Stimulate attention and engagement

Sometimes these principles auger against each other. Do I need to create some “shock and awe” to get your attention? Am I accurately depicting the critical trend or insight?


An analyst must carefully think about what the data is saying and how best to present it without increasing the chances that it will be misinterpreted or ignored. For many charts, your primary weapons of choice are Y-axis values, object scaling, and colors (the latter two of which I will address in a separate post).


Y-axis Values – The Mother of Data Viz Distortion

Nothing screws around with what data has to say than how you set the values of the Y-axis. (David Yanofsky at Quartz has great piece on this topic (3.5 minute read) with some nice visuals).


The big decision you must make is whether to go with a zero-based or a truncated Y-axis. Purists insist that all charts have a zeroed Y-axis which provides consistency, but sometimes ignores reality (some values never reach zero – like your body temperature or the National Debt – trust me) or creates a calming effect when we should be freaking out (Wait, what? Your temperature is up 6 degrees?!).


Truncated axes are OK with line charts when small movements are important, zero values are not realistic, or you are trying to emphasize a key point. I recommend using consistent Y-axis values when comparing the same data points between groups.


Zero-based axes are required for all column and bar charts because the size of the rectangle (the visualization) stretching down to zero establishes the ratio between data points. Yes, you could plot the data as a line chart and truncate the Y-axis if appropriate.


Data Distortion in Action

Let’s take a controversial topic like climate change (what topic isn’t controversial these days?) to illustrate how the use of Y-axis values can distort what the data has to say. The National Review (a conservative magazine founded in 1955 by William F. Buckley, Jr.) tweeted the following chart based on data from NASA’s Global Surface Temperature Analysis to support its view that concerns about rising temperatures and climate change are unfounded.

Source: National Review (using data from NASA’s Global Surface Temperature Analysis)

What's Wrong with This Picture?


The National Review used a zero-based Y-axis (they actually went down to -10 degrees F) which essentially flattens the line chart. In fact, the analyst deployed another Y-axis trick of using a high maximum value (120 degrees F) which has the effect of showing less volatility, less growth, and a less steep line than a lower maximum value.


This might be good enough for Homer Simpson, but not for me. My feeling is that the National Review analyst either was incompetent or wanted to support a pre-conceived narrative with some visual evidence. Come on! You’re better than this.


When analyzing annual average global surface temperatures, I think the consensus (common sense?) is that (1) small movements are important and (2) zero values are not realistic. Nor is a maximum value of 120 degrees (FYI – the average high temp in August in Death Valley, CA is 115 degrees). A truncated Y-axis for this line chart would avoid distorting what the data has to say, aid with interpretation, and stimulate attention.


Here is the same data charted with a Y-axis ranging from 55 to 60 degrees F. (Charted using Atlas a very cool free website that you should check out.) Now we see that the Earth’s average annual temperature between 1880 and 1938 was pretty much at or slightly below 57 degrees. From 1939 to 2014 it was mostly above 57 degrees and has been above 58 degrees since 2000. With a spread of 2.17 degrees between the highest (2014 = 58.52) and lowest values (1909 = 56.35), a truncated Y-axis is much more effective at illustrating the changes.


Think About What the Data is Saying


A good analyst will examine the data from different perspectives and figure out how best to convey the key insights. Here is a chart showing the average annual global temperatures between 1880 and 2014 compared to the 100-year average (1901-2000). It is based on the same data as the previous charts but displayed in a column chart with zero representing the 100-year average temperature for the Earth. I used colors to highlight years below (blue) and above (red) the average. This chart gets my attention, is very easy to interpret, and does not distort the data.

Now, I really don’t care about your politics. I do care about the data and how best to visualize it so that people can make better decisions – whether it is about the Earth’s surface temperature, how your marketing campaigns are performing, what is going on with your equipment installed base, or whatever other metrics are important to your business.


Don’t be stupid or conniving. Pick the right chart type and properly set the Y-axis values. Your colleagues and, maybe, the fate of the Earth are depending on it.

0 comments
bottom of page