Balancing Clarity and Creativity in Data Visualization

I’ve been reflecting on Elijah Meeks’ provocative essay, “3rd Wave Data Visualization”. In this post, I want to reflect on the tension between his first and third “waves.” I’ll refer to these as attitudes. (Meeks himself acknowledges that none of his “waves” have washed away. Each lives on.) He refers to them as Wave 1: Clarity and Wave 3: Convergence.

Upon re-reading his argument a few times, I believe we may useful understand the contrast Meeks highlights as the tension between these two imperatives:

Attitude 1: Design with Clarity. (Make sure we don’t miss the message.)

Attitude 2: Bring back the Creativity and Fun. (Give us some enjoyment.)

I’ll talk about these attitudes in more detail in a later post.

For now, I’m going to spend some time going out and evaluating a number of data visualizations bearing in mind questions such as these:

  1. How clear is this visualization? How easy is it to understand and interpret? Is that a good or a bad thing?
  2. How creative and fun is this visualization? Am I motivated to explore it further? Why or why not?
  3. Are there times, places, and audiences for whom clarity is more important than creativity? And vice versa?

The Tableau Public Gallery is a good place to start. And there are many others.

I’d be interested in your responses below. Include a link to a relevant data visualization.

I’ll report back with an update to this post.

Once again, the NY Times demonstrates the value of interactive data visualization

This impressive interactive data visualization demonstrates the value of the format. More than merely interesting, or intriguing, or even fun — it massively amplifies the communicative power of its subject matter.

Check it out:

How to Cut U.S. Emissions Faster? Do What These Countries Are Doing.
By Brad Plumer and Blacki Migliozzi — FEB. 13, 2019

NYTimes_DataViz_Carbon_Reduction_13Feb2019

Hal Varian on the Need for Data Interpreters

Hal Varian, Google’s chief economist, gave a nice summary of a major need of our era.

Emphasis added:

“The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.

“I think statisticians are part of it, but it’s just a part. You also want to be able to visualize the data, communicate the data, and utilize it effectively. … being able to access, understand, and communicate the insights you get from data analysis —are going to be extremely important.”

Hal Varian, Google’s Chief Economist, 2009

Strategy Tips for Kaggle Competitors

Martin O’Leary recently posted some sound advice for Kaggle competitors. You can find the three-graph version in the Kaggle wiki.

Here I’ll break it into four key points:

  1. Spend a while on visualization, making graphs of various properties of the data and trying to get a feel for how everything fits together.
  2. Test the performance of a variety of standard algorithms (random forests, SVMs, elastic net, etc.) to see how they compare. It’s often very informative to look at which data points are the least well predicted by standard algorithms, as this can give you a good idea of what direction to move in. (Be warned: Home-brew algorithms can be useful later on in a project, but in the early stages you want to try out as many things as possible, not get bogged down in the details of implementing a particular algorithm.)
  3. Then move into the nitty-gritty details once you have a sense for the lay of the land.
  4. Of course, all this assumes a certain kind of problem, where the data is already in numeric/categorical form. For more “interesting” datasets, such as the recent Automated Essay Scoring competition, a lot of the early work is in feature extraction — just looking for numbers which you can pull out of the data. That tends to be a bit more creative, and I use a variety of tools to see what works best. However, one of the joys of this kind of problem is that every one is different, so it’s hard to give general advice.