Edward Tufte – Father of DataViz

It’s funny that literally within hours of relaunching this site, I had two or three friends ping me to ask if I was familiar with the works of Edward Tufte. As my mind does, it went immediately to a favourite film quote, “People are always asking me if I know Tyler Durden.” Edward Tufte is an economist and statistician who pioneered the study of data visualization and is perhaps most singularly responsible for elevating the study above merely giving a pretty face to a boring chart.

Visual ExplanagtionsHis books are the definitive texts on effective presentation of statistical data. As he first explored his field in the early 80s, he surveyed hundreds of examples of data presentment evaluating different methods of selecting and modelling data, and how those different methods work together to create effective or failed communication.

The thing that really makes his work fascinating is the breadth of scope that he encompasses in his exploration. Tufte evaluates examples of data visualization from all across history, and across such diverse fields as meterology, photography, modern art, kinesthetics, cartography, epidemiology, and of course really obvious fields like marketing and propaganda.

A very dear friend lent me a copy of Visual Explanations a year or so ago and I immediately became a disciple, so I owe as huge a debt of gratitude to her for turning me on to this brilliant source of inspiration as I do to him. It will truly be a challenge for me to share my own thoughts and views without seeming derivative of Tufte’s life work.

 

Data Visualization and Open Data

COUNT ALL THE THINGS!Last November, I had the great idea to start a blog devoted to excellence in data visualization.  “DataViz” is an increasingly important field as the amount of raw data to which we are exposed and that we generate daily grows and grows.   From searches on Google to trends on Twitter to likes on Facebook, more and more of us become increasingly involved in the generation and collateral consumption of large sets of data.  We are consciously aware of only superficial manifestations of those activities, such as the accuracy of our search results, or the popularity of certain topics of interest, but there is a gaping black-hole of awareness between what we as social networkers contribute to these systems and what we harvest from them.  DataViz is one of the tools available to us to try and illuminate that mystery for ourselves.  It allows us to wrap information in (what is typically hoped to be) an aesthetically appealing presentation of data to derive meaning or to present an position on the basis of empirical data.  To put it simply, it provides a picture of reality based on objective sampling to tell some kind of a story.  Hence, the name of this blog – numbers made to be pretty, or “prettynumbers“.

One of the reasons that it has taken me so long to get this blog going is that I have become professionally involved in a pretty significant data-sharing initiative.  The project deals with Open Data or the presentation of publicly available sets of data for public consumption.  In principle, it is the release into the public domain, to do with what it will, the statistics and measurements that organizations track and use to make informed decisions about how to proceed in policy.  This data is managed or owned by governments, corporations, not-for-profit groups, schools, individuals – anyone at all with a collection of numbers.  The reason that Open Data is so appealing is that if we can agree on a standard format for expressing sets of data, then we can also develop really useful tools to visualize any of those sets of data creatively and usefully, so as to make them comprehensible, appealing and perhaps most importantly relevant for consumption.

In my mind, it’s the difference between taking all of your clothes from your dressers and closets and throwing them on the floor in a huge messy pile and saying that is your collective “wardrobe and personal sense of style”, versus choosing the demonstrative outfits that best describe your wardrobe and style.  Or better yet, allowing your friends or complete strangers to come in to your room and rifle through your clothes themselves and allowing them to draw their own conclusions!  Undeniably , there is an opportunity for bias to be applied in the process in either case, but it makes the overall task of assessing the value of the data far more manageable (and in this example, far more entertaining).  But we’ll deal with the bias issue more in subsequent posts.

Hands down, one of the most exciting partners in the movement to expand Open Data is a company called Socrata.  They have created an unbelievable set of easy-to-use tools designed to simplify the conversion of raw data into useful web applications that make sense to humans, rather than spreadsheet programs, and it has had tremendous uptake.  One of my favourite implementations of the Socrata toolkit belongs to my home town’s government website, the City of Edmonton.  data.edmonton.ca offers over a hundred sundry data sets all coupled with useful (to varying degrees) visualizations that encourage its citizens to explore the data that has been captured, rather than relying on the account of news agencies or even the government itself.  Giving people the purest, most raw form of data available as well as the tools to explore and interact with that data is the best way of removing bias from understanding reality that I can imagine, short of going out to a field and observing all of the phenomena for oneself.  It is at once empowering, democratizing, manifesting real operational transparency, and maximizing opportunities for discourse in a way with which sitting in a crowded bar or pub and exchanging misinformed opinions can’t even begin to compete.

Open Data as a concept has been around forever – since the first sentient being looked under a rock.  However the technology to make all of a government’s spending patterns available to every citizen is incredibly new.  My smartphone has thousands of times the computer processing power that yesteryear’s supercomputers had, meaning that as easily as I can update my Facebook status, I can explore the population movements in my country over the past four decades.  So long as that data is available.  Open Data solves that last problem.

I can’t get excited enough about the possibilities of this technology! With all of the conceivable opportunities to misinform and misdirect public opinion in today’s mass media channels, Open Data stands as a force for unequivocal good in the search for truth in an increasingly complicated and confusing age.  I hope to share more of my experiences, insights and examples with you over time.  In the meantime, check out the thousands and thousands of examples available on Socrata sites like https://opendata.socrata.com/ or https://nycopendata.socrata.com/ to get a sense for the breadth of this cool new approach to sharing information.