Alumni

Chris Wilson

Summer School

Class of 2014

Bio

Chris Wilson is the interactive graphics editor at Time.com, where he writes data-driven stories and produces visualizations based on the results. He specializes in features driven by behavioral science and other social phenomena. Prior to joining TIME, Chris was a senior editor at Slate and a columnist at Yahoo News. He is the author of a textbook on web-based data visualization and a book of puns on the name “Barack Obama” that was written overnight using a 45-line program and a digital dictionary. That book has the distinction of being one of the worst-selling Slate books of all time.

Chris is a 2005 graduate of the University of Virginia and lives in Washington, DC, with his wife Susan and their two cats, Junebug and Martin.

Project: Computable News

Automatic reports on any topic from the news, based on the archives of TIME magazine.

For all their richness in information, the large majority of news articles are lost to history. This is not a failure of format. Even venerable publications like TIME, where I work, have successfully digitized their archives and made them searchable. The trouble is that there is no way to synthesize large numbers of articles on a subject into a concise summary of a topic.

I propose to use the complete print archives of TIME Magazine, which amount to about 180,000 articles dating back to 1923, to generate automated reports on any topic that has been covered in the magazine with sufficient volume. The majority of these reports will be visual. For example, a report on “nuclear weapons” would likely include a map of all nations and locations implicated in the subject over time. A report on “innovation” would chart the astronomical rise in the term’s popularity, as well as its shifting connotations, as told by the evolution of the words it co-occurs with. (This is the subject of a recent New Yorker article.) The report framework will make decisions about the most relevant form of visualization based on the data.

On a technical level, this involves:

tokenizing all the words in the archives, which I have already collected
performing entity recognition where necessary using Wolfram|Alpha
indexing the data for fast retrieval of words and terms and their associated data (co-occurances, etc.)
automating the process of analysis of that data
visualizing that data in charts, maps, word charts, and so forth

Favorite Outer Totalistic r=1, k=2 2D Cellular Automaton

Rule 178734