cscheid, 14 Oct 2013

Good morning from Atlanta! This week holds the largest academic visualization conference of the year, with about 1,000 people all interested in understanding and creating meaningful and beautiful visual depictions of data. What’s not to like? Although the conference proper only starts tomorrow, there are interesting satellite events today. The two big associated symposia are BioVis and LDAV.


BioVis is arguably the hottest area in visualization these days. The combination of world-changing impact, legendarily messy data problems (and, let’s face it, better-than-average funding prospects) makes bioinformatics irresistible. Which is alright with me, because the following papers look great:

Large-Scale Multiple Sequence Alignment Visualization through Gradient Vector Flow Analysis, Khoa Tan Nguyen and Timo Ropinski. (PDF)

invis: Exploring High-dimensional Sequence Space of In Vitro Selection Çağatay Demiralp, Eric Hayden, Jeff Hammerbacher, Jeffrey Heer. (PDF)

Yes, I have a soft spot in my heart for high-dimensional data analysis.


On the LDAV side, I’m sad to have missed Dr. Guy Lebanon’s keynote yesterday. Lebanon has long been involved in bridging the visualization, data mining and machine learning worlds, which is also a favorite topic of mine. I’m happy to see his work featured in a keynote (does anyone have a link to his talk or slides?).

For today, I’m eager to hear about these two papers:

A Provably-Robust Sampling Method for Generating Colormaps of Large Data. David Thompson, C. Seshadhri, Ali Pinar, Janine Bennett. (preprint? PDF It’s the only link I found). Data-driven colormap design is, when you think about it, a straightforward idea, but one appears to have received essentially no attention until the last couple of years.

Less After the Fact: Investigative Visual Analysis of Events from Streaming Twitter. Thomas Kraft, Xiaoyu Wang, Jeffery Delawder, Wenwen Dou, Li Yu, William Ribarsky. (PDF) I’m curious about this because of the relationship with our nanocubes paper, and about how to scale an interactive tool into querying hundreds of millions of datapoints at acceptable rates.