A Survey of Data and Encodings in Word Clouds

1. Abstract

Word clouds are an increasingly popular means of presenting statistical summaries of document collections, appearing frequently in digital humanities literature, newspaper articles, and social media. Despite their ubiquity and intuitive appeal, our ability to read such visualizations accurately is not yet fully understood. Past work has shown that readers perform poorly at certain tasks with word clouds, and that perceptual biases can affect their interpretation. To better understand the potential impacts of these biases, we present a survey of word cloud usage. Drawing from a corpus of literature from the fields of digital humanities, data visualization, and journalism, we record what data encodings are most commonly used (e.g., font size, position, etc.), what data is being presented, and what tasks are meant to be supported. We offer design recommendations given the most common tasks and biases, and point to future work to answer standing questions.

Muyang Shi (shim@carleton.edu), Carleton College, United States of America, Danielle Albers Szafir , University of Colorado Boulder, United States of America and Eric Carlson Alexander , Carleton College, United States of America

Theme: Lux by Bootswatch.