June 27 - June 30 2016
Stanford University, Stanford, California
The materials used in the tutorial are available here and here.
An interactive graphic invites the viewer to become an active partner in the analysis and allows for immediate feedback on how the data and results may change when inputs are modified. Interactive graphics can be extremely useful for exploratory data analysis, for teaching, and for reporting.
Because there are so many different kinds of interactive graphics, there has been an explosion in R packages that can produce them (e.g. animint, shiny, rCharts, rMaps, ggvis, htmlwidgets). A beginner with little knowledge of interactive graphics can thus be easily confused by (1) understanding what kinds of graphics are useful for what kinds of data, and (2) finding an R package that can produce the desired type of graphic. This tutorial solves these two problems by (1) introducing a vocabulary of keywords for understanding the different kinds of graphics, and (2) explaining what R packages can be used for each kind of graphic.
Attendees will gain hands-on experience with using R to create interactive graphics. We will discuss several example data sets and several R interactive graphics packages. Attendees will learn a vocabulary that helps to understand the strengths and weaknesses of the many different packages which are currently available.
A vocabulary for understanding interactive graphics, 30 minutes
In this section we will give a high-level introduction about interactive graphics, without going into details about R code for specific packages.
Motivation: interactive graphics in exploratory data analysis, reporting results and teaching.
Vocabulary for describing interactive graphics:
interactive user interaction changes what is displayed. Useful when there are many similar plots for different data subsets, but you don’t want to see them all at the same time.
direct manipulation interacting with plot elements (lines, points, etc).
indirect manipulation interacting with keyboard, mouse clicks on widgets (buttons, menus, etc).
animated An animated graphic automatically advances over time, like a video. Animated graphics are most useful when data sets have a time dimension. The only interaction possible is moving forward and backward in time.
multi-layer A multi-layer graphic uses several geometric elements to show several data sets and/or variables. Multi-layer plots are useful for showing relationships between data sets and/or variables.
multi-panel A multi-panel graphic shows different things in different panels (sub-plots) which each have their own axes (perhaps different from each other). Useful when there are many similar plots for different data subsets, and you do want to see them at the same time. Also useful for showing different plots with aligned axes.
Compare and contrast:
interactive vs animated only interaction possible in an animated graphic is moving forward and backward in time (animated graphics are thus a subset of interactive graphics).
interactive vs multi-panel both useful for many similar plots with different data subsets. Do you want to see all the subsets at the same time? (yes=multi-panel, no=interactive)
Quiz questions
The previous section introduced a vocabulary for describing interactive graphics. In the following section, after showing a new graphic, we will ask the audience to take 1 minute to discuss with their neighbour about which vocabulary words can be used to describe that graphic.
Creating interactive graphics using R packages
In this section we will show specific R code examples from the various packages.
High-level interactive plotting packages, 30 minutes
Interactive graphics with shiny and plotly, 30 minutes
Multi-layer graphics, ggplot2 package, 15 minutes
Multi-panel graphics, facets in ggplot2, 15 minutes useful in two different situations:
Animated graphics, animation package, 15 minutes
Interactive + animated + multi-panel + multi-layer, 45 minutes a few packages are able to produce complex graphics which can be described by several vocabulary words.
Since we plan to present state-of-the-art interactive graphics, people should know how to use R data
structures (lists, data.frames) and the ggplot2 package.
Even though many examples will be interactive web graphics, we will assume only knowledge of
R, not HTML/JavaScript.
There are two classes of potential attendees:
Attendees should run the following R code to install the packages which are required for our tutorial:
source("http://tdhock.github.io/interactive-tutorial/packages.R")
source("http://biostatistics.dk/useR2016pack.R”)
Toby Dylan Hocking has designed and implemented several R graphics packages. He is the designer and primary maintainer of directlabels (Best Student Poster Prize at useR 2011) and animint (presented in an invited session at Joint Statistical Meetings 2015). He is also the original designer of the ggplot conversion feature of the plotly package.
Claus Thorn Ekstrøm is the creator and contributor to a number of R packages (MESS, Meth- Comp, SuperRanker) and is the author of "The R Primer" book. He has previously given ( tutorials on Dynamic graphics in R ) and the role of interactive graphics in teaching, and won the C. Oswald George prize for his article "Teaching ‘Instant Experience’ with Graphical Model Validation Techniques" in 2014.