Chapter 12 Reporting data results #3

Download a pdf of the lecture slides covering this topic.

Slides for the second half of these week are available to download in an HTML format. Save this file as HTML, and then you should be able to open the file in a web browser to see the presentation.

12.1 Shiny web apps

12.1.1 Resources for learning Shiny

There is an excellent tutorial to get you started here at RStudio. There are also several great sites that show you both Shiny examples and their code, here and here.

Many of the examples and ideas in the course notes this week come directly or are adapted from RStudio’s Shiny tutorial.

To start, Shiny has several example apps that you can try out. These are all available through your R session once you install the Shiny package. You can make them available to your R session using the command system.file():

install.packages("shiny")
library(shiny)
system.file("examples", package = "shiny")

12.1.2 Basics of Shiny apps

Once you have Shiny installed, you can run the examples using the runExample() command. For example, to run the first example, you would run:

runExample("01_hello")

This is a histogram that lets you adjust the number of bins using a slider bar. Other examples are: 02_text, 03_reactivity, 04_mpg, 05_sliders, 06_tabsets, 07_widgets, 08_html, 09_upload, 10_download, and 11_timer.

When you run any of these, a window will come up in your R session that shows the Shiny App, and your R session will pay attention to commands it gets from that application until you close the window. Notice that if you scroll down, you’ll be able to see the code that’s running behind the application:

Generally, each application takes two files: (1) a user interface file and (2) a server file. You can kind of think of the two elements of an R shiny app, the user interface and the server, as two parts of a restaurant.

The user interface is the dining area. This is the only place the customer every sees. It’s where the customer makes his order, and it’s also where the final product comes out for him to consume. The server is the kitchen. It takes the order, does all the stuff to make it happen, and then sends out the final product back to the dining area.

At its heart, an R shiny app is just a directory on your computer or a server with these two files (as well as any necessary data files) in it. For example, here’s a visual of an App I wrote to go with a paper:

This has the heart of the application (server.R and ui.R) plus a couple of R helper files and subdirectories with some figures and data that I’m using in the that application. If I open either of the main files in RStudio, I can run the application locally using a button at the top of the file called “Run App”. Once I have the App running, if I have an account for the Shiny server, I can choose to “Publish” the application to the Shiny server, and then anyone can access and use it online (this service is free up to a certain number of visitors per time– unless you make something that is very popular, you should be well within the free limit).

12.1.3 server.R file

The server file will be named server.R. This file tells R what code to run with the inputs it gets from a user making certain selections. For example, for the histogram example, this file tells R how to re-draw a histogram of the data with the number of bins that the user specified on the slider on the application. Once you get through all the code for what to do, this file also will have code telling R what to send back to the application for the user to see (in this case, a picture of a histogram made with the specified number of bars). Here is the code in the server.R file for the histogram example:

library(shiny)

# Define server logic required to draw a histogram
shinyServer(function(input, output) {

  # Expression that generates a histogram. The expression is
  # wrapped in a call to renderPlot to indicate that:
  #
  #  1) It is "reactive" and therefore should be automatically
  #     re-executed when inputs change
  #  2) Its output type is a plot

  output$distPlot <- renderPlot({
    x    <- faithful[, 2]  # Old Faithful Geyser data
    bins <- seq(min(x), max(x), length.out = input$bins + 1)

    # draw the histogram with the specified number of bins
    hist(x, breaks = bins, col = 'darkgray', border = 'white')
  })

})

Notice that some of the “interior” code here looks very familiar and should remind you of code we’ve been learning about this class. For example, this file has within it some code to figure out the breaks for histogram bins, based on how many total bins you want, and draw a histogram with those bin breaks:

x    <- faithful[, 2]  # Old Faithful Geyser data
bins <- seq(min(x), max(x), length.out = input$bins + 1)

# draw the histogram with the specified number of bins
hist(x, breaks = bins, col = 'darkgray', border = 'white')

This code is then “wrapped” in two other functions. First, this code is generating a plot that will be posted to the application, so it’s wrapped in a renderPlot function to send that plot as output back to the application:

output$distPlot <- renderPlot({
    x    <- faithful[, 2]  # Old Faithful Geyser data
    bins <- seq(min(x), max(x), length.out = input$bins + 1)

    # draw the histogram with the specified number of bins
    hist(x, breaks = bins, col = 'darkgray', border = 'white')
  })

Notice that this code is putting the results of renderPlot into a slot of the object output named distPlot. We could have used any name we wanted to here, not just distPlot, for the name of the slot where we’re putting this plot, but it is important to put everything into an object called output. Now that we’ve rendered the plot and put it in that slot of the output object, we’ll be able to refer to it by its name in the user interface file, when we want to draw it somewhere there.

All of this is wrapped up in another wrapper:

# Define server logic required to draw a histogram
shinyServer(function(input, output) {

  # Expression that generates a histogram. The expression is
  # wrapped in a call to renderPlot to indicate that:
  #
  #  1) It is "reactive" and therefore should be automatically
  #     re-executed when inputs change
  #  2) Its output type is a plot

  output$distPlot <- renderPlot({
    x    <- faithful[, 2]  # Old Faithful Geyser data
    bins <- seq(min(x), max(x), length.out = input$bins + 1)

    # draw the histogram with the specified number of bins
    hist(x, breaks = bins, col = 'darkgray', border = 'white')
  })

})

The server.R file also has a line to load the shiny package. You should think of apps as being like Rmd files– if there are any packages or datasets that you need to use in the code in that file, you need to load it within the file, because R won’t check in your current R session to find it when it runs the file.

12.1.4 ui.R file

The other file that a Shiny app needs is the user interface file (ui.R). This is the file that describes how the application should look. It will write all the buttons and sliders and all that you want for the application interface. This is also where you specify what you want to go where and put in any text that you want to show up.

For example, here is the ui.R file for the histogram example:

library(shiny)

# Define UI for application that draws a histogram
shinyUI(fluidPage(

  # Application title
  titlePanel("Hello Shiny!"),

  # Sidebar with a slider input for the number of bins
  sidebarLayout(
    sidebarPanel(
      sliderInput("bins",
                  "Number of bins:",
                  min = 1,
                  max = 50,
                  value = 30)
    ),

    # Show a plot of the generated distribution
    mainPanel(
      plotOutput("distPlot")
    )
  )
))

There are a few things to notice with this code. First, there is some code that tells the application to show the results from the server.R code. For example, the following code tells R to show the histogram that we put into the output object in the distPlot slot and to put that graph in the main panel of the application:

mainPanel(
      plotOutput("distPlot")
)

Other parts of the ui.R code will tell the application what kinds of choice boxes and sliders to have on the application, and what default value to set each to. For example, the following code tells the application that it should have a slider bar that can take a minimum value of 1 and a maximum value of 50. When you first open the application, its default value should be 30. It should be annotated with the text “Number of bins:”. Whatever value is selected should be saved to the bins slot of the input object (just like we’re using the output object to get things out of the server and printed to the application, we’re using the input object to get things that the user chooses from the application interface to the server where we can run R code).

sliderInput("bins",
            "Number of bins:",
             min = 1,
             max = 50,
             value = 30)

12.1.5 Making a Shiny app

The first step in making a Shiny app is to make a new directory somewhere and to create R scripts for that directory called server.R and ui.R. You can just make these two files the normal way– within RStudio, do “New File”, “R Script”, and then just save them with the correct names to the directory you created for the App. Once you save a file as ui.R, notice that you’ll have a button in the top right of the file called “Run App”. When you’re ready to run your application, you can either use this button or use the command runApp.

12.1.6 Starting with the ui.R file

Next, you’ll need to put code in these files. I would suggest starting with the ui.R files. This file is where you get to set up how the application looks and how people will be able to interact with it. That means that this is a good place to start because it’s both quickly fulfilling (I made something pretty!) and also because you need to have an idea of what inputs and outputs you need before you can effectively make the server file to tell R what to do. In truth, though, you’ll be going back and forth quite a bit between these two files as you edit your application.

In the ui.R file, everything needs to be wrapped in a shinyUI() function, and then most things will be wrapped in other functions within that to set up different panels. For example, here’s a very basic ui.R file (adapted directly from the RStudio tutorial) that shows a very basic set up for a user interface:

shinyUI(fluidPage(
        
        titlePanel("Tweets during Paris Attack"),
        
        sidebarLayout(
                sidebarPanel("Select hashtag to display"),
                mainPanel("Map of tweets")
        )
))

Notice that everything that I want to go in certain panels of the page are wrapped in functions like sidebarPanel and mainPanel and titlePanel. Everything in this file will be divided up by the place you want it to go in the final version.

As a note, the sidebar layout (a sidebar on one side and one main panel) is the simplest possible Shiny layout. You can do fancier layouts if you want by using different functions like fluidRow() and navBarPage(). RStudio has a layout help page with very detailed instructions and examples to help you figure out how to do other layouts.

If I run this ui.R, even if my server.R file only includes the line shinyServer(function(input, output) { }), I’ll get the following version application:

This doesn’t have anything interactive on it, and it isn’t using R at all, but it shows the basics of how the syntax of the ui.R file works. As a note, I don’t have all of the functions for this, like fluidPage and titlePanel memorized. When I’m working on this file, I’ll either look to example code from other Shiny apps of look at RStudio’s help for Shiny applications until I can figure out what syntax to use to do what I want to do.

12.1.7 Adding in widgets

Next, I’ll add in some cool things that will let the user interact with the application. In this case, I’d like to have a slider bar so people can chose the range of time for the tweets that are shown. I’d also like to have a selection box so that users can look at maps of specific hashtags or terms. To add these on (they won’t be functional, yet, but they’ll be there!), I can edit the ui.R script to the following:

shinyUI(fluidPage(
        
        titlePanel("Tweets during Paris Attack"),
        
        sidebarLayout(position = "right",
                sidebarPanel("Choose what to display",
                             sliderInput(inputId = "time_range",
                                         label = "Select the time range: ",
                                         value = c(as.POSIXct("2015-11-13 00:00:00",
                                                              tz = "CET"),
                                                   as.POSIXct("2015-11-14 12:00:00",
                                                              tz = "CET")), 
                                         min = as.POSIXct("2015-11-13 00:00:00", tz = "CET"),
                                         max = as.POSIXct("2015-11-14 12:00:00", tz = "CET"),
                                         step = 60,
                                         timeFormat = "%dth %H:%M",
                                         timezone = "+0100")),
                mainPanel("Map of tweets")
        )
))

The important part of this is the new sliderInput call, which sets up a slider bar that users can use to specify certain time ranges to look at. Here is what the interface of the app looks like now:

If I open this application, I can move the slider bar around, but I it isn’t actually sending any information to R yet.

The heart of this new addition to the ui.R file is this:

sliderInput(inputId = "time_range",
            label = "Select the time range: ",
            value = c(as.POSIXct("2015-11-13 00:00:00", tz = "CET"),
                      as.POSIXct("2015-11-14 12:00:00", tz = "CET")), 
            min = as.POSIXct("2015-11-13 00:00:00", tz = "CET"),
            max = as.POSIXct("2015-11-14 12:00:00", tz = "CET"),
            step = 60,
            timeFormat = "%dth %H:%M",
            timezone = "+0100")

This is all within the function sliderInput and is all being used to set up the slider bar. "time_range" is the name I’m giving the input I get from this. Later, when I write my server code, I’ll be able to pull the values that the user suggested from the time_range slot of the input object. The next thing is the label. This is what I want R to print right before it gives the slider bar. The value object says which values I want to be the defaults on the slider. This is where the slider positions will be when someone initially opens the application. min and max give the highest and lowest values that will show up on the slider bar. For this application, I’m calling them as POSIXct objects because I want to do this for times rather than numbers. step says how big of an increment I want the slider to advance by when someone is pulling it. If the values of the slider are times, then the default unit for this is seconds, so I’m saying to have a step size of one minute. The timeFormat says how I want the time to print out at the interface and the timezone says what time zone I want time values to display in.

Things like this slider bar are called “Control Widgets”, and there’s a whole list of them in the third lesson of RStudio’s Shiny tutorial. There are also examples online in the Shiny Gallery.

12.1.8 Creating output in server.R

Next, I’ll put some R code in the server.R file to create a figure and pass it through to the ui.R file to print out to the application interface. At first, I won’t make this figure “reactive”; that is, it won’t change at all when the user changes the slider bar. However, I will eventually add in that reactivity so that the plot changes everytime a user changes the slider bar.

I am going to create a map of all the Tweets that included certain hashtags or phrases and that were Tweeted (and geolocated) from within a five-mile radius of the center of Paris during the attacks last Friday. For this, I’m going to use data on Tweets I pulled using the TwitteR package, which syncs up with Twitter’s API. Here’s an example of what the data looks like (I’ve cleared out some of the extra columns I won’t use):

paris_twitter <- read.csv("data/App-1/data/final_tweets.csv", as.is = TRUE) %>%
        mutate(tag = factor(tag),
               created = ymd_hms(created, tz = "Europe/Paris"),
               text = iconv(text, to='ASCII//TRANSLIT'))
paris_twitter[1:2, ]
##                                                                                                                                           text
## 1 RT @forza_will2006: My heart aches for the people of France. #PorteOuverte #PrayForParis #DownWithTerrorism #Pompidou. https://t.co/KwtNPsQ.
## 2                                  Ensemble contre la haine #jesuisparis #porteouverte #paris @ Place de la Republique https://t.co/TZykpVJHwd
##               created longitude latitude           tag
## 1 2015-11-15 01:27:56        NA       NA #PorteOuverte
## 2 2015-11-14 22:19:47  2.364184 48.86747 #PorteOuverte

About an equal number of these have and don’t have location data:

table(!is.na(paris_twitter$longitude))
## 
## FALSE  TRUE 
## 10478 10683

Here is a table of the number of tweets under the five most-tweeted tags:

head(paris_twitter)
##                                                                                                                                           text
## 1 RT @forza_will2006: My heart aches for the people of France. #PorteOuverte #PrayForParis #DownWithTerrorism #Pompidou. https://t.co/KwtNPsQ.
## 2                                  Ensemble contre la haine #jesuisparis #porteouverte #paris @ Place de la Republique https://t.co/TZykpVJHwd
## 3                   Liberte, egalite, fraternite #Paris #TodosSomosParis #JeSuisParis #PorteOuverte #peace #paix #paz. https://t.co/AjYVrmCCnL
## 4                Stay With My French #prayforparis #parisattacks #porteouverte #france #riphumanity #liberte #egalite. https://t.co/9DH5diHmyz
## 5                                                                                                                                         <NA>
## 6                                                          RT @bodoi_music: J'ai peur je cherche un abri #porteouverte https://t.co/1DIr4VLOEa
##               created longitude latitude           tag
## 1 2015-11-15 01:27:56        NA       NA #PorteOuverte
## 2 2015-11-14 22:19:47  2.364184 48.86747 #PorteOuverte
## 3 2015-11-14 21:36:06  2.350800 48.85670 #PorteOuverte
## 4 2015-11-14 21:00:49  2.350800 48.85670 #PorteOuverte
## 5 2015-11-14 20:11:01  2.350800 48.85670 #PorteOuverte
## 6 2015-11-14 19:52:43        NA       NA #PorteOuverte
tweet_sum <- paris_twitter %>%
  dplyr::group_by_(~ tag) %>%
  dplyr::summarize_(n = ~ n(),
                   example = ~ gsub("[[:punct:]]", " ", 
                                 base::sample(text, 1))) %>%
  dplyr::arrange_(~ dplyr::desc(n))
## Warning: `arrange_()` was deprecated in dplyr 0.7.0.
## ℹ Please use `arrange()` instead.
## ℹ See vignette('programming') for more help
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: `summarise_()` was deprecated in dplyr 0.7.0.
## ℹ Please use `summarise()` instead.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: `group_by_()` was deprecated in dplyr 0.7.0.
## ℹ Please use `group_by()` instead.
## ℹ See vignette('programming') for more help
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
knitr::kable(tweet_sum[1:5, ],
             col.names = c("Tag", "# of Tweets", "Example Tweet"))
Tag # of Tweets Example Tweet
#Paris 7502 RT taimaz Hommage aux victimes sur un panneau d affichage dans le 17e Paris AFP https t co jCt07gHZgv
#PrayForParis 7250 prayforparis https t co JBNwqGUzdC
#13novembre 1255 RT taimaz Perimetre de securite de 100m autour du bataclan AFP Paris 13novembre https t co A6Rp31ZqXN
#PorteOuverte 1147 RT fake rebel Si jamais quelqu un a besoin porteouverte Metro Maraichers L appart est vide car on a signe le bail il y a 1h mais on e
#fusillade 765 rayenoximore Bref je suis musulman et ma religion c est une religion de paix
Confondez pas ces gens n ont pas de religions fusillade

For the Tweets that are geolocated, it’s possible to map the tweet locations using the following code:

paris_map <- get_map("paris", zoom = 12, color = "bw")

paris_locations <- c("Stade de France", "18 Rue Alibert",
                     "50 Boulevard Voltaire", "92 Rue de Charonne",
                     "Place de la Republique")
paris_locations <- paste(paris_locations, "paris france")
paris_locations <- cbind(paris_locations, geocode(paris_locations))

# Plot contour map of tweet locations
plot_map <- function(tag = "all", df = paris_twitter){
        library(ggmap)
        library(dplyr)
        df <- dplyr::select(df, tag, latitude, longitude) %>%
                filter(!is.na(longitude)) %>%
                mutate(tag = as.character(tag))
        
        if(tag != "all"){
                if(!(tag %in% df$tag)){
                        stop(paste("That tag is not in the data. Try one of the following tags instead: ", paste(unique(df$tag), collapse = ", ")))
                }
                to_plot <- df[df$tag == tag, ]
        } else {
                to_plot <- df
        }
        
        hotel_de_ville <- to_plot$latitude == 48.85670 &
                to_plot$longitude == 2.350800
        n_hotel_de_ville <- sum(hotel_de_ville)
        
        if(n_hotel_de_ville == max(table(to_plot$latitude))){
                hdv_index <- sample((1:nrow(to_plot))[hotel_de_ville],
                            round(n_hotel_de_ville / 2))
                to_plot <- to_plot[-hdv_index, ]
        }
        
        my_map <- ggmap(paris_map, extent = "device") + 
                geom_point(data = to_plot, aes(x = longitude, y = latitude),
                           color = "darkgreen", alpha = 0.75) +
                geom_density2d(data = to_plot, 
                               aes(x = longitude, latitude), size = 0.3) + 
                stat_density2d(data = to_plot, 
                               aes(x = longitude, y = latitude,
                                   fill = ..level.., alpha = ..level..),
                               size = 0.01, bins = round(nrow(to_plot) / 3.3),
                               geom = "polygon") +
                scale_fill_gradient(low = "green", high = "yellow", guide = FALSE) + 
                scale_alpha(guide = FALSE) + 
                geom_point(data = paris_locations, aes(x = lon, y = lat),
                           color = "red", size = 5, alpha = 0.75) 
        return(my_map)
}

To print this out in the application, I’ll put all the code for the mapping function in a file called helper.R, source this file in the server.R file, and then I can just call the function within the server file. The application will look as follows after this step:

To complete this, I first changed the server.R file to look like this:

library(dplyr)
library(lubridate)

source("helper.R")

paris_twitter <- read.csv("data/final_tweets.csv", as.is = TRUE) %>%
        mutate(tag = factor(tag),
               created = ymd_hms(created, tz = "Europe/Paris"))

shinyServer(function(input, output) {
        output$twitter_map <- renderPlot({ plot_map() })
})

Notice a few things here:

  1. I’m loading the packages I’ll need for the code.
  2. I’m running all the code in the helper.R file (which includes the function I created to plot this map) using the source() command.
  3. I put the code to plot the map (plot_map()) inside the renderPlot({}) function.
  4. I’m putting the plot in a twitter_map slot of the output object.
  5. All of this is going inside the call shinyServer(function(input, output){ }).

One other change is necessary to get the map to print on the app. I need to add code to the ui.R file to tell R where to plot this map on the final interface. The full file now looks like this:

shinyUI(fluidPage(
        
        titlePanel("Tweets during Paris Attack"),
        
        sidebarLayout(position = "right",
                sidebarPanel("Choose what to display",
                             sliderInput(inputId = "time_range",
                                         label = "Select the time range: ",
                                         value = c(as.POSIXct("2015-11-13 00:00:00",
                                                              tz = "CET"),
                                                   as.POSIXct("2015-11-15 06:00:00",
                                                              tz = "CET")), 
                                         min = as.POSIXct("2015-11-13 00:00:00", tz = "CET"),
                                         max = as.POSIXct("2015-11-14 12:00:00", tz = "CET"),
                                         step = 60,
                                         timeFormat = "%dth %H:%M",
                                         timezone = "+0100")),
                mainPanel("Map of tweets",
                          plotOutput("twitter_map"))
        )
))

The new part is where I’ve added the code:

mainPanel("Map of tweets",
          plotOutput("twitter_map"))

This tells R to put the plot output twitter_map from the output object in the main panel of the Shiny app.

12.1.9 Making the output reactive

Now almost all of the pieces are in place to make this graphic reactive. First, I added some options to the function in helper.R to let it input time ranges and only plot the tweets within that range. Next, I need to use the values that the user selects from the slider in the call for plotting the map. To do this, I can use the values passed from the slider bar in the input object into the code in the server.R file. Here is the new code for the server.R file:

library(ggmap)
library(ggplot2)
library(dplyr)
library(lubridate)

source("helper.R")

paris_twitter <- read.csv("data/final_tweets.csv", as.is = TRUE) %>%
        mutate(tag = factor(tag))
paris_twitter$created <- as.POSIXct(paris_twitter$created, tz = "CET")

shinyServer(function(input, output) {
        output$twitter_map <- renderPlot({
                plot_map(start.time = input$time_range[1],
                         end.time = input$time_range[2])
                })
})

The only addition from before is to use the start.time and end.time options in the plot_map function and to set them to the first, [1], and second, [2], values in the time_range slot of the input object. Remember that we chose to label the input from the slider bar time_range when we set up the ui.R file.

This app is saved in the directory App-1 in this week’s directory if you’d like to play around with the code. I’ve deployed it on shinyapps here.

12.1.10 Fancier version

I’ve also created a (much) fancier version of a Shiny App looking at this Twitter data that you can check out here. All the code for that is here.

12.2 htmlWidgets

12.2.1 Overview of htmlWidgets

Very smart people have been working on creating interactive graphics in R for a long time. So far, nothing coded in R has taken off in a big way.

JavaScript has developed a number of interactive graphics libraries that can be for documents viewed in a web browser. There is now a series of R packages that allow you to create plots from these JavaScript libraries from within R.

There is a website with much more on these htmlWidgets at http://www.htmlwidgets.org.

Some of the packages availabe to help you create interactive graphics from R using JavaScript graphics libraries:

  • leaflet: Mapping (we’ll cover this next week)
  • dygraphs: Time series
  • plotly: A variety of plots, including maps
  • rbokeh: A variety of plots, including maps
  • networkD3: Network data
  • d3heatmap: Heatmaps
  • DT: Data tables
  • DiagrammeR: Diagrams and flowcharts

These packages can be used to make some pretty cool interactive visualizations for HTML output from R Markdown or Shiny (you can also render any of theme in RStudio).

There are, however, a few limitations:

  • Written by different people. The different packages have different styles as well as different interfaces. Learning how to use one package may not help you much with other of these packages.
  • Many are still in development, often in early development.

12.2.2 plotly package

From the package documentation:

“Easily translate ggplot2 graphs to an interactive web-based version and / or create custom web-based visualizations directly from R.”

  • Like many of the packages today, draws on functionality external to R, but within a package that allows you to work exclusively within R.
  • Allows you to create interactive graphs from R. Some of the functions extend the ggplot2 code you’ve learned.
  • Interactivity will only work within RStudio or on documents rendered to HTML.

The plotly package allows an interface to let you work with plotly.js code directly using R code.

plotly.js is an open source library for creating interactive graphs in JavaScript. This JavaScript library is built on d3.js (Data-Driven Documents), which is a key driver in interactive web-based data graphics today.

There are two main ways of create plots within plotly:

  • Use one of the functions to create a customized interactive graphic:
    • plot_ly: Workhorse of plotly, renders most non-map types of graphs
    • plot_geo, plot_mapbax: Specific functions for creating plotly maps.
  • Create a ggplot object and then convert it to a plotly object using the ggplotly function.
library(faraway) 
data(worldcup)
library(plotly)
plot_ly(worldcup, type = "scatter", x = ~ Time, y = ~ Shots)

Just like with ggplot2, the mappings you need depend on the type of plot you are creating.

For example, scatterplots (type = "scatter") need x and y defined, while a surface plot (type = "surface") can be created with a single vector of elevation (we’ll see an example in a few slides).

The help file for plot_ly includes a link with more documentation on the types of plots that can be made and the required mappings for each.

plot_ly(worldcup, type = "scatter", x = ~ Time, y = ~ Shots,
        color = ~ Position)

The plotly package is designed so you can pipe data into plot_ly and add elements by piping into add_* functions (this idea is similar to adding elements to a ggplot object with +).

worldcup %>%
  plot_ly(x = ~ Time, y = ~ Shots, color = ~ Position) %>%
  add_markers()

Some of the add_* functions include:

  • add_markers
  • add_lines
  • add_paths
  • add_polygons
  • add_segments
  • add_histogram

If you pipe to the rangeslider function, it allows the viewer to zoom in on part of the x range. (This can be particularly nice for time series.)

You should have a dataset available through your R session named USAccDeaths. This gives a montly county of accidental deaths in the US for 1973 to 1978. This code will plot it and add a range slider on the lower x-axis.

plot_ly(x = time(USAccDeaths), y = USAccDeaths) %>% 
  add_lines() %>%
  rangeslider()

For a 3-D scatterplot, add a mapping to the z variable:

worldcup %>%
  plot_ly(x = ~ Time, y = ~ Shots, z = ~ Passes,
          color = ~ Position, size = I(3)) %>%
  add_markers()

The volcano data comes with R and is in a matrix format. Each value gives the elevation for a particular pair of x- and y-coordinates.

dim(volcano)
## [1] 87 61
volcano[1:4, 1:4]
##      [,1] [,2] [,3] [,4]
## [1,]  100  100  101  101
## [2,]  101  101  102  102
## [3,]  102  102  103  103
## [4,]  103  103  104  104
plot_ly(z = ~ volcano, type = "surface")

Mapping with plotly can build on some data that comes with base R or other packages you’ve likely added (or can add easily, as with the map_data function from ggplot2). For example, we can map state capitals and cities with > 40,000 people using data in the us.cities dataframe in the maps package:

head(maps::us.cities, 3)
##         name country.etc    pop   lat    long capital
## 1 Abilene TX          TX 113888 32.45  -99.74       0
## 2   Akron OH          OH 206634 41.08  -81.52       0
## 3 Alameda CA          CA  70069 37.77 -122.26       0

Here is code you can use to map all of these cities on a US map:

ggplot2::map_data("world", "usa") %>%
   group_by(group) %>% filter(-125 < long & long < -60 &
                                25 < lat & lat < 52) %>%
   plot_ly(x = ~long, y = ~lat) %>%
   add_polygons(hoverinfo = "none") %>%
   add_markers(text = ~paste(name, "<br />", pop), hoverinfo = "text",
               alpha = 0.25,
     data = filter(maps::us.cities, -125 < long & long < -60 &
                     25 < lat & lat < 52)) %>%
   layout(showlegend = FALSE)

You can also make choropleths interactive. Remember that we earlier created a choropleth of US state populations with the following code:

library(choroplethr)
data(df_pop_state)
state_choropleth(df_pop_state)

You can use the following code with plotly to make an interactive choropleth instead:

us_map <- list(scope = 'usa',
  projection = list(type = 'albers usa'),
  lakecolor = toRGB('white'))
plot_geo() %>%
   add_trace(z = df_pop_state$value[df_pop_state$region != 
                                        "district of columbia"],
             text = state.name, locations = state.abb,
             locationmode = 'USA-states') %>%
  add_markers(x = state.center[["x"]], y = state.center[["y"]], 
    size = I(2), symbol = I(8), color = I("white"), hoverinfo = "none") %>%
  layout(geo = us_map)

The other way to create a plotly graph is to first create a ggplot object and then transform it into an interactive graphic using the ggplotly function.

The following code can be used to plot Time versus Shots for the World Cup date in a regular, non-interactive plot:

shots_vs_time <- worldcup %>%
  mutate(Name = rownames(worldcup)) %>%
  filter(Team %in% c("Netherlands", "Germany", "Spain", "Uruguay")) %>%
  ggplot(aes(x = Time, y = Shots, color = Position, group = Name)) + 
  geom_point() + 
  facet_wrap(~ Team)
shots_vs_time

To make the plot interactive, just pass the ggplot object to ggplotly:

ggplotly(shots_vs_time)

With R, not only can you pull things from another website using an API, you can also upload or submit things.

There is a function in the plotly library, plotly_POST, that lets you post a plot you create in R to https://plot.ly.

You need a plot.ly account to do that, but there are free accounts available.

The creator of the R plotly package has written a bookdown book on the package that you can read here. It provides extensive details and examples for using plotly.

Getting Started with D3 by Mike Dewar (a short book on D3 in JavaScript) is available for free here.

12.2.3 rbokeh package

The rbokeh package provides an R interface to a Python interactive visualization library, Bokeh.

There is a website with many more details on using the rbokeh package: https://hafen.github.io/rbokeh/

You can find out more about the original Python library, Bokeh, at http://bokeh.pydata.org/en/latest/.

Here is an example of an interactive scatterplot of the World Cup data made with the rbokeh package:

library(rbokeh)
figure(width = 600, height = 300) %>%
  ly_points(Time, Shots, data = worldcup,
    color = Position, hover = list(Time, Shots))

This package can also be used to create interactive maps. For example, the following dataset has data on Oregon climate stations, including locations:

orstationc <- read.csv("data/orstationc.csv")
head(orstationc, 3)
##   station    lat      lon elev tjan tjul tann pjan pjul pann  idnum
## 1     ANT 44.917 -120.717  846  0.0 20.2  9.6   41    9  322 350197
## 2     ARL 45.717 -120.200   96  0.9 24.6 12.5   40    6  228 350265
## 3     ASH 42.217 -122.717  543  3.1 20.8 11.1   70    7  480 350304
##                  Name
## 1 ANTELOPE 1 N USA-OR
## 2    ARLINGTON USA-OR
## 3  ASHLAND 1 N USA-OR

You can use the following code to create an interactive map of these climate stations:

gmap(lat = 44.1, lng = -120.767, zoom = 5, width = 500, height = 428) %>%
  ly_points(lon, lat, data = orstationc, alpha = 0.8, col = "red",
    hover = c(station, Name, elev, tann))

You can get very creative with this package. The following code comes directly from the help documentation for the package and shows how to use this to create an interactive version of the periodic table:

# prepare data
elements <- subset(elements, !is.na(group))
elements$group <- as.character(elements$group)
elements$period <- as.character(elements$period)

# add colors for groups
metals <- c("alkali metal", "alkaline earth metal", "halogen",
  "metal", "metalloid", "noble gas", "nonmetal", "transition metal")
colors <- c("#a6cee3", "#1f78b4", "#fdbf6f", "#b2df8a", "#33a02c",
  "#bbbb88", "#baa2a6", "#e08e79")
elements$color <- colors[match(elements$metal, metals)]
elements$type <- elements$metal

# make coordinates for labels
elements$symx <- paste(elements$group, ":0.1", sep = "")
elements$numbery <- paste(elements$period, ":0.8", sep = "")
elements$massy <- paste(elements$period, ":0.15", sep = "")
elements$namey <- paste(elements$period, ":0.3", sep = "")

# create figure
p <- figure(title = "Periodic Table", tools = c("resize", "hover"),
  ylim = as.character(c(7:1)), xlim = as.character(1:18),
  xgrid = FALSE, ygrid = FALSE, xlab = "", ylab = "",
  height = 445, width = 800) %>%

# plot rectangles
ly_crect(group, period, data = elements, 0.9, 0.9,
  fill_color = color, line_color = color, fill_alpha = 0.6,
  hover = list(name, atomic.number, type, atomic.mass,
    electronic.configuration)) %>%

# add symbol text
ly_text(symx, period, text = symbol, data = elements,
  font_style = "bold", font_size = "10pt",
  align = "left", baseline = "middle") %>%

# add atomic number text
ly_text(symx, numbery, text = atomic.number, data = elements,
  font_size = "6pt", align = "left", baseline = "middle") %>%

# add name text
ly_text(symx, namey, text = name, data = elements,
  font_size = "4pt", align = "left", baseline = "middle") %>%

# add atomic mass text
ly_text(symx, massy, text = atomic.mass, data = elements,
  font_size = "4pt", align = "left", baseline = "middle")

p

12.2.4 dygraphs package

The dygraphs package lets you create interactive time series plots from R using the dygraphs JavaScript library.

The main function syntax is fairly straightforward. Like many of these packages, it allows piping.

There is a website with more information on using dygraphs available at http://rstudio.github.io/dygraphs/index.html.

For example, here is the code to plot monthly deaths from lung diseases in the UK in the 1970s.

library(dygraphs)
lungDeaths <- cbind(mdeaths, fdeaths)
dygraph(lungDeaths) %>%
  dySeries("mdeaths", label = "Male") %>%
  dySeries("fdeaths", label = "Female")

12.2.5 DT package

The DT package provides a way to create interactive tables in R using the JavaScript DataTables library.

We’ve already seen some examples of this output in some of the Shiny apps I showed last week. You can also use this package to include interactive tables in R Markdown documents you plan to render to HTML.

There is a website with more information on this package at http://rstudio.github.io/DT/.

library(DT)
datatable(worldcup)

12.2.6 Creating your own widget

If you find a JavaScript visualization library and would like to create bindings to R, you can create your own package for a new htmlWidget.

There is advice on creating your own widget for R available at http://www.htmlwidgets.org/develop_intro.html.