1 An Introduction to Leaflet and Shiny in R:

From: http://github.com/SimonGoring/ShinyLeaflet-tutorial

This tutorial will help you get started using both leaflet and shiny in R to do basic data exploration with geospatial data. In this tutorial you will:

  1. Learn briefly about leaflet & shiny
  2. Install the leaflet and Shiny packages
  3. Produce a very basic leaflet map and learn some elementary functions
  4. Download a file from GitHub
  5. Build a simply Shiny app with a slider bar and drop down menu
  6. Explore a dataset.

I’m going to write code here using the dplyr pipe style. The RStudio leaflet tutorial uses the same styling. You don’t need to use it, you would otherwise just keep updating the map object, rather than piping data from one step to another.

2 What is Leaflet?

Leaflet is a JavaScript Library for mobile-friendly mapping. Leaflet is an alternative to OpenLayers, although doesn’t present all the functionality, and to the closed-source Google Maps API. Leaflet can provide functionality for web mapping, but can, in this instance, provide a flexible platform for data exploration.

Leaflet is very widely used, in part because of its flexibility and because of the broad coalition of users & developers. There are a large number of base layers available for mapping, it is modular, it supports interactive mapping, multiple geospatial data formats, and it’s fairly easy & intuitive to use.

This workshop will focus on using the leaflet package for R, but wrappers also exist for Python - folium, and leaflet is natively developed in JavaScript. Throughout this tutorial I’ll use the lowercase leaflet to refer to the R package and the uppercase Leaflet for the JavaScript module.

Leaflet provides a set of tools for easily visualizing geospatial data, and also for navigating and investigating individual records. First, a brief introduction to Leaflet.

2.1 Installing leaflet and Shiny

The R leaflet package has been developed by Joe Cheng with RStudio. The package itself lives on CRAN and GitHub, so getting it is fairly straightforward:

install.packages("leaflet")

# or #

devtools::install_github("rstudio/leaflet")

Since we’re going to be going a bit further with our tutorial (at the end), let’s also install shiny.

install.packages("shiny")

# or #

devtools::install_github("rstudio/shiny")

Fantastic. It was pretty easy to get started. Now all you need to do is write all the code. Freaking out? Don’t worry, the next section of this Tutorial will lead you through the next steps. At the end you’ll say to yourself “Self, you’re amazing, look at all you accomplished at the end of this tutorial!”, you’ll look off into the distance, and then get on with your, now bigger, now more fulfilling, life.

3 A Basic Leaflet Map

3.1 Some Terminology

Leaflet runs on “widgets”. The widget is the thing you interact with. When we think of a fully developed leaflet object in R or any other platform, like the one above, we are thinking of a container that is filled with things. A widget is a box, and it is a box that can contain:

  1. Navigation controls & decorations
  2. Underlying map tiles (a street map or topographic map)
  3. Popups or markers
  4. Geometric shapes (circles, lines, rectangles, polygons)

A leaflet widget is built piece by piece. We start by creating the widget, the box for all the other content we want to fill in. Once the box is made we can add pieces to it. We might want a box with a fixed street map, or without any underlying base layer. We might want all of our map points to generate popups, or we might only want to add polygons representing protected areas. Maybe we want to show networks of relatedness for genes, or degrees of similarity between the flavors served at ice cream parlors.

Regardless, in all cases we need to initialize our widget. We do that with leaflet().

leaflet() accepts a number of parameters, width, height and padding are parameters you can use to change the appearances of your leaflet object (all in pixels). You can also pass in data directly to the leaflet map using the data parameter. In practice I tend not to do this, prefering to pass in data directly using the set of add*() functions.

leaflet(width = 400, height = 200)

3.2 Making a Map

So, we start with an empty widget. If we want to do more with our widget, or make it more interactive later, it makes sense to assign it to a variable, and then add to it. We need to explicitly call the variable to make the map appear. We can add a baselayer with the addTiles() function, we’ll see a bit more about it later, and we can make sure that the initial view is reasonable. Using setView we can center the map using the lat/lng parameters and the zoom. There are a set of options that you could add, defined in the Leaflet documentation (as opposed to the leaflet documentation)

map <- leaflet(width = 400, height = 400)
map <- addTiles(map)
map <- setView(map, lng = -123.251,
               lat = 49.263,
               zoom = 6)

# Show the map:
map

Tiles are the baselayers for most web-mapping applications. They’re literally png tiles, at various zoom levels for a grid of latitude and longitude coordinates. There are an enormous number of publically available tile layers for you to use. You can check out Leaflet Providers for a pretty exhaustive list. For one of our webmapping exercises I used a smaller number of options. You can play with them by copying & running the code block. If you change the index of maptypes you can see what some different map types look like.

maptypes <- c("MapQuestOpen.Aerial",
               "Stamen.TerrainBackground",
               "Esri.WorldImagery",
               "OpenStreetMap",
               "Stamen.Watercolor")

# Change the index to see what the different tile-sets look like:
# Now we're into the magrittr formatting.  We're using "Provider" tiles here,
# not the default (OpenStreetMap) Tiles:

map <- leaflet() %>% 
  addProviderTiles(maptypes[1])

It’s possible to stack tiles as well, for example, adding OpenWeatherMap.Clouds to the map abovewould give us a pretty map with some semblance of weather. Because we’re in Vancouver we want to see the rain, and we want to see it close up. We could just look out the window, or we could setView to a fixed location, so that the widget knows where exactly we want to start (higher numbers for zooms mean more zoomed in):

map <- leaflet() %>% 
  addProviderTiles("Stamen.Watercolor") %>% 
  addProviderTiles("OpenWeatherMap.Rain") %>% 
  setView(lng = -123.251,
          lat = 49.263,
          zoom = 6)

Provider tiles ultimately come from unique web providers. The speed and reliability of a tile set is dependent both on your own platform and on the platform of the provider.

3.3 Adding Markers

Once a map is made we often want to add some sort of marker or object on top of the map. We could add someting like a marker, or a line. These functions all follow the add*() format, including addMarkers, addPopups, addCircleMarkers, addCircles and others.

Here we’re going to add a Marker and a CircleMarker. Markers are pretty useful, the “points” of the leaflet universe. We can modify the Marker further using its options parameter. The options accept a set of different markerOptions that allow you to drag, click, and hover over the markers:

map <- leaflet() %>% 
  addProviderTiles("Esri.WorldImagery") %>% 
  addMarkers(lng = -123.251,
                    lat = 49.263, 
                    popup = "You are here.",
             options = markerOptions(draggable = TRUE, riseOnHover = TRUE)) %>% 
  addCircleMarkers(lng = -123.261,
                    lat = 49.273, 
                    popup = "You aren't here.",
                   fillColor= "red", opacity = 1,
             options = markerOptions(draggable = FALSE, title = "Whoops")) %>% 
  setView(lng = -123.251,
          lat = 49.263,
          zoom = 13)

Most of these options are more fully described in the markerOptions help, or the markers documentation for Leaflet. The pointy marker above can be dragged around willy-nilly but the circle can’t. Try to switch the settings around here to drag the circle marker and leave the pointy marker fixed.

3.4 Getting Some Data

In February 2015 the University of Wisconsin’s excellent Cartography Lab held a Cartographic Design Challenge to take paleoecological data, records of fossil pollen, and turn them into map products that could tell a story through cartographic design. As part of the Scott Farley created a large csv file from the Neotoma Paleoecological Database. The data was hosted on the Design Challenge’s [GitHub page], but we’re just going to take the pollen data:

# Note, this file is 4MB, so it might take some time to download.  It comes from:
# https://raw.github.com/scottsfarley93/DesignChallengeData/master/plants_combined/all_pollen_final.csv
# but I've downloaded it and added it to the `data` folder:
pollen_data <- read.csv("data/pollen_data.csv", stringsAsFactors = FALSE)

# Everything gets loaded as a character string.
str(pollen_data)
## 'data.frame':    42553 obs. of  12 variables:
##  $ SiteName  : chr  "Devon Island Glacier" "Pittsburg Basin" "Portage Lake" "Portage Lake" ...
##  $ Latitude  : chr  "75.35" "38.904167" "47.081" "47.081" ...
##  $ Longitude : chr  "-82.5" "-89.1875" "-94.113" "-94.113" ...
##  $ ageYounger: chr  "" "" "" "" ...
##  $ AgeOlder  : chr  "" "" "" "" ...
##  $ Age       : chr  "2023" "" "423" "6807" ...
##  $ AgeType   : chr  "Radiocarbon years BP" "Radiocarbon years BP" "Radiocarbon years BP" "Radiocarbon years BP" ...
##  $ Taxon     : chr  "XANTHIUM" "XANTHIUM" "XANTHIUM" "XANTHIUM" ...
##  $ Value     : chr  "2" "2" "1" "1" ...
##  $ Pct       : chr  "4.166666667" "0.25" "0.284900285" "0.226244344" ...
##  $ Depth     : chr  "21950" "577" "35" "640" ...
##  $ Handle    : chr  "DEVON4" "PBBVANDA" "PORTAGE" "PORTAGE" ...

3.4.1 Very Basic Plotting

The table is pretty big, there’s 1110 unique sites with pollen and 84 pollen taxa. In paleoecological analysis we use the presence and proportion of various pollen types (e.g., Pinus or Abies) as a proxy for the plant that produces that pollen. We would assume that a sample containing high proportions of Pinus and Picea pollen would have high proportions of Pine and Spruce on the landscape, while a site with lots of Poaceae and Quercus pollen might represent an open oak savanna. Paleoecologists are often interested in patterns of change at broad spatial scales, so it would be interesting to see how things have changes in North America over the last 15,000 years.

To do this, let’s shrink the dataset down a bit:

  1. For varous reasons it’s worth being dubious about dates that are in “Radiocarbon years BP”, so let’s get rid of them.
  2. Let’s also get rid of any taxon that never has a proportion greater than 5%.
  3. There’s some columns that are extraneous. We really only need the SiteName, Latitude, Longitude, Age, Taxon & Pct.
  4. Lastly, let’s just look at samples between 15000 and -60 years BP. (1950 is 14C year zero because of atmospheric atomic bomb testing).
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Get all taxa with proportions greater than 5%:
good_taxa <- pollen_data %>% 
  group_by(Taxon) %>% 
  summarise(max = max(Pct)) %>% 
  filter(max > 5) %>% 
  select(Taxon) %>% unlist

# Now subset the data and remove extraneous columns:
pollen_subset <- dplyr::filter(pollen_data, 
                               !AgeType %in% "Radiocarbon years BP" & Taxon %in% good_taxa) %>% 
                    select(SiteName, Latitude, Longitude, Age, Taxon, Pct) %>% 
  mutate_each(funs(as.numeric), Latitude, Longitude, Age, Pct) %>% na.omit

plot(Latitude ~ Longitude, 
     data = pollen_subset[!duplicated(pollen_subset$SiteName),])

So, plotting the unique points here is helpful. We can see some patterns, at the very least, we know where points are and where they aren’t. If we want to look at individual taxa, or individual time slices then we have to keep going back to the plot function. We can get a better handle on some other aspects of the data if we add these points to the map as markers. We have 13581 individual records, one for each pollen taxon at each site at each sampled time period. This is still a lot.

3.4.2 Plotting Data With Leaflet

If we plotted everything at once the computer would be unhappy. So we have two options. We could subset the data further using a call like p_small <- pollen_subset %>% filter(!duplicated(SiteName)) or we could aggregate the Markers into spatial clusters. We’re going to do both using the clusterOptions for the function addMarkers:

# If you add all the markers to the map at once your computer is going to be very unhappy :)
# This just adds each unique point to the dataset.

p_small <- pollen_subset %>% filter(!duplicated(SiteName))

# We're adding the popup here so I can click & see what sites I'm looking at.

map <- leaflet() %>% 
  addProviderTiles("Stamen.Watercolor") %>% 
  addMarkers(lat = p_small$Latitude, 
             lng = p_small$Longitude,
             clusterOptions = markerClusterOptions(),
             popup = as.character(p_small$SiteName))

map

Okay. This is as far as we’re going to go with the basic plotting. Let’s switch into Shiny to see how we can make this more interactive.

3.5 Adding Shiny to the Mix

One of the reasons this is so messy still is that we have unique Markers for each pollen taxon at each site, and at each time interval. This means that some sites have upwards of 300 markers at the same lat/long coordinates. That’s not really going to show me much. I need a way to navigate interactively through the taxa and time periods. That’s where Shiny comes in.

3.5.1 A Basic Shiny Setup

So, if we want to interactively switch between taxa or time periods, we need to add some controls. To do this we can move over to a Shiny App. At a bare bones level the structure is really straightforward. To build a Shiny app we need a user interface usually as a file called ui.R and code that is going to run our analysis on the server, called server.R usually. If we’re being very concise we can wrap everything into a function called shinyApp.

NOTE: This is a static webpage, not hosted on a Shiny Server. To properly run this document you need to uncomment a line in the YAML header, as indicated there. Neither this block, nor the next will render on this webpage.

#install.packages('shiny')
library(shiny)

shinyApp(ui = fluidPage(titlePanel("So Simple"),
                        sliderInput("slider", "Turn It Up", min = 0, max = 11, value = 5),
                        mainPanel(textOutput("MainPanel"))),
         server = function(input, output){
           output$MainPanel = renderText(input$slider)
         }
           )
Shiny applications not supported in static R Markdown documents

If we want shiny to accept inputs we need to define the type of inputs as part of the ui. In the example above we used a sliderInput, but you can have a checkboxInput (or checkboxGroupInput), dateInput, dateRangeInput, fileInput (for file uploads!), and all sorts of other inputs. You can use HTML formatting, add tool tips, go as crazy as you want.

If you want to add tool tips to your inputs (which is a nice thing to do for users), you could wrap each of your *Input functions with:

tags$div(title = "This is the tool tip.",
               selectInput( . . . ))

There’s also a useful package called shinydashboard that I’ve found helpful for making the apps look a bit prettier, and you can always embed a shiny app in a different webpage using an HTML iframe.

3.5.2 A Full Shiny/Leaflet Example

The code sample below can be divided into five sections:

  1. First, we put the pollen data into a global variable at the header.
  2. We wrap the whole Shiny App into a single function (this is for convenience and to simplify the example)
  3. The UI defines a fluidPage as above, this time with a slider for time and a selectInput with the unique pollen taxa (using scientific genera for the most part). The last component of the UI is the leafletOutput, which will look for an output, which is a leaflet widget named "MapPlot1" to render.
  4. The server section starts by rendering the leaflet widget, with the Stamen Watercolor tile set and setting it to output$MapPlot1. You can use whatever tileset you want here. The observe function is a reactive, but it’s a special case (you can read more about reactive & observe on the RStudio blog) that returns a rendered objct, not a list of values. In this case, the rendered object is the leaflet widget "MapPlot1", updated by first clearing any makers added using addMarkers, and then putting in new markers based on a subset of pollen data from a 500 year time band around the time defined in the sliderInput "time" and the selectInput choice "taxon".
  5. Finally, because the whole thing was rendering in a funny way I set the height option to 600 pixels so that everything would look nice.
pollen_subset <- dplyr::filter(pollen_data, Taxon %in% good_taxa) %>% 
                    select(SiteName, Latitude, Longitude, Age, Taxon, Pct) %>% 
  mutate_each(funs(as.numeric), Latitude, Longitude, Age, Pct) %>% na.omit

library(leaflet)
library(shiny)

shinyApp(
  ui = fluidPage(
    sliderInput(inputId = "time", 
                label = "Years Before Present:", 
                min = -50, max = 15000, value = 0, step = 500),
    tags$div(title = "This input has a tool tip",
             selectInput(inputId = "taxon", 
                label = "Taxon of Interest", 
                choices = sort(unique(pollen_subset$Taxon)))),
    leafletOutput("MapPlot1")
  ),
  
  server = function(input, output) {
    
    output$MapPlot1 <- renderLeaflet({
     leaflet() %>% 
       addProviderTiles("Stamen.Watercolor") %>% 
        setView(lng = -100, lat = 50, zoom = 2)
    })
    
    observe({
      
      age <- input$time
      taxon <- input$taxon
      
      sites <- pollen_subset %>% 
        filter(findInterval(pollen_subset$Age, c(age - 250, age + 250)) == 1 &
                            pollen_subset$Taxon %in% taxon)
      
      leafletProxy("MapPlot1") %>% clearMarkers() %>% 
        addCircleMarkers(lng = sites$Longitude,
                  lat = sites$Latitude,
                  opacity = sites$Pct)
    })
  },
  options = list(height = 600)
)
Shiny applications not supported in static R Markdown documents

Now you I explore my data easily. I can slide through time and see if there are sites that show up where they shouldn’t, or sites I know are there that don’t show up at all. I could make this even more interactive if I used addPopups, since the API for the Neotoma Paleoecological Database allows us to directly open the Neotoma Data Explorer from a URI, for example http://api.neotomadb.org/v1/data/publications?datasetid=1001. So I could generate HTML code within the popup that would bring me right to the explorer to look at the data more closely, or link the the publication DOI, or anything else I want to.

As a quick pitch (for paleo folks) EarthCube’s Cyber4Paleo Research Coordination Network is hosting a hackathon for the Neotoma Paleoecological Database and the Paleobiology Database in Boulder, CO. If you’re interested it’s worth checking out this link.

Obviously we can make some changes to the code. In general I start splitting my server and ui components up into two pieces to make them more manegable. You can customize your display more using fluidRows and the sidebarLayout, but in some senses, data exploration may not need all the bells and whistles, as long as you can interactively manipulate the variable elements you’re interested in seeing.