<html>
<head>
  <link rel="stylesheet" type="text/css" href="dl-record.css"/>
</head>
<body>
  <h1>Recording and Replaying the Graphics Engine Display List</h1>
  <p class="author">
    by Paul Murrell, Jeroen Ooms, and JJ Allaire
  </p>
  <p class="date">
<!--begin.rcode echo=FALSE, results="asis"
cat(format(Sys.Date(), "%A %d %B %Y"))
end.rcode-->    
  </p>

<!--begin.rcode echo=FALSE, results="hide"
opts_chunk$set(prompt=TRUE, comment=NA)

codeprep <- function(x) {
    y <- gsub("(#.+)", '<span class="comment">\\1</span>', x)
    paste("  ", y)
}

source("Rsession.R")
end.rcode-->

  <a name="overview"/>
  <h2>Overview</h2>
  <p>
    In the development version of R (to become R 3.3.0),
    it is now possible (again) to record the graphics engine display list
    in one R session and replay it in another R session.
  </p>
  <p>
    The current situation (in R 3.2.2) is demonstrated below.
    If we record a plot with <c>recordPlot()</c>, 
    save it to disk with <c>saveRDS()</c>, then quit R ... 
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-record-plot", '
image(matrix(runif(16), ncol=4), col=hcl(240, 60, 10*1:8))
rp <- recordPlot()
saveRDS(rp, "R-recordplot.rds")
')
end.rcode-->

  <p>
    ... then start a new R session and attempt
    to load the plot with <c>readRDS()</c>, 
    and replay the recorded plot with
    <c>replayPlot()</c>, the result is an error 
    (and no plot is produced).
  </p>

<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-replay-plot", '
rp <- readRDS("R-recordplot.rds")
replayPlot(rp)
', 
         eval=TRUE)
end.rcode-->

  <p>
    In the development version of R, recording a plot in one session ...
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-record-plot", '
image(matrix(runif(16), ncol=4), col=hcl(240, 60, 10*1:8))
rp <- recordPlot()
saveRDS(rp, "Rdevel-recordplot.rds")
')
end.rcode-->

  <p>
    ... then replaying it in a new R session now works fine.
  </p>

<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-replay-plot", '
rp <- readRDS("Rdevel-recordplot.rds")
replayPlot(rp)
')
end.rcode-->

  <p>
    The graphics engine in R was originally created to support
    redrawing of the on-screen graphics window when it was resized
    and to support copying between graphics devices (with <c>dev.copy()</c>).  
    When we draw a plot in R, low-level drawing operations
    are recorded on the display list and, when a graphics window is
    resized, or we copy a plot from one device to another,
    those drawing operations are replayed to reproduce the
    plot.
  </p>
  <p>
    The <c>recordPlot()</c> function was added later.  This takes a 
    copy of the display list, as a "recordedplot" object,
    and the <c>replayPlot()</c> function
    can be used to redraw the "recordedplot" object.
    This means that we can keep copies of more than
    just the current plot, which
    allows things like the "Plot History" feature
    in the RGui on Windows - we can record all of the plots drawn
    on a device and browse backwards and forwards between them.
  </p>
  <p>
    Because <c>recordPlot()</c> returns the
    display list as an R object, we can also save a "recordedplot"
    to disk, for example, with <c>saveRDS()</c>.
    This means that we can save a plot from one R session and
    replay it in another (with <c>readRDS()</c> and <c>replayPlot()</c>).
  </p>
  <p>
    Recording a "recordedplot" in one R session
    and replaying it in another R session is actually a bit dangerous
    (for reasons we will come to later),
    but of course, because it was possible and in some cases quite useful, 
    people started doing it anyway.
  </p>
  <p>
    Then in R version 3.0.0, replaying "recordedplot"s from a different
    R session went from being dangerous to being outlawed.
    If you try this sort of thing in R, between version 3.0.0 and
    3.2.3, you just get an error
    message, as demonstrated in the opening section of this document.
  </p>
  <p>
    So now we were in a situation where
    there were a number of R projects and packages that depended on
    a feature that no longer worked.  
    Naturally, that produced a black market in "recordedplot"s;
    ingenious coders simply worked around the ban and carried on.
    But having projects and packages depending upon a feature that
    was both <em>dangerous</em> and
    <em>unsupported</em> was not ideal.
  </p>
  <p>
    For R 3.3.0, the workaround has been incorporated into R
    and additional support and defences have been added so that
    replaying a "recordedplot" from a different R session will be
    both <em>supported</em> and <em>not as dangerous as it was</em>.
  </p>
  <p>
    In the remainder of this document, we explore why this sort of
    recording and replaying of plots is useful, why it was broken, how it
    has been fixed, and what you can now do with the fixed system.
  </p>

  <a name="useful"/>
  <h2>Why reloading a "recordedplot" is useful</h2>
  <p>
    There are many ways to save R graphics output, so what is the
    benefit of using a "recordedplot"?
  </p>
  <ul>
    <li>
      Exporting R graphics to a file format such as PNG, PDF, or SVG
      is a one-way trip; we cannot get the graphics back into R.
    </li>
    <li>
      Saving an R object from 'lattice' or 'ggplot2' does allow reloading
      back into R, but only works for
      plots produced by 'lattice' or 'ggplot2'.
    </li>
    <li>
      Saving the R code that produced the plot allows us to run it again
      in R, but this is much slower than redrawing a "recordedplot",
      especially if the R code includes calculations to produce the
      data for plotting.
    </li>
  </ul>
  <p>
    This still leaves the question of why we might want to save 
    plots to disk and reload them, possibly in another R session.
    Examples of the use "recordedplot"s include:
  </p>
  <ul>
    <li>
      <a href="https://www.rstudio.com/">RStudio</a>
      saves plot histories to disk rather than keeping them
      in memory.  This can involve multiple R sessions when using
      RStudio's "Build and Reload" feature for package development
      and when RStudio Server temporarily suspends then resumes
      an inactive R session.
    </li>
    <li>
      The <a href="http://yihui.name/knitr/">'knitr' package</a> 
      uses "recordedplot"s to record plot output
      from code chunks, which makes it more efficient
      to produce multiple formats for each plot, and can involve
      multiple session when caching is involved.
    </li>
    <li>
      The <a href="https://www.opencpu.org/">OpenCPU</a> 
      project provides an HTTP API that separates plot generation 
      from plot rendering.  A "recordedplot" provides a
      a format-independent record of a plot that can be efficiently
      rendered to PNG, PDF, or SVG on request.
    </li>
  </ul>

  <a name="detail"/>
  <h2>The display list in more detail</h2>
  <p>
    This section describes the graphics engine display list in more detail,
    which will help to explain some of the problems and solutions that are
    described in later sections.
  </p>
  <p>
    There are three parts to the display list:  a set of low-level
    graphics operations, state information for the 'graphics' package,
    and state information for the 'grid' package.  
  </p>
  <p>
    The following code
    shows an example of the first low-level graphics operation for 
    an empty plot (the call to <c>dev.control()</c> ensures that the
    display list is on;  it is off by default for off-screen graphics
    devices).  This example shows that the low-level graphics operations that
    are recorded on the display list are essentially calls to C functions
    (in this case, <c>C_plot_new()</c>).  Note that part of the information
    that is recorded is a pointer to the memory address for the C function;
    this will be important later.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-low-level-ops", '
dev.control("enable")
plot.new()
rp <- recordPlot()
rp[[1]][[1]]
',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>
    The following code provides a glimpse at the state information
    for the 'graphics' package.  The main point here is that, as far
    as the "recordedplot" is concerned, the graphics state is just an
    opaque block of bytes.  Only the internal C code for the graphics
    engine has any idea what these bytes mean.  This opacity will be
    important later.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-graphics-state", '
dev.control("enable")
plot.new()
rp <- recordPlot()
head(rp[[2]], 100)
',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>
    The following code provides a glimpse at the state information
    for the 'grid' package (note that this code draws something with
    'grid').  The main point here is that this 
    value is <c>NULL</c>, i.e., there is no state information recorded
    for the 'grid' package.  This lack of information will be important 
    later.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-grid-state", '
dev.control("enable")
library(grid)
grid.text("test")
rp <- recordPlot()
rp[[3]]
',
         eval=TRUE, fig=FALSE)
end.rcode-->

  <a name="outlawed"/>
  <h2>Why reloading a "recordedplot" was outlawed</h2>
  <p>
    Part of the information stored about a low-level graphics operation
    on the display list is a pointer to a memory address for a
    C function.  The code below focuses on just that memory address
    for our empty plot example.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-low-level-ops", '
dev.control("enable")
plot.new()
rp <- recordPlot()
rp[[1]][[1]][[2]][[1]]$address
',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>
    This information, a pointer to a memory address, is transient;  it
    is only meaningful for a single R session.  For example, the 
    code below shows exactly the same information for exactly the same 
    empty plot, but in a different R session,
    to show that the memory address for the C function is different.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-low-level-ops", '
dev.control("enable")
plot.new()
rp <- recordPlot()
rp[[1]][[1]][[2]][[1]]$address
',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>    
    Because of the transience of these pointers, when they are 
    saved to disk ("serialised"), they are set to <c>NULL</c>.
    The following code shows one of the memory addresses in
    a "recordedplot" from an earlier 
    example to show that the pointer has been erased.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-graphics-ops-reload", '
rp <- readRDS("R-recordplot.rds")
rp[[1]][[1]][[2]][[1]]$address
',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>
    This is why loading "recordedplot"s from a different R session
    was outlawed in R 3.0.0;  because a saved "recordedplot"
    contains <c>NULL</c>ed 
    pointers to C functions (because the C pointers from a previous
    R session are meaningless in the current R session).
  </p>

  <a name="enabled"/>
  <h2>Why reloading a "recordedplot" is allowed again</h2>
  <p>
    The "trick" to allowing a "recordedplot" to be loaded in a 
    different R session is based on the fact that it is possible,
    given the name of a C function and the name of the R package
    that the C function comes from, to determine the memory address for
    that C function in the current R session. 
  </p>
  <p>
    The following code gives a reminder that a "recordedplot" contains both the
    name of the C function for a low-level graphics operation and
    the name of the package for that C function (and that the 
    pointer to the C function is <c>NULL</c>).
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-graphics-ops-reload", '
rp <- readRDS("R-recordplot.rds")
rp[[1]][[1]][[2]][[1]]
',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>
    The following code shows, rather opaquely, that the C function name
    and the package name can be used to fill in the information about
    the memory address for the C function (in the current R session).
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-graphics-ops-reload", '
rp <- readRDS("R-recordplot.rds")
dllName <- rp[[1]][[1]][[2]][[1]]$package[["name"]]
pkgDLL <- getLoadedDLLs()[[dllName]]
getNativeSymbolInfo(rp[[1]][[1]][[2]][[1]]$name, 
                    PACKAGE=pkgDLL)',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>
    With that pointer to memory restored, the "recordedplot" can
    be used in the current R session, despite being recorded and saved in
    a different R session.  This is the main reason why 
    loading "recordedplot"s will work again for R version 3.3.0.
  </p>

  <a name="danger"/>
  <h2>Why reloading a "recordedplot" was always dangerous</h2>
  <p>
    Even before they were outlawed, a "recordedplot" was not considered
    a valid long-term storage format for an R plot because
    the content of the R display list can (and does) change format 
    between R versions.
  </p>
  <p>
    The following sample sessions show an example, focusing on the
    'graphics' state information part of the display list.
    In R version 2.15.3, this state information was 3864 bytes in size ...
  </p>
<!--begin.rcode echo=FALSE, results="asis"
oldRsession(R2.15.3, "R2.15.3-graphics-state", '
dev.control("enable")
plot.new()
rp <- recordPlot()
length(rp[[2]])
')
end.rcode-->
  <p>
    ... but by R version 3.0.0, the state information had grown to 
    35992 bytes.  If we saved a "recordedplot" from R version 2.15.3
    and tried to replay it in R version 3.0.0, there would be a <em>lot</em>
    of missing 'graphics' state information and the fact that this information
    is an opaque series of bytes makes it difficult to interpret or cope with
    any differences in size.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
oldRsession(R3.0.0, "R3.0.0-graphics-state", '
dev.control("enable")
plot.new()
rp <- recordPlot()
length(rp[[2]])
')
end.rcode-->
  <p>
    Another example of changes in the display list format occurred 
    in the recording of low-level graphics operations.
    The following code shows that the first low-level graphics operation
    in R 2.15.3 is very different from what is recorded 
    from R 3.0.0 onwards.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
oldRsession(R2.15.3, "R2.15.3-graphics-ops", '
dev.control("enable")
plot.new()
rp <- recordPlot()
rp[[1]][[1]]
')
end.rcode-->
  <p>
    More subtle incompatibilities are also possible.  For example,
    a "recordedplot" from one R version could contain a low-level
    graphics operation that does not exist in another R version.
    More subtle still, a "recordedplot" from one R version could
    include arguments for a low-level graphics operation that are
    incompatible with the argument list for the same low-level
    graphics operation in a different R version.
  </p>

  <a name="safer"/>
  <h2>Why reloading a "recordedplot" is now safer</h2>
  <p>
    In addition to re-enabling support for "recordedplot"s,
    the development version of R contains some extra defences
    against the problems that can arise.
  </p>
  <ul>
    <li>
      The R version is now recorded as part of a "recordedplot" and
      a warning is issued 
      if the "recordedplot" R version does not match the R version
      of the session attempting to replay the display list.
    </li>
    <li>
      The number of arguments to each low-level graphics operation is
      checked and an error is generated if that number does not
      match the expected number of arguments in the R session
      attempting to replay the display list. 
    </li>
    <li>
      The size (in bytes) of the graphics state information is 
      checked and an error is generated if that size does not match
      the size of graphics state information in the R session
      attempting to replay the display list.
    </li>
  </ul>

  <a name="broken"/>
  <h2>Why reloading a "recordedplot" may still not work</h2>
  <p> 
    Even though it is possible to reload a "recordedplot" from a different
    R session, there are situations where the plot will not be redrawn
    correctly.
    The following code shows an example where a  'ggplot2'
    plot is recorded in one R session and then replayed in another 
    R session, with an undesirable result (this example uses
    'ggplot2' version 1.0.1).
  </p>
<!--begin.rcode echo=FALSE, results="hide", message=FALSE
# Ensure that this demo uses correct 'ggplot2' version
system(paste0(Rdevel, " -e 'install.packages(\"ggplot2_1.0.1.tar.gz\", repos=NULL)'"))
end.rcode-->
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-record-ggplot2", '
dev.control("enable")
library(ggplot2)
df <- expand.grid(x=1:4, y=1:4)
df$z <- runif(16)
p <- ggplot(df) + geom_tile(aes(x=x, y=y, fill=z))
print(p)
rp <- recordPlot()
saveRDS(rp, "Rdevel-ggplot2.rds")
')
end.rcode-->
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-replay-ggplot2", '
rp <- readRDS("Rdevel-ggplot2.rds")
replayPlot(rp)
')
end.rcode-->
<!--begin.rcode echo=FALSE, results="hide", message=FALSE
# Revert to default 'ggplot2' version
system(paste0(Rdevel, " -e 'install.packages(\"ggplot2\", repos=\"http://cran.stat.auckland.ac.nz\")'"))
end.rcode-->
  <p>
    The underlying problem here is that redrawing the plot requires
    functions that were present in the R session when the recording
    was performed, but are not present in the R session when the
    "recordedplot" is replayed.  
  </p>
  <p>
    The following code demonstrates a simplified (and more extreme) 
    version of the 
    problem: we assign a value to 'x', then record drawing on the 
    display list that makes use of 'x', but 'x' itself is not recorded on
    the display list ...
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-record-x", '
dev.control("enable")
x <- runif(16)
recordGraphics(image(matrix(x, ncol=4), 
                     col=hcl(240, 60, 10*1:8)),
               list(), getNamespace("graphics"))
rp <- recordPlot()
saveRDS(rp, "Rdevel-record-x.rds")
')
end.rcode-->
  <p>
    ... then when we go to replay the "recordedplot" in a new R session,
    the display list contains code that relies on 'x', 'x' is not 
    defined, so the plotting fails.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-replay-x", '
rp <- readRDS("Rdevel-record-x.rds")
replayPlot(rp)
',
         eval=TRUE, fig=FALSE)
end.rcode-->
  <p>
    Another problem with "recordedplot"s, at least for 
    those that contain 'grid' 
    output, is that a redraw does not repopulate the 'grid' display list.
    This means that, for example, it is not possible to use 
    <c>grid.edit()</c> to modify grobs within a redrawn plot.
    As a demonstration of this problem, in the following code, 
    we record a 'lattice' levelplot ...
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(R, "R-lattice-plot", '
dev.control("enable")
library(lattice)
p <- levelplot(matrix(runif(16), ncol=4), 
               col.regions=hcl(240, 60, 10*1:8))
print(p)
rp <- recordPlot()
saveRDS(rp, "R-lattice-plot.rds")
')
end.rcode-->
  <p>
    ... then we load the "recordedplot" into a new R session,
    redraw it (get warnings because of the difference in R versions),
    try to edit the plot, and fail because there are no grobs to edit.
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-lattice-plot", '
rp <- readRDS("R-lattice-plot.rds")
replayPlot(rp)
library(grid)
grid.edit("plot_01.levelplot.rect.panel.1.1", gp=gpar(col="white", lwd=7))
',
         eval=TRUE, fig=FALSE)
end.rcode-->

  <a name="better"/>
  <h2>Why reloading a "recordedplot" is now better</h2>
  <p>
    In addition to the extra defences, the new support for
    "recordedplot"s includes some new features to reduce the
    chance of a replay producing the wrong result.
  </p>
  <ul>
    <li>
      For each set of graphics state information within
      a "recordedplot", the name of the relevant package,
      either "graphics" or "grid", is recorded and that package
      is automatically reloaded (if necessary) in the R session 
      attempting to replay the display list.
    </li>
    <li>
      The <c>recordPlot()</c> function has two new arguments,
      <c>load</c> and <c>attach</c>, and the <c>replayPlot()</c>
      function has one new argument, <c>reloadPkgs</c>.
      The new arguments in <c>recordPlot()</c> can be used to record
      the names of packages that will be needed to replay the display
      list correctly;  if <c>reloadPkgs</c> is <c>TRUE</c> then
      <c>replayPlot()</c> loads or attaches the relevant packages
      in the R session 
      attempting to replay the display list.
    </li>
  </ul>
  <p>
    In addition, information is now recorded in the 'grid' state information
    part of a "recordedplot".  The following code draws, records, and saves
    a 'lattice' plot and shows the start of the 'grid' state information
    (a list of viewports and grobs) that is included in the
    "recordedplot". 
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-lattice-plot-record", '
dev.control("enable")
library(lattice)
p <- levelplot(matrix(runif(16), ncol=4), 
               col.regions=hcl(240, 60, 10*1:8))
print(p)
rp <- recordPlot()
saveRDS(rp, "R-lattice-plot-record.rds")
head(rp[[3]][[1]])
',
          eval=TRUE)
end.rcode-->
  <p>
    This means that if we redraw
    'grid' output, the result can now be edited (in the example below,
    each of the blue squares is modified to have a thick white border) ...
  </p>
<!--begin.rcode echo=FALSE, results="asis"
Rsession(Rdevel, "Rdevel-lattice-plot-replay", '
rp <- readRDS("R-lattice-plot-record.rds")
replayPlot(rp)
library(grid)
grid.edit("plot_01.levelplot.rect.panel.1.1", 
          gp=gpar(col="white", lwd=7))
')
end.rcode-->

  <a name="summary"/>
  <h2>Summary</h2>
  <p>
    In the development version of R (to be R 3.3.0), it is possible again
    to save the result of <c>recordPlot()</c> from one R session and
    then load it and replay it, with <c>replayPlot()</c>, in a different
    R session.  This recording and replaying of R plots across R sessions
    has also been made safer, with more warnings and errors in place
    to protect against incompatibilities between R versions, and
    it has been made better, with support for reloading 
    packages along with a "recordedplot", 
    and with support for reproducing the 'grid'
    display list when redrawing a "recordedplot" that contains 'grid'
    output.
  </p>

  <a name="acknowledgements"/>
  <h2>Acknowledgements</h2>
  <p>
    We would like to acknowledge the wider group of people
    who helped to discuss 
    and motivate the changes described in this document:  
    Yihui Xie, Gabriel Becker, Henrik Bengtsson,
    G&aacute;bor Cs&aacute;rdi, Gergeley Dar&oacute;czi, and Winston Chang.
  </p>

  <a name="references"/>
  <h2>References</h2>
  <p>
    The original implementation of the graphics engine display list
    is described in Paul Murrell's PhD Thesis,
    <a href="https://researchspace.auckland.ac.nz/handle/2292/514">Investigations in Graphical Statistics</a>.
  </p>
  <p>
    The source code changes to enable reloading of "recordedplot"s in
    a new R session were based on 
    <a href="https://github.com/rstudio/rstudio/commit/eb5f6f1db4717132c2ff111f068ffa6e8b2a5f0b">this code</a> (by Jeroen Ooms and JJ Allaire).
  </p>
  <p>
    The motivation for "recordedplot"s in OpenCPU is outlined in
    <a href="https://escholarship.org/uc/item/4q6105rw">Jeroen Ooms' PhD Thesis</a>
    (Chapter 2, Section .2.3).
  </p>
  <p>
    The main development of the source code changes described in this 
    document occurred on the 
    <a href="https://svn.r-project.org/R/branches/R-DL/">R-DL branch</a> 
    of the R Project subversion repository.
    These changes were merged back into the main trunk of the repository
    in revision 69314 (and most subsequent clean ups refer to that 
    revision in their commit comment).
  </p>
  <p>
    A 
    <a href="https://github.com/pmur002/R-display-list">suite of tests</a> 
    of "recordedplot" saving and reloading is available on github.
  </p>
      
</body>
</html>
