Recording and Replaying the Graphics Engine Display List

by Paul Murrell, Jeroen Ooms, and JJ Allaire

Monday 14 December 2015

Overview

In the development version of R (to become R 3.3.0), it is now possible (again) to record the graphics engine display list in one R session and replay it in another R session.

The current situation (in R 3.2.2) is demonstrated below. If we record a plot with recordPlot(), save it to disk with saveRDS(), then quit R ...

R version 3.2.2 (2015-08-14)
> image(matrix(runif(16), ncol=4), col=hcl(240, 60, 10*1:8)) > rp <- recordPlot() > saveRDS(rp, "R-recordplot.rds")

... then start a new R session and attempt to load the plot with readRDS(), and replay the recorded plot with replayPlot(), the result is an error (and no plot is produced).

R version 3.2.2 (2015-08-14)
> rp <- readRDS("R-recordplot.rds") > replayPlot(rp) Error in replayPlot(rp): loading snapshot from a different session

In the development version of R, recording a plot in one session ...

R Under development (unstable) (2015-12-13 r69768)
> image(matrix(runif(16), ncol=4), col=hcl(240, 60, 10*1:8)) > rp <- recordPlot() > saveRDS(rp, "Rdevel-recordplot.rds")

... then replaying it in a new R session now works fine.

R Under development (unstable) (2015-12-13 r69768)
> rp <- readRDS("Rdevel-recordplot.rds") > replayPlot(rp)

The graphics engine in R was originally created to support redrawing of the on-screen graphics window when it was resized and to support copying between graphics devices (with dev.copy()). When we draw a plot in R, low-level drawing operations are recorded on the display list and, when a graphics window is resized, or we copy a plot from one device to another, those drawing operations are replayed to reproduce the plot.

The recordPlot() function was added later. This takes a copy of the display list, as a "recordedplot" object, and the replayPlot() function can be used to redraw the "recordedplot" object. This means that we can keep copies of more than just the current plot, which allows things like the "Plot History" feature in the RGui on Windows - we can record all of the plots drawn on a device and browse backwards and forwards between them.

Because recordPlot() returns the display list as an R object, we can also save a "recordedplot" to disk, for example, with saveRDS(). This means that we can save a plot from one R session and replay it in another (with readRDS() and replayPlot()).

Recording a "recordedplot" in one R session and replaying it in another R session is actually a bit dangerous (for reasons we will come to later), but of course, because it was possible and in some cases quite useful, people started doing it anyway.

Then in R version 3.0.0, replaying "recordedplot"s from a different R session went from being dangerous to being outlawed. If you try this sort of thing in R, between version 3.0.0 and 3.2.3, you just get an error message, as demonstrated in the opening section of this document.

So now we were in a situation where there were a number of R projects and packages that depended on a feature that no longer worked. Naturally, that produced a black market in "recordedplot"s; ingenious coders simply worked around the ban and carried on. But having projects and packages depending upon a feature that was both dangerous and unsupported was not ideal.

For R 3.3.0, the workaround has been incorporated into R and additional support and defences have been added so that replaying a "recordedplot" from a different R session will be both supported and not as dangerous as it was.

In the remainder of this document, we explore why this sort of recording and replaying of plots is useful, why it was broken, how it has been fixed, and what you can now do with the fixed system.

Why reloading a "recordedplot" is useful

There are many ways to save R graphics output, so what is the benefit of using a "recordedplot"?

Exporting R graphics to a file format such as PNG, PDF, or SVG is a one-way trip; we cannot get the graphics back into R.
Saving an R object from 'lattice' or 'ggplot2' does allow reloading back into R, but only works for plots produced by 'lattice' or 'ggplot2'.
Saving the R code that produced the plot allows us to run it again in R, but this is much slower than redrawing a "recordedplot", especially if the R code includes calculations to produce the data for plotting.

This still leaves the question of why we might want to save plots to disk and reload them, possibly in another R session. Examples of the use "recordedplot"s include:

RStudio saves plot histories to disk rather than keeping them in memory. This can involve multiple R sessions when using RStudio's "Build and Reload" feature for package development and when RStudio Server temporarily suspends then resumes an inactive R session.
The 'knitr' package uses "recordedplot"s to record plot output from code chunks, which makes it more efficient to produce multiple formats for each plot, and can involve multiple session when caching is involved.
The OpenCPU project provides an HTTP API that separates plot generation from plot rendering. A "recordedplot" provides a a format-independent record of a plot that can be efficiently rendered to PNG, PDF, or SVG on request.

The display list in more detail

This section describes the graphics engine display list in more detail, which will help to explain some of the problems and solutions that are described in later sections.

There are three parts to the display list: a set of low-level graphics operations, state information for the 'graphics' package, and state information for the 'grid' package.

The following code shows an example of the first low-level graphics operation for an empty plot (the call to dev.control() ensures that the display list is on; it is off by default for off-screen graphics devices). This example shows that the low-level graphics operations that are recorded on the display list are essentially calls to C functions (in this case, C_plot_new()). Note that part of the information that is recorded is a pointer to the memory address for the C function; this will be important later.

R version 3.2.2 (2015-08-14)

> dev.control("enable")
> plot.new()
> rp <- recordPlot()
> rp[[1]][[1]]

[[1]]
function (.NAME, ..., PACKAGE)  .Primitive(".External2")

[[2]]
[[2]][[1]]
$name
[1] "C_plot_new"

$address
<pointer: 0x1ac8cf0>
attr(,"class")
[1] "RegisteredNativeSymbol"

$package
DLL name: graphics
Filename: /usr/lib/R/library/graphics/libs/graphics.so
Dynamic lookup: FALSE

$numParameters
[1] 0

attr(,"class")
[1] "ExternalRoutine"  "NativeSymbolInfo"

The following code provides a glimpse at the state information for the 'graphics' package. The main point here is that, as far as the "recordedplot" is concerned, the graphics state is just an opaque block of bytes. Only the internal C code for the graphics engine has any idea what these bytes mean. This opacity will be important later.

R version 3.2.2 (2015-08-14)

> dev.control("enable")
> plot.new()
> rp <- recordPlot()
> head(rp[[2]], 100)

  [1] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0 3f 01 00 00 00 ff ff ff
 [24] ff 6f 00 00 00 00 00 00 00 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00
 [47] f0 3f 00 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [70] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff 00 00 00 00
 [93] 00 00 00 00 00 00 00 00

The following code provides a glimpse at the state information for the 'grid' package (note that this code draws something with 'grid'). The main point here is that this value is NULL, i.e., there is no state information recorded for the 'grid' package. This lack of information will be important later.

R version 3.2.2 (2015-08-14)
> dev.control("enable") > library(grid) > grid.text("test") > rp <- recordPlot() > rp[[3]] NULL

Why reloading a "recordedplot" was outlawed

Part of the information stored about a low-level graphics operation on the display list is a pointer to a memory address for a C function. The code below focuses on just that memory address for our empty plot example.

R version 3.2.2 (2015-08-14)
> dev.control("enable") > plot.new() > rp <- recordPlot() > rp[[1]][[1]][[2]][[1]]$address <pointer: 0x2449cf0> attr(,"class") [1] "RegisteredNativeSymbol"

This information, a pointer to a memory address, is transient; it is only meaningful for a single R session. For example, the code below shows exactly the same information for exactly the same empty plot, but in a different R session, to show that the memory address for the C function is different.

R version 3.2.2 (2015-08-14)
> dev.control("enable") > plot.new() > rp <- recordPlot() > rp[[1]][[1]][[2]][[1]]$address <pointer: 0x188fcf0> attr(,"class") [1] "RegisteredNativeSymbol"

Because of the transience of these pointers, when they are saved to disk ("serialised"), they are set to NULL. The following code shows one of the memory addresses in a "recordedplot" from an earlier example to show that the pointer has been erased.

R version 3.2.2 (2015-08-14)
> rp <- readRDS("R-recordplot.rds") > rp[[1]][[1]][[2]][[1]]$address <pointer: (nil)> attr(,"class") [1] "RegisteredNativeSymbol"

This is why loading "recordedplot"s from a different R session was outlawed in R 3.0.0; because a saved "recordedplot" contains NULLed pointers to C functions (because the C pointers from a previous R session are meaningless in the current R session).

Why reloading a "recordedplot" is allowed again

The "trick" to allowing a "recordedplot" to be loaded in a different R session is based on the fact that it is possible, given the name of a C function and the name of the R package that the C function comes from, to determine the memory address for that C function in the current R session.

The following code gives a reminder that a "recordedplot" contains both the name of the C function for a low-level graphics operation and the name of the package for that C function (and that the pointer to the C function is NULL).

R version 3.2.2 (2015-08-14)

> rp <- readRDS("R-recordplot.rds")
> rp[[1]][[1]][[2]][[1]]

$name
[1] "C_plot_new"

$address
<pointer: (nil)>
attr(,"class")
[1] "RegisteredNativeSymbol"

$package
DLL name: graphics
Filename: /usr/lib/R/library/graphics/libs/graphics.so
Dynamic lookup: FALSE

$numParameters
[1] 0

attr(,"class")
[1] "ExternalRoutine"  "NativeSymbolInfo"

The following code shows, rather opaquely, that the C function name and the package name can be used to fill in the information about the memory address for the C function (in the current R session).

R version 3.2.2 (2015-08-14)

> rp <- readRDS("R-recordplot.rds")
> dllName <- rp[[1]][[1]][[2]][[1]]$package[["name"]]
> pkgDLL <- getLoadedDLLs()[[dllName]]
> getNativeSymbolInfo(rp[[1]][[1]][[2]][[1]]$name,
+                     PACKAGE=pkgDLL)

$name
[1] "C_plot_new"

$address
<pointer: 0x2b9185606980>
attr(,"class")
[1] "NativeSymbol"

$package
DLL name: graphics
Filename: /usr/lib/R/library/graphics/libs/graphics.so
Dynamic lookup: FALSE

$numParameters
[1] 0

attr(,"class")
[1] "ExternalRoutine"  "NativeSymbolInfo"

With that pointer to memory restored, the "recordedplot" can be used in the current R session, despite being recorded and saved in a different R session. This is the main reason why loading "recordedplot"s will work again for R version 3.3.0.

Why reloading a "recordedplot" was always dangerous

Even before they were outlawed, a "recordedplot" was not considered a valid long-term storage format for an R plot because the content of the R display list can (and does) change format between R versions.

The following sample sessions show an example, focusing on the 'graphics' state information part of the display list. In R version 2.15.3, this state information was 3864 bytes in size ...

R version 2.15.3 (2013-03-01)
> dev.control("enable") > plot.new() > rp <- recordPlot() > length(rp[[2]]) [1] 3864

... but by R version 3.0.0, the state information had grown to 35992 bytes. If we saved a "recordedplot" from R version 2.15.3 and tried to replay it in R version 3.0.0, there would be a lot of missing 'graphics' state information and the fact that this information is an opaque series of bytes makes it difficult to interpret or cope with any differences in size.

R version 3.0.0 (2013-04-03)
> dev.control("enable") > plot.new() > rp <- recordPlot() > length(rp[[2]]) [1] 35992

Another example of changes in the display list format occurred in the recording of low-level graphics operations. The following code shows that the first low-level graphics operation in R 2.15.3 is very different from what is recorded from R 3.0.0 onwards.

R version 2.15.3 (2013-03-01)
> dev.control("enable") > plot.new() > rp <- recordPlot() > rp[[1]][[1]] [[1]] .Primitive("plot.new") [[2]] NULL

More subtle incompatibilities are also possible. For example, a "recordedplot" from one R version could contain a low-level graphics operation that does not exist in another R version. More subtle still, a "recordedplot" from one R version could include arguments for a low-level graphics operation that are incompatible with the argument list for the same low-level graphics operation in a different R version.

Why reloading a "recordedplot" is now safer

In addition to re-enabling support for "recordedplot"s, the development version of R contains some extra defences against the problems that can arise.

The R version is now recorded as part of a "recordedplot" and a warning is issued if the "recordedplot" R version does not match the R version of the session attempting to replay the display list.
The number of arguments to each low-level graphics operation is checked and an error is generated if that number does not match the expected number of arguments in the R session attempting to replay the display list.
The size (in bytes) of the graphics state information is checked and an error is generated if that size does not match the size of graphics state information in the R session attempting to replay the display list.

Why reloading a "recordedplot" may still not work

Even though it is possible to reload a "recordedplot" from a different R session, there are situations where the plot will not be redrawn correctly. The following code shows an example where a 'ggplot2' plot is recorded in one R session and then replayed in another R session, with an undesirable result (this example uses 'ggplot2' version 1.0.1).

R Under development (unstable) (2015-12-13 r69768)

Loading required package: methods

> dev.control("enable")
> library(ggplot2)
> df <- expand.grid(x=1:4, y=1:4)
> df$z <- runif(16)
> p <- ggplot(df) + geom_tile(aes(x=x, y=y, fill=z))
> print(p)
> rp <- recordPlot()
> saveRDS(rp, "Rdevel-ggplot2.rds")

R Under development (unstable) (2015-12-13 r69768)
> rp <- readRDS("Rdevel-ggplot2.rds") > replayPlot(rp)

The underlying problem here is that redrawing the plot requires functions that were present in the R session when the recording was performed, but are not present in the R session when the "recordedplot" is replayed.

The following code demonstrates a simplified (and more extreme) version of the problem: we assign a value to 'x', then record drawing on the display list that makes use of 'x', but 'x' itself is not recorded on the display list ...

R Under development (unstable) (2015-12-13 r69768)

> dev.control("enable")
> x <- runif(16)
> recordGraphics(image(matrix(x, ncol=4),
+                      col=hcl(240, 60, 10*1:8)),
+                list(), getNamespace("graphics"))
> rp <- recordPlot()
> saveRDS(rp, "Rdevel-record-x.rds")

... then when we go to replay the "recordedplot" in a new R session, the display list contains code that relies on 'x', 'x' is not defined, so the plotting fails.

R Under development (unstable) (2015-12-13 r69768)
> rp <- readRDS("Rdevel-record-x.rds") > replayPlot(rp) Error in matrix(x, ncol = 4): object 'x' not found

Another problem with "recordedplot"s, at least for those that contain 'grid' output, is that a redraw does not repopulate the 'grid' display list. This means that, for example, it is not possible to use grid.edit() to modify grobs within a redrawn plot. As a demonstration of this problem, in the following code, we record a 'lattice' levelplot ...

R version 3.2.2 (2015-08-14)
> dev.control("enable") > library(lattice) > p <- levelplot(matrix(runif(16), ncol=4), + col.regions=hcl(240, 60, 10*1:8)) > print(p) > rp <- recordPlot() > saveRDS(rp, "R-lattice-plot.rds")

... then we load the "recordedplot" into a new R session, redraw it (get warnings because of the difference in R versions), try to edit the plot, and fail because there are no grobs to edit.

R Under development (unstable) (2015-12-13 r69768)

> rp <- readRDS("R-lattice-plot.rds")
> replayPlot(rp)

Warning in restoreRecordedPlot(x, reloadPkgs): snapshot recorded in
different R version (pre 3.3.0)

Warning in replayPlot(rp): snapshot recorded with different graphics engine
version (pre 11 - this is version 11)

> library(grid)
> grid.edit("plot_01.levelplot.rect.panel.1.1", gp=gpar(col="white", lwd=7))

Error in editDLfromGPath(gPath, specs, strict, grep, global, redraw): 'gPath' (plot_01.levelplot.rect.panel.1.1) not found

Why reloading a "recordedplot" is now better

In addition to the extra defences, the new support for "recordedplot"s includes some new features to reduce the chance of a replay producing the wrong result.

For each set of graphics state information within a "recordedplot", the name of the relevant package, either "graphics" or "grid", is recorded and that package is automatically reloaded (if necessary) in the R session attempting to replay the display list.
The recordPlot() function has two new arguments, load and attach, and the replayPlot() function has one new argument, reloadPkgs. The new arguments in recordPlot() can be used to record the names of packages that will be needed to replay the display list correctly; if reloadPkgs is TRUE then replayPlot() loads or attaches the relevant packages in the R session attempting to replay the display list.

In addition, information is now recorded in the 'grid' state information part of a "recordedplot". The following code draws, records, and saves a 'lattice' plot and shows the start of the 'grid' state information (a list of viewports and grobs) that is included in the "recordedplot".

R Under development (unstable) (2015-12-13 r69768)

> dev.control("enable")
> library(lattice)
> p <- levelplot(matrix(runif(16), ncol=4),
+                col.regions=hcl(240, 60, 10*1:8))
> print(p)
> rp <- recordPlot()
> saveRDS(rp, "R-lattice-plot-record.rds")
> head(rp[[3]][[1]])

[[1]]
viewport[ROOT] 

[[2]]
rect[plot_01.background] 

[[3]]
viewport[plot_01.toplevel.vp] 

[[4]]
viewport[plot_01.xlab.vp] 

[[5]]
text[plot_01.xlab] 

[[6]]
[1] 1
attr(,"class")
[1] "up"

This means that if we redraw 'grid' output, the result can now be edited (in the example below, each of the blue squares is modified to have a thick white border) ...

R Under development (unstable) (2015-12-13 r69768)
> rp <- readRDS("R-lattice-plot-record.rds") > replayPlot(rp) > library(grid) > grid.edit("plot_01.levelplot.rect.panel.1.1", + gp=gpar(col="white", lwd=7))

Summary

In the development version of R (to be R 3.3.0), it is possible again to save the result of recordPlot() from one R session and then load it and replay it, with replayPlot(), in a different R session. This recording and replaying of R plots across R sessions has also been made safer, with more warnings and errors in place to protect against incompatibilities between R versions, and it has been made better, with support for reloading packages along with a "recordedplot", and with support for reproducing the 'grid' display list when redrawing a "recordedplot" that contains 'grid' output.

Acknowledgements

We would like to acknowledge the wider group of people who helped to discuss and motivate the changes described in this document: Yihui Xie, Gabriel Becker, Henrik Bengtsson, Gábor Csárdi, Gergeley Daróczi, and Winston Chang.

References

The original implementation of the graphics engine display list is described in Paul Murrell's PhD Thesis, Investigations in Graphical Statistics.

The source code changes to enable reloading of "recordedplot"s in a new R session were based on this code (by Jeroen Ooms and JJ Allaire).

The motivation for "recordedplot"s in OpenCPU is outlined in Jeroen Ooms' PhD Thesis (Chapter 2, Section .2.3).

The main development of the source code changes described in this document occurred on the R-DL branch of the R Project subversion repository. These changes were merged back into the main trunk of the repository in revision 69314 (and most subsequent clean ups refer to that revision in their commit comment).

A suite of tests of "recordedplot" saving and reloading is available on github.