<html>
<head>
  <style type="text/css">
    @media print {
      body { }
    }
    @media screen {
      body { max-width: 800px; margin: auto }
      img.CC { display: inline }
    }
  </style>
</head>
<body>
  <h1>Debugging Display List Internals</h1>
  <p style="font-style: italic">by Paul Murrell</p>
  <p>
  <rcode id="date" echo="FALSE" results="asis"><![CDATA[
cat(format(Sys.time(), "%A %d %B %Y"))
  ]]></rcode>
  </p>

  <rcode id="init" echo="FALSE"><![CDATA[
opts_chunk$set(comment=" ", tidy=FALSE)
options(width=80)
  ]]></rcode>

  <hr/>

  <p>
    This report documents the process of debugging a problem
    with the recording and replaying of R plots from one R session to
    another.  The purpose of this report is to record the
    source of the problem, to record the solution to the problem, 
    to explain some of the internal details of recorded R plots, and
    to demonstrate the 'hexView' package for exploring binary
    blobs. 
  </p>

  <h2>The problem</h2>
  <p>
    The <code>recordPlot()</code> function allows a "snapshot" of the
    current R plot to be recorded as an R object (a "recordedplot"), 
    and the <code>replayPlot()</code> function can be used to
    redraw a "recordedplot".
    In the development version of R (to become R 3.3.0), 
    a "recordedplot" can be saved to disk with <code>saveRDS()</code>
    and then reloaded in a different R session with 
    <code>readRDS()</code>.  This should work between R sessions
    on different platforms (e.g., record on Linux and replay on Windows).
  </p>
  <p>
    Henrik Bengtsson discovered that, if we record a plot on Linux
    using the PNG device, it does not replay correctly on Windows (the
    replayed plot is blank).  He provided example "recordedplot"s <a
    href="R-recordplot_LinuxA.rds">from Linux</a> and <a
    href="R-recordplot_Windows.rds">from Windows</a> to demonstrate the
    problem.  The "recordedplot" created on Windows replayed correctly
    on both Linux and Windows, the "recordedplot" created on Linux 
    replayed correctly on Linux, but not on Windows.
  </p>
  <p>
    Each "recordedplot" was created with something like the following R code ...
  </p>
  <rcode eval="FALSE"><![CDATA[
png("dummy.png")
dev.control("enable")
plot(1:10)
rp <- recordPlot()
saveRDS(rp, "R-recordplot_LinuxA.rds")
  ]]></rcode>
  <p>
    ... and then replayed with ...
  </p>
  <rcode eval="FALSE"><![CDATA[
rp <- readRDS("R-recordplot_LinuxA.rds")
replayPlot(rp)
  ]]></rcode>

  <h2>Debugging the problem</h2>
  <p>
    The debugging effort focused on looking for differences between the
    "recordedplot" created on Windows and the "recordedplot" created on Linux
    (because the former replayed fine and the latter did not).
  </p>
  <p>
    Each "recordedplot" is a list of two components: the first is a
    set calls to internal C graphics functions ...
  </p>
  <rcode><![CDATA[
rp <- readRDS("R-recordplot_LinuxA.rds")
rp[[1]][[1]]
  ]]></rcode>
  <p>
    ... and the second is state
    information for the 'graphics' package.
  </p>
  <rcode><![CDATA[
head(rp[[2]], 100)
  ]]></rcode>

  <p>
    The following function was written to capture the printed display of
    a "recordedplot" to a text file ...
  </p>
  <rcode><![CDATA[
printRDS <- function(infile, outfile) {
    x <- readRDS(infile)
    displaylist <- capture.output(print(x[[1]]))
    graphicsContext <- capture.output(print(x[[2]]))
    writeLines(c(displaylist, graphicsContext), outfile)
}
  ]]></rcode>
  <p>
    ... and this function was used to create two text files that could be
    'diff'ed ...
  </p>
  <rcode message="FALSE" warning="FALSE" results="asis"><![CDATA[
printRDS("R-recordplot_Windows.rds", "printRDS_Windows.txt")
printRDS("R-recordplot_LinuxA.rds", "printRDS_LinuxA.txt")   
diff <- system("diff printRDS_Windows.txt printRDS_LinuxA.txt", intern=TRUE)
  ]]></rcode>
  <p>
    This revealed two sorts of differences. The first is not at all
    surprising: the information on internal C calls has paths to DLLs
    on Windows and paths to shared object (.so) files on Linux, for
    example, ...
  </p>
  <rcode><![CDATA[
diff[1:5]    
  ]]></rcode>
  <p>
    This difference is not a concern because these paths are rebuilt
    when a "recordedplot" is loaded into a new R session.
  </p>
  <p>
    However, the second difference is more of a concern: there were
    differences in the 'graphics' state information, for example ...
  </p>
  <rcode><![CDATA[
diff[41:44]    
  ]]></rcode>
  <p>
    These differences required further exploration, but to do that we need
    a better view of the state information.  We can see, if we look closely,
    that there are two bytes different in the example above, but it is not
    very clear what that two-byte difference represents.  This is where
    the 'hexView' package comes in.
  </p>
  <p>
    The following function was used to create a binary file that just
    contained the 'graphics' state information ...
  </p>
  <rcode><![CDATA[
exportGraphicsContext <- function(infile, outfile) {
    x <- readRDS(infile)
    graphicsContext <- x[[2]]
    # as.vector() to strip attributes
    writeBin(as.vector(graphicsContext), outfile)
}
exportGraphicsContext("R-recordplot_Windows.rds", "graphicsContext_Windows.bin")
exportGraphicsContext("R-recordplot_LinuxA.rds", "graphicsContext_LinuxA.bin")
  ]]></rcode>
  <p>
    Now we can use the 'hexView' package to view the contents of the 
    binary files ...
  </p>
  <rcode><![CDATA[
library(hexView)
viewRaw("graphicsContext_Windows.bin", nbytes=100)
  ]]></rcode>
  <p>
    That is not a huge improvement because it is interpreting each byte
    in the file as an ASCII character.  We can do better if we tell 
    'hexView' how to interpret sequences of bytes within the file.
    To do that, we need a description of the 'graphics'
    state information data structure, which we can get from the
    "Graphics.h" file in R's source code
    (the <a href="#resources">Resources Section</a> has a link to 
    the online subversion repository).  
    A small snippet of that
    information is shown below.
  </p>
  <pre>
   typedef struct {
    /* Plot State */
    /*
       When the device driver is started this is 0
       After the first call to plot.new/perps it is 1
       Every graphics operation except plot.new/persp
       should fail if state = 0
       This is checked at the highest internal function
       level (e.g., do_lines, do_axis, do_plot_xy, ...)
    */

    int	state;		/* plot state: 1 if GNewPlot has been called
			   (by plot.new or persp) */
    Rboolean valid;	/* valid layout ?  Used in GCheckState &amp; do_playDL */

    /* GRZ-like Graphics Parameters */
    /* ``The horror, the horror ... '' */
    /* Marlon Brando - Appocalypse Now */

    /* General Parameters -- set and interrogated directly */

    double adj;		/* String adjustment */ 
  </pre>
  <p>
    Ignoring the colourful comments, we have the following information: 
    the state information starts with a 4-byte integer, followed by
    an "Rboolean" (another 4-byte integer), followed by an 8-byte double.
  </p>
  <p>
    We can describe that structure to 'hexView' as follows and the result
    is much easier to interpret (the integer zero, twice, then the numeric
    value 0.5) ...
  </p>
  <rcode><![CDATA[
GPar <- memFormat(state=integer4,
                  valid=integer4,
                  adj=real8)
viewFormat("graphicsContext_Windows.bin", GPar)
  ]]></rcode>
  <p>
    We can continue that process to create a complete description
    of the 'graphics' state information 
    (the <a href="#resources">Resources Section</a> has a link to 
    complete code).  
    The following shows just
    the part where we have seen a difference.
  </p>
  <rcode><![CDATA[
GParFragment <- memFormat(font=integer4,
                          gamma=real8,
                          lab=vectorBlock(integer4, 3))
viewFormat("graphicsContext_Windows.bin", GParFragment, offset=292)
  ]]></rcode>
  <p>
    The same view of the "recordedplot" from Linux shows that the difference
    is in the value of 'gamma', which is 1 on Windows and 0 on Linux.
  </p>
  <rcode><![CDATA[
GParFragment <- memFormat(font=integer4,
                          gamma=real8,
                          lab=vectorBlock(integer4, 3))
viewFormat("graphicsContext_LinuxA.bin", GParFragment, offset=292)
  ]]></rcode>
  <p>
    This discovery lead to the diagnosis that the Cairo graphics
    device on Linux was not initialising its 'startgamma' value, so it
    was defaulting to zero, and when the Windows graphics device
    applied that zero gamma value all of the drawing colours were
    converted to white.  So 
    when replaying a "recordedplot" on a Windows graphics device, when
    the "recordedplot" had been created on a Cairo graphics device on
    Linux, the replayed plot was drawn white-on-white (which appears blank).
  </p>
  <p>
    This problem was fixed in commit r70080 to the R subversion repository.
  </p>

  <h2>Data structure alignment</h2>
  <p>
    One interesting wrinkle arose when attempting to describe
    the C code description
    of the 'graphics' state information to 'hexView'.
    Consider the following excerpt from "Graphics.h" ...
  </p>  
  <pre>
    double adj;		/* String adjustment */
    Rboolean ann;	/* Should annotation take place */
    rcolor bg;		/* **R ONLY** Background color */
    char bty;		/* Box type */
    double cex;		/* Character expansion */
    double lheight;     /* Line height
  </pre>
  <p>
    If we translate that naively, we get: 8-byte real, 4-byte integer,
    a sequence of four 1-byte unsigned integers, 1-byte character,
    8-byte real, and an 8-byte real.  However, that clearly is not
    right (both 'cex' and 'lheight' should be 1) ...
  </p>
  <rcode><![CDATA[
rcolor <- vectorBlock(atomicBlock("int", size=1, signed=FALSE), 4)
GParFragment <- memFormat(adj=real8,
                          ann=integer4,
                          bg=rcolor,
                          bty=ASCIIchar,
                          cex=real8,
                          lheight=real8)
viewFormat("graphicsContext_Windows.bin", GParFragment, offset=8)
  ]]></rcode>
  <p>
    The issue here is that the 8-byte real for 'cex' has to start in
    memory on a multiple of 8-bytes (25 is not a multiple of 8).  In order
    for that to happen, the data structure is "padded" with extra bytes.
    So the memory actually looks like this (32 is a multiple of 8) ...
  </p>
  <rcode><![CDATA[
GParFragment <- memFormat(adj=real8,
                          ann=integer4,
                          bg=rcolor,
                          bty=ASCIIchar,
                          padding=memBlock(7),
                          cex=real8,
                          lheight=real8)
viewFormat("graphicsContext_Windows.bin", GParFragment, offset=8)
  ]]></rcode>
  <p>
    The rules for padding data structures may be sensitive to both hardware
    and software platforms, so this may be a foreshadowing of future problems
    with moving a "recordedplot" between 32-bit and 64-bit systems and/or
    between i386 and other chip sets. 
  </p>

  <h2>Acknowledgements</h2>  
  <p>
    Thanks to Henrik Bengtsson for reporting the problem, for providing
    nice "recordedplot"s to assist with the diagnosis, and for 
    helping to confirm that the fix works.
  </p>

  <a name="resources"><h2>Resources</h2></a>
  <ul>
    <li>
      The <a
      href="https://www.stat.auckland.ac.nz/~paul/Reports/DisplayList/dl-record.html">technical
      report</a> "Recording and Replaying the Graphics Engine Display
      List", which discusses the support in the development version of R 
      (to become version 3.3.0) for
      creating a "recordedplot" in one R session and replaying it in
      another R session.
    </li>
    <li>
      The <a
      href="https://svn.r-project.org/R/trunk/src/include/Graphics.h">Graphics.h</a>
      source file for R, which describes the 'graphics' state
      information data structure.
    </li>
    <li>
      The <a href="http://cran.stat.auckland.ac.nz/web/packages/hexView/index.html">'hexView' package</a>.
    </li>
    <li>
      The <a href="https://cran.r-project.org/doc/Rnews/Rnews_2007-1.pdf">'hexView' package article</a>.
    </li>
    <li>
      R code describing the <a href="format.R">'graphics' state information for 'hexView'</a>
    </li>   
    <li>
      R code to <a href="explore.R">export a "recordedplot"</a> as text and
      extract the 'graphics' state information.
    </li>   
    <li>
      Example "recordedplot" objects (saved to file) for
      <a href="R-recordplot_Windows.rds">Windows</a> and 
      <a href="R-recordplot_LinuxA.rds">Linux</a> (from Henrik Bengtsson).
    </li>   
    <li>
      The <a
      href="https://en.wikipedia.org/wiki/Data_structure_alignment">Wikipedia
      article</a> where I got everything I know so far about data structure
      alignment.
    </li>   
    <li>
      The <a href="dl-bug.cml">raw source file</a> for this report, a
      <a href="dl-bug.xml">valid XML</a> transformation of the source
      file, a <a href="dl-bug.Rhtml">'knitr' document</a> generated
      from the XML file, two <a href="common.xsl">XSL</a> <a
      href="knitr.xsl">files</a> that are used to transform the XML to
      the 'knitr' document, and a <a href="Makefile">Makefile</a> that
      contains code for the other transformations and coordinates
      everything.
    </li>
  </ul>

  <p>
    This report was generated on Ubuntu 14.04 64-bit running 
    R Under development (unstable) (2016-02-03 r70080) with
    version 0.3-4 of <a href="https://github.com/pmur002/hexview">the 'hexView' package</a>.
  </p>

  <hr/>
  <p>
    <a rel="license"
    href="http://creativecommons.org/licenses/by/4.0/"><img class="CC"
    alt="Creative Commons License" style="border-width:0"
    src="https://i.creativecommons.org/l/by/4.0/88x31.png"/></a><br/><span
    xmlns:dct="http://purl.org/dc/terms/"
    property="dct:title">Debugging Display List Internals</span>
    by <span xmlns:cc="http://creativecommons.org/ns#"
    property="cc:attributionName">Paul
    Murrell</span> is licensed under a <a rel="license"
    href="http://creativecommons.org/licenses/by/4.0/">Creative
    Commons Attribution 4.0 International License</a>.
  </p>

</body>
</html>
