by Paul Murrell
      
         http://orcid.org/0000-0002-3224-8858
      
      
        http://orcid.org/0000-0002-3224-8858
      
    
Version 1:

This document
    by Paul
    Murrell is licensed under a Creative
    Commons Attribution 4.0 International License.
  
This document describes some lower-level, technical details of the CRAN package 'xdvir' for rendering LaTeX fragments as labels and annotations on R plots.
The 'xdvir' package for R (Murrell, 2025) allows LaTeX fragments to be used as labels and annotations on R plots, including 'ggplot2' plots. The package vignette describes basic high-level usage and an article submitted to the R Journal explains more about the standard usage of the package and has more complex examples. The purpose of this document is to provide a description of some more advanced features of the package, including ways to extend the package, and to provide a record of the lower-level design and implementation of the package.
The package startup detects and reports on which TeX engines are available on the current system. This report is built on a system with XeTeX and an acceptable version of LuaTeX, so both of those engines are available. This means that the default engine for building this report is LuaTeX.
Rendering a LaTeX fragment in R involves three steps:
This document begins by taking a closer look at each of those three steps.
       
    
    The author() function turns a LaTeX fragment into 
    a LaTeX document.  This consists of adding additional
    LaTeX code around the LaTeX fragment.
  
    There are some obvious additions, like \begin{document} and
    \end{document}.  
  
    The \documentclass is always
    standalone.  This is partly because that seems sensible - the
    LaTeX fragments are not complete documents, but typically small labels to be
    added to other drawing - but also because that will not produce, for
    example, any page numbering or headers and footers that you normally
    get from something like the standard article class.
  
    The varwidth option is set for the standalone
    class.  This is not always necessary and it may actually cause
    problems if the width of the label exceeds the text width of 
    a standard article.  However, the width argument
    to author() can be used to specify an explicit width
    if that is convenient and/or if that is necessary.
    This uses the varwidth option to set an explicit width.
    If given, the width is interpreted as a number of inches.
  
    The \usepackage{unicode-math} is there to force the TeX engine
    to use TrueType math fonts and to produce font file paths in DVI output.
    Without this, you tend to get fnt_def operations in the DVI
    output of the form cmmi10.  Those not only require 
    a potentially complicated further mapping to get to
    actual font files, but they resolve to Type 1 fonts.  That is a problem
    because the rendering of LaTeX fragments in R 
    requires a graphics device that has
    support for rendering glyphs
    
    and the main set of those is the Cairo devices and they 
    no longer support Type 1 fonts.
  
The comment on the first line records information about the version of 'xdvir', the TeX engine that was used to author the document, and the LaTeX packages that were specified. As noted previously, the engine defaults on package startup. This information is used in the typesetting step to check that compatible TeX engines and packages were used in the authoring step.
    We can use the engine and
    packages arguments to author() to explicitly set
    those values.  Notice that the addition of packages="xcolor"
    adds a new
    \usepackage line and it adds to
    the comment on the first line.  
    The engine just turns up in the first line of comments.
  
    Some packages will add more lines.  For example, the preview
    package adds further LaTeX code 
    to the document preamble and wraps the fragment
    within a preview environment.  
  
    Functions like grid.latex() and geom_latex()
    also add additional code to LaTeX fragments in order to 
    match the current font family, font face, and font size.
    We can easily see this if we use the texFile argument
    to control the name of the file that 'xdvir' writes a full LaTeX
    document into.
    For example, the following code draws a simple LaTeX fragment
    and automatically matches the R default font family (rather than
    using the default LaTeX Computer Modern font)
    and writes the full LaTeX document to the file "test.tex".
    The font matching is implemented by adding \setmainfont and
    \fontsize commands to the LaTeX document 
    (plus \usepackage{fontspec}).
  
    Where user-level functions add a lot
    of additional code to the LaTeX fragment, like the examples above, 
    there is a risk of 
    conflicting with the LaTeX fragment and/or a risk of 
    packages conflicting with each other.  This is one possible
    reason for authoring a LaTeX document directly rather than relying
    on the author() function.
    However, care must be taken in that case to retain features like
    \usepackage{unicode-math} so that 
    DVI output that is consumable by 'xdvir' 
    is produced during the typesetting step.
  
In other words, here be dragons. On the positive side, I have yet to irretrievably bite myself with this problem.
    The typeset() function processes a LaTeX document
    to a "DVI" object.
    The LaTeX document can be a "LaTeXdocument"
    object, as produced by author()
    or just a character vector (though the latter must be a complete
    LaTeX document, not just a LaTeX fragment).
  
    The typeset() function generates a DVI file by 
    writing the LaTeX document to a file and running a TeX engine
    to produce a DVI file.
    'xdvir' makes use of latexmk() from the 'tinytex'
    package for this step, which automatically takes care of 
    multiple runs of the TeX engine and even installing missing 
    LaTeX packages in at least some cases.
  
    The DVI file is read into R as a "DVI" object and
    the "DVI" object has a print method to show the contents
    of a DVI file in a human-readable format.
    Underneath, it is a list of "rawFormat"
    objects from the 'hexView' package, which are  labelled blocks of 
    raw bytes along with interpreted values (integers, strings, etc).
    For example, the first element of the dvi object
    is a DVI pre
    operation, which consists of a one-byte integer operation code
    (247), followed by a one-byte integer DVI version number (2),
    and so on.
  
    Notice that typeset() embeds a "signature" in
    the pre operation of the DVI file that records 
    information about the 'xdvir' version, the TeX engine used,
    and any LaTeX packages used.  This information is used in the
    rendering step 
    to check that compatible TeX engines and packages are used in each
    step.
    Warnings are given if information is missing or inconsistent.
  
    There is an engine argument to allow the TeX engine
    to be specified, but it will default from information in the
    LaTeX document if that exists.
  
    
    The typesetting step is probably where most user problems will occur, 
    most probably because the LaTeX fragment (or the LaTeX
    document) contains an error, so the call to a TeX engine fails.
    For example, the following code adds a $ to the 
    previous LaTeX fragment to produce an illegal LaTeX fragment.
  
When we build a complete LaTeX document and attempt to typeset it, we get an error from the TeX engine.
    The texFile argument is useful in cases like this because
    the .log file will be in the same location as the
    texFile, so we can easily see what the problem is.
    Using texFile is also a good way to
    easily get hold of the LaTeX document and run a TeX engine on it
    outside of R to debug the problem.
  
    'xdvir' automatically caches typsetting results. If the combination of
    LaTeX document, TeX engine, and LaTeX packages has been seen before
    then a cached "DVI" object will be used.
    Ths means that the expensive typesetting step
    should only occur once per unique TeX fragment (per session).
  
    For debugging problems, it can be useful to turn off caching 
    with options(xdvir.useDVIcache=FALSE).
  
    Because typeset() accepts a simple character vector,
    it is easy to author a LaTeX document outside of R and
    just readLines() that document to pass it to 
    typeset.
  
    It is also possible to perform the typesetting outside of R and
    just read a DVI file in using readDVI().
    The important thing here is to make sure that you produce a DVI file, e.g.,
    xelatex --no-pdf or
    lualatex --output-format=dvi.
  
    The render() function renders a "DVI"
    object.  This is an alias for grid.dvi() - the rendering
    happens within the current 'grid' viewport.
  
Rendering consists of two steps internally:
        We walk the list of DVI operations and build a list of 
        objects, mostly either "XDVIRglyphObjglyph" objects, 
        which describe
        glyphs, the fonts they come from, and where to draw them, 
        or "XDVIRruleObj" objects, 
        which correspond to rectangles to draw
        (the only core TeX drawing operation).        
      
This step also resolves font definitions and gathers bounding box information, which is vital for sizing and positioning the grobs that are produced in the second step.
        We walk the list of objects and build a gTree of grobs,
        mostly either "glyphgrob"s, "rect" grobs,
        or "segments" grobs.  Segments are used for
        very thin rectangles because very thin filled rectangles 
        tend to disappear on raster devices, including on screen.
      
        Consecutive glyph operations in the DVI file are collapsed into
        a single "glyphgrob", but we may end up with
        multiple "glyphgrob"s if, for example, 
        a mathematical equation includes a horizontal line.
        In that case, in order to preserve the order of drawing,
        we end up with a "glyphgrob" for one or more
        glyphs, before a "rect" for the line, then another 
        "glyphgrob" for one or more further glyphs.
      
    Justification of the rendering is complicated because we 
    potentially need to
    justify a combination of multiple "glyphgrob"s, plus
    lines and/or rectangles.
    The solution is to generate common "anchors" for all
    "glyphgrob"s based on the bounding box for the 
    rendering and to calculate an "offset" for lines and rectangles
    based on how much a "dummy" glyph positioned at the bottom-left of
    the bounding box would have to move to satisfy the justification.
  
    A margin and rotation can be specified
    for rendering and these are implemented via viewports on the
    overall gTree.
  
    It is possible to specify the packages and
    engine for rendering, though they will be taken
    from the "DVI" object if possible.
    Checks are made for inconsistency between the user specification
    and the signature in the "DVI" object, if there is one.
    Warnings are given if information is missing or inconsistent.
  
It is also possible to specify a font library, which is used to get glyph metrics (for calculating glyph positions and the overall bounding box). There is a default font library based on FreeType (The FreeType Project, 2025) and no other option is provided (on CRAN). Section Font libraries has more information.
    The render() function is vectorised.
    The dvi argument to render() can be
    a list of "DVI" objects, 
    which can each be drawn at different (x, y)
    locations, at different rotations,  and
    with different hjust and vjust
    justifications.
    The gp settings are also vectorised and applied
    to the corresponding "DVI" object.
    On the other hand, the packages, engine,
    and fontLib are fixed for all "DVI" objects
    in a single render() call.
  
By default, DVI positions and font metrics are used at maximum resolution (typically 72*2^16 scaled points per inch and 72*10*1000 units per inch, where 10 is the font point size, respectively) and R uses its default naive infinite resolution calculations to position glyphs and to determine dimensions, such as the bounding box.
    However, it is also possible to specify a dpi 
    resolution, in which case glyph locations and font dimensions
    are rounded to the nearest "pixel".  This uses an algorithm
    that produces identical results to dvitype 
    (Knuth, 1995)
    at multiple resolutions on an identical DVI file
    (see tests/dpi.R).
  
    Using dpi may make sense if you know the dpi of 
    the target graphics device, though I have yet to see convincing
    evidence of improvement.  This may be because the dpi
    rounding does
    not guarantee alignment with pixels on the device because there 
    are subsequent calculations on the rounded dpi locations
    to satisfy justification.
  
It is possible for a DVI file to contain multiple pages. While this is unlikely to be the typical case for LaTeX fragments, it could still occur if a LaTeX document is generated outside of R.
    In such a case, readDVI() will read all pages into
    a "DVI" object, but only the operations 
    on the page specified to render()
    will generate objects and operations on other pages will be ignored.
  
To be more accurate, some operations, such as positioning and setting of fonts will be listened to on all pages in case settings carry over across pages, but those operations will not generate any objects themselves.
    The creation of "glyphgrob"s is delayed, in the sense
    of happening within a makeContent() method,
    because the calculation of anchors and offsets for justification
    has to occur at drawing time, once the rendering viewport is known.
    This means that grob queries, such as grobWidth()
    incur a penalty because they have to generate final grobs
    (though not necessarily "glyphgrob"s).
  
    There can be a further layer of delay when calling 
    grid.latex() because the author step has to occur
    at drawing time if the width requires converting to
    inches and/or the current font has to be queried (so we need to
    know the final rendering viewport).
    This delay does not occur if width is not specified
    and/or it is in absolute units and gp
    is set to NULL (so the "current" font settings
    are ignored).  This additional delay results in additional
    penalties for grob queries.
  
'xdvir' provides predefined support for several LaTeX packages: fontspec, xcolor, preview, zref, and tikz (see the TikZ Section).
    There are obviously many more LaTeX packages that 
    users might want to make use of in their LaTeX fragments.
    One approach is to write an entire LaTeX document by hand, which
    means you can include whatever packages you need.
    However, this is not the most convenient and will not work,
    for example, with the 'ggplot2' interfaces geom_latex()
    and element_latex().
  
    So there is a mechanism for defining support for additional LaTeX packages
    in the form of the LaTeXpackage() function
    (plus the registerPackage() function).
  
    The name argument provides convenience.
    If the package is registered, then it can be referred to by this
    name (which is what previous examples have done with predefined packages).
  
    The preamble argument is LaTeX code that will be placed in
    the LaTeX document preamble in the authoring step.
    This is often just a \usepackage command, but 
    the preview package example earlier showed that other LaTeX
    code can be included.
  
    The prefix and suffix arguments are LaTeX code 
    that is wrapped around the LaTeX fragment in the authoring step.
    A typical use would be to wrap the LaTeX fragment in a LaTeX
    environment (like the preview package does).
  
    The special argument is a function.
    This is useful for packages that generate DVI specials, like xcolor.
    The following code authors and typesets a simple LaTeX fragment
    that draws text in red.
    The resulting DVI output contains DVI specials, which are marked
    as xxx1 operations.
    These have been  produced by the xcolor package
    and contain information about changes in colour.
    For example color push gray 0 sets the text colour
    to black and color push rgb 1 0 0 sets it to red.
  
    The rendering step calls the special function
    (if not NULL) for every registered package,
    providing the content of the special (e.g., color push gray 0)
    as a character value, plus a state object.
    The latter is just an environment that can be used to maintain 
    state variables during the walk of DVI operations.
    For example, the special function for the xcolor
    package (in 'xdvir') maintains a stack of colours
    based on specials that start with color (and ignores
    all other specials).
  
    The init and final arguments are also functions
    that are called at the start and end of walking the DVI operations.
    These are passed a state object and allow initialisation 
    of state variables and anything else the package wants to do.
  
    In summary, the LaTeXpackage() function allows the user
    to specify LaTeX code that will be added to a LaTeX fragment during the
    authoring step and a function to handle specials during the rendering
    step that the LaTeX package
    generated in DVI output during the typesetting step.
  
    The registerPackage() function is used to 
    register the package with 'xdvir' so that it can be referred to by name.
  
The support for the tikz package in 'xdvir' is several orders of magnitude more complex than the support for other packages, so deserves its own mention.
The LaTeX package TikZ (Tantau, 2013) provides a sophisticated graphics system within the LaTeX world. It generates graphical output by producing specials in the DVI output. For example, consider the TikZ fragment below (it is constructed in a slightly odd way to avoid 'knitr' errors). This describes a simple TikZ diagram that draws an arrow from "a" to "b".
    The following code augments that to a full LaTeX document,
    including additional LaTeX code to load the TikZ package
    and wrap the fragment within a tikzpicture
    environment.
    The \def\pgfsysdriver command is very important
    because this provides a backend for the TikZ package 
    that outputs TikZ specials designed for consumption by 'xdvir'.
    Without that command, TikZ would output specials consisting of
    PDF commands.
    This is another example of a detail that would have to be
    replicated if we author a complete LaTeX document (that included
    a TikZ picture) manually for use with 'xdvir'.
  
    Typesetting this document generates a large number of special
    operations in the DVI output, all starting with xdvir-tikz::
    so that the special function for the tikz package in 'xdvir' 
    will identify them and handle them.
  
    The special function for the tikz package in 'xdvir'
    calls a large number of other functions to maintain a lot of 
    state to keep track of where drawing needs to occur.
    It also generates a number of special "XDVIRtikz*"
    objects (that are not glyph or rule objects that normal 
    DVI operations produce) and 
    there are functions to convert those special tikz objects into
    'grid' grobs (in the second stage of the rendering step).
  
    At the high level, all of this internal activity is hidden and we can just
    call grid.latex() with the TikZ fragment and make sure to
    load the tikz package support, as shown below.
  
    The 'xdvir' package has predefined support for 
    the XeTeX engine and the LuaTeX engine, though it insists on
    a recent version of the latter so that DVI output contains
    font file paths in font definitions and glyph indices in 
    set_char operations.
  
    This means that 'xdvir' cannot work with DVI output that is 
    generated by, for example, pdfTeX or upTeX, because those engines generate
    DVI output with font definitions that require resolving more complex 
    font file mappings and set_char operations that
    use a variety of different encodings.
  
    The door is left ajar for future development of support for other
    engines by providing the TeXengine() function
    (and the registerEngine() function).
  
    The name argument provides an easy way to refer to the
    engine if it is registered with 'xdvir'.
  
    The version argument is a function (with no arguments)
    that should return the version of the engine.
    This is used to construct comments in the authoring step and
    signatures in the typesetting step to check for consistency
    between steps.
  
    The command is a character value that gives the
    engine command (e.g., "xelatex") and options
    are any additional options required to produce DVI output
    (e.g., "--no-pdf").
    dviSuffix gives the suffix used for DVI output files
    (e.g., .xdv).
  
isEngine is a function that is called with a
    DVI object and should return whether the engine produced that 
    DVI object (used to help check for consistency).
  
    The preamble is LaTeX code that should be added
    in the authoring step.
  
    The fontFile and glyphIndex arguments
    are functions called during the rendering step to
    convert DVI operations into objects.
    fontFile is called with the content of a DVI font definition 
    (as a character value) and should return a font file path.
    glyphIndex is called with the content of a 
    set_char operation (as a set of raw bytes) and should
    return an integer glyph index.
  
    The implementation of fontFile and glyphIndex
    for XeTeX and (acceptable) LuaTeX is relatively straightforward,
    but these functions offer a possible path for other engine support, even if
    that support may require much more complex implementations.
  
    The 'xdvir' package relies on FreeType to access glyph
    metrics (for glyph placement and for bounding box calculations).
    The FreeType support is defined via an internal 
    FontLibrary() function that requires three functions
    to return glyph metric information.
    That function is not currently exported, but 'xdvir' internally 
    defines a different font library based on TTX 
    (FontTools Project, 2025), which
    can be useful for debugging.
  
The 'xdvir' package provides a convenient high-level interface for using LaTeX fragments as plot labels and annotations in R. This document describes some of the details and internal design of the package, which may be helpful for diagnosing problems when things go wrong and possibly for extending the package.
Rendering is slow. The typesetting step requires at least one run of a TeX engine. Caching helps the second time around, but a plot that contains multiple unique TeX fragments can be glacial.
As mentioned previously, 'xdvir' only supports XeTeX and LuaTeX only if luaotfload-tool verion is > 3.15.
    'xdvir' depends on R >= 4.3 because it makes use of the
    glyph-rendering feature that was added in that version.
    This extends to requiring a graphics device that provides
    glyph-rendering, which currently includes
    pdf(),
    quartz(),
    the Cairo-based devices such as
    cairo_pdf() and
    png(type="cairo"),
    plus devices from the 'ragg' package (Pedersen and Shemanarev, 2024).
  
    The PDF produced by
    pdf() 
    does NOT embed the fonts, so viewers may show 
    garbage glyphs.
    The embedGlyphs() function can be used
    to embed glyphs, but a little effort is required to extract the
    glyph information that embedGlyphs() requires
    from the 'grid' gTree that rendering produces.
  
The 'xdvir' package is an evolution of the 'dvir' package (Murrell, 2018, Murrell, 2020b), which only exists on github. 'xdvir' has borrowed a lot of code from 'dvir' (especially for TikZ support; Murrell, 2020a), but streamlined some of the user interface, added integration with 'ggplot2', and massively simplified the handling of font definitions and glyph specifications in DVI output (by insisting that they contain font file paths and glyph indices, respectively). The 'xdvir' package also benefits from the glyph rendering support in R (from version 4.3), whereas 'dvir' had to struggle with the simplistic R text drawing of character values.
A number of R packages and other software perform a similar job to 'xdvir', but differ in various ways. The following list focuses on the negatives of these other approaches, but they almost all have the advantage of NOT having massive dependencies (like 'xdvir' does with its need for a TeX installation) and NOT being slow (like 'xdvir' is with its typesetting step).
'latex2exp' converts LaTeX (mathematical equation) fragments to R expressions, to be drawn using plotmath. This helps LaTeX users to write plotmath expressions, but the end result is plotmath, which produces an inferior result to real LaTeX. It also only supports LaTeX mathematical equations.
'tikzDevice' converts R graphics into LaTeX graphics. This is basically the inverse of 'xdvir'. If you want to end up in a LaTeX document, this might be the way to go. If you want to end up somewhere else, then 'xdvir' might be for you.
'marquee' typesets and renders markdown. The main difference is markdown rather than LaTeX. Markdown can be easier to type, but LaTeX gives more control.
There are some R packages that perform specific tasks, like 'geomtextpath' (text along a path) and 'ggpage' (paragraphs of text), but they tend to be much more limited in their scope.
In the Python world, there is a flag in matplotlib that switches all plot labelling to use LaTeX. This uses a similar approach to 'xdvir' by generating and reading DVI output. The main difference is where you want to produce your plots: Python/matplotlib versus R/ggplot2.
The dvi-decode javascript module renders DVI files in a web browser. It is quite general in theory, but faces challenges with resolving fonts. This is also not integrated with any plot-drawing library.
It is apparently possible to use LaTeX in plot titles (at least) in the COMSOL software, though this is a commercial product so I have not tried it myself.
The examples and discussion in this report relate to R version 4.4.2 and 'xdvir' version 0.1-2.
This report was generated within a Docker container (see Resources section below).
Murrell, P. (2025). "LaTeX Typesetting in R: The 'xdvir' Package" Technical Report 2025-01, Department of Statistics, The University of Auckland. Version 1. [ bib | DOI | http ]

This document
    by Paul
    Murrell is licensed under a Creative
    Commons Attribution 4.0 International License.