Using Computer Modern Fonts in R Graphics
by Paul Murrell
Problem | Solution | Examples | Sweave Tutorial | Details | Adobe Symbol encoding

Many R users (especially on Linux) produce documents and reports using LaTeX for typesetting and PDF or PostScript as the final output format. If the document includes a plot produced by R, it can be useful to use the same font in the plot as is used in the main text of the document. The default font in LaTeX is Donald Knuth's Computer Modern, so the issue is how to use Computer Modern fonts in R (PDF and PostScript) graphics.

The Problem

R allows the user to specify special fonts for graphics (see "Fonts, Lines, and Transparency" in volume 4(2) of http://cran.r-project.org/doc/Rnews), but the Computer Modern fonts are special in a couple of ways:

  1. The Computer Modern fonts have special TeX-specific encodings.

    For example, character 60 [character 074 in octal, 0x3C in hex] for the "plain" text Computer Modern font is an upside down exclamation mark, instead of the more typical ASCII less-than sign [<])

    Worse, many of the different Computer Modern fonts have their own special encoding: for the typewriter font, character 60 is the usual less-than sign; for math symbol font, character 60 is the R-fractur character.

    Special encodings are not necessarily a problem, because it is possible to tell R about the encoding for a font. What is a problem, is that for the symbol face in R graphics text (face 5), R assumes that the symbol font uses the Adobe Symbol encoding. If you supply a font for the symbol face which is not in Adobe Symbol encoding, the wrong characters will be drawn.

  2. Each Computer Modern font contains only 128 characters.

    For example, the plain text fonts do not contain a less-than sign, and the math symbol font does not contain an equals sign!

    This means that, in order to draw text using a very standard, unexotic set of characters, you have to use a combination of more than one font.

    In R graphics, it is only possible to specify a single font, and in fact only a single font face, for drawing a piece of text. This restriction is lifted somewhat when drawing mathematical annotations (see ?plotmath), where it is possible to use different font faces, and the symbol face is automatically employed to draw special mathematical symbols. But this is still not enough; even to draw quite normal text using a Computer Modern font, you need more than one font.

    The real killer is that, because the symbol face is assumed to be in Adobe Symbol encoding, the font provided for the symbol face must include all of the characters defined in the Adobe Symbol encoding. And yes, the Adobe Symbol encoding includes both a less-than sign and an equals sign (among many others).

In summary, the problem is that you need to combine several of the Computer Modern fonts together in order to produce even plain text, and if you want to produce any mathematical equations, you need to combine several Computer Modern fonts using Adobe Symbol encoding (i.e., with the characters combined in a very special order).

The Solution

NOTE: There is a partial solution implemented for R PostScript output (see ?postscript), but that is an incomplete solution and is basically a nasty, special-case hack in the C code. It doesn't help at all for PDF output.

The solution for producing plain text is actually quite straightforward. All of the comments in the previous section about the peculiarity of the Computer Modern fonts were largely "straw man" comments (i.e., totally unfair). They are true of the Type 1 Computer Modern fonts that are distributed as part of a standard LaTeX set up (i.e., Computer Modern fonts designed for use with LaTeX which of course has no problem with TeX-specific encodings and character sets), but those are not the only Type 1 Computer Modern fonts around.

The TeX package cm-lgc contains Type 1 Computer Modern fonts which are set up for use with non-TeX encoding schemes; they contain more complete character sets. All we need to do is download (and possibly install) these fonts then we can produce normal text in R PostScript and PDF graphics output (see the examples below).

Another set of Computer Modern fonts with non-TeX encodings is provided by the cm-super package. Andrey Paramonov reports that this solves a problem with cm-lgc where Cyrillic encodings lack the "minus" character (thanks Andrey!).

The solution for producing mathematical annotation is harder, but is mostly solved. This has required creating a new, customised Computer Modern font containing appropriate Adobe Symbol characters in the Adobe Symbol encoding. You can read about the gory details if you like, but the important part is that you can download an AFM file and a PFB file for use as the symbol face in a Computer Modern family of fonts for producing mathematical annotation in PostScript and PDF graphics output (see the examples below).

Some Examples

Suppose that cm-lgc has been downloaded and unzipped in the current directory (creates a directory called cm-lgc containing various other directories and files). Suppose also, that the AFM and PFB files for the new cmsyase font are in the current working directory. The following code produces the mathematical annotation demo using Computer Modern fonts (updated 2010-06-11; tested on R 2.12.0).
CM <- Type1Font("CM",
                c("cm-lgc/fonts/afm/public/cm-lgc/fcmr8a.afm",
                  "cm-lgc/fonts/afm/public/cm-lgc/fcmb8a.afm",
                  "cm-lgc/fonts/afm/public/cm-lgc/fcmri8a.afm",
                  "cm-lgc/fonts/afm/public/cm-lgc/fcmbi8a.afm",
                  "./cmsyase.afm")) 
pdf("destructiontest.pdf", family=CM)
demo(plotmath)
dev.off()
embedFonts("destructiontest.pdf", 
           fontpaths=c("cm-lgc/fonts/type1/public/cm-lgc/", "."))
A complete test of the new cmsyase font is produced by this R code. The output using the standard Adobe Helvetica and Symbol fonts looks like this and the output using the cm-lgc and cmsyase fonts looks like this.

Embedding the Fonts

An extra complication arises with producing PDF and PostScript graphics output from R because R does not embed fonts. This is not a problem if you just use one of the standard 14 Adobe fonts (Helvetica, Times, Courier, etc), but requires an additional step if you use a special font, like Computer Modern. This will not stop you producing a PDF or PostScript file from R, but it will prevent you from viewing the file or printing it.

One way to view a file produced by R that contains a special font is to use ghostscript and just tell it where the font files are. For example ...

GS_FONTPATH=cm-lgc/fonts/type1/public/cm-lgc:. gs -sDEVICE=x11 yourfile.pdf
For printing (or inclusion in other documents), ghostscript can be used to do the embedding. For example ...
GS_FONTPATH=cm-lgc/fonts/type1/public/cm-lgc:. gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=yourfileembed.pdf yourfile.pdf
This creates a file that has the fonts embedded, so you can send it to a printer or to a friend and the printer or friend's viewer software will be able to find the fonts in the file.

R now has an embedFonts() function that will do this font embedding (via ghostscript) from within an R session.

For inclusion of R graphics files in LaTeX documents, especially when working with a large report or a book (where embedding the fonts in each graphic file is inefficient; it is better to embed the fonts in the overall file), I had suspected that dvips or pdflatex could do the font embedding, and Daniel Sabanes Bove has subsequently provided a description of how to do this (I have not tried this out myself yet, but intend to soon; thanks Daniel!):

The *.afm metric files in cm-lgc/fonts/afm/public/cm-lgc (moved them to directory fonts/metrics), and the *.pfb outline files in cm-lgc/fonts/type1/public/cm-lgc (moved them to directory fonts/outlines) are necessary. The respective files of the CMSYASE font are moved to the new directories too. The encoding file cm-lgc/dvips/base/8r-mod.enc and the mapping file cm-lgc/dvips/config/cm-lgc.map go in the fonts directory, say. Everything else from the cm-lgc package is superfluous for our purpose.

We are almost done, only the mapping file needs some changes: The files have to be named relative to the compilation directory. Adding the prefixes fonts/ and fonts/outlines to 8r-mod.enc and *.pfb accomplishes that. The CMSYASE font can be found by dvips if we add the line

CMSYASE        <fonts/outlines/cmsyase.pfb
to fonts/cm-lgc.map.

After configuring the correct paths in the font description in the Rnw-file myfile.Rnw, we can run R with Sweave and process the dvi-file with the command

dvips -Ppdf -u +fonts/cm-lgc.map myfile.dvi
Finally, we can assure that ps2pdf embeds the necessary fonts by executing
ps2pdf14 -dSubsetFonts=true -dEmbedAllFonts=true myfile.ps

Sweave Tutorial

Here is a simple demonstration of how to use Computer Modern fonts within an Sweave document: the Sweave file and the resulting PDF document.

Type 1 Fonts

Type 1 fonts are the main sort of fonts that can be used for PostScript and PDF output in R. The fonts used by LaTeX to produce dvi output are actually in a different, MetaFont, format. LaTeX distributions provide Type 1 versions of the fonts because that will produce nicer output when you produce PostScript or PDF output from LaTeX (e.g., using dvips or pdflatex).