Slide 1: Slide 2: 'ggplot2' is an example of an R package that can be used to generate data visualisations. This is my "Bilbo Baggins" slide: start with something simple and uncontroversial that makes people comfortable. Slide 3: This diagram shows how the 'ggplot2' package fits into the ecosystem of R graphics. The most important feature of this diagram is the 'grDevices' package, which represents the R "graphics engine". The next most important feature is the fact that all other graphics packages pass through 'grDevices' to get to the graphics devices, which produce final output. For example, 'ggplot2' talks to the 'grid' package, which talks to the R graphics engine. This means that the R graphics engine represents a bottle-neck; graphics packages can only do things that the R graphics engine allows them to do. Graphical output is limited to the (limited) vocabulary of the R graphics engine. Slide 4: 'ggplot2' constructs a scatter plot from rectangles ... Slide 5: ... lines ... Slide 6: ... points ... Slide 7: ... and text. Slide 8: This is just a repeat of the complete 'ggplot2' plot to show the final result of combining rectangles, lines, points, and text. Slide 9: The 'riverplot' package is another example of an R package that can be used to generate data visualisations. For example, it can be used to draw Sankey diagrams. Slide 10: Like 'ggplot2', the 'riverplot' package has to go through the R graphics engine to produce output on the graphics devices. In this case, 'riverplot' talks to the 'graphics' package, which talks to the R graphics engine. Slide 11: One of the features of the Sankey diagrams that the 'riverplot' package produces is the colour gradient that is used to fill the "edges" in the diagram. The 'riverplot' package must describe this colour gradient using the vocabulary of the graphics engine. Slide 12: the R graphics engine does (did) not understand colour gradients, but it does understand filling polygons with a solid colour. Because of the limitations of the R graphics engine vocabulary, the 'riverplot' package is forced to describe the colour gradient in terms of a series of small polygonal slices, each filled with a slightly different colour. Slide 13: It might make life easier for the 'riverplot' package if the R graphics engine vocabulary included the ability to fill a single polygon with a colour gradient. Slide 14: So what is new in the R graphics engine for R version 4.1.0? I think you might be able to guess at least part of what is coming ... Slide 15: Changes to the R graphics engine in R 4.1.0 mean that the R graphics engine has an expanded vocabulary. One new thing that we can do is to define a linear gradient fill. This means that we can fill a shape with a colour gradient. In this case, we have a gradient that transitions smoothly from white at the bottom-left corner to transparent at the top-right corner. The code shown in this section is just here to make it seem real. We will describe the code interface properly a bit later on. Slide 16: Another new thing that we can do is to define a radial gradient. Again, this means that we can fill a shape with a colour gradient, just a different style of gradient. In this case, we have a gradient that transitions smoothly from white at the centre of the filled region to transparent at the edges. Slide 17: Another new thing we can do is to define a pattern fill. This allows us to fill a shape with a repeating pattern. In this case, we define a pattern consisting of a single filled circle and fill a rectangular region by repeating that circle. Slide 18: Another new thing we can do is to define a clipping path. This means that we can limit the drawing of one shape to the region bounded by another shape. In this case, we define the clipping path to be a circle, then we draw a filled rectangle. With the clipping path in place, only the part of the rectangle that lies inside the circle is drawn. The significant thing here is that the clipping region is NOT a rectangle. The R graphics engine could always clip to a simple rectangle. Slide 19: The final new thing we can do is to define a mask, or more specifically, a transparency mask. This means that we can define the transparency of one shape based on the transparency of another shape. In this case, we define the mask to be a semi-transparent circle (on a fully transparent background) and we draw a filled rectangle. The transparency of the mask is transferred to the rectangle, so the parts of the rectangle that lie within the circle become semi-transparent and the parts of the rectangle that lie outside the circle become fully transparent. Masking can be viewed as more sophisticated clipping. Or clipping can be viewed as a simple form of masking. Slide 20: What do these new features get us? What can people do with the new R graphics engine vocabulary? Slide 21: We have already seen with the 'riverplot' package that there are graphical features that people want to produce even if they have to work quite hard to express what they want to do in terms of the limited R graphics engine vocabulary. There are other packages in the same boat as 'riverplot'. Here we see an example of a pattern fill from the 'ggpattern' package. Like 'riverplot', 'ggpattern' is forced to use polygons to achieve its result and it might benefit from the R graphics engine understanding pattern fills. Slide 22: In the case of 'riverplot' and 'ggpattern', when faced with the limitations of the R graphics engine, the solution has been to just work harder to achieve the desired result. Another option, when faced with the limitations of the R graphics engine, is to stop using R and start using something that can do what we want. When this option results in people manually adjusting an image in something like Adobe Illustrator, and NOT producing their image entirely in code, we lose a lot of good things, like reproducibility, version control, sharing, etc ... This image, from the "American Soccer Analysis" web site, is actually an example where the authors have worked very hard to produce the entire result from R code, albeit with some trial-and-error to fine tune the positioning of some elements of the image. Expanding the vocabulary of the R graphics engine can help to make it easier for people to create complete images from R code. Slide 23: There are a number of R packages that allow us to import graphics from other systems and draw them as part of an R data visualisation. This can be difficult to achieve if an image from an external system contains a graphical feature that is beyond the limited R graphics engine vocabulary. In this image, on the left, we have the R logo, which contains a subtle colour gradient in both the "hoop" and the "R". On the right, we have added the R logo to a 'ggplot2' scatter plot, INCLUDING the subtle gradient fills. Slide 24: Expanding the R graphics engine vocabulary expands the expressiveness of the R graphics engine. It is not up to me to figure out what people will try to do with R graphics. This image makes use of masks based on semi-transparency gradients to "fade out" some of the curves. Perhaps the new graphics engine vocabulary will inspire even more creativity from Rtists like Danielle Navarro. Slide 25: In this section, we will look at the new functions in R that provide the user interface for the new graphics vocabulary. We will also look a bit at how these new graphical features are defined and how they work because they may not be familiar to everyone. Slide 26: If you want to try out the new R graphics engine vocabulary, for now you will have to work with the interface that is provided by the 'grid' package. Slide 27: In particular, there is no 'graphics' interface and there is no 'ggplot2' interface (yet). You can retroactively modify 'ggplot2' or 'graphics' data visualisations to add some of the new features, but that requires a good knowledge of 'grid', so we will not address that here; the "Resources" slide has links to documents that include some examples of this sort. Slide 28: We use the linearGradient() function to define a linear gradient and then we can provide the resulting object as the value for 'fill' in a call to gpar(). It is possible to fill a single shape or we can make the linear gradient the default fill by specifying it for a viewport. Slide 29: A linear gradient is defined by a start point and an end point (indicated by red dots), plus two or more colours at "stops" along the straight line between the end points (indicated by vertical red lines). In this case, the start point is one quarter of the width from the left edge of the region being filled and the end point is at the right edge of the region being filled. There are three stops, one at the start point, one half-way between the start and end points, and one at the end point. The three colours at the three stops are black, white, and black. The region being filled is a rectangle. If the end points are within the limits of the shape that is being filled, we can also control what happens outside the end points (e.g., pad, repeat, invert, ...). Slide 30: We use the radialGradient() function to define a radial gradient and then we can provide the resulting object as the value for 'fill' in a call to gpar(). It is possible to fill a single shape or we can make the radial gradient the default fill by specifying it for a viewport. Slide 31: A radial gradient is defined by a start circle and an end circle (indicated by red circles), plus two or more colours at "stops" between the two circles. In this case, the start circle is very small and near the top-right corner of the region being filled and the end circle is the same diameter as and centred on the region being filled. There are just two stops, white at the start circle and black at the end circle. The region being filled is a square. Slide 32: We use the pattern() function to define a pattern and then we can provide the resulting object as the value for 'fill' in a call to gpar(). It is possible to fill a single shape or we can make the pattern the default fill by specifying it for a viewport. Slide 33: A pattern is defined by a 'grid' grob, which describes one or more shapes. We can also specify the size of the pattern within the region that is being filled. In this case, we are defining a pattern based on a single circle (shown by the red circle), but we specify the size of the pattern to be less than the diameter of that circle (shown by the red rectangle). This means that the pattern is actually four small arcs from the perimeter of the circle. We also specify that the pattern repeats within the region being filled and this produces a series of star or diamonds within the filled region. Slide 34: We define a clipping path by specifying a 'grid' grob as the value for the 'clip' argument in a 'grid' viewport. This enforces the clipping path until we pop the viewport or push another viewport with another clipping path or with clipping turned off. Slide 35: A clipping path is defined by a 'grid' grob, which can be as simple as a single shape, but could also be as complicated as an entire data visualisation. In this case, we have a circle grob that describes two disjoint circles, so the clipping path is the union of the two circular regions (shown by two red circles). When we draw a rectangle filled with black (shown by the red rectangle), the result is just the parts of the filled rectangle that lie within one of the two circles. Slide 36: We define a mask by specifying a 'grid' grob as the value for the 'mask' argument in a 'grid' viewport. This enforces the mask until we pop the viewport or push another viewport with another mask or 'NULL' (which represents no mask). Slide 37: A mask is defined by a 'grid' grob, which can be as simple as a single shape, but could also be as complicated as an entire data visualisation. In this case, we have a circle grob that describes two disjoint circles, one with an opaque fill and one with a semi-trasparent fill (shown by two red circles). When we draw a rectangle filled with black (shown by the red rectangle), the opacity of the circles is transferred to the rectangle so the result is just the parts of the filled rectangle that lie within one of the two circles, with one part opaque and the other part semi-transparent. Slide 38: The new vocabulary can be combined to produce interesting results. In this case, we define a linear gradient that transitions from transparent at the bottom of the region being filled to white at the top. We then define a mask that is based on a rectangle shape filled with the linear gradient. This produces a mask that transitions from transparent to opaque, from bottom to top. The mask is shown being applied to a 'ggplot2' scatter plot, which becomes more translucent towards the bottom of the scatter plot. Slide 39: The new vocabulary can be combined to produce interesting results. In this case, we define a clipping path based on two overlapping circles. We then define a mask based on a rectangle shape that is filled with a semi-transparent white and drawn within a viewport that enforces the clipping path. This produces a semi-transparent mask that is the shape of the union of the overlapped circles. The mask is shown being applied to a filled rectangle. The result is just the part of the rectangle that lies within the union of the overlapped circles made semi-transparent. Slide 40: Although there is no official 'ggplot2' interface to the new R graphics engine vocabulary, the 'gggrid' package (only on GitHub at present) makes it easy to combine the new 'grid' interface with 'ggplot2' data visualisations. In this case, we use the 'grid' interface to define a radial gradient that transitions from transparent at the centre of the region being filled to black at the edges. We then use the grid_panel() function from the 'gggrid' package to add a 'grid' rectangle filled with the radial gradient to the plot region of a 'ggplot2' scatter plot. Slide 41: This diagram attempts to show how the 'gggrid' package provides a way to combine the 'grid' interface to the new R graphics engine vocabulary with 'ggplot2' data visualisations. Slide 42: This section looks at work that remains to be done. This section could be skipped if there is insufficient time. Slide 43: The new R graphics engine vocabulary has only been implemented so far for Cairo-based devices and the pdf() device. All other graphics devices just ignore the new vocabulary. Slide 44: This diagram attempts to show that only some graphics devices currently support the new R graphics engine vocabulary. Some built-in graphics devices, like the postscript() device, and all third-party graphics devices, like 'ragg', just ignore the new vocabulary. Slide 45: It would be very useful to expand support for the new R graphics engine vocabulary to more graphics devices. The quartz() device would be a big one for Mac OS users. Support for the 'ragg' device would also be great, given that it is cross-platform and given its importance to the R Studio IDE. NOTE that there is no hope of adding support for the current windows() device for Windows users because the 'graphapp' system upon which the windows() device is built, has its own limited vocabulary. There are a couple of large openings here for contributions from outside of R core! Slide 46: There are still some things that the new R graphics engine vocabulary does not support. For example, if we apply a mask to a circle grob that describes two overlapping circles, the mask is applied to each circle, one after the other. The result is two overlapping semi-transparent circles, with a visible overlap. Slide 47: If we extend the R graphics engine vocabulary to include the concept of "isolated" groups, we can produce a different result. Here, the overlapping circles are drawn as a group BEFORE applying the mask. The result now is a semi-transparent version of the result of drawing the two circles. Work on this and related extensions is currently underway. Slide 48: The new R graphics engine vocabulary has undergone a battery of tests, but not all possible combinations have been explored. In this example, we have a mask consisting of an opaque rectangle and a semi-transparent rectangle being applied to a rectangle filled with a pattern. The result is not as expected because the mask is being applied to the pattern AS WELL AS being applied to the rectangle. The more people use the new R graphics engine vocabulary, the sooner we will find, and hopefully fix, these bugs. Slide 49: One area where the R graphics engine vocabulary is still severely lacking is in the production of text. The 'ragg' graphics device is leading the way here. Discussions are underway to investigate improvements to the R graphics engine in this area. Slide 50: Slide 51: Slide 52: