Improving the 'gridGraphviz' package in R

Ashley Noel Hinton ahin017@aucklanduni.ac.nz and Paul Murrell paul@stat.auckland.ac.nz

Department of Statistics, University of Auckland

Abstract

The gridGraphviz package renders node-and-edge graphs in R using the grid graphics package. Graphs are laid out using the Rgraphviz package to interface with the graph layout algorithms in graphviz. This article details the improvements made between gridGraphviz versions 0.2 and 0.3, including: support for "ellipse"- and "polygon"-shaped nodes; handling of edges in undirected graphs; support for various new arrow types; and support for edge labels. Version 0.3 also introduces a method to produce graphs with an overall size closer to graphviz's output.

Introduction

This article describes improvements made to the gridGraphviz [1] package for the R Project [2]. gridGraphviz is a package for rendering node and edge graphs with R's grid [2] graphics package. The graph layouts are produced by AT&T's graphviz [3] software, accessed through Bioconductor's Rgraphviz [4] package. Rgraphviz produces graphs which appear very different from those produced by graphviz directly. This project attempted to bring the features of gridGraphviz in line with those available in Rgraphviz, to bring the resulting graph closer to those produced by graphviz, and to make use of the grid graphics model to allow further extensibility of the resulting graphs.

There are three stages to the plotting of graphs with gridGraphviz. First a graph object must be created; next the graph must be laid out; finally the graph is plotted.

The graph package provides several ways to create a graph. In the approach employed in this report, a graph is made up of a set of nodes and a list of edges between these nodes, combined to make a new "graphNEL" [5] object. The "graphNEL" takes in the nodes, edge lists, and an edgemode - directed or undirected.

This graph is then laid out as an "Ragraph" object. Rgraphviz's agopen() function is used; it accepts a graph object, a name, and any attributes the graph should have e.g. the shape of the nodes.

Finally the graph is rendered using gridGraphviz's grid.graph() function. Here we can specify whether to start a new page, or to plot over the previous output in the current device. Figure 1 shows the resulting plot.

Figure 1: Example of a 'gridGraphviz' plot

Example graph output

A set of eight examples of graphs was taken from http://graphs.grevian.org/example [6] to assess which areas needed to be improved in gridGraphviz at the beginning of the project. The examples demonstrated that gridGraphviz version 0.2:

The following sections describe each of these problems in more detail and how they have been addressed in version 0.3 of gridGraphviz.

'Ellipse' shaped nodes

Graphviz produces ellipse shaped nodes if no other node shape is given, while Rgraphviz defaults to circular nodes. As gridGraphviz had been built to make use of Rgraphviz for interfacing with graphviz, it had also been initially written to produce circular nodes by default. The "ellipse" shape was not handled at all and resulted in an error.

Node shapes are passed to graphviz as an attribute (attrs) when the graph is laid out. The following code creates a laid out "Ragraph" object with ellipse-shaped nodes. Figure 2 shows the improvement in the rendering of the graph between versions 0.2 and 0.3 of gridGraphviz.

Figure 2: Improvements in handling of ellipse-shaped nodes

Ellipse-shaped nodes in version 0.2 Ellipse-shaped nodes in version 0.3
Ellipse-shaped nodes in version 0.2 Ellipse-shaped nodes in version 0.3

Other nodes shapes

The only node shapes initially supported by gridGraphviz were 'circle,' and 'box.' The list of node shapes that graphviz supports is somewhat extensive so only a limited subset of the possibilities have been added at this stage. Support was added for 'square', 'diamond', 'triangle', 'pentagon', 'hexagon', 'septagon', and 'octagon' node shapes. Limited support for the 'polygon' node shape was added; as Rgraphviz does not currently pass through the node attribute 'sides' gridGraphviz displays the default four-sided polygon shape.

The code required to produce these node shapes is a simple variation on the ellipse code from the previous section. The following code shows the case for 'triangle'-shaped nodes. Figure 3 shows the gridGraphviz results for the newly-supported node shapes.

Figure 3: node shapes added to version 0.3

Triangle nodes Polygon nodes
Triangle Polygon
Pentagon nodes Hexagon nodes
Pentagon Hexagon
Septagon nodes Octagon nodes
Septagon Octagon
Square nodes Diamond nodes
Square Diamond

All of the above node shapes are attempts to approximate the way graphviz renders node shapes. Future work in this area might attempt to pull the node rendering information from graphviz directly; nodes in graphviz are returned with the 'vertices' attribute containing the coordinates of the node's vertices. This attribute does not appear to be returned to Rgraphviz's "Ragraph" objects, but may be recoverable using the 'Ragraph-class' agraph function .

Phantom edges on undirected graphs

A "graphNEL" object is either directed or undirected. Our examples so far have all been directed. A graph can be declared undirected when the object is created using the 'edgemode' argument.

One noticeable problem with gridGraphviz was its rendering of phantom edges, especially on undirected graphs. Having been initially written with only directed edges in mind, gridGraphviz ran into problems with the way that edge layout locations are returned.

Each edge in a graphviz plot is described by a set of control points for a bezier curve. Edges on directed graphs also contain a location for an end point or a start point, depending on the direction of the edge in question. The gap between the final control point and the end point (or the first and start points) provides both a termination point and a direction for the edge's arrow.

gridGraphviz sensibly used these end points to render arrows, on directed edges. However, as it did not discriminate between directed and undirected edges, gridGraphviz also rendered arrows between the edge's last control point and its end point on undirected edges; undirected edges do not contain sensible end or start point information, which led to the phantom edges.

gridGraphviz was modified so that it will only render arrows - at the start or end of edges - when the graph is directed, and when the direction of a given edges calls for it. Figure 4 shows the problem in version 0.2 and the improved rendering in version 0.3.

Figure 4: Comparison of edge handling on undirected graphs

Undirected edges in version 0.2 Undirected edges in version 0.3
Undirected edges in version 0.2 Undirected edges in version 0.3

Arrow types

Version 0.2 of gridGraphviz supported 'open' and 'normal' arrow types, with Rgraphviz selecting 'open' by default. 'Normal' was augmented to include the synonymous 'closed' type, and to bring its proportions closer to those in graphviz; 'vee' was added to the 'open' arrow type.

The arrow type for a graph can be changed from 'open' after the graph has been laid out. All of the forward arrow types or 'arrrowhead' can be set to "normal" with:

As with node shapes, the list of arrow types supported in graphviz is extensive so no attempt was made to support all of them. However, tentative support for 'box' and 'dot' shapes was added, along with their "open" versions. Figure 5 shows the gridGraphviz results for the newly-supported arrow types.

Figure 5: arrow types added to version 0.3

Closed arrows Dot arrows
Closed arrows Dot arrows
Odot arrows Box arrows
Odot arrows Box arrows
Obox arrows
Obox arrows

Arrow size

The size of the arrows on a graph can be set when the graph is laid out. The attribute 'arrowsize' can be set in the 'edge' attributes. It is an integer which affects the scale of the graph's arrows.

Some work was done to improve the size of arrows to bring the rendering more in line with graphviz. The 'arrowsize' attribute on edges is a scaling factor for arrows. Its effect in gridGraphviz is dependent on arrow types being rendered at the same size as they would be in graphviz.

graphviz's 'dot' documentation provides the fault length of arrows at 10 (pixels), which has allowed for gridGraphviz to successfully handle the scaling of 'open' and 'closed' arrow styles. Figure 6 demonstrates how gridGraphviz version 0.3 renders different arrowsize values.

Figure 6: Arrowsize rendering in version 0.3

arrowsize 1 arrowsize 0.5
Arrowsize 1 Arrowsize 0.5
arrowsize 2
Arrowsize 2

Unfortunately information on default sizes for 'dot' and 'box' arrow styles has not been uncovered, and as such gridGraphviz handles their sizing rather less desirably. Currently their sizes are determined by the gap between the previously-mentioned last control point and end point for each edge.

Edge labels

Labels can be added to the edges of a graph by specifying 'label' values within the 'edgeAttrs' argument in the call to agopen()

Support for edge labels was not available in gridGraphviz version 0.2; however, label information was readily available in the "Ragraph" object, so it was straightforward to add support for edge labels in version 0.3. Figure 7 shows the improvements in edge label rendering between gridGraphviz versions 0.2 and 0.3.

Figure 7: Comparison of edge label rendering

No edge labels in version 0.2 Support for edge labels in version 0.3
Version 0.2 Version 0.3

Plot size

One of the nice things about graphviz is that it produces clean and well-proportioned graphs; its algorithm is designed to produce a compact representation of the graph in a neat and readable layout. gridGraphviz, via Rgraphviz, was ignoring one of the key components of this neatness - the overall plot size. Figure 8 shows the difference in size between graphviz and gridGraphviz version 0.2 by default.

Figure 8: Comparison of plot sizes produced by graphviz and gridGraphviz version 0.2

Graphviz output size gridGraphviz version 0.2output size
Default size graphviz output Default gridGraphviz version 0.2 size

By default, Rgraphviz sets the size of the graph being laid out to either the dimensions of the currently open graphics device, or to a default of seven by seven inches (according to a comment in the source code, this is to "prevent visual distortion when scaling down the image"). While it was possible to set the size to anything the user wanted, all size information from graphviz's algorithm was lost.

This led to the creation of the agopenTrue() function within gridGraphviz. Where Rgraphviz's agopen() returned a graph forced into certain dimensions, agopenTrue() lets graphviz determine the size of the resulting graph, while otherwise letting agopen() behave as it had before.

agopenTrue() can be used at the graph layout stage to create a laidout graph where the size is determined by graphviz:

Figure 9 shows the similarity in size between a graph produced by graphviz and one produced by gridGraphviz version 0.3 using agopenTrue().

Figure 9: Comparison of plot sizes produced by graphviz and gridGraphiz version 0.3

Graphviz output size gridGraphviz version 0.3 output size
Default size graphviz output Improved gridGraphviz size

gridGraphviz now also provides the functions graphWidth() and graphHeight() which return the dimensions of a laidout graph in inches:

Grob labelling

GridGraphviz's use of the grid graphics system in R provides opportunity for expansion and interactivity. Grid graphics objects are editable, which means they can be manipulated and augmented after the initial rendering is done.

Initially many of the objects in gridGraphviz were named by grid's internal naming scheme:

To assist in the future development of plots produced by gridGraphviz all the grid objects involved are now sensibly named:

This naming scheme makes it easier to work with the plot, and to pass on to other R packages that use the grid graphics model. One example is the gridSVG [7] package, which outputs grid plots in the SVG [8] file format. The SVG file format is open, text-based, interactive, and is supported in modern web browsers. Through gridSVG one could use gridGraphviz to produce interactive graphs for viewing and manipulating online.

In the following example gridSVG is used to add a tooltip to the nodes whenever they are moused over. The tooltip is added using the grid.garnish() function, which idenitifies the node objects by their names, "a" and "b", that were assigned by gridGraphviz

The following code creates an "Ragraph" with two nodes, named "a" and "b", and attaches the node labels "apple" and "banana" respectively. It then plots the graph with grid.graph() and displays the names of the resulting grobs with grid.ls(). Next the gridSVG library is loaded and grid.garnish() is applied to grobs "a" and "b" by name. The result is then exported as an SVG file using grid.export(). Figure 10 shows the resulting .SVG file embedded in this page.

Figure 10: Hovering the mouse over a node will produce a tooltip

Current limitations in Rgraphviz

Rgraphviz is fundamental to the plotting done by gridGraphviz, as it does all of the heavy lifting in providing an interface to graphviz. Unfortunately it bears several features which have proved a hindrance to the work of this project.

The following is a short wishlist of problems in Rgraphviz which might help in the improvement of this package.

Future improvements

Possible future work on gridGraphviz could focus on the following things:

Acknowledgements

The work in this report was undertaken as part of the University of Auckland's Faculty of Science Summer Studentship programme in the Department of Statistics.

Summary

Several improvements were made to the gridGraphviz package between versions 0.2 and 0.3. Support was added for "ellipse"-shaped nodes, as well as several variations of "polygon"-shaped nodes. gridGraphviz version 0.3 no longer produces "phantom" edges on undirected graphs, and can now produce a greater variety of arrow types. Support was added for rendering edge labels, and for producing plots that are the same size as that produced directly by graphviz. The naming of grobs in gridGraphviz version 0.3 was also improved.

Downloads

The latest revisions of gridGraphviz can be freely downloaded from R-Forge under a GPL license.

This article was produced with the knitr [9] package for R. The source code for the page can downloaded here.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

References

  1. Paul Murrell and Ashley Noel Hinton (2014). gridGraphviz: Drawing Graphs with Grid. R package version 0.3. URL http://r-forge.r-project.org/projects/gridgraph/.
  2. R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
  3. Arif Bilgin, John Ellson, Emden Gansner, Yifan Hu, Stephen North, Yehuda Koren, Don Caldwell, Vladimir Alexiev, David Dobkin, Tim Dwyer, Eleftherios Koutsofios, Bruce Lilly, Glen Low, John Mocenigo, Jeroen Scheerder, Richard G. Daniel, Gordon Woodhull. Graphviz - Graph Visualization Software. URL http://www.graphviz.org/. Version 2.26.3.
  4. Jeff Gentry, Li Long, Robert Gentleman, Seth Falcon, Florian Hahne, Deepayan Sarkar and Kasper Daniel Hansen. Rgraphviz: Provides plotting capabilities for R graph objects. R package version 2.6.0. URL http://www.bioconductor.org/packages/release/bioc/html/Rgraphviz.html .
  5. R. Gentleman, Elizabeth Whalen, W. Huber and S. Falcon (). graph: A package to handle graph data structures. R package version 1.40.1. URL http://www.bioconductor.org/packages/release/bioc/html/graph.html .
  6. Josh Hayes-Sheen. Examples. In GraphViz for discrete math students. Retrieved February 13, 2014, from http://graphs.grevian.org/example.
  7. Paul Murrell and Simon Potter (2013). gridSVG: Export grid graphics as SVG. R package version 1.3-1. http://CRAN.R-project.org/package=gridSVG.
  8. W3C (2011). Scalable Vector Graphics (SVG) 1.1 (Second Edition). http://www.w3.org/TR/SVG11/.
  9. Yihui Xie (2013). knitr: A general-purpose package for dynamic report generation in R. R package version 1.5. URL http://yihui.name/knitr/.