# Improving the 'gridGraphviz' package in R

Department of Statistics, University of Auckland

18 February 2014

## Abstract

The *gridGraphviz* package renders node-and-edge graphs
in *R* using the *grid* graphics package. Graphs
are laid out using the *Rgraphviz* package to interface
with the graph layout algorithms in *graphviz*. This
article details the improvements made
between *gridGraphviz* versions 0.2 and 0.3, including:
support for "ellipse"- and "polygon"-shaped nodes; handling of
edges in undirected graphs; support for various new arrow types;
and support for edge labels. Version 0.3 also introduces a
method to produce graphs with an overall size closer to
*graphviz*'s output.

## Introduction

This article describes improvements made to the
*gridGraphviz* [1]
package for the *R
Project* [2].
*gridGraphviz* is a package for rendering node and
edge graphs with *R*'s *grid* [2]
graphics package. The graph layouts
are produced by AT&T's
*graphviz*
[3]
software, accessed through Bioconductor's
*Rgraphviz* [4]
package. *Rgraphviz* produces graphs which appear very
different from those produced by *graphviz* directly. This
project attempted to bring the features of *gridGraphviz* in line
with those available in *Rgraphviz*, to bring the resulting graph
closer to those produced by *graphviz*, and to make use of the
grid graphics model to allow further extensibility of the
resulting graphs.

There are three stages to the plotting of graphs with
*gridGraphviz*.
First a graph object must be created; next the graph must be laid out;
finally the graph is plotted.

The *graph* package provides several ways to create a graph.
In the approach employed in this report,
a graph is made up of a set of nodes and a list of edges between
these nodes, combined to make a new "graphNEL" [5]
object. The "graphNEL"
takes in the nodes, edge lists, and an edgemode - directed or undirected.

library(graph) nodes <- c("a", "b", "c", "d") edgeList <- list(a=list(edges=c("b")), b=list(edges=c("c")), c=list(edges=c("d")), d=list(edges=c("a"))) directedGraph <- new("graphNEL", nodes=nodes, edgeL=edgeList, edgemode="directed") directedGraph

## A graphNEL graph with directed edges ## Number of Nodes = 4 ## Number of Edges = 4

This graph is then laid out as an "Ragraph" object. *Rgraphviz*'s
agopen() function is used; it accepts a graph object, a name, and any
attributes the graph should have e.g. the shape of the nodes.

library(Rgraphviz) Ragraph <- agopen(directedGraph, "myGraph") Ragraph

## [1] "A graph with 4 nodes."

Finally the graph is rendered using *gridGraphviz*'s grid.graph()
function. Here we can specify whether to start a new page, or to plot
over the previous output in the current device.
Figure 1 shows the resulting plot.

library(gridGraphviz) grid.graph(Ragraph, newpage=TRUE)

A set of eight examples of graphs
was taken from
http://graphs.grevian.org/example [6]
to assess which areas
needed to be improved in *gridGraphviz* at the beginning of the
project. The examples demonstrated that *gridGraphviz* version
0.2:

- could not produce 'ellipse' shaped nodes
- produced phantom edges on undirected graphs
- could not produce edge labels
- produced plots of a different size to
*graphviz*

The following sections describe each of these problems in more detail
and how they have been addressed in version 0.3 of *gridGraphviz*.

## 'Ellipse' shaped nodes

*Graphviz* produces ellipse shaped nodes if no other
node shape is given, while *Rgraphviz* defaults to
circular nodes. As
*gridGraphviz* had been built to make use of
*Rgraphviz* for interfacing with *graphviz*, it
had also been initially written to produce circular nodes by
default. The "ellipse" shape was not handled at all and
resulted in an error.

Node shapes are passed to *graphviz* as an attribute
(attrs) when the graph is laid out. The following code creates a laid
out "Ragraph" object with ellipse-shaped nodes.
Figure 2 shows the improvement in the rendering
of the graph between versions 0.2 and 0.3 of *gridGraphviz*.

Ragraph <- agopen(directedGraph, "myGraph", attrs=list(node=list(shape="ellipse")))

Ellipse-shaped nodes in version 0.2 | Ellipse-shaped nodes in version 0.3 |

## Other nodes shapes

The only node
shapes initially supported by *gridGraphviz* were 'circle,' and
'box.' The list of node shapes that *graphviz* supports
is somewhat
extensive so only a limited subset of the possibilities
have been added at this stage. Support
was added for 'square', 'diamond', 'triangle', 'pentagon',
'hexagon', 'septagon', and 'octagon' node shapes. Limited
support for the 'polygon' node shape was added; as *Rgraphviz*
does not currently pass through the node attribute 'sides'
*gridGraphviz* displays the default four-sided polygon shape.

The code required to produce these node shapes is a simple variation
on the ellipse code from the previous section. The following code shows
the case for 'triangle'-shaped nodes. Figure 3
shows the *gridGraphviz* results for the newly-supported
node shapes.

Ragraph <- agopen(directedGraph, "myGraph", attrs=list(node=list(shape="triangle")))

Triangle | Polygon | |

Pentagon | Hexagon | |

Septagon | Octagon | |

Square | Diamond |

All of the above node shapes are attempts to approximate the
way *graphviz* renders node shapes. Future work in this area
might attempt to pull the node rendering information from
*graphviz* directly; nodes in *graphviz* are returned with the
'vertices'
attribute containing the coordinates of the node's
vertices. This attribute does not appear to be returned to
*Rgraphviz*'s "Ragraph" objects, but may be recoverable using
the 'Ragraph-class'
agraph function .

## Phantom edges on undirected graphs

A "graphNEL" object is either directed or undirected. Our examples so far have all been directed. A graph can be declared undirected when the object is created using the 'edgemode' argument.

directedGraph <- new("graphNEL", nodes=nodes, edgeL=edgeList, edgemode="directed")

One noticeable problem with *gridGraphviz* was its rendering of
phantom edges, especially on undirected graphs. Having
been initially written with only directed edges in mind,
*gridGraphviz* ran into problems with the way that edge layout
locations are returned.

Each edge in a *graphviz* plot is described by a set of control points
for a bezier curve. Edges on directed
graphs also contain a location for an end point or a start
point, depending on the direction of the edge in question. The
gap between the final control point and the end point (or the
first and start points) provides both a termination point and a
direction for the edge's arrow.

*gridGraphviz* sensibly used these end points to render arrows,
on directed edges. However, as it did not discriminate between
directed and undirected edges, *gridGraphviz* also rendered
arrows between the edge's last control point and its end point
on undirected edges; undirected edges do not contain sensible
end or start point information, which led to the phantom
edges.

*gridGraphviz* was modified so that it will only render arrows -
at the start or end of edges - when the graph is directed, and
when the direction of a given edges calls for it.
Figure 4 shows the problem in version 0.2 and
the improved rendering in version 0.3.

Undirected edges in version 0.2 | Undirected edges in version 0.3 |

## Arrow types

Version 0.2 of *gridGraphviz* supported 'open' and 'normal' arrow
types, with *Rgraphviz* selecting 'open' by default. 'Normal' was
augmented to include the synonymous 'closed' type, and to
bring its proportions closer to those in *graphviz*; 'vee' was
added to the 'open' arrow type.

The arrow type for a graph can be changed from 'open' after the graph has been laid out. All of the forward arrow types or 'arrrowhead' can be set to "normal" with:

for (i in seq(along = AgEdge(Ragraph))) { AgEdge(Ragraph)[[i]]@arrowhead <- "normal" }

As with node shapes, the list
of arrow
types supported in *graphviz* is extensive so no attempt was
made to support all of them. However, tentative support for
'box' and 'dot' shapes was added, along with their "open"
versions. Figure 5 shows the
*gridGraphviz* results for the newly-supported arrow types.

Closed arrows | Dot arrows | |

Odot arrows | Box arrows | |

Obox arrows |

## Arrow size

The size of the arrows on a graph can be set when the graph is laid out. The attribute 'arrowsize' can be set in the 'edge' attributes. It is an integer which affects the scale of the graph's arrows.

Ragraph <- agopen(directedGraph, "", attrs=list(edge=list(arrowsize=1.5)))

Some work was done to improve the size of arrows to bring the
rendering more in line with *graphviz*. The 'arrowsize' attribute
on edges is a scaling factor for arrows. Its effect in
*gridGraphviz* is dependent on arrow types being rendered at the
same size as they would be in *graphviz*.

*graphviz*'s
'dot' documentation
provides the fault length of arrows at 10
(pixels), which has allowed for *gridGraphviz* to successfully
handle the scaling of 'open' and 'closed' arrow styles.
Figure 6 demonstrates how *gridGraphviz*
version 0.3 renders different arrowsize values.

Arrowsize 1 | Arrowsize 0.5 |

Arrowsize 2 |

Unfortunately information on default sizes for 'dot' and 'box'
arrow styles has not been uncovered, and as such *gridGraphviz*
handles their sizing rather less desirably. Currently their
sizes are determined by the gap between the previously-mentioned
last control point and end point for each edge.

## Edge labels

Labels can be added to the edges of a graph by specifying 'label' values within the 'edgeAttrs' argument in the call to agopen()

Ragraph <- agopen(directedGraph, "", edgeAttrs=list(label=c("a~b"="first edge", "c~b"="another edge")))

Support for edge labels was not available in *gridGraphviz*
version 0.2; however, label information was readily available in
the "Ragraph" object, so it was straightforward to add support for
edge labels in version 0.3.
Figure 7 shows the improvements in edge label
rendering between *gridGraphviz* versions 0.2 and 0.3.

Version 0.2 | Version 0.3 |

## Plot size

One of the nice things about *graphviz* is that it produces
clean and well-proportioned graphs; its
algorithm is designed to produce a compact representation of
the graph in a neat and readable layout. *gridGraphviz*, via
*Rgraphviz*, was ignoring one of the key components of this
neatness - the overall plot size. Figure 8
shows the difference in size between *graphviz* and
*gridGraphviz* version 0.2 by default.

Default size graphviz output |
Default gridGraphviz version 0.2 size |

By default, *Rgraphviz* sets the size of the graph being laid out
to either the dimensions of the currently open graphics device,
or to a default of seven by seven inches
(according to a comment in the source code, this is to
"prevent visual distortion when scaling down
the image"). While it was possible
to set the size to anything the user wanted, all size
information from *graphviz*'s algorithm was lost.

This led to the creation of the agopenTrue() function within
*gridGraphviz*. Where *Rgraphviz*'s agopen()
returned a graph forced into certain dimensions, agopenTrue()
lets *graphviz* determine the size of the resulting
graph, while otherwise letting agopen() behave as it had
before.

agopenTrue() can be used at the graph layout stage to create a laidout
graph where the size is determined by *graphviz*:

Ragraph <- agopenTrue(directedGraph, "") Ragraph

## [1] "A graph with 4 nodes."

Figure 9 shows the similarity in size between
a graph produced by *graphviz* and one produced by
*gridGraphviz* version 0.3 using agopenTrue().

Default size graphviz output |
Improved gridGraphviz size |

*gridGraphviz* now also provides the functions graphWidth() and
graphHeight() which return the dimensions of a laidout graph in inches:

```
graphWidth(Ragraph)
```

## [1] 0.8194

```
graphHeight(Ragraph)
```

## [1] 3.5

## Grob labelling

GridGraphviz's use of the grid graphics system in R provides opportunity for expansion and interactivity. Grid graphics objects are editable, which means they can be manipulated and augmented after the initial rendering is done.

Initially many of the objects in *gridGraphviz* were named by
grid's internal naming scheme:

## GRID.gTree.98 ## GRID.beziergrob.96 ## GRID.segments.97 ## GRID.gTree.101 ## GRID.beziergrob.99 ## GRID.segments.100 ## GRID.gTree.104 ## GRID.beziergrob.102 ## GRID.segments.103 ## GRID.gTree.107 ## GRID.beziergrob.105 ## GRID.segments.106 ## a ## box ## label ## b ## box ## label ## c ## box ## label ## d ## box ## label

To assist in the future development of plots produced by
*gridGraphviz* all the grid objects involved are now sensibly
named:

## edge-a~b ## curve-a~b-1 ## arrowhead-a~b ## edge-b~c ## curve-b~c-1 ## arrowhead-b~c ## edge-c~d ## curve-c~d-1 ## arrowhead-c~d ## edge-d~a ## curve-d~a-1 ## arrowhead-d~a ## a ## box-a ## label-a ## b ## box-b ## label-b ## c ## box-c ## label-c ## d ## box-d ## label-d

This naming scheme makes it easier to work with the plot, and to
pass on to other R packages that use the grid graphics
model. One example is the
*
gridSVG* [7]
package, which outputs grid
plots in the SVG [8] file format.
The SVG file format is open,
text-based, interactive, and is supported in modern web
browsers. Through *gridSVG* one could use *gridGraphviz*
to produce
interactive graphs for viewing and manipulating online.

In the following example gridSVG is used to add a tooltip to the nodes
whenever
they are moused over. The tooltip is added using the grid.garnish()
function, which idenitifies the node objects by their names, "a"
and "b", that were assigned by *gridGraphviz*

The following code creates an "Ragraph" with two nodes, named
"a" and "b", and attaches the node labels "apple" and "banana"
respectively. It then plots the graph with grid.graph() and
displays the names of the resulting grobs with grid.ls(). Next
the *gridSVG* library is loaded and grid.garnish() is
applied to grobs "a" and "b" by name. The result is then
exported as an SVG file using grid.export().
Figure 10 shows the resulting .SVG file
embedded in this page.

graph <- new("graphNEL", nodes=c("a", "b"), edgeL=list(a=list(edges="b"), b=list()), edgemode="directed") rag <- agopenTrue(graph, "", nodeAttrs=list(label=c(a="apple", b="banana"))) grid.graph(rag, newpage=TRUE) grid.ls()

## edge-a~b ## curve-a~b-1 ## arrowhead-a~b ## a ## box-a ## label-a ## b ## box-b ## label-b

library(gridSVG) grid.garnish("a", title="Apple Node", "pointer-events"="all") grid.garnish("b", title="Banana Node", "pointer-events"="all") grid.export("graph.svg")

Figure 10: Hovering the mouse over a node will produce a tooltip

## Current limitations in Rgraphviz

*Rgraphviz* is fundamental to the plotting done by
*gridGraphviz*, as it does all of the heavy lifting in providing
an interface to *graphviz*. Unfortunately it bears several
features which have proved a hindrance to the work of this project.

The following is a short wishlist of problems in *Rgraphviz*
which might help in the improvement of this package.

- Default graph size:

As has already been mentioned,*Rgraphviz*sets the graph size to either the size of the current device or to a default of 7x7 inches. It would be nice to have the option of letting*graphviz*set the size.

Workaround: agopenTrue() lets*graphviz*set the graph size. - Edge weights:

agopen() does not pass through edge weights when laying out graphs. This can have a dramatic effect on the layout of a graph, and can even lead to distorted looking edges. It would be preferable for agopen() to handle edgeweights.

Workaround: agopenTrue() will pass through edge weights when laying out a graph. - Node sizes:

agopen() sets nodes sizes to width=0.75, height=0.5 and fixedsize=TRUE by default. This results in oversized "square"- and "circle"-shaped nodes. It might be preferable to let*graphviz*set these values if the user does not specify othersise.

Workaround: agopenTrue() will let*graphviz*set the node sizes if not otherwise specified. - Arrow types:

agopen() does not allow the specification of arrow types at layout. Changing arrow types after layout can cause visual problems.

## Future improvements

Possible future work on *gridGraphviz* could focus on the
following things:

- Extension of available node shapes
- Extension of available arrow types
- Improve handling of alternative arrow types (e.g., avoid overlap of 'dot' or 'box' arrows with node boundary)
- Rendering of graph labels and sublabels
- Rendering of subgraph labels and boxes
- Investigate how graphviz handles fill and backgound colours compared to Rgraphviz

## Acknowledgements

The work in this report was undertaken as part of the University of Auckland's Faculty of Science Summer Studentship programme in the Department of Statistics.

## Summary

Several improvements were made to the *gridGraphviz* package
between versions 0.2 and 0.3. Support was added for "ellipse"-shaped
nodes, as well as several variations of "polygon"-shaped nodes.
*gridGraphviz* version 0.3 no longer produces "phantom" edges
on undirected graphs, and can now produce a greater variety of arrow
types. Support was added for rendering edge labels, and for producing
plots that are the same size as that produced directly by
*graphviz*. The naming of grobs in *gridGraphviz*
version 0.3 was also improved.

## Downloads

The latest revisions of *gridGraphviz* can be freely downloaded
from
R-Forge
under a GPL license.

This article was produced with the knitr [9] package for R. The source code for the page can downloaded here.

