Generating Unique Names in gridSVG

Simon Potter simon.potter@auckland.ac.nz and Paul Murrell p.murrell@auckland.ac.nz

Department of Statistics, University of Auckland

Abstract: The gridSVG package exports grid images to the SVG image format for viewing on the web. This article describes the problems associated with retaining grid object names in SVG element id attributes. In addition, new features in gridSVG that allow manipulation and retrieval of generated id attributes are discussed. These new features allow for easier and more predictable development of interactivity in plots generated by gridSVG.

Introduction

grid is an alternative graphics system to the traditional base graphics system provided by R [1]. Two key features of grid distinguish it from the base graphics system: graphics objects and viewports.

Viewports are how grid defines a drawing context and plotting region. All drawing occurs relative to the coordinate system within a viewport. Viewports have a location and dimension and set scales on the horizontal and vertical axes. Crucially, they also have a name so we know how to refer to them.

Graphics objects (grobs) store information necessary to describe how a particular object is to be drawn. For example, a grid circleGrob contains the information used to describe a circle, in particular its location and its radius. As with viewports, graphics objects also have names.

The task that gridSVG [2] performs is to translate viewports and graphics objects into SVG [3] equivalents. In particular, the exported SVG image retains the naming information on viewports and graphics objects. The advantage of this is we can still refer to the same information in grid and in SVG. This means that interactivity can be added to specific named graphics objects to do things like adding tooltips or highlighting a point. In addition, we are able to annotate grid grobs to take advantage of SVG features such as hyperlinking and animation.

The fact that SVG is an XML-based [4] image format means that if we are to identify SVG output by name, we are required to produce SVG id attributes that are unique. This document describes how gridSVG retains the names associated with grobs and viewports, along with the difficulties in doing so.

Name Translation

When gridSVG exports the grid display list, it attempts to give SVG id attributes the same value as the name associated with a grob or viewport. However, the fact we require a unique id presents us with problems in maintaining these names for a few reasons, which will be discussed later. For now, we will first look at an image drawn in grid and what gridSVG produces from that grid scene.

A simple image will be drawn where we have two viewports and a circle is then drawn inside those viewports. The code to produce that image and the display list that grid records for the image are shown below:

What we can see is that there are three viewports, named ROOT, a, and b, one of which shares its name with a circle called a. The ROOT viewport is a viewport that grid creates by default that corresponds to the entire drawing canvas. This explains why ROOT exists in the display list without explicitly being created.

Ideally we would like to see that grid's viewport and grob names are mapped directly to SVG id attributes. However, because we are constrained to having our SVG element id attributes being unique, gridSVG must take action to ensure this is the case. Before explaining how gridSVG does this, let us first consider the simple example we just created by examining the relevant output from gridSVG.

We see here that none of the names we have in grid are mapped directly to SVG id attributes. The grid names are still being retained, albeit modified from the original names. The following name translations occurred:

This name translation is clearly evident. How gridSVG performs this translation will now be discussed.

Paths

In grid, both grobs and viewports can be constructed as a tree of viewports or a tree of grobs. To find a particular viewport or a grob within a tree, we need to use a path. This path is an ordered list of names, specifying parent-child relations. We will be focusing on viewport paths for simplicity, but the same principle applies to trees of graphics objects.

An example of a viewport path is shown below:

This viewport path describes that we first visit the viewport called first, followed by its child, second. Once in the second viewport, we then traverse to its child viewport third. We can see that the resulting path is simply a double-colon separated string of names.

It is possible to create a path where not all names in the path are unique.

In this example we create a viewport tree and push into it. We then observe our current viewport path to be a::b::a. Despite there being two a viewports in the path, they are each in fact two completely different viewports. As a result, we cannot simply assign the name of each viewport in the path to SVG output because the id attribute may not be unique. In our simple example, if we were to do this, we would end up with two SVG elements named a. The relevant output showing the result of this is shown below:

This demonstrates that names alone are not sufficient for the requirement of unique id attributes. A potential solution is to use the path as the name of the element. The path would avoid repeating a in id attributes. This would produce output like the following:

This looks like an adequate solution as we have produced unique id attributes, despite having viewports with the same names. However, because viewports can be moved in and out of at any point, we cannot guarantee that the viewport tree is fixed while the plot is being drawn. Consider the following:

What is happening here is that we first push into our tree but then navigate back to the previous viewport path of a::b. A new viewport is then created called a, and we push into that viewport instead of the viewport that we were previously in. This creates an ambiguity because we have two different viewports that have been pushed into at the same path of a::b::a.

We can see here that despite using paths, they are not sufficient for uniqueness when generating an SVG id attribute. This problem is also present when revisiting the same viewports in a viewport path. To overcome this problem, we use an integer suffix that is incremented each time we encounter the same path. To ensure consistency, this integer suffix is applied to every path. The result is shown below:

What we can see here is that we visit the top a for the first time. We then traverse to the viewport path a::b for the first time. It is important to note however that we can see that we have traversed to a::b::a on two separate occasions.

By keeping track of viewport and grob paths we can ensure that their SVG id attributes are unique. In addition, their uniqueness allows us to easily retain viewport coordinate information (see 'Working with the gridSVG Coordinate System' [5]), because the coordinate information will be paired with the SVG id that was generated by gridSVG. This is necessary because each time a viewport path is visited there may be a different coordinate system in use. For example, consider the case where we create two different viewports share the same name but use different coordinate systems:

Firstly, two viewports with the name a have been created. These two viewports have a different width and height to one another. This is important because each time the viewports are used the viewport path is the same despite different coordinate systems being used. The id attributes that that gridSVG will generate in this situation should now be familiar:

The unique id attributes of a.1 and a.2 allow us to look up coordinate information based on these generated names. To see this in action, we only need to look at the relevant subset of coordinate information that has been exported to JSON [6], a structured data format.

This clearly illustrates that the viewport coordinate systems for both viewports has been retained. We know this is the case because the x, y, width and height attributes for the viewports are indeed different. By generating unique id attributes for viewport paths, we can guarantee that coordinate information is not only retained, but is also unambiguous.

Name Sharing

In grid, both viewports and grobs contain names. Indeed, we have seen they can also be referred to by a path. One problem that gridSVG has that grid isn't concerned with is that viewports and grobs can have the same name. Consider the following example:

We can see that a viewport has a name that is the same as a circle grob's name. grid is able to draw this scene without any issues but we are presented with a problem when exporting it using gridSVG. We want the SVG id attribute to be assigned with the name of the object that we're representing. However, in this example, because this is the first time each grob path and each viewport path is encountered, we can end up with non-unique elements. This is shown below:

The reason why both the viewport and circle grob are given the suffix of .1 is because it is both the first time that the viewport path has been visited and it is the first time that the grob path has been visited. This presents us with a case where id attributes between grobs and viewports are shared. To correct this, we not only need to track paths, but we also need to track the names and how often they have been assigned to both viewports and grobs. The solution currently used by gridSVG is shown below:

Now instead of just tracking how often each viewport or grob path has been used, we track how often each grid name has been used. This is shown in our output because when the circle grob is drawn, it is the second time that the name a has been encountered, so we end up with a suffix of .2.

In summary, by tracking the names that we attempt to apply to id attributes, we ensure that unique id attributes are generated by gridSVG by adding an integer suffix.

Sub-grobs

We have already seen that when a grob is drawn, we create an SVG <g> element. The contents of this grouping element are graphical elements (e.g. rectangles, circles, lines) that are drawn to an SVG canvas. The reason why grouping output is necessary is because there are cases where gridSVG cannot create a one-to-one mapping between a grid grob and SVG output. For example, while it is possible for a single grid circle to be drawn simply as an SVG <circle /> element, we cannot assume this to always be true. We can draw multiple circles using a single call to grid.circle. An example of this is shown below:

A single circle grob (as listed on the display list) has managed to draw three separate circles. These will be referred to as sub-grobs. It is clear that we cannot apply any name given to the grob to all of its sub-grobs because all sub-grobs would therefore have identical names. The solution gridSVG uses is to use an integer suffix to identify each sub-grob that is drawn. Using the example above, we will take a look at the SVG output that gridSVG produces.

Firstly, we can see that the original grob name has been changed to a.1 because it is the first time that we use the name a. However, its children also have an integer suffix applied. The first circle drawn (i.e. the circle with a radius of 0.1) is given the name a.1.1. The second and third circles are assigned the names a.1.2 and a.1.3 respectively.

This technique also applies to grobs where there is an id parameter. An example of such a grob is a polylineGrob.

A single call to grid.polyline has produced 5 distinct lines. This is because of grid.polyline's id parameter which determines the sub-grob that each line coordinate belongs to. The SVG output is shown below, and demonstrates that the same rule applies to grobs with vectorised parameters and to those with an id parameter.

The addition of an integer suffix to sub-grobs allows us to not only generate unique ids for SVG elements, but also allows us to identify each sub-grob that is being drawn in a consistent manner.

Controlling Output

This article has shown why gridSVG needs to modify names to produce unique output. One of the problems in doing this is that the SVG id attributes are now much harder to predict. This means that any name that is assigned to a grob or viewport in grid, when exported to SVG by gridSVG, does not map to an easily predictable SVG id attribute. However, to aid predictability, gridSVG does offer some options for controlling how it constructs id attributes.

The usePaths Option

It was discussed earlier why viewport paths are used as part of the exported ids. However, there are cases where this unnecessarily complicates the SVG output. Primarily this is the case when the names of each viewport — and therefore every viewport path — are unique. Using viewport paths as part of the generated id attributes is therefore not strictly necessary. We add the complication of dealing with paths when our viewport names are sufficiently specific.

The usePaths parameter for gridSVG's gridToSVG function allows us to determine whether paths are used when creating ids for grobs and viewports. There are four possible options:

To demonstrate the effect of these options, a simple image will be drawn, then we will examine the relevant SVG output that gridSVG generates from each option.

What has been drawn are two grobs, a circle and a rectangle. They are the only children in a single tree of grobs called gt. This tree has been drawn inside the viewport path a::b. Because we have trees of content, we can easily compare the effect of each option. We will first look at the output when we only want viewport paths to be used.

Only viewport paths are being used here. As a result each of the grob names are kept unchanged and instead of viewport names we use the viewport path. We know this is the case because there is an id that has been exported that can only belong to a viewport path, a::b.1. In addition, both the c and d grobs do not use paths for their names so are exported as c.1 and d.1 respectively.

The viewports are now being retained as just being the names, while we are now using paths for grobs. The viewport path a::b is consequently exported simply to b.1, which can only be the case if we ignore the path prefix of a. The grob path is being used in particular with the rectangle and circle grobs because they are now being exported with the ids of gt.1::c.1 and gt.1::d.1 respectively. This clearly indicates that they are children of the gTree named gt.

No paths are being used so we are only exporting the names of the viewports and grobs. This is particularly evident because the default path separator of :: is no longer present in any of our id attributes.

Finally, we observe the output created by exporting ids as both viewport paths and grob paths.

Custom Separators

When gridSVG exports paths as SVG ids, the result is that each name in the path is separated by ::. This is the default path separator used by grid. However, there may be situations where a custom path separator may be more appropriate. An example where this is the case is when using ids within CSS selectors [7]. This is because the colon character is a special character in CSS [8], as it prefixes a pseudo-selector. Therefore, if we were to use the default gridSVG path separator, we would need to escape it for use within a CSS selector. This would require modifying each instance of :: and replacing it with \:\:. Ideally we would like to avoid performing any escaping by using a different separator. This section discusses how custom separators can be by gridSVG when it exports an SVG image.

There are three types of separators that gridSVG uses:

We can change the values of these separators, avoiding the need to escape them for use within CSS selectors. Another possible reason why using custom separators might be useful is if we have grob names containing . characters. By changing the id separator, we can make it easier to determine the grob or viewport name from the generated SVG id attribute.

gridSVG provides three functions that are useful for the purposes of changing the separators used when generating SVG id attributes: setSVGoptions, getSVGoptions, and getSVGoption. setSVGoptions allows us to change the separators, while getSVGoptions allows us to query gridSVG for all current separators. getSVGoption is a convenience function that gives us the value of a single separator. Example usage is shown below:

Now that we have changed the separators, we can examine the effect of these changes by drawing our earlier example again with usePaths being set to "both". The relevant output is shown below:

Notice how the each of the grob and viewport paths now have underscore characters in them. Additionally, every id now has a dash as a separator to the integer suffix.

Unique Names

By default, to ensure valid SVG content, gridSVG adds an integer suffix for the purposes of making the generated id attribute unique. A consequence of this is that there is not a one-to-one mapping between grid names and SVG ids. This makes it hard to predict the SVG id that is generated for a grob or viewport, presenting challenges when we want to use the SVG output. For example, in JavaScript, if we want to change the colour of a grob as we hover our mouse over it, we first need to know the id of the SVG element that we are applying this effect to.

If a grid plot has been drawn that is known to have unique grob and viewport names, this procedure of adding an integer suffix is not required. gridSVG provides an option for enabling this process, uniqueNames, which is TRUE by default. In the case when this parameter is FALSE it is possible to produce valid SVG without the addition of any integer suffixes. This means that we can create a one-to-one mapping between grid grob names and the id attributes that gridSVG generates. This parameter only affects grob names because modifying viewport names could affect retention of coordinate information. A simple demonstration of the effect of uniqueNames is shown below:

We can see that the id generated for the grob named circle is still circle. One important thing to note is that gridSVG does not change its behaviour for sub-grobs. This is why the <circle /> element has an id of circle.1.

When the uniqueNames argument is set to FALSE, it is possible to generate invalid SVG. This may occur when grobs and/or viewports share names when exported to SVG. gridSVG will generate non-unique names, but it will provide a warning in this case because invalid SVG is being produced. See the following:

In this example, gridSVG is not checking whether the id a.1 already exists. The viewport is given the expected gridSVG name of a.1 because it is the first time that the a viewport path has been pushed into. Now when we come across a grob called a.1, no checking is occurring to see whether the id already exists. Additionally, because uniqueNames is set to FALSE, no integer suffix is added for the purpose of ensuring uniqueness. Therefore we end up with two id attributes that are the same, creating invalid SVG, which gridSVG is providing a warning message for.

Care should be taken when using this parameter because it is the only parameter which has the potential to produce invalid SVG documents. In fact, the need to change this parameter from the default of TRUE is rarely necessary when we use mapping information.

Output Annotation

We have shown how gridSVG can be used to modify its resulting SVG output. These settings can be particularly useful for deriving the structure of a source image. For example, we might like to know whether an id attribute is part of a grob path or a viewport path. If we know the values of separators used when exporting, we can make more sense of a gridSVG generated SVG document. This is also particularly useful for debugging a gridSVG image.

Settings used for controlling SVG output are exported as metadata in the SVG that gridSVG creates. To demonstrate this, we will draw a simple grid image (not shown), but show only the SVG metadata that was exported by gridSVG.

The metadata shows us exactly how the image was drawn. In particular, the gridsvg:argument elements which tell us how the id attributes were controlled. We can see that as this particular image was exported it was ensuring that unique names were being used and the only paths it was generating were for viewport paths. Additionally, we can also see the values of the separators when the image was being exported.

Mappings

We have discussed the many ways in which gridSVG modifies grob and viewport names, including ways to control how that happens. However, the key issue with this name translation is that it is difficult to predict how to map the names that are used in grid with the output produced by gridSVG. A recent development in gridSVG is the ability to retain mapping information that provides us with information on how to map a grid grob or viewport name to an SVG id attribute.

It is useful to have mapping information available both in R, and in JavaScript [9]. In R, we might want to perform some post-processing on the XML nodes that a grob maps to. If the id can be retrieved easily then performing this task is far simpler than writing an XPath [10] expression. Similarly, if we want to perform some modification on an SVG image in the browser, using tools like D3 [11], then knowing what content we're trying to select is an important problem to solve.

We will first look at the mapping information that gridSVG is exporting. We start with the following image:

This exports mapping information as JSON, a structured data format that is convenient for use within a web browser. The mapping information from this plot is shown below:

This is showing that we store both viewport and grob mapping information. Within each category, we store the name of the object, which has three pieces of information associated with it. The first are the integer suffixes that the name has been mapped to. For our example, we used the grob name b twice, so there are two suffixes associated with the b grob. This can be used to construct an id attribute by concatenating the name with the id.sep value.

Also included are CSS selectors and XPath expressions which target the same id. These are included for convenience, and special characters are already escaped. This means that if we use a JavaScript library like D3 or jQuery [12], we can select the content immediately by just using the exported CSS selector.

In order to make the mapping information easy to use, gridSVG provides convenience functions in both R and JavaScript. The primary function that is used is getSVGMappings, which is named the same in both R and JavaScript. To demonstrate, we will be building upon the mapping information shown earlier.

The first thing that occurs is that because mapping information is stored in a file as JSON, we need to read it into R. The readMappingsJS function takes the filename containing mapping information, and reads that file into R and parses it as a list. The result can then be given to gridSVGMappings. Once this has been done we can apply the mapping information by using getSVGMappings.

It is important to note that when a name has been used more than once, instead of getting a single id value, we can end up with multiple ids. This ambiguity cannot be resolved because of issues discussed earlier, but at least we can reduce the search to only the ids that have been returned from the function. Typically there are few instances where multiple results are returned.

The same example above can be performed in a browser using JavaScript, the output is shown below:

JS> getSVGMappings("a", "vp");
["a.1"]

JS> getSVGMappings("b", "grob");
["b.1", "b.2"]

In this example, is it shown that the function always returns an array of values, even when there is only one matching result. This is for simplicity across single and multiple matching results.

To return a CSS selector or XPath expression instead of an id we just need to specify that in the optional third parameter. Again, this is the case in both the R and JavaScript implementations of the function. This is shown below:

JS> getSVGMappings("a", "vp", "selector");
["#a\.1"]
JS> getSVGMappings("a", "vp", "xpath");
["//*[@id='a.1']"]

JS> getSVGMappings("b", "grob", "selector");
["#b\.1", "#b\.2"]
JS> getSVGMappings("b", "grob", "xpath");
["//*[@id='b.1']", "//*[@id='b.2']"]

An example where this becomes useful is if you want to use D3 to modify content, perhaps using a transtition. All that is required is to get the appropriate selector and D3 can select the appropriate content based on that selector. For example the following shows how this might occur:

JS> var sel = getSVGMappings("a", "vp", "selector")[0];

JS> d3.select(sel)
JS+     .transition()
JS+     ...

In R, the use of the XML package [13] is more familiar, so we can use XPath expressions instead of CSS selectors.

With the development of retaining name mapping information we can more easily manipulate SVG images that have been exported by gridSVG.

Conclusion

We have demonstrated that grid grob and viewport names are required to be modified as they are translated to SVG id attributes. We have also shown why this is necessary and the process gridSVG takes to ensure valid SVG is generated.

gridSVG also provides two parameters in its gridToSVG function which affect how modification of grob and viewport names occur. Despite the modification of names, it is straightforward to retrieve possible matching ids using convenience functions that access gridSVG's name mapping information.

Downloads

This document is licensed under a Creative Commons Attribution 3.0 New Zealand License. The code is freely available under the GPL. The described functionality of gridSVG is present in version 1.1-0. gridSVG is currently under development on GitHub but will be migrated to R-Forge soon.

References

  1. R Development Core Team (2013). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
  2. Murrell, P. and Potter, S. (2013). gridSVG: Export grid graphics as SVG. http://r-forge.r-project.org/projects/gridsvg/. R package version 1.0-0.
  3. W3C (2011). Scalable Vector Graphics (SVG) 1.1 (Second Edition) Specification. http://www.w3.org/TR/SVG/.
  4. W3C (2008). Extensible Markup Language (XML) 1.0 (Fifth Edition). http://www.w3.org/TR/xml/.
  5. Potter, S. and Murrell, P. (2012). Working with the gridSVG Coordinate System. http://stattech.wordpress.fos.auckland.ac.nz/2012-6-working-with-the-gridsvg-coordinate-system/.
  6. JSON: JavaScript Object Notation. http://www.json.org/.
  7. W3C (2011). Selectors Level 3. http://www.w3.org/TR/css3-selectors/.
  8. W3C (2011). Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification. http://www.w3.org/TR/CSS2/.
  9. ECMA International (2011). Standard ECMA-262: ECMAScript Language Specification. http://www.ecma-international.org/publications/standards/Ecma-262.htm
  10. W3C (1999). XML Path Language (XPath) Version 1.0. http://www.w3.org/TR/xpath/.
  11. Bostock, M. (2013). Data Driven Documents. http://d3js.org/.
  12. The jQuery Foundation (2013). jQuery: The write less, do more, JavaScript library. http://jquery.com/.
  13. Lang, D. T. (2013). XML: Tools for parsing and generating XML within R and S-Plus. http://www.omegahat.org/RSXML/. R package version 3.96-0.2