'DOM' Version 0.4

by Paul Murrell http://orcid.org/0000-0002-3224-8858

cat(format(Sys.Date(), "%A %d %B %Y"))

opts_chunk$set(comment=" ", tidy=FALSE) options(width=100) # Use phantomjs 2.1 # (NOTE that this refers to a location within the Docker container that is # used to build the report) phantomjs <- "/home/phantomjs-2.1.1-linux-x86_64/bin/phantomjs" # phantomjs <- "/home/pmur002/Files/Research/Rstuff/AltEngine/PhantomJS/phantomjs-2.1.1-linux-x86_64/bin/phantomjs" Sys.setenv(R_PHANTOMJSCMD=phantomjs) # Generate PNG of web page renderPage <- function(page, filestem) { writeLines(DOM:::getPage(page), paste0(filestem, ".html")) DOM:::render(page, paste0(filestem, ".png")) } library(DOM) options(DOM.headless=TRUE) options(DOM.width=300, DOM.height=100)

Creative Commons License
'DOM' version 0.4 by Paul Murrell is licensed under a Creative Commons Attribution 4.0 International License.


This report describes changes in version 0.4 of the 'DOM' package for R. The main change in this version is the addition of new functions that allow control over the Cascading Style Sheet (CSS) content of a web page. This provides programmatic control over the styling of HTML and SVG content on a page.

Element Style

For demonstration purposes, we will work with a web page consisting of a single paragraph (a more complex example is provided later).

library(DOM) page <- htmlPage("

A paragraph

")
renderPage(page, "paraPage")

Because we will be working with this paragraph multiple times, the following code creates a pointer to the paragraph element. We will be able to use this to refer to the paragraph from now on.

p <- getElementsByTagName(page, "p", response=nodePtr())

A simple way to use CSS styling on an element on a web page is to define a style attribute for the element. The existing setAttribute function in the 'DOM' package already provides support for this. The following code sets the style attribute for the paragraph so that the text turns red.

setAttribute(page, p, "style", "color: red") renderPage(page, "paraStyledPage")

However, this setAttribute approach is heavy-handed and does not provide fine control over the CSS styling because the entire style attribute has to be specified. For example, the following modification of the CSS styling replaces the previous setting; the text is now italic, but it is no longer red.

setAttribute(page, p, "style", "font-style: italic") renderPage(page, "paraStyleAttrPage")

Properties versus Attributes

Another way to access the CSS styling on an element is through the style property of the element. In version 0.4 of 'DOM' there are two new functions getProperty and setProperty that allow us to access and modify element properties. The following code gets the style property for the paragrah.

style <- getProperty(page, p, "style") style

The result is a DOM_CSSStyleDeclaration_ptr. Compare that result to what we get from getAttribute (another new function in version 0.4), which is just a character vector.

getAttribute(page, p, "style")

With getProperty, we get a pointer to a style object, rather than just the text value for a style attribute. The advantage of the style object is that we can access and set individual properties of that object. For example, the following code accesses the font-style property of the paragraph style.

getProperty(page, style, "font-style")

The following code sets the color property of the style. The advantage of this, compared to setting an attribute, is that we only set the color property of the style; the font-style property (italic) is untouched.

setProperty(page, style, "color", "red") renderPage(page, "paraStylePropPage")

There is also a short hand provided for getting and setting properties.

p$style$color p$style$color <- "green" renderPage(page, "paraShorthandPage")

In summary, with the new ability to get and set properties, we can easily access and modify individual CSS properties within the style property of an HTML element on a web page.

closePage(page)

Style sheets

Another way to use CSS styling on an element is to add a style sheet to the web page, with a CSS rule that targets the element. For examples in this section, we will start with a fresh page (because CSS styling via a style sheet has a lower priority than inline CSS styling via a style attribute).

page <- htmlPage("

A paragraph

")
renderPage(page, "paraPage")

A style sheet can be added to a page by adding a <style> element to the <head> element of the web page. Another option would be to add a <link> element (to point to an external style sheet). The existing appendChild function can do this for us.

appendChild(page, htmlNode(''), parent=css("head")) renderPage(page, "paraStyleSheetPage")

The style sheet consists of zero or more rules. In this case, there is a single rule:

p { color: red; }
  

Each rule consists of a selector and zero or more style declarations. The selector specifies the target of the rule (in this case, the selector p means that the rule will apply to all <p> elements in the page) and the style declarations have the same format as in the style attribute of an element: a CSS property name, followed by a colon, followed by a CSS property value (with a semi-colon between multiple style declarations).

We can add more than one style sheet to a page and we can remove style sheets (with removeChild), but, as with style attributes, this is heavy-handed and does not allow fine control of the details of a style sheet.

CSS Rules

The new styleSheets function provides access to the current style sheets on a page. The result is a DOM_CSSStyleSheet_ptr, which is one or more pointers to the style sheet objects in the browser.

sheets <- styleSheets(page) sheets

Having access to these style sheet objects is useful because we can use them with the new insertRule and deleteRule functions to add/remove individual rules to/from a style sheet. For example, the following code adds a new CSS rule, that also applies to <p> elements, so that the paragraph text becomes italic as well as red.

insertRule(page, sheets[1], "p { font-style: italic; }", 0) renderPage(page, "paraInsertRulePage")

However, adding and removing entire rules is still a fairly coarse level of control. Even better would be control of the components of a rule: the selector and the style declarations.

CSS Style Rules

The cssRules property of a style sheet produces a DOM_CSSRule_ptr object: a vector of pointers to individual CSS rules. In this case, there are two CSS rules in the style sheet.

sheets[1]$cssRules

We can access the style property of a CSS style rule and that gives us a DOM_CSSStyleDeclaration_ptr (just like we got from accessing the style property of an HTML element). We can then get and set the properties of that object to access and modify the style declarations in the CSS rule in the style sheet. In the following code, we are using CSS rule number 2 to get the rule that controls color because the rule that we inserted above to control font-style was inserted at index 0 (i.e., BEFORE the color rule that was already in the style sheet).

sheets[1]$cssRules[2]$style$color sheets[1]$cssRules[2]$style$color <- "green" renderPage(page, "paraCSSRulePage")

The function propertyNames can be used to get the names of all properties in a style declaration. This does not correspond to a DOM method; it is just a convenience function.

propertyNames(page, sheets[1]$cssRules[2]$style)

We can remove an existing property from a style declaration with the removeProperty function.

removeProperty(page, sheets[1]$cssRules[2]$style, "color") propertyNames(page, sheets[1]$cssRules[2]$style) renderPage(page, "paraRemovePropPage")

It is also possible to access the selector for a CSS rule, but this cannot be modified; if we want a rule to control a different target, we should make a new rule.

sheets[1]$cssRules[2]$selectorText

Similarly, we can view (but not edit) the full text for a CSS rule via the cssText property.

sheets[1]$cssRules[2]$cssText

In summary, several new functions, combined with the ability to get and set properties, allows us to access and modify entire style sheets for a web page. This means that we can programmatically control the appearance of entire sets of elements at once.

Building style from scratch

Most of the examples so far have involved working with a ready-made element with a style attribute or working with a ready-made style sheet. This section briefly demonstrates how to build a stylesheet for a web page from the ground up.

We will again start with a web page containing a single paragraph and no CSS styling.

page <- htmlPage("

A paragraph

")
renderPage(page, "paraPage")

The first step is to create an empty style sheet. We can do this by creating an empty <style> element and adding that to the page.

styleElement <- createElement(page, "style") appendChild(page, styleElement, parent=css("head"))

We can access the style sheet via the sheet property of the <style> element. The first thing we do with the style sheet is disable it so that we can build it up without affecting the page.

styleSheet <- styleElement$sheet styleSheet$disabled <- TRUE

The next step is to add an empty rule to the style sheet. This allows us to specify just the selector for the rule.

insertRule(page, styleSheet, "p { }", 0)

We now create a short-cut to the new rule, to save on typing, and add style declarations to the rule.

rule1 <- styleSheet$cssRules[1] rule1$style$color <- "red" rule1$style$"font-style" <- "italic"

The last step is to enable the style sheet so that it can have an effect on the contents of the page.

styleSheet$disabled <- FALSE renderPage(page, "paraGroundUpPage")

CSS and SVG

All of the examples so far have involved styling HTML elements. Styling SVG elements is very similar, but with the added complication that individual SVG elements have presentation attributes in addition to a style attribute.

For an HTML element, a style declaration in the style attribute will override any style declarations in a style sheet that target the element. For an SVG element, a style declaration in the style attribute will override any style declarations in a style sheet that target the element, which in turn will override any presentation attributes on the SVG element.

The following code demonstrates these rules. First of all, we add an SVG image to the page and then we set the presentation attribute fill on the element (so that it is filled blue).

appendChild(page, svgNode(' '), ns=TRUE) circle <- getElementById(page, "c", response=nodePtr()) setAttribute(page, circle, "fill", "blue") renderPage(page, "paraSVGPage")

Now if we add a style sheet to the page that targets that SVG element and has a style declaration for fill, it overrides the element's presentation attribute, and the circle turns green.

appendChild(page, htmlNode("