by Paul Murrell
htmlPage opens a new web page within
a browser window (or tab), optionally setting the initial content
for the page.
function returns a unique identifier for the page, which
can be used in further calls to modify the page.
For example, the
appendChild function can be used
to add content and the
removeChild function can be
used to remove content.
The code below removes the first <p> element from the document.
closePage function should be used once
we have finished with a page.
The full argument list for the
appendChild function is
The first argument,
pageID, identifies the web page that
we are interacting with (as returned by
The second argument,
child, can be used to specify HTML
code for the new child element, as shown in the example from the
previous section. In that example, we specified a <p> element
explicitly, but there are many R packages that can help us to generate
HTML code (e.g., 'XML',
The third argument to
provides an alternative way to specify the child element.
This argument can be used to specify a CSS selector for
an existing element in the web page. If we use this argument, we
can move an element from one place to another within the web
page. For example, the following code creates a web page with
The following call to
moves the first paragraph to the end of
Exactly one of
must be specified. In the case of
value must be a single character value and it must describe a
single HTML element (though that element may have other elements
nested within it).
The use of both
arguments reflects the fact that, in the real DOM API,
arguments are pointers to HTML elements.
In the 'DOM' package, we do not have pointers to HTML elements;
R communicates with the browser by passing JSON objects back and
forth over a websocket (thanks to the
If we need to specify an HTML element that is not yet part of the
web page, we use HTML code (as in the
If we need to specify an HTML element that is already part of the
web page, we use a CSS selector (as in the
The fourth argument to
appendChild is called
parentRef. This is a CSS selector that specifies
the parent element that the child should be added to.
This argument defaults to
"body", which means
that the child is added to the end of the web page.
The following code adds a <span> element to the page
as a child of the second <p> element.
The fifth arguent to
appendChild is called
css. This is a logical value specifying whether the
parentRef arguments should be
interpreted as CSS selectors (the default), or as XPath expressions.
The following code uses XPath expressions to move the <span>
element from its current position to
be a child of the third <p> element.
The sixth argument to
appendChild is called
This is a logical value specifying whether the call is asynchronous.
By default, R will block until the web browser has responded with the
result of the
appendChild request. This allows us to
write requests to the browser in the familiar imperative programming
style (do A, then do B, etc).
However, requests to the browser are
inherently asynchronous, so it is also possible to send a request to
the browser and then run subsequent R code without waiting for
a response from the browser.
We will see a use for this when we get to
the section on calling R from the browser.
The seventh argument
appendChild is called
NULL or an R function.
It can be used to
supply an R function that will be run once the web browser has responded
to the request.
In combination with
async, this can be used to execute
R code some unknown time in the future, whenever the web browser has
completed our request.
The value returned by the
is the HTML code for
the child that was appended. If we specify the
as HTML code, the return value should be identical to that, as shown below.
However, if we specify
childRef, this returns the HTML code
for the child that we moved. The code below moves the first paragraph
on the page to the end of the page and returns the HTML code for
the element that was moved.
There is also an
appendChildCSS function that returns
a CSS selector for the child that was appended (or moved).
In the following code, we add a new paragraph to the end of the
page and the return value provides us with a CSS selector to
identify that new element.
When we specify a child to move, with a CSS selector in
the return value gives the new position in the page,
not the old position, so the returned CSS selector will not be the same
Furthermore, the CSS selector that is returned is generated using the
produce succinct CSS selectors, so it is difficult to predict
the format of the CSS selector result.
The 'DOM' package has so far only implemented a tiny part of the DOM interface, but several of the most common operations are possible: adding elements, removing elements, replacing elements, selecting elements (by ID or tag or class), setting attributes, etc.
In each case, where it makes sense, arguments are provided to
allow HTML elements to be specified as either HTML code (for new
CSS selectors (for existing elements).
For example, it is possible to use
to replace an existing element with a new element ...
... or to replace an existing element with another existing element ...
Also, where it makes sense, there are function variations that
allow the return value to be either HTML code or CSS selectors.
For example, it is possible to get the results of
getElementsByTagNames as HTML code ...
... or as CSS selectors.
In addition to interacting with a web page that is initialised from
htmlPage, it is possible to interact with a
web page that already exists.
filePage function can be used to open a web page
from the local filesystem.
Adding and removing content works just like before.
urlPage function opens a web page from the given
URL (but only for the
http: protocol currently).
These functions allow us to manipulate a web page that we did not create (or do not want to have to go to the effort of creating).
The extra complication with these functions is that they only work if, in addition to having the 'DOM' package installed for R, we have installed the 'RDOM.user.js' user script in the browser. This script is included with the 'DOM' package, but must be manually installed, for example, using the greasemonkey plug-in for Firefox. Furthermore, the settings for the user script will have to be modified to enable access to specific URLs (see the @include rules in 'RDOM.user.js').
It is also possible to manipulate a web page using
a headless browser (PhantomJS).
This works for each of
urlPage (without the need for a user script), by
This somewhat ruins the point of dynamically modifying a web page because, with a headless browser, we cannot watch the changes being made to the web page on screen. However, the headless browser is extremely useful for testing.
Because R is communicating with the web browser via a websocket, it is also possible for the web browser to send requests to R, for example, in response to a user event, such as a mouse click.
RDOM.Rcall. This function must, of course, be used
and insert it in a web page using the R functions previously described.
RDOM.Rcall function takes three arguments: the name of
function (a callback) to run once the call to R has completed.
The R function given as the first argument to
must take 2 arguments. The first argument will contain HTML code
for the HTML element that was given as the second argument to
and the second argument will contain a CSS selector for that
The following code provides a demonstration. First, we define an R
that takes two arguments and prints them to the screen.
Next, we open a browser window and append a paragraph with
a <span> element embedded in it.
Finally, we set the
attribute of the <span> element to be a call to
RDOM.Rcall, with the name of our R function,
"echo", as the first argument, the span element
this) as the second argument, and
If we now click on the word "special" in the web browser, the R function
echo is called and we get
the following output in the R console:
It is important to note that the call from the web browser to R is
asynchronous. R is not blocked waiting for the call to the
It is possible to include a request
to the browser in the R function that is called from the browser
(e.g., use R to modify an element in response to a mouse click in
the browser), but in that case, it is essential that the request from
R is also asynchronous (using the
async argument that was
The 'DOM' package provides a tool for generating and modifying the content of a web page on-the-fly. It does this through a web socket connection to a web browser, which allows R to send requests to the browser and allows the browser to send responses or even requests back to R.
In effect, the 'DOM' package uses a web browser as an interactive output device. We can write R code to produce output that is rendered by the browser. Furthermore, the browser can capture user events that occur on the output and call back to R.
Future development of the package will be aimed at allowing the generation and modification of SVG and CSS output so that the browser can also act as an interactive device for graphics and/or a mixture of graphics and textual content.
Several excellent packages already existed for manipulating web page content, but they did not provide exactly the right set of features: The 'XML' package, and more recently 'xml2', provide functions for manipulating XML and HTML content, but the document being modified is not associated with a web browser so changes are not dynamically visualised; packages such as 'Rapache' and 'Rook' allow R to act as a web server, but with a focus on supplying content on request from a web browser, not to allow R to drive the web browser; 'RSelenium' allows R to drive a web browser, but R cannot receive callbacks from the browser on user events; 'Shiny' allows us to create web content, including interactive elements that call back to R, but only via a higher-level framework that does not provide the level of fine control that we need. A very recent addition is the 'fiery' package, which is lower-level than 'Shiny', but very general-purpose. We may in the future explore 'fiery' as a possible basis for 'DOM' to build on (instead of 'httpuv').
The 'DOM' package (currently) has several important limitations:
only a tiny fraction of the DOM interface has been implemented so far;
The package has mostly only been tested on Linux, with Firefox
(there has been
one successful test of
htmlPage on Windows, with Chrome);
and the 'DOM' package
is only aimed at the case where R and the browser are running
together on the same machine.
Furthermore, the 'DOM' package is aimed at
a single-user, single R session scenario
(e.g., it is not difficult to clobber yourself
by running two R sessions that make use the same port
for their websocket).
The 'DOM' package allows us to open a web page in a browser and to manipulate the content of the web page dynamically from R. It is also possible to arrange for R code to be run in response to user events in the browser.
The examples and discussion in this document relate to version 0.1 of the 'DOM' package.
This report was generated on Ubuntu 14.04 64-bit running and PhantomJS version 1.9.0.
"CSS Selector Generator", Riki Fridrich, https://github.com/fczbkk/css-selector-generator, date visited: 2016-07-26.
"Document Object Model (DOM)", World Wide Web Consortium (W3C), https://www.w3.org/DOM/, date visited: 2016-07-27.
"DOM", Web Hypertext Application Technology Working Group (WHATWG), https://dom.spec.whatwg.org/, date visited: 2016-07-27.
"Document Object Model (DOM)", Mozilla Developer Network (MDN), https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model, date visited: 2016-07-27.
"fiery: A Lightweight and Flexible Web Framework", Thomas Lin Pedersen, https://cran.r-project.org/web/packages/fiery/index.html, https://github.com/thomasp85/fiery, date visited: 2016-07-26.
"htmltools: Tools for HTML", RStudio, Inc., https://cran.r-project.org/web/packages/htmltools/index.html, https://github.com/rstudio/htmltools, date visited: 2016-07-26.
"httpuv: HTTP and WebSocket Server Library", RStudio, Inc., https://cran.r-project.org/web/packages/httpuv/index.html, https://github.com/rstudio/httpuv, date visited: 2016-07-26.
"Rapache: R embedded inside Apache", Jeffrey Horner, https://github.com/jeffreyhorner/rapache, http://www.rapache.net/, date visited: 2016-07-26.
"Rook: a web server interface for R", Jeffrey Horner, https://cran.r-project.org/web/packages/Rook/index.html, date visited: 2016-07-26.
"RSelenium: R bindings for Selenium WebDriver", John Harrison, https://cran.r-project.org/web/packages/RSelenium/, http://ropensci.github.io/RSelenium, date visited: 2016-07-26.
"shiny: Web Application Framework for R", Winston Chang, Joe Cheng, JJ Allaire, Yihui Xie, and Jonathan McPherson, https://cran.r-project.org/web/packages/shiny/index.html, http://shiny.rstudio.com/, date visited: 2016-07-26.
"xml2: Parse XML", Hadley Wickham, James Hester, and Jeroen Ooms, https://cran.r-project.org/web/packages/xml2/index.html, https://github.com/hadley/xml2/, date visited: 2016-07-26.
"XML: Tools for Parsing and Generating XML Within R and S-Plus", Duncan Temple Lang and the CRAN Team, https://cran.r-project.org/web/packages/XML/index.html, http://www.omegahat.net/RSXML, date visited: 2016-07-26.
"xtable: Export Tables to LaTeX or HTML", David B. Dahl and David Scott, https://cran.r-project.org/web/packages/xtable/index.html, http://xtable.r-forge.r-project.org/, date visited: 2016-07-26.
An Introduction to the 'DOM' Package by Paul Murrell is licensed under a Creative Commons Attribution 4.0 International License.