Subsections


10.1 R syntax

R code consists of one or more expressions. This section describes several different sorts of expressions.

10.1.1 Constants

The simplest type of expression is just a constant value. The most common constant values in R are numbers and text. There are various ways to enter numbers, including using scientific notation and hexadecimal syntax. Text must be surrounded by double-quotes or single-quotes, and special characters may be included within text using various escape sequences. The help pages ?NumericConstants and ?Quotes provide a detailed description of the various possibilities.

Any constant not starting with a number and not within quotes is a symbol.

There are a number of reserved symbols with predefined meanings: NA (missing value), NULL (an empty data structure), NaN (Not a Number), Inf and -Inf ([minus] infinity), TRUE and FALSE, and the symbols used for control flow as described below.

Section 10.1.5 will describe how to create new symbols.

10.1.2 Arithmetic operators

R has all of the standard arithmetic operators such as addition (+), subtraction (-), division (/), multiplication (*), and exponentiation (^). R also has operators for integer division (%/%) and remainder on integer division (%%; also known as modulo arithmetic).

10.1.3 Logical operators

The comparison operators <, >, <=, >=, ==, and != are used to determine whether values in one vector are larger or smaller or equal to the values in another vector. The %in% operator determines whether each value in the left operand can be matched with one of the values in the right operand. The result of these operators is a logical vector.

The logical operators || (or) and && (and) can be used to combine two logical values and produce another logical value as the result. The operator ! (not) negates a logical value. These operators allow complex conditions to be constructed.

The operators | and & are similar, but they combine two logical vectors. The comparison is performed element by element, so the result is also a logical vector.

Section 10.3.4 describes several functions that perform comparisons.

10.1.4 Function calls

A function call is an expression of the form:

functionName(arg1, arg2)

A function can have any number of arguments, including zero. Every argument has a name.

Arguments can be specified by position or by name (name overrides position). Arguments may have a default value, which they will take if no value is supplied for the argument in the function call.

All of the following function calls are equivalent (they all generate a numeric vector containing the integers 1 to 10):

seq(1, 10)             # positional arguments
seq(from=1, to=10)     # named arguments
seq(to=10, from=1)     # names trump position
seq(1, 10, by=1)       # 'by' argument has default

Section 10.3 provides details about a number of important functions for basic data processing.


10.1.5 Symbols and assignment

Anything not starting with a digit, that is not a special keyword, is treated as a symbol. Values may be assigned to symbols using the <- operator; otherwise, any expression involving a symbol will produce the value that has been previously assigned to that symbol.



> x <- 1:10





> x

 [1]  1  2  3  4  5  6  7  8  9 10




10.1.6 Loops

A loop is used to repeatedly run a group of expressions.

A for loop runs expressions a fixed number of times. It has the following general form:

for (symbol in sequence) {
    expressions
}

The expressions are run once for each element in the sequence, with the relevant element of the sequence assigned to the symbol.

A while loop runs expressions until a condition is met. It has the following general form:

while (condition) {
    expressions
}

The while loop repeats until the condition is FALSE. The condition is an expression that should produce a single logical value.


10.1.7 Conditional expressions

A conditional expression is used to make expressions contingent on a condition.

A conditional expression in R has the following form:

if (condition) {
    expressions
}

The condition is an expression that should produce a single logical value, and the expressions are only run if the result of the condition is TRUE.

The curly braces are not necessary, but it is good practice to always include them; if the braces are omitted, only the first complete expression following the condition is run.

It is also possible to have an else clause.

if (condition) {
    trueExpressions
} else {
    falseExpressions
}

Paul Murrell

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.