Code Development: Diffing R Code

2024-04-18

One of the purposes of writing a function is to avoid repetition in our code. In order to identify repetition, including what changes and what stays the same, it is useful to be able to generate a “diff” of two sets of code.

For example, the following output shows differences between two pieces of code that calculate moving averages. We can see that only the size of the “window” (the size of the subset of the data over which the moving average is calculated) changes - 4 becomes 6 - plus the name of the symbol that holds the averages also changes - avg to avg7.

<
 
window size 5
>
 
window size 7
@@ 1,6 @@
@@ 1,6 @@
 
 
n <- length(b1temp)
 
 
n <- length(b1temp)
<
 
avg <- numeric(n - 4)
>
 
avg7 <- numeric(n - 6)
<
 
for (i in 1:(n - 4)) {
>
 
for (i in 1:(n - 6)) {
<
 
    window <- i:(i + 4)
>
 
    window <- i:(i + 6)
<
 
    avg[i] <- mean(b1temp[window])
>
 
    avg7[i] <- mean(b1temp[window])
 
 
}
 
 
}

The {diffobj} package provides some nice functions for generating “diff”s like this. The only difficulty is how we provide the code that we want to compare to {diffobj}:

If we have two functions to compare …
```
f1 <- function(x) {
    x
}
```
```
f2 <- function(x, y) {
    x + y
}
```
… then we can just call diffPrint() and pass it the function names …
```
diffobj::diffPrint(f1, f2)
```
```
<
 
f1
>
 
f2
@@ 1,3 @@
@@ 1,3 @@
<
 
function(x) {
>
 
function(x, y) {
<
 
    x
>
 
    x + y
 
 
}
 
 
}
```
A problem may occur if R formats the function code differently from our original. In that case, we can save the code for our functions to separate text files and then compare the files with diffFile() …
```
diffobj::diffFile("f1.R", "f2.R")
```
```
<
 
f1.R
>
 
f2.R
@@ 1,3 @@
@@ 1,3 @@
<
 
f1 <- function(x) {
>
 
f1 <- function(x, y) {
<
 
    x
>
 
    x + y
 
 
}
 
 
}
```
If we have two code chunks to compare …
```
x <- 1
```
```
x <- 1:10
```
… we can call diffPrint(), but we need to quote() the code chunks …
```
code1 <- quote(x <- 1)
code2 <- quote(x <- 1:10)
```
```
diffobj::diffPrint(code1, code2)
```
```
<
 
code1
>
 
code2
@@ 1 @@
@@ 1 @@
<
 
x <- 1
>
 
x <- 1:10
```
… and if (as is likely) the code chunk has multiple expressions, we need to make the code chunks compound expressions (by adding curly brackets) …
```
code1 <- quote({
                    x <- 1
                    y <- 2
               })
code2 <- quote({
                    x <- 1:10
                    y <- 2:20
               })
```
```
diffobj::diffPrint(code1, code2)
```
```
<
 
code1
>
 
code2
@@ 1,4 @@
@@ 1,4 @@
 
 
{
 
 
{
<
 
    x <- 1
>
 
    x <- 1:10
<
 
    y <- 2
>
 
    y <- 2:20
 
 
}
 
 
}
```
… or we can save the code chunks to separate text files and then compare the files with diffFile() …
```
diffobj::diffFile("code1.R", "code2.R")
```
```
<
 
code1.R
>
 
code2.R
@@ 1,2 @@
@@ 1,2 @@
<
 
x <- 1
>
 
x <- 1:10
<
 
y <- 2
>
 
y <- 2:20
```

This work is licensed under a Creative Commons Attribution 4.0 International License.