Lab 6: Animation


The purpose of this lab is to practise generating animated data visualisations and to demonstrate the value of animation.

The Data Set

The data set is a CSV file, nzpolice-proceedings.csv, which was derived from “Dataset 5” of Proceedings (offender demographics) on the policedata.nz web site.

We can read the data into an R data frame with read.csv().

crime <- read.csv("nzpolice-proceedings.csv")
crime$Month <- as.Date(crime$Date)
crime$Year <- as.POSIXlt(crime$Date)$year + 1900
typeCount <- table(crime$ANZSOC.Division)
crime$Type <- factor(crime$ANZSOC.Division,
                     levels=names(typeCount)[order(typeCount)])

For this lab we will drop the year 2014 (for which we only have partial data).

crime <- subset(crime, Year >= 2015)

Some questions will only look at data for youth crime.

youth <- subset(crime, Age.Lower <= 15)

Questions of Interest

In this lab we are interested in a variety of questions (most data visualisations will only be relevant to one of these):

  • How does the proportion of male versus female crimes change over time? (Questions 1 and 2)
  • How does the number of incidents for each type of crime change over time? (Question 3 and 4)
  • How does the number of incidents for each age group and each type of crime change over time? (Questions 5 and 6)

Data Visualisations

  1. Write R code using the ‘magick’ package to produce an animation of the proportion of female versus male youth crimes over time as shown below. You should produce an animation frame for each month of data.

    Comment on what this data visualisation tells us about the questions of interest.

    NOTE that there is also a text label that shows the changing date.

    NOTE that the animation only runs once, so you may need to reload the page, or view the animation image in a separate tab or window in order to see it run.

    The following code creates a matrix of proportions that can be used in this question.

    youthSexMonth <- t(apply(table(youth$Month, youth$SEX), 1, 
                             function(x) x/sum(x)))

    The following code defines the colours in the bars

    female <- "#E46C0A"
    male <- "#0070C0"

  2. Write R code using the ‘gganimate’ package to produce a ‘ggplot2’ version of the animation from the previous question, as shown below.

    NOTE that the animation does NOT pause at each month AND there is a plot title that shows the changing month.

    The following code creates a data frame of proportions that can be used in this question.

    longYouthSexMonth <- reshape2::melt(youthSexMonth)
    names(longYouthSexMonth) <- c("Month", "Sex", "Prop")

  3. Write R code using the ‘gganimate’ package to produce an animation of the number of incidents for each type of crime per year, as shown below.

    Comment on what this data visualisation tells us about the questions of interest.

    NOTE that at each transition between years, the old bars fade out and the new bars fade in. There is also a label showing the changing year.

  4. Write R code to produce an animation of the number of crimes per year for different types of crime, with two of the crimes highlighted by drawing a line and point that move as the years change.

    Comment on what this data visualisation tells us about the questions of interest.

    It is up to you to decide whether to use ‘magick’ or ‘gganimate’ for this task.

    The following code generates a data frame of counts that may be used for this question.

    crimeYearType <- as.data.frame(table(crime$Year, crime$ANZSOC.Division))
    crimeYearType$Year <- as.numeric(as.character(crimeYearType$Var1))
    crimeYearType$Type <- crimeYearType$Var2
    lastCount <- subset(crimeYearType, Year == 2022)

  5. Write R code to produce a static visualisation of the number of incidents for each age group, for each type of crime and for each year, as shown below.

    Comment on what this data visualisation tells us about the questions of interest.

    NOTE that the y-axis scales are different on each row.

    NOTE that this figure is 8 inches high.

  6. Write R code using the ‘gganimate’ package to produce an animation of the number of incidents for each age group, for each type of crime, per year, as shown below.

    Comment on whether you can see any features in this animation that you cannot see in the static visualisation from the previous question.

    NOTE that the aspect ratio of the panels is 0.25.

    NOTE that this figure is 8 inches high.

Challenge

  1. No marks will be given for this question.

    The data visualisation below packs a lot of information into a single image (although it is a little larger than normal at 8 inches square). Each panel shows data for one age group and type of crime, with a green-ish bar and a pink-ish bar for each month of data. The bars show the proportion of male versus female crimes. We can see all sorts of interesting differences between types of crime, between age groups, and over time. The only problem is the labelling of the types of crime.

    Can you design (if not implement) an animated data visualisation that would show this data at least as clearly, but allow for better labelling of the types of crime?

The Report

Your submission should consist of a knitted R Markdown document, in HTML format, submitted via Canvas.

Your report should include:

  • A brief description of the data and the question we are trying to answer.
  • For each data visualisation, R code AND a brief text commentary.
  • A brief overall summary.

Don’t forget to also complete the Canvas Quiz!

Marking

Marks will be lost for:

  • Plagiarism.
  • Section of the report is missing.
  • The summary is too short or does not make sense.
  • Significantly poor R (or other) code.
  • Overly verbose code, output, or commentary.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.