This lab is broken into two sections, requiring TWO submissions: a non-shiny part and a shiny part. This is the non-shiny part.

The Data Set

The data set is a CSV file, nzpolice-proceedings.csv, which was derived from “Dataset 5” of Proceedings (offender demographics) on the policedata.nz web site.

We can read the data into an R data frame with read.csv().

crime <- read.csv("nzpolice-proceedings.csv")
crime$Month <- as.Date(crime$Date)
crime$Year <- as.POSIXlt(crime$Date)$year + 1900
typeCount <- table(crime$ANZSOC.Division)
crime$Type <- factor(crime$ANZSOC.Division,
                     levels=names(typeCount)[order(typeCount)])

For this lab we will drop the year 2014 (for which we only have partial data).

crime <- subset(crime, Year >= 2015)

Some questions will use the “raw” crime data above, with one row per incident, and some questions will use the table-of-counts version of the data below, with one row per combination of crime type and month.

counts <- as.data.frame(table(crime$Type, crime$Month))
names(counts) <- c("Type", "Month", "Freq")
counts$Month <- as.Date(counts$Month)
counts$Abbrev <- counts$Type
levels(counts$Abbrev) <- sub("(.+?)(,|and|With|Offences|Endangering)(.+)",
                             "\\1", levels(counts$Abbrev))

Questions of Interest

For each data visualisation in this Lab, we will be interested in answering the following question:

For specific data visualisations there may be additional specific questions of interest.

Data Visualisations

Direct interaction with ‘plotly’

library(plotly)
  1. The following code produces an interactive plot of crime frequencies over time for different types of crime.

    The abbreviated type labels are used in the legend to save space, but the text aesthetic is used to add the full type labels to the tooltips.

    gg <- ggplot(counts, aes(Month, Freq)) + 
        geom_line(aes(colour=Abbrev, group=Abbrev, text=Type))
    
    ggplotly(gg)

    Zooming and panning was used to explore the trends in less frequent crimes. A snapshot view is shown below.

    Interactions with the legend were used to isolate the trends for Public Order Offences and Dangerous or Negligent Acts. A snapshot view is shown below.

    Tooltips were used to identify that the sudden dip in Dangerous or Negligent Acts occurred in April 2020.

    The following interesting features and comparisons in trends were identified:

    • Dangerous or Negligent Acts have actually increased over time (while most other crimes have decreased).
    • Several other crimes have remained quite stable (Unlawful Entry, Prohibited, Sexual Assault, Robbery). These are all less common crimes.
    • There has been a recent drop in Illicit Drug Offences.
    • The sudden spike in Offences Against Justice and Miscellaneous Offences both occurred April 2020, the same month as the sudden dip in Dangerous or Negligent Acts.

Direct interaction with ‘plotscaper’

In this section we make use of ‘plotscaper’ to generate linked plots.

library(plotscaper)

For this question we will limit the exploration to just Jan 2021 onwards (to limit the data size).

crimeRecent <- subset(crime, Year >= 2021)
  1. The following code produces three (linked) interactive bar plots.

    layout <- matrix(c(1:3, 3), byrow=TRUE, ncol=2)
    schema <- create_schema(crimeRecent)
    bar <- add_barplot(schema, "Type")
    bar2 <- add_barplot(bar, "Age.Lower")
    bar3 <- add_barplot(bar2, "Date")
    plot <- set_layout(bar3, layout)
    render(plot, width=800, height=600)

    Using Q-hover, we can deterine that the month with the highest crime count was March 2021.

    Using click-and-drag, we can highlight the youth (Age.Lower 15 and below) and view the corresponding bars in the Date plot. This shows no evidence of an increase in Youth Crimes in the second half of 2022 (see screen shot below).

    By selecting different age groups (and using tooltips to identify crime types) we can compare distributions of crimes across crime types. This reveals that Theft is much more common in youth statistics compared to, for example, Traffic offences in older populations (see screen shots below).

Indirect interaction with ‘shiny’

See "interaction-shiny-model.Rmd"

Summary

In this lab we have experimented with producing both direct interactions (using ‘plotly’ and ‘plotscaper’) and indirect interactions (using ‘shiny’). Being able to modify and query the plots makes it easy to rapidly explore a wide range of different views of the data. Selections are helpful for overcoming overplotting and tooltips also provide a way to save on space with labelling.

We have also clearly demonstrated the limitations of different tools for producing interactive plots. Each tool has its own strengths and weaknesses.