The purpose of this lab is to practise generating maps and, in particular, adding representations of data values to a map.
In this lab, we have more code than usual for setting up the data set. This is because we need data about the map we want to draw (the outline of New Zealand Police regions) in addition to the crime data that we have been using throughout the course. We then need to create data sets that combine the map data and the crime data.
The data set is a CSV file, nzpolice-proceedings.csv,
which was derived from “Dataset 5” of Proceedings
(offender demographics) on the policedata.nz
web site.
We can read the data into an R data frame with
read.csv().
The following code generates a column of real dates, generates a
Year column, and makes a tweak to the
Police.District column (which will be useful later when we
merge this crime data with the map outline data).
crime$Month <- as.Date(crime$Date)
crime$Year <- as.POSIXlt(crime$Date)$year + 1900
crime$Police.District <- gsub("Of", "of", crime$Police.District)For this lab we will drop the year 2014 (for which we only have partial data).
The following code generates total counts per Police District.
The following code generates total counts for each year per Police District.
The following code generates total counts for each type of crime per Police District.
Map data for the Police Districts was obtained from (Koordinates)[https://koordinates.com/layer/105480-nz-police-district-boundaries-29-april-2021/].
Reading layer `nz-police-district-boundaries-29-april-2021' from data source
`/home/pmur002/Uni of Auckland Dropbox/r-project/Files/Teaching/STATS787/2025/Labs/nz-police-district-boundaries-29-april-2021.shp'
using driver `ESRI Shapefile'
Simple feature collection with 12 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension: XYZ
Bounding box: xmin: 1067061 ymin: 4701317 xmax: 2114868 ymax: 6242140
z_range: zmin: 0 zmax: 0
Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
The following code adds centroids per region (as X and
Y).
The following code combines the map data with the different crime counts.
crimeDistricts <- inner_join(districts, crimePerDistrict,
by=join_by(DISTRICT_N == Police.District))
crimeTypeDistricts <- inner_join(districts, crimeTypePerDistrict,
by=join_by(DISTRICT_N == Police.District))
crimeYearDistricts <- inner_join(districts, crimeYearPerDistrict,
by=join_by(DISTRICT_N == Police.District))Each data visualisation in this lab will address at least one of the following questions:
Write R code to produce a map of the New Zealand Police Districts, with a label for each district.
HINT: I used hjust to shift the labels
out of each others way.
Write R code to produce a map of the New Zealand Police Districts, with each region coloured to represent the number of incidents in the region.
Comment on what this map tells us about the questions of interest. Are the visual channels used in this data visualisation helping or hindering us?
Comment on the major substantive problem with this map (hint: we read about substantive problems in week 1).
Write R code to produce a map of the New Zealand Police Districts, with a dot within each region and the area of the dot representing the number of incidents.
NOTE: the dots are semitransparent.
HINT: the dots are drawn at the centroids of the regions.
Comment on what this map tells us about the questions of interest. Does the different visual channel help with answering the questions?
Write R code to produce an animated map that shows the number of incidents in each region over time (one frame per year).
NOTE: that there is a year label above the map.
Comment on what this map tells us about the questions of interest.
Write R code to produce the data visualisation below of the number of incidents per region over time.
Comment on what this data visualisation tells us about the questions of interest. Are there features that are easier to see in this plot versus the animated map? Are there features that are easier to see in the animated map versus this plot?
Write R code to produce a facetted plot of regions with one facet for each type of crime.
NOTE: there is no legend, there are no axis ticks or labels, and the strip labels are left-aligned.
Comment on what this map tells us about the questions of interest. What could we do to the colour scale to improve the effectiveness of this data visualisation?
Write R code to produce a non-map data visualisation that shows the number of incidents in each region for each type of crime.
Comment on whether it is easier or harder to answer the questions of interest with this data visualisation compared to the previous question and explain why.
No marks will be given for this question.
Write R code to produce a map of Police Districts with a simple embedded line plot for each region that shows the number of incidents over time for each region.
Is this a better data visualisation than the ones in Questions 4 and 5? What visual channels are we employing here?
Your submission should consist of a knitted R Markdown document, in HTML format, submitted via Canvas.
Your report should include:
Don’t forget to also complete the Canvas Quiz!
Marks will be lost for:
This
work is licensed under a
Creative
Commons Attribution 4.0 International License.