The Twitter bots

@tuureti (and on Mastodon) reports punctuality, cancellations, and headway statistics for Auckland Transport buses, from the AT GTFS feed. Here’s a typical tweet

The graph has a dot for every bus in the GTFS feed. Punctuality is based on being no more than 5 minutes late, and if the last trip update is a departure, being no more than 1 minute early. Headway is based on all routes with at least four active buses, and looks at the mean absolute difference in delay between adjacent buses: the report is the median and 90th percentile. Cancellations are service alert events with NO_SERVICE or REDUCED_SERVICE as the effect code and Cancellation in the text. The bot name means “late” in te reo.

@head_ways reports punctuality and cancellations for Wellington-area buses, from the Metlink GTFS feed. Here’s a typical tweet

The punctuality definitions are the same as for Auckland. The cancellations are all service alert events with NO_SERVICE as the effect parameter and is cancelled in the text description. The bot name is a reference to the concept of headway in transit and to the traditional name of the Wellington area as te upoko o te ika a Maui, the head of the fish of Maui.

The web pages

For Auckland:

For Wellington:

Inconceivable

Some of the punctuality outliers look implausible. Some of these are actually true, but many appear to be errors. There are instances where a bus starts updatng its position long before it starts or keeps updating long after it finishes a trip. There are also outliers that may be due to a driver entering the wrong trip information into the bus computer. The bots trust the GTFS feed, which probably trusts the buses, and like all automated data systems the results may contain nuts.

Under the hood

Everything is written in R and runs on a virtual machine at the University of Auckland. The code uses the httr and jsonlite packages to get data from the GTFS feeds, the beeswarm package to do the dotplots, the htmlwidgets and leaflet packages to make the maps, and the twitteR package to tweet. I needed a free account with Auckland Transport and Metlink, and a free Twitter developer account. Everything is rate-limited to some extent, but not enough to be a problem (I hope).

The R processes used to run continually, but they now stop at night and get restarted the next day by a cron job. I’m hoping this will be more robust. Every month the virtual machine gets rebooted and I need to set up some file mounts manually, so the system will stop for a while.