## Department of Statistics

# Seminars

**Fitting Markov chains to sampled aggregate data: modelling tree fern gametophyte growth under different conditions**

Speaker: Louise McMillan

Affiliation: The University of Auckland

When: Wednesday, 31 January 2018, 11:00 am to 12:00 pm

Where: 303-310

The alternate life-stage (gametophyte) of plants has received relatively little academic attention, and so the environmental effects on growth not well understood. This talk focuses on a recent experiment by James Brock of the School of Biological Sciences in which he grew tree fern gametophytes in vitro, in different levels of light and phosphorus, and monitored their growth stages. I then worked on fitting Markov chains to the growth stage data, a task made more difficult by the fact that the data was incomplete aggregate data rather than following particular spores through each stage of their growth. This talk will cover the fitting methods used, and the results of the analysis.

**Bayesian nonparametric analysis of multivariate time series**

Speaker: Alexander Meier

Affiliation: Otto-von-Guericke University Magdeburg

When: Wednesday, 7 February 2018, 3:00 pm to 4:00 pm

Where: 303.310

While there is an increasing amount of literature about Bayesian time series analysis, only few nonparametric approaches to multivariate time series exist. Many methods rely on Whittle's likelihood, involving the second order structure of a stationary time series by means of its spectral density matrix f. The latter is often modeled in terms of the Cholesky decomposition to ensure positive definiteness. However, asymptotic properties under these priors such as posterior consistency or posterior contraction rates are not known.

A different idea is to model f by means of random measures. This is in line with (1), who model the normalized spectral density of a univariate time series with a Dirichlet process mixture of beta densities. We use a similar approach, with matrix-valued mixture weights induced by a completely random matrix-valued measure (2,3). We use a class of infinitely divisible matrix Gamma distributions (4) for this purpose. While the procedure performs well in practice, we also establish posterior consistency and derive posterior contraction rates.

Authors:

Alexander Meier, Otto-von-Guericke University Magdeburg

Claudia Kirch, Otto-von-Guericke University Magdeburg

Renate Meyer, The University of Auckland

**An asymmetric measure of population differentiation based on the saddlepoint approximation method**

Speaker: Louise McMillan

Affiliation: The University of Auckland

When: Wednesday, 28 March 2018, 3:00 pm to 4:00 pm

Where: 303-310

In the field of population genetics there are many measures of genetic diversity and population differentiation. The best known is Wright's Fst, later expanded by Cockerham and Weir, which is very widely used as a measure of separation between populations. More recently a multitude of other measures have been developed, from Gst to D, all with different features and disadvantages. One thing these measures all have in common is that they are symmetric, which is to say that the Fst between population A and population B is the same as that between population B and population A. Following my work on GenePlot, a visualization tool for genetic assignment, I am now working on the development of an asymmetric measure, where the fit of A into B may not be the same as the fit of B into A. This measure will enable the detection of scenarios such as "subsetting", the relationship between a large, diverse population A and a smaller population B that has experienced genetic drift since being separated from A. The measure has several features that distinguish it from existing measures, and is constructed using the same saddlepoint approximation method underlying GenePlot, and which is used to approximate the multi-locus genetic distributions of populations.

**Smooth survival models**

Speaker: Mark Clements

Affiliation: Karolinska Institute, Stockholm, Sweden

When: Wednesday, 6 December 2017, 11:00 am to 12:00 pm

Where: 303.310

The R package rstpm2 includes implementations for two classes of smooth survival models. First, we have implemented generalized survival models, where (S(t|x))=eta(t,x) for a link function g, survival S at time t with covariates x and a linear predictor eta(t,x). We allow for penalized smoothers from the 'mgcv' package. These models include left truncation, right censoring, interval censoring, time-varying effects, gamma frailties and normal random effects. The models allow for the estimation of a variety of parameters, including time-dependent hazard ratios, survival differences, standardised survival and attributable fractions. Second, we have recently implemented smooth accelerated failure time models, such that g(S(t|x))= eta0(log(t)-eta(t,x)) for a baseline linear predictor eta0. This model includes time-dependent acceleration factors and a variety of estimators.

**Topological structures are consistently overestimated in functional complex networks**

Speaker: Javier Cano

Affiliation: King Juan Carlos University

When: Wednesday, 22 November 2017, 3:00 pm to 4:00 pm

Where: 303.610

Functional complex networks are a powerful way of representing connections between the element composing a (complex) system, and to map the propagation of information between them; as such, they have successful been used to improve our understanding of, for instance, the human brain. While usually not taken into account, links are characterised by some degree of uncertainty, which can affect the final structure we observe. In order to quantify such effect, this study introduces a Bayesian reconstruction framework and validates it with real electroencephalography brain data. We show that disregarding such uncertainty introduces a bias that results in an overestimation of all topological structures, especially when only short time series are available.

Authors:

Massimiliano Zanin, Seddik Belkoura, Javier Gomez, Cesar Alfaro, Javier Cano

**Two locality properties in two dimensions**

Speaker: Dr. Jesse Goodman

Affiliation: University of Auckland

When: Wednesday, 15 November 2017, 3:00 pm to 4:00 pm

Where: 303-610

In two dimensions, many self-interacting processes are described by the Schramm-Loewner Evolution SLE(kappa), a family of random fractal path joining two boundary points of an underlying domain D. These continuous paths arise as the scaling limits of various discrete self-interacting paths, such as loop-erased random walk.

A self-interacting process has the locality property if it does not "feel" the boundary of its domain D until it hits the boundary. Among the two-dimensional processes known as Schramm-Loewner Evolution SLE(kappa), it is known that only one, SLE(6), satisfies the locality property. In this talk, I will describe the key properties that identify SLE(6) - the Domain Markov Property, conformal invariance, and the (classical) Locality Property - and introduce a "non-local" form of locality also satisfied by SLE(6), describing the behaviour of the process when it first encloses a target set.

**Analysis and Prediction of High-Dimensional Time Series**

Speaker: Fangyao Li

Affiliation: The University of Auckland

When: Wednesday, 8 November 2017, 3:00 pm to 4:00 pm

Where: 303-610

The need for predictions of high-dimensional time series arises with data from many applications, for instance, air pollution. To construct an overall predictor we use the matching pursuit algorithm (MPA), which selects a subset from a dictionary of possible predictors. Although there are theoretical results on the performance of MPA, there is no widely accepted stopping rule for the algorithm. We consider stopping rules using a new information theoretic (IT) criterion based on the degrees of freedom given by the trace of the hat matrix found at each MPA iteration. We compare the performance of IT criteria for different time series models using a simulation study.

We will also apply our IT criteria to model choice for air pollution data provided by National Institute of Water and Atmospheric Research, (NIWA).

**Predicting hotspots of nutrients in estuaries**

Speaker: Prof. Judi Hewitt

Affiliation: The University of Auckland

When: Wednesday, 1 November 2017, 3:00 pm to 4:00 pm

Where: 303-610

Making predictions of the impacts of stressors on ecological systems generally requires smart study designs and a range of statistical analyses. In particular analyses need to be able to take into account information available in spatial (and temporal) variability and the likelihood of non-linear responses. I illustrate this with a study on how the ability of a system to deal with nutrients may change with increasing nutrient concentrations. The study design nesting a manipulative experiment within a large scale spatial survey. The analyses included multiple regression, spatial pattern recognition and kriging to extrapolate results across an extensive area. The results demonstrate patchiness across a landscape in performance and the potential for changes in the location of hotspots with increasing nutrients.

**Estimating animal density with spatial capture-recapture**

Speaker: Dr. Ben Stevenson

Affiliation: University of Auckland

When: Wednesday, 25 October 2017, 3:00 pm to 4:00 pm

Where: 303-610

Spatial capture-recapture (SCR) methods emerged just over a decade ago, and quickly filled a niche in ecological statistics. SCR's versatility underlies its success: recordings of whale song across stretches of the Pacific Ocean, video images of leopards roaming the African plains, and genetic traces of passing rodents are all staple ingredients that SCR converts into estimates of animal density and distribution. In this talk, I outline some of my contributions to SCR methodology, with a particular focus on acoustic surveys.

**Computing Entropies with Nested Sampling**

Speaker: Dr. Brendon Brewer

Affiliation: University of Auckland

When: Wednesday, 18 October 2017, 3:00 pm to 4:00 pm

Where: 303-610

The Nested Sampling algorithm, invented in the mid-2000s by John Skilling, represented a major advance in Bayesian computation. Whereas Markov Chain Monte Carlo (MCMC) methods are usually effective for sampling posterior distributions, Nested Sampling also calculates the marginal likelihood integral used for model comparison, which is a computationally demanding task. However, there are other kinds of integrals that we might want to compute. Specifically, the entropy, relative entropy, and mutual information, which quantify uncertainty and relevance, are all integrals whose form is inconvenient in most practical applications. I will present my technique, based on Nested Sampling, for estimating these quantities for probability distributions that are only accessible via MCMC sampling. This includes posterior distributions, marginal distributions, and distributions of derived quantities. I will present an example from experimental design, where one wants to optimise the relevance of the data for inference of a parameter.

**Externalities, optimization and regulation in queues**

Speaker: Prof. Moshe Haviv

Affiliation: Department of Statistics and the Federmann Center for the Study of Rationality, The Hebrew University of Jerusalem

When: Wednesday, 20 September 2017, 3:00 pm to 4:00 pm

Where: 303-610

The academic research on queues deals mostly with waiting. Yet, the externalities , namely the added waiting time an arrival inflicts on others, are of no less, if not of more, importance. The talk will deal mostly with how the analysis of the externalities leads to the socially optimal behavior, while solving queueing dilemmas such as whether or not to join a queue, when to arrive to a queue, or from which server to seek service at. Customers, being selfish, do not mind the externalities they impose on others. We show how in queues too, internalizing the externalities leads to self regulation. In this setting selecting the service regime is one of the tools for regulation.(Joint with Binyamin Oz)

**Australian initiatives for enticing next-gen statisticians**

Speaker: A/Prof Peter Howley

Affiliation: School of Mathematical and Physical Sciences/Statistics, The University of Newcastle

When: Friday, 8 September 2017, 2:00 pm to 3:00 pm

Where: 303-B05

This talk will be in two parts, the first will discuss recent initiatives to improve statistics education across Australia. The second will discuss collaborative research on health-care standards and improving health-care systems aided by Bayesian hierarchical modelling.

Peter Howley (https://www.newcastle.edu.au/profile/peter-howley) is Chair of the Statistical Society of Australia’s Statistical Education Section and Associate Professor in Statistics at Newcastle. He will describe recently developed national statistical initiatives and resources aimed to increase access to and support within higher education. One of the initiatives recently was awarded the ISI’s 2017 Best Cooperative Project Award. The resources comprise short animated videos, interactive exercises and extension documents developing statistical threshold concepts, and tools to assist primary and secondary school teachers and students engage with statistics via a national schools poster competition, including industry expert and ‘how to deliver’ videos. These aim to enable students and teachers to feel the interdisciplinary and pervasive nature and value of statistics, and make the field of statistics accessible. The exponentially increasing annual numbers of students participating and positive feedback received is very promising. The teaming up of Sustainability, Statistics and STEM for a road trip to remote and rural NSW schools will be discussed, as will recent developments in teaching at The University of Newcastle.

Most of the medical work is done with the Australian Council on Healthcare Standards and Taipei Medical University.

https://www.newcastle.edu.au/profile/peter-howley

**Combined nonparametric tests**

Speaker: Asso. Prof. Stefano Bonnini

Affiliation: Department of Economics and Management, Center for Modelling Computing and Simulations, University of Ferrara (Italy)

When: Friday, 14 July 2017, 11:00 am to 12:00 pm

Where: 303-B07

In several application problems, the phenomena under study are multidimensional. Therefore, these phenomena are represented by multivariate variables. In multivariate inferential problems, such as tests of hypotheses for comparing two or more populations, where data are assumed to be determinations of random variables, standard parametric methods (e.g. likelihood ratio test, Hotelling T2 test, ...), when applicable, require stringent assumptions that make them non robust and often inappropriate.

The main limits of these methods are:

1) the assumed underlying distribution is not always plausible or cannot be tested (especially for small sample sizes);

2) the dependence structure (apart from the infrequent case of independent variables) must be formally defined and estimated. For example, in the case of normal multivariate distributions, it is necessary to estimate the covariance matrix or correlation matrix.

The proposed combined nonparametric test, is based on the breakdown of the problem into as many sub-problems as many variables, and on the application of a univariate permutation test for each subproblem. The combination of the permutation significance level functions of each test provides a unique test statistic (and a unique p-value) to solve the multivariate problem.

The test is therefore distribution-free and the dependence of partial tests doesn