Department of Statistics


There are no seminars to show.

Past seminars

Integrative analysis of high-dimensional data with applications to soil microbiome data.

Speaker: Innocenter Amima

Affiliation: Department of Statistics, Auckland University

When: Wednesday, 2 December 2020, 11:00 am to 12:00 pm

Where: 303-310

The Vineyard Ecosystem (VE) project intends to determine the long-term effects of management practices on commercial grapevine productivity and vine longevity. This novel project uses an ecology approach to understand the interconnections between components within the vineyard ecosystems.

Microbiome count data are high dimensional, compositional and over-dispersed with many zero abundances (~50-90% of all the values). Statistical methods have been proposed to perform dimension reduction however, some methods fail to account for microbiome properties. For our preliminary analysis, we used principal coordinate analysis (PCoA). Given that no species-specific parameters were estimated, it was a challenge identifying species present in the components. In this talk, preliminary results from the VE project will be discussed together with future research plans. I will briefly introduce factor analysis (FA), a model-based approach, for dimension reduction and parsimonious modelling.

Bayes in the time of Big Data

Speaker: Andrew Holbrook

Affiliation: department of biostatistics, University of California, Los Angeles

When: Thursday, 12 November 2020, 1:00 pm to 2:00 pm

Where: 303-610

Abstract: Big Bayes is the computationally intensive co-application of big data and large, expressive Bayesian models for the analysis of complex phenomena in scientific inference and statistical learning. Standing as an example, Bayesian multidimensional scaling (MDS) can help scientists learn viral trajectories through space and time, but its computational burden prevents its wider use. Crucial MDS model calculations scale quadratically in the number of observations. We mitigate this limitation through massive parallelization using multi-core central processing units, instruction-level vectorization and graphics processing units (GPUs). Fitting the MDS model using Hamiltonian Monte Carlo, GPUs can deliver more than 100-fold speedups over serial calculations and thus extend Bayesian MDS to a big data setting. To illustrate, we employ Bayesian MDS to infer the rate at which different seasonal influenza virus subtypes use worldwide air traffic to spread around the globe. We examine 5392 viral sequences and their associated 14 million pairwise distances arising from the number of commercial airline seats per year between viral sampling locations. To adjust for shared evolutionary history of the viruses, we implement a phylogenetic extension to the MDS model and learn that subtype H3N2 spreads most effectively, consistent with its epidemic success relative to other seasonal influenza subtypes.

Dr Holbrook is a Assistant Professor in the department of biostatistics, University of California, Los Angeles. His research interest includes Bayesian statistics (theory and methods) and hierarchical modelling, computational statistics and high-performance computing, spatial epidemiology, Alzheimer’s disease.

The population properties of binary black holes with Bayesian hierarchical modelling

Speaker: Eric Thrane

Affiliation: Monash University in the School of Physics and Astronomy

When: Thursday, 22 October 2020, 1:00 pm to 2:00 pm

Where: 303-610

Bayesian inference finds elegant application in gravitational-wave astronomy thanks to the clear predictions of general relativity and the great simplicity with which gravitational-wave sources can be described. Gravitational-wave astronomers use Bayesian inference to solve a variety of problems, for example, to determine the masses of merging black holes and to work out the neutron star equation of state, which governs how matter behaves at the highest possible densities. As the catalog of gravitational-wave signals has grown to dozens, it is increasingly fruitful to apply the Bayesian method of hierarchical modelling to study the population properties of black holes and neutron stars. In this talk, which is pitched for a mixed audience of statisticians and physicists, I will discuss some of the most exciting discoveries from the field of gravitational-wave astronomy, and highlight how hierarchical modelling is used to answer emerging questions about the fate of massive stars and how black holes merge.

Prof Eric Thrane is a professor at Monash University in the School of Physics and Astronomy. His research interest includes the astrophysical inference using data from gravitational-wave observatories to answer questions such as: how do compact binaries form, what is the fate of massive stars, what is the nature of matter at the highest possible densities?

To summarise or not to summarise: A comparison of likelihood-free methods with and without summary statistics

Speaker: Chris Drovandi

Affiliation: School of Mathematical Sciences at the Queensland University of Technology (QUT)

When: Thursday, 15 October 2020, 1:00 pm to 2:00 pm

Where: 303-610

Likelihood-free methods are useful for parameter estimation of complex simulable models with intractable likelihood functions. Such models are prevalent in many disciplines including genetics, biology, ecology and cosmology. Likelihood-free methods avoid explicit likelihood evaluation by finding parameter values of the model that generate data close to the observed data. The general consensus has been that it is most efficient to compare datasets on the basis of a low dimensional informative summary statistic, sacrificing information loss for reduced dimensionality. More recently, researchers have explored various approaches for efficiently comparing empirical distributions in the likelihood-free context in an effort to avoid data summarisation. Here I will present some preliminary results comparing likelihood-free methods with and without data summarisation. This is joint work with David Frazier.

Dr Chris Drovandi is an Associate Professor in the School of Mathematical Sciences at the Queensland University of Technology (QUT), Aus. His research interests are in Bayesian algorithms for complex models, optimal Bayesian experimental design methods and the translation of Bayesian methods across many disciplines.

Subsampling MCMC: Bayesian inference for large data problems

Speaker: Matias Quiroz

Affiliation: School of Mathematical and Physical Sciences at the University of Technology Sydney

When: Thursday, 8 October 2020, 1:00 pm to 2:00 pm

Where: 303-610

Subsampling MCMC: Bayesian inference for large data problems. Abstract: This talk reviews our work on subsampling MCMC, a posterior sampling algorithm to speed up inference for large datasets by combining i) data subsampling and ii) the pseudo-marginal Metropolis-Hastings framework. I will outline the general methodology and discuss some recent developments that have enabled the method to i) produce ''exact'' inference, ii) be applied to more complex datasets, and iii) use more efficient likelihood estimators.

Dr Quiroz is a lecturer of the School of Mathematical and Physical Sciences at the University of Technology Sydney. His research interests lie in the area of Bayesian Statistics and particularly Bayesian computations, such as Monte Carlo methods and variational Bayes.

Bayesian functional regression for prediction and variable selection

Speaker: Daniel Kowal

Affiliation: Rice University

When: Thursday, 1 October 2020, 1:00 pm to 2:00 pm

Where: Zoom

As high-resolution monitoring and measurement systems generate vast quantities of complex and highly correlated data, functional data analysis has become increasingly vital for many scientific, medical, business, and industrial applications. Functional data are typically high dimensional, highly correlated, and may be measured concurrently with other variables of interest. In the presence of such complexity, Bayesian models are appealing: they can accommodate multiple sources of dependence concurrently, such as multivariate observations, covariates, and time-ordering, and provide full uncertainty quantification via the posterior distribution. However, key challenges remain: constructing scalable algorithms, providing sufficient modeling flexibility alongside much-needed regularization, and producing accurate predictions of functional data. In this talk, I will present new Bayesian models and algorithms for high-dimensional and dynamic functional regression. These methods are motivated by two applications: (1) selecting which—if any—items from a sleep questionnaire are predictive of intraday physical activity and (2) using dynamic macroeconomic variables to forecast interest rate curves. Model implementations are available in an R package at

Daniel Kowal from the Rice University, US. Dr. Kowal is a Dobelman Family Assistant Professor of the Rice University. His research area includes statistical methodology and algorithms for massive data sets with complex dependence structures, such as functional, time series, and spatial data.

Use of model reparametrization to improve variational Bayes

Speaker: Linda. S. L. Tan

Affiliation: National University of Singapore

When: Thursday, 24 September 2020, 1:00 pm to 2:00 pm

Where: Zoom

We propose using model reparametrization to improve variational Bayes inference for hierarchical models whose variables can be classified as global (shared across observations) or local (observation specific). Posterior dependence between local and global variables is minimized by applying an invertible affine transformation on the local variables. The functional form of this transformation is deduced by approximating the posterior distribution of each local variable conditional on the global variables by a Gaussian density via a second order Taylor expansion. Variational Bayes inference for the

reparametrized model is then obtained using stochastic approximation. Our approach can be readily extended to large datasets via a divide and recombine strategy. Using generalized linear mixed models, we demonstrate that reparametrized variational Bayes (RVB) provides improvements in both accuracy and convergence rate compared to state of the art Gaussian variational approximation methods.

On the Use of Dictionary Learning in Statistical Inference

Speaker: Xiaomeng Zheng

Affiliation: Department of Statistics, Auckland University

When: Wednesday, 16 September 2020, 11:00 am to 12:00 pm

Where: Zoom

Dictionary learning (DL) is an active research area in statistics and computer science. The problem addressed by DL is the representation of the data as a sparse linear combination of the columns of a matrix called dictionary. Both the dictionary and the sparse representations are learned from the data.

In this talk, we show how DL can be employed in the imputation of univariate and multivariate time series. For univariate time series, we also introduce an iterative method that is better suited for long sequences of missing data. In the multivariate case, the main contribution consists in the use of a structured dictionary. In all DL imputation methods that we propose, the size of the dictionary and the sparsity level of the representation are selected by using information theoretic criteria. We also evaluate the effect of removing the trend/seasonality before applying DL. We present the results of an extensive experimental study on real-life data. The positions of the missing data are simulated by applying two strategies: (i) sampling without replacement, which leads to isolated occurrences of the missing data, and (ii) sampling via Polya urn model that is likely to produce long sequences of missing data. In all scenarios, the novel DL-based methods compare favourably with the state-of-the-art.

All these results have been obtained during the first year of the PhD studies done under the supervision of Dr. Ciprian Doru Giurcaneanu and Dr. Jiamou Liu. In the second part of the talk, we will present the plan of the research for the next two years, which is mainly focused on designing novel DL algorithms for the case when the representation errors have a non- Gaussian distribution.

Spectral and heritability analysis of EEG time series data using a nested Dirichlet process

Speaker: Mark Fiecas

Affiliation: Division of Biostatistics, University of Minnesota

When: Thursday, 13 August 2020, 1:00 pm to 2:00 pm

Where: Via Zoom

Abstract: In this talk, we will analyze the spectral features of resting-state EEG time series data collected from twins enrolled in the Minnesota Twin Family Study (MTFS). Our goal is to calculate the heritability of the spectral features of the resting EEG data. Due to the twin design of the MTFS, the time series will have similar underlying characteristics across individuals. To account for this, we develop a Bayesian nonparametric modeling approach for estimating the spectral densities of the EEG data. In our methodology, we use Bernstein polynomials and a Dirichlet process (DP) to estimate each subject-specific spectral density. In order to estimate the spectral densities for the entire sample, we nest this model using a nested DP process. Thus, the top level DP clusters individuals with similar spectral densities and the bottom-level dependent DP fits a functional curve to the individuals within that cluster. We then extract relevant spectral features from the estimates of the spectral densities and estimate their heritability. This is joint work with Dr. Brian Hart (UnitedHealthGroup), Dr. Michele Guindani (UC Irvine), and Dr. Stephen Malone (Univ. of Minnesota).

Mark is an assistant professor, Division of Biostatistics, University of Minnesota. The focus of his research is to understand the structure and function of the human brain through the use of imaging technology. His research focuses on functional connectivity, imaging genetics and time series analysis.

Understanding surgical outcomes in New Zealand using large national data sets

Speaker: Luke Boyle

Affiliation: University of Auckland

When: Wednesday, 29 July 2020, 11:00 am to 12:00 pm

Where: 303-310

Weighing the benefits of surgery is complicated and involves many assumptions about future benefits and perceived risks. Shared decision making, where a patient and a clinician share responsibility for a clinical decision, is widely regarded as the best approach to weigh the risks and benefits of an operation. A clinician can provide clinical context and risk assessment, while a patient will explain their values and together, they reach a conclusion about how to proceed. A clinician’s assessment of risk largely comes from experience; however, it has been shown that models for mortality can better estimate risk than clinician assessments (Moonesinghe et al., 2013). Ultimately, the decision to undergo surgery needs to be the patient’s choice. Providing accurate and relevant information allows patients to make the best choice for themselves. We believe New Zealand is an ideal place to experiment with new ideas about risk communication and to develop new tools for assessing risk as we have access to high-quality, longitudinal, national datasets. This PhD project will investigate different approaches to providing information about risk to patients as well as considering how to fairly compare risk between groups.


Please give us your feedback or ask us a question

This message is...

My feedback or question is...

My email address is...

(Only if you need a reply)

A to Z Directory | Site map | Accessibility | Copyright | Privacy | Disclaimer | Feedback on this page