Department of Statistics


An Overview: Data Analysis for Space-based Gravitational Wave Observations

Speaker: Ollie Burke


When: Thursday, 29 February 2024, 3:00 pm to 4:00 pm

Where: 303-310


Current observations through ground-based detectors of gravitational waves (GWs) are having a pronounced effect on the understanding of our universe. Due to the presence of the earth, ground-based detectors are limited in sensitivity to lower frequency GWs, losing access to the rich science that can be reaped from higher mass black hole coalescences. The proposed space-based detector, the Laser Interferometer Space Antennae (LISA), eliminates sources of noise from the earth and will provide access to observations of GWs in the rich mHz frequency band, thus higher mass binaries. The aim of this talk is to be pedagogical in nature: reviewing GWs up to the first detection GW150914, providing an overview of LISA specific sources with a simple example of Bayesian inference applied to a toy GW model. We will finish on the prospects for the LISA instrument by discussing both current work and future challenges in the context of data analysis.

Healthcare and Public Health Monitoring and Management

Speaker: Kwok-Leung Tsui

Affiliation: Virginia Polytechnic Institute and State University

When: Monday, 4 March 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract :

Due to the advancement of computation power, sensor technologies, and data collection tools, the field of healthcare and public health monitoring and management have been evolved over the past several decades with different names under different application domains, such as statistical process control (SPC), process monitoring, health surveillance, prognostics and health management (PHM), personalized medicine, etc. There are tremendous opportunities in interdisciplinary research of system monitoring through integration of SPC, system informatics, data analytics, PHM, and personalized health management. In this talk we will present our views and experience in the evolution of systems monitoring and health management, its challenges and opportunities, as well as its applications in both healthcare surveillance and public health management.

About the speaker :

Kwok L Tsui is professor in the Grado Department of industrial and Systems Engineering at Virginia Polytechnic Institute and State University. Tsui’s current research interests include data science and data analytics, surveillance in healthcare and public health, personalized health monitoring, prognostics and systems health management, calibration and validation of computer models, process control and monitoring, and robust design and Taguchi methods.

Two Applications of Regression Averaging

Speaker: Norman Matloff

Affiliation: University of California

When: Thursday, 7 March 2024, 3:00 pm to 4:00 pm

Where: 303-310

Optimising Healthcare Pathways for Elderly Patients: Wellbeing Equity and Efficiency

Speaker: Yvonne Li

Affiliation: UoA

When: Monday, 19 February 2024, 2:00 pm to 3:00 pm

Where: 303-G14


This work explores enhancing healthcare for elderly patients in Aotearoa New Zealand through queueing theory and simulations, responding to the demographic shift towards an aging population. It addresses the need for more effective and equitable healthcare, considering workforce shortages and access disparities. By developing mathematical models for patient flow, waiting time, and resource allocation, this research underscores the necessity of models that adjust priorities and routing to alleviate service congestion, aiming to improve resource use, access equality, and elderly patient wellbeing.

This is Yvonne's PYR seminar.

Close-kin mark-recapture methods to estimate demographic parameters of mosquitoes

Speaker: John Marshall

Affiliation: University of California, Berkeley

When: Wednesday, 31 January 2024, 3:00 pm to 4:00 pm

Where: 303-310

Abstract :

Close-kin mark-recapture (CKMR) methods have recently been used to infer demographic parameters such as census population size and survival for fish of interest to fisheries and conservation. These methods have advantages over traditional mark-recapture methods as the mark is genetic, removing the need for physical marking and recapturing that may interfere with parameter estimation. For mosquitoes, the spatial distribution of close-kin pairs has been used to estimate mean dispersal distance, of relevance to vector-borne disease transmission and novel biocontrol strategies. Here, we extend CKMR methods to the life history of mosquitoes and comparable insects. We derive kinship probabilities for mother-offspring, father-offspring, full-sibling and half-sibling pairs, where an individual in each pair may be a larva, pupa or adult. A pseudo-likelihood approach is used to combine the marginal probabilities of all kinship pairs. To test the effectiveness of this approach at estimating mosquito demographic parameters, we develop an individual-based model of mosquito life history incorporating egg, larva, pupa and adult life stages. The simulation labels each individual with a unique identification number, enabling close-kin relationships to be inferred for sampled individuals. Using the dengue vector Aedes aegypti as a case study, we find the CKMR approach provides unbiased estimates of adult census population size, adult and larval mortality rates, and larval life stage duration for logistically feasible sampling schemes. Considering a simulated population of 3,000 adult mosquitoes, estimation of adult parameters is accurate when ca. 40 adult females are sampled biweekly over a three month period. Estimation of larval parameters is accurate when adult sampling is supplemented with ca. 120 larvae sampled biweekly over the same period. The methods are also effective at detecting intervention-induced increases in adult mortality and decreases in population size. As the cost of genome sequencing declines, CKMR holds great promise for characterizing the demography of mosquitoes and comparable insects of epidemiological and agricultural significance.

About the speaker :

John Marshall is a Professor in Residence of Biostatistics and Epidemiology whose research supports efforts to control and eliminate mosquito-borne diseases such as malaria, dengue, and Zika virus broadly.

Self-reinforced Knothe--Rosenblatt rearrangements for high-dimensional stochastic computation

Speaker: Tiangang Cui

Affiliation: University of Sydney

When: Wednesday, 31 January 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract :

Characterizing intractable high-dimensional random variables is a fundamental task in stochastic computation. It has broad applications in statistical physics, machine learning, uncertainty quantification and beyond. The recent surge of transport maps offers new insights into this task by constructing variable transformations that couple intractable random variables with tractable reference random variables. In this talk, we will present numerical methods that build the Knothe--Rosenblatt (KR) rearrangement of a family of transport maps in a triangular form in high dimensions. We first design function approximation tools to realize the KR rearrangement that ensures the order-preserving property with controlled statistical errors. We then introduce a self-reinforced procedure to adaptively precondition the construction of KR rearrangements to significantly expand their capability of handling random variables with complicated nonlinear interactions and concentrated density functions. We demonstrate the efficiency of the resulting self-reinforced KR rearrangements on applications in statistical learning and uncertainty quantification, including parameter estimation for dynamical systems, PDE-constrained inverse problems, and rare event estimation.

About the speaker :

Tiangang is a Senior Lecturer of the School of Mathematics and Statistics, University of Sydney. His research interests are broadly in computational mathematics for scientific machine learning and data science. I develop mathematically rigorous computational methods for statistical inverse problems, data assimilation and uncertainty quantification. These methods aim to optimally learn hidden structures and driven factors of complex mathematical models from data for issuing certified model predictions and making risk-averse decisions.

Modelling Self-Excitement in Ecology

Speaker: Alec van Helsdingen

Affiliation: UoA

When: Friday, 24 November 2023, 11:00 am to 12:00 pm

Where: 303-153

Self-excitement is a phenomenon where one event induces others to occur later (e.g. earthquake aftershocks), and may be modelled with the Hawkes process. This seminar will focus on two applications in ecological statistics.

The first application focuses on Spatial capture-recapture (SCR), which is a method to estimate animal populations from sighting data (e.g. motion-triggered cameras). Standard SCR methods assume animal movements are temporally independent, but this is unrealistic as an animal is most likely to be close to where it was last seen. Our proposed solution is based on the Hawkes process and makes the rates of detection of a given animal a function of not only the location, but also on where and when the animal was last seen. This leads to a detection of an animal “self-exciting” detections at that or nearby cameras. We will show that our model gives more accurate results than traditional SCR in situations where SCR population estimates may be negatively biased.

The second application models cues emitted by sperm whales. The times of these cues form a temporal point pattern that is clearly self-exciting, but the Hawkes process is too restrictive because the clicks do not adhere to the assumptions of the Poisson process. Specifically, they are more evenly spaced (under-dispersed) than expected. Motivated by this example, we have developed a framework for incorporating under-dispersion and over-dispersion into the Hawkes process. We use a Weibull rather an exponential distribution to model the time between events, giving more flexibility and a better fit. We use our new formulation to model the cues of an individual whale, confirm our intuition that the cues are under-dispersed, and quantify the relationship between the cue rate and covariates.

This is the PYR seminar.

An automatic method for the identification of cycles in Covid-19 time series data

Speaker: Miaotian (Vivian) Li

Affiliation: UoA

When: Thursday, 23 November 2023, 10:00 am to 11:00 am

Where: 303-148

Abstract :

All previous methods for the identification of cycles in Covid-19 daily and weekly data involve a subjective interpretation of the results. This poses difficulties for researchers interested in conducting a comprehensive study which analyzes the presence of the cycles for each country/territory/area (CTA). During the first year of PhD studies, we have designed an algorithm that detects automatically the fundamental period T0 and the harmonics T0/2,...,T0/5, where T0=7 days for daily data and T0=52 weeks for weekly data. We have tested the new algorithm by applying it to the time series from 236 CTA's, where World Health Organization (WHO) collected the Covid-19 data. The detection results we have obtained confirm the findings previously reported by other researchers.

In this talk, we will present all the details of our algorithm and comment on the results obtained in experiments with Covid-19 time series data. We will also discuss a proposal for evaluating the dissimilarity between the time series collected for two different CTA’s.

This is the PYR seminar.

Using Convolutional Autoencoders for Signal Detection of Extreme Mass Ratio Inspirals Detected by the LISA Mission

Speaker: Amin Boumerdassi

Affiliation: UoA

When: Tuesday, 21 November 2023, 2:00 pm to 3:00 pm

Where: 303-148

Extreme Mass Ratio Inspirals (EMRIs) are gravitational wave (GW) events produced by the mergers of pairs of massive objects such as black holes and neutron stars whose mass ratio is >10,000. Generally, GWs are caused by the acceleration of masses resulting in the distance between two points to oscillate in time. These oscillations can be detected by measuring the interference of light beams which propagate at a fixed speed but traverse varying distances when a GW passes through. Traditionally, the detection of GW events was performed through matched filtering in which a detected signal would be compared to millions of variations of a template model to find the closest fit to the detected signal. In the case of EMRIs, this is computationally unfeasible owing to the large parameter space of EMRI waveform models, years-long waveform duration, and large file size. My work attempts to overcome these problems by leveraging recent developments in the rapid generation of EMRI waveforms, paired with machine learning (ML) techniques which can perform a given task quickly and with little computational requirement. The ML model of choice is the convolutional autoencoder which learns to map input data to a low-dimensional representation, and back into a reconstruction of the original input. A trained autoencoder is expected to poorly reconstruct data that is not of the same kind as its training data. Hence, the problem of detecting EMRI signals can be framed as an anomaly detection problem in which non-EMRI signals are treated as anomalies to be poorly reconstructed by an autoencoder trained to reproduce EMRIs. This could be used to analyse EMRIs detected by the space-based GW detector LISA, due to be launched in the early 2030s. Successful detections of EMRIs will open the door for novel tests of General Relativity, the theory of gravity which itself led to the prediction of GWs.

This is the PYR seminar.

Estimating Customer Impatience in a Service System With Unobserved Balking

Speaker: Prof. Michel Mandjes

Affiliation: Universiteit van Amsterdam

When: Friday, 3 November 2023, 11:00 am to 12:00 pm

Where: 303-G14


In this talk I'll discuss a service system in which arriving customers are provided with information about the delay they will experience. Based on this information, they decide to wait for service or leave the system. Specifically, every customer has a patience threshold, and they balk if the observed delay is above the threshold. The main objective is to estimate the parameters of the customers' patience-level distribution and the corresponding potential arrival rate, using knowledge of the actual queue-length process only. The main complication and distinguishing feature of our setup lies in the fact that customers who decide not to join are not observed, and remarkably, we manage to devise a procedure to estimate the underlying patience and arrival rate parameters. The underlying model is a multiserver queue with a Poisson stream of customers, enabling evaluation of the corresponding likelihood function of the state-dependent effective arrival process.

We establish strong consistency of the MLE and derive the asymptotic distribution of the estimation error. Several applications and extensions of the method are discussed. The performance is further assessed through a series of numerical experiments. By fitting parameters of hyper-exponential and generalized hyperexponential distributions, our method provides a robust estimation framework for any continuous patience-level distribution.

The last part of the talk will discuss the setting in which the arrival process is not constant but follows a periodic pattern (say, following a daily pattern) -- in this setup various technical hurdles have to be overcome, primarily related to establishing an appropriate regeneration structure.

About the speaker : Prof. Mandjes is a professor in probability and operations research at the University of Leiden. His research interest include stochastic networks, queueing theory, stochastic processes, operations research, large deviations, simulation, performance.

Modelling Warranty Claims using Geometric-like Processes

Speaker: Sarah Marshall

Affiliation: UoA

When: Wednesday, 1 November 2023, 2:00 pm to 3:00 pm

Where: 303-310


The geometric process can be used to model the occurrence of events with an underlying monotonic trend. This type of trend can be observed in many practical problems in reliability, in particular in the recurrent failures of ageing repairable systems. When both the operational and repair times are of interest and are impacted by ageing, the alternating geometric process can be used. Two approaches for computing the mean (i.e. the expected number of events by a given time) and the variance of the alternating geometric process are presented and applied to warranty cost analysis. Various extensions of the geometric process have been proposed to provide greater flexibility in situations involving trends. This talk provides an overview of the related geometric-like processes, focusing on the alpha series process. The alternating geometric process and the alternating alpha series process are applied to warranty data from an automotive manufacturer and are shown to be superior to an alternating renewal process.

About the speaker : Dr Sarah Marshall is a Senior Lecturer in the Department of Information Systems and Operations Management at the University of Auckland Business School, New Zealand. Her research focuses on the use of operations research to address problems of interest to business and industry. She has expertise in deterministic and stochastic modelling, simulation, and analytics, and has applied these across a variety of domains, such as remanufacturing, healthcare, rainfall modelling and water resource management. Currently, her work focusses on the use of geometric-like processes to model ageing repairable systems, with applications in reliability and warranty analysis.

An Automatic Finite-Sample Robustness Check: Can Dropping a Little Data Change Conclusions?

Speaker: Tamara Broderick

Affiliation: Massachusetts Institute of Technology

When: Wednesday, 1 November 2023, 11:00 am to 12:00 pm

Where: 303-310

Abstract : Practitioners will often analyze a data sample with the goal of applying any conclusions to a new population. For instance, if economists conclude microcredit is effective at alleviating poverty based on observed data, policymakers might decide to distribute microcredit in other locations or future years. Typically, the original data is not a perfect random sample from the population where policy is applied -- but researchers might feel comfortable generalizing anyway so long as deviations from random sampling are small, and the corresponding impact on conclusions is small as well. Conversely, researchers might worry if a very small proportion of the data sample was instrumental to the original conclusion. So we propose a method to assess the sensitivity of statistical conclusions to the removal of a very small fraction of the data set. Manually checking all small data subsets is computationally infeasible, so we propose an approximation based on the classical influence function. Our method is automatically computable for common estimators. We provide finite-sample error bounds on approximation performance and a low-cost exact lower bound on sensitivity. We find that sensitivity is driven by a signal-to-noise ratio in the inference problem, does not disappear asymptotically, and is not decided by misspecification. Empirically we find that many data analyses are robust, but the conclusions of several influential economics papers can be changed by removing (much) less than 1% of the data.

About the speaker : Tamara is a A/Professor in the Electrical Engineering and Computer Science Department, MIT and, was awarded several honors including the Evelyn Fix Memorial, Savage Award, National Science Foundation Career Award. She works in the areas of machine learning and statistics, particularly in Bayesian statistics and graphical models with an emphasis on scalable, nonparametric, and unsupervised learning.

Cutting Feedback with copula models

Speaker: David Nott

Affiliation: National University of Singapore

When: Wednesday, 25 October 2023, 3:00 pm to 4:00 pm

Where: 303-310

Abstract : Complex models are often specified through a collection of coupled submodels or "modules". Bayesian inference for such models is attractive in principle, but the presence of a misspecified module can adversely affect inferences about parameters beyond those appearing in the problematic module. "Cutting feedback" is a modified Bayesian inference method which attempts to mitigate the impact of such suspect modules. In this talk we consider cutting feedback for copula models, which are multivariate models defined by separately specifying marginal distributions and the dependence structure through a copula function. We treat the marginals and the copula function as two distinct modules for modular Bayesian inference, and consider two types of cut posterior distributions. The first limits the influence of a misspecified copula on inference for the marginals, which is a Bayesian analogue of the popular Inference for Margins (IFM) estimator. The second limits the influence of misspecified marginals on inference for the copula parameters by using a rank likelihood to define the cut model. Properties of these cut posterior distributions and their computation will be discussed, and the efficacy of the new methodology demonstrated for a substantive multivariate time series application from macroeconomic forecasting. In the latter, cutting feedback from misspecified marginals improves posterior inference and predictive accuracy greatly, compared to conventional Bayesian inference. This is joint work with Weichang Yu, Michael Smith and David Frazier.

About the speaker : David is a A/Professor of the Department of Statistics and Applied Probability, the National University of Singapore. His research areas are Bayesian model selection, model misspecification, Bayesian nonparametrics, and approximations.


Please give us your feedback or ask us a question

This message is...

My feedback or question is...

My email address is...

(Only if you need a reply)

A to Z Directory | Site map | Accessibility | Copyright | Privacy | Disclaimer | Feedback on this page