Department of Statistics


Limiting distribution of particles near the frontier in the catalytic branching Brownian motion

Speaker: Sergey Bocharov

Affiliation: Zhejiang University

When: Wednesday, 17 April 2019, 3:00 pm to 4:00 pm

Where: 303-610

Catalytic branching Brownian motion (catalytic BBM) is a spatial population model in which individuals, referred to as particles, move in space according to the law of standard Brownian motion and split in two at a spatially-inhomogeneous branching rate \beta \delta_0(.), where \delta_0(.) is the Dirac delta measure and \beta > 0 is some constant.

We shall discuss various asymptotic results concerning the spatial spread of particles in such a model and in particular show that the distribution of particles near the frontier converges to a Poisson point process with a random intensity.

Data Science on Music Data

Speaker: Prof. Claus Weihs

Affiliation: TU Dortmund University, Germany

When: Wednesday, 17 April 2019, 11:00 am to 12:00 pm

Where: 303-310

The talk discusses the structure of the field of data science and substantiates the key premise that statistics is one of the most important disciplines in data science and the most important discipline to analyze and quantify uncertainty. As an application, the talk demonstrates data science methods on music data for automatic transcription and automatic genre determination, both on the basis of signal-based features from audio recordings of music pieces.


Claus Weihs und Katja Ickstadt (2018): Data Science: The Impact of Statistics; International Journal of Data Science and Analytics 6, 189–194

Claus Weihs, Dietmar Jannach, Igor Vatolkin und Günter Rudolph (Eds.)(2017): Music Data Analysis: Foundations and Applications; CRC Press, Taylor & Francis, 675 pages

Bayesian modelling of forensic evidence

Speaker: Jason Wen

Affiliation: Department of Statistics, University of Auckland

When: Wednesday, 27 March 2019, 3:00 pm to 4:00 pm

Where: 303-610

The term trace evidence is used to describe (usually non-biological) evidence left at a crime scene, or perhaps recovered from a person of interest (POI). There are many questions of interest, but they usually ultimately devolve to “How much more likely does this evidence make it that the accused is guilty?” The answer to this question, and the central quantity of interest for statisticians involved in the statistical interpretation of evidence, is the likelihood ratio. This is usually given as

LR = Pr(Evidence|Hp) Pr(Evidence|Hd )

where Hp and Hd are competing explanations for the evidence. Evaluation of the LR depends on statistical models which represent forensic expert knowledge, and incorporate experimental and observational data. My the- sis is concerned with developing and refining models in a number of trace evidence disciplines.

In this presentation, I will present my initial work on a Bayesian semi-parametric model - an infinite mixture model–for the distribution of recovered glass. This work can be seen as an expansion, or refinement, of models for the denominator of the likelihood ratio.

I will also outline the directions I intend to take in my future work.

Bayesian demographic estimation and forecasting

Speaker: John Bryant

Affiliation: Research Associate, University of Waikato

When: Wednesday, 27 March 2019, 11:00 am to 12:00 pm

Where: 303-310

Demographers face some tricky statistical problems. Users of demographic statistics are demanding estimates and forecasts that are as disaggregated as possible, including not just age and sex, but also ethnicity, area, income, and much else besides. Administrative data, and big data more generally, offer new possibilities. But these new data sources tend to be noisy, incomplete, and mutually inconsistent. The talk will describe a long-term project to develop Bayesian methods, and the associated software, to address these problems. We will look at some concrete examples, including a probabilistic demographic account for New Zealand.

About the Speaker:

John Bryant has worked at universities in New Zealand and Thailand, at the New Zealand Treasury, and (until February 2019) at Stats NZ. He has a PhD in Demography from the Australian National University. He is the author, with Junni Zhang, of the book Bayesian Demographic Estimation and Forecasting, published by CRC Press in 2018.

Empowering Signal Processing Using Machine Learning: Applications in Speech Reconstruction and Forensic Investigations

Speaker: Hamid Sharifzadeh

Affiliation: Senior Lecturer, School of Computing at Unitec Institute of Technology

When: Wednesday, 20 March 2019, 11:00 am to 12:00 pm

Where: 303-310

Advances in machine learning find rapid adoption in many fields ranging from communications, signal processing, and automotive industry to healthcare, law, and forensics. In this talk, we focus on two research projects revolved around machine learning and signal processing: a) forensic investigation through vein pattern recognition, b) speech reconstruction for aphonic patients. Both research projects rely on cutting edge machine learning algorithms while applied into speech and image processing areas, one for rehabilitation purposes for post-laryngectomised patients and the other for helping law enforcement agencies identify criminals/victims in child exploitation material.

Bio: Hamid Sharifzadeh is currently a Senior Lecturer in the School of Computing at Unitec Institute of Technology. Hamid completed his Ph.D. at Nanyang Technological University (NTU), Singapore in 2012. Following the completion of his studies, he undertook two postdoctoral fellowships at NTU focussing on the areas of speech and image processing before joining Unitec as a Lecturer in 2014.

Multi-kernel linear mixed model with adaptive lasso for complex phenotype prediction

Speaker: Yalu Wen

Affiliation: The University of Auckland

When: Wednesday, 6 March 2019, 11:00 am to 12:00 pm

Where: 303-310

Linear mixed models (LMMs) and their extensions have been widely used for prediction purposes with high-dimensional genomic data. However, LMMs used to date have lacked theoretical justification for selecting disease predictive regions and failed to account for complex genetic architectures. In this work we present a multi-kernel linear mixed model with adaptive lasso to predict phenotypes using high-dimensional data. We developed an efficient algorithm for parameter estimation and also established the asymptotic properties when only one dependent observation is available. The proposed KLMM-AL can account for heterogeneous effect sizes from different genomic regions, capture both additive and non-additive genetic effects, and adaptively and efficiently select predictive genomic regions. Through simulation studies, we demonstrate that KLMM-AL is overall comparable if not better than most of the existing methods and KLMM-AL achieves high sensitivity and specificity of selecting predictive genomic regions. KLMM-AL is further illustrated by an application to a real data set.

Variable Selection and Dimension Reduction methods for high dimensional and Big-Data Set.

Speaker: Benoit Liquet

Affiliation: University of Pau & Pays de L'Adour, ACEMS (QUT)

When: Wednesday, 27 February 2019, 11:00 am to 12:00 pm

Where: 303-310

It is well established that incorporation of prior knowledge on the structure existing in the data for potential grouping of the covariates is key to more accurate prediction and improved interpretability. In this talk, I will present new multivariate methods incorporating grouping structure in Bayesian and frequentist methodology for variable selection and dimension reduction to tackle the analysis of high dimensional and Big-Data set.

Data depth and generalized sign tests

Speaker: Christine Mueller

Affiliation: Department of Statistics, University of Dortmund, Germany

When: Wednesday, 20 February 2019, 11:00 am to 12:00 pm

Where: 303-310

Data depth is one of the approaches to generalize the outlier robust median to more complex situations. In this talk, I show how this can be done for multivariate data and regression. The concept of half space depth and simplicial depth are crucial for the generalization for multivariate data while the concept of nonfit was used for defining regression depth and simplicial regression depth. Thereby, simplicial regression depth leads often to a so-called sign depth and corresponding generalized sign tests. These generalized sign tests can be used as soon as residuals are available and are much more powerful than the classical sign test. I demonstrate this for

Critical issues in recent guidelines

Speaker: Prof Markus Neuhaeuser

Affiliation: Dept. of Mathematics and Technology, Koblenz University of Applied Sciences, Remagen, Germany

When: Tuesday, 12 February 2019, 11:00 am to 12:00 pm

Where: 303-B09

To increase rigor and reproducibility, some medical journals provide detailed guidelines for experimental design and statistical analysis. Although this development is positive, quite a few recommendations are critical because they reduce the power or are indefensible from a statistical point of view. This is shown using two current examples, namely the 2017 published checklist of the journal Circulation Research [Circulation Research. 2017; 121:472-9] and the 2018 published guideline of the British Journal of Pharmacology [British Journal of Pharmacology. 2018; 175:987-93]. Topics discussed are the analysis of variance in case of heteroscedasticity, the question of balanced sample sizes, the power calculation including so-called post-hoc power analyses, minimum group sizes, and the t test for small samples.

Modelling block arrivals in the Bitcoin blockchain

Speaker: Peter Taylor

Affiliation: University of Melbourne

When: Wednesday, 30 January 2019, 3:00 pm to 4:00 pm

Where: 303-610

Modelling block arrivals in the Bitcoin blockchain
 In 2009 the pseudononymous Satoshi Nakamoto published a short paper on the Internet, together with accompanying software, that proposed an `electronic equivalent of cash’ called Bitcoin. At its most basic level, Bitcoin is a payment system where transactions are verified and stored in a distributed data structure called the blockchain. The Bitcoin system allows electronic transfer of funds without the presence of a trusted third party. It achieves this by making it `very hard work’ to create the payment record, so that it is not computationally-feasible for a malicious player to subsequently repudiate a transaction and create the forward history without it.

The Nakamoto paper contained a simple stochastic model, used to show that the above-mentioned malicious player would be very unlikely to succeed. Unfortunately, this calculation contained an error, which I shall discuss and show how to correct.

The Bitcoin payment record is stored in a data structure called the blockchain. Blocks are added to this structure by `miners’ working across a distributed peer-to-peer network to solve a computationally difficult problem. With reference to historical data, I shall describe the block mining process, and present a second stochastic model that gives insight into the block arrival process.

Finally, I shall make some brief comments about how stochastic modelling can be used to address the current concerns that the transaction processing rate of the Bitcoin system is not high enough.


Please give us your feedback or ask us a question

This message is...

My feedback or question is...

My email address is...

(Only if you need a reply)

A to Z Directory | Site map | Accessibility | Copyright | Privacy | Disclaimer | Feedback on this page