Professor Thomas Lumley

Profile Image
Chair in Biostatistics


Thomas Lumley attended Monash University (B.Sc.(Hons) in Pure Mathematics), the University of Oxford (M.Sc. in Applied Statistics) and the University of Washington, Seattle (PhD in Biostatistics). He spent twelve years on the faculty of the Department of Biostatistics at the University of Washington, and then moved to Auckland in 2010. He is still an Affiliate Professor at the University of Washington.

See Thomas Lumley's CV

Research | Current

Thomas' research interests include

  • Semiparametric models
  • Survey sampling
  • Statistical computing
  • Foundations of statistics
  • and whatever methodological problems his medical collaborators come up with -- currently, the design and analysis of a DNA resequencing study.


  • The survey package for R is a fairly comprehensive system for analysis of data from complex probability samples.
  • I have written a book on survey analysis, published by Wiley.

Postgraduate supervision

Potential student projects

  • Likelihood of the empirical distribution function as an approach to Bayesian analysis of survey data.
  • Two-phase subsampling designs for DNA sequencing: it is now feasible to do DNA sequencing of individual genes in thousands of people or whole exomes in hundreds of people. We need efficient ways to decide which people to sample from much larger existing studies.
  • Behavior of testing in the extreme tails of distributions, for genomics. Genome-wide association studies perform millions of separate association tests and so have a serious multiple-testing problem. We need to understand which tests have problems when run at thresholds like 0.00000005 and what fixes can be made
  • Survey software: design and implementation of various things for the survey package in R. Graphics, probability distributions, regression models, multivariate methods...
  • Software for teaching:design and implementation of interface to R for teaching introductory biostatistics
  • How many categories? Data on, say, hospital visits, can be categorized coarsely (eg lung problem) or much more finely (eg infection by pencillin-resistant Strep. pneumoniae). When the categories are too fine the sample size in each one is too small to see patterns; when they are too coarse, the patterns are masked by events that don't really belong. The idea is to build a tree structure that uses all levels of categorization simultaneously in a Bayesian model.

Selected publications and creative works (Research Outputs)

  • Li, A. H., Morrison, A. C., Kovar, C., Cupples, L. A., Brody, J. A., Polfus, L. M., ... Veeraraghavan, N. (2015). Analysis of loss-of-function variants and 20 risk factor phenotypes in 8,554 individuals identifies loci influencing chronic disease. Nature genetics, 47 (6), 640-642. 10.1038/ng.3270
  • Lumley, T., & Scott, A. (2015). AIC and BIC for modeling with complex survey data. Journal of Survey Statistics and Methodology, 3 (1), 1-18. 10.1093/jssam/smu021
  • Sitlani, C. M., Rice, K. M., Lumley, T., McKnight, B., Cupples, L. A., Avery, C. L., ... Psaty, B. M. (2015). Generalized estimating equations for genome-wide association studies using longitudinal phenotype data. Statistics in medicine, 34 (1), 118-130. 10.1002/sim.6323
  • Connolly, M. J., Boyd, M., Broad, J. B., Kerse, N., Lumley, T., Whitehead, N., & Foster, S. (2015). The Aged Residential Care Healthcare Utilization Study (ARCHUS): a multidisciplinary, cluster randomized controlled trial designed to reduce acute avoidable hospitalizations from long-term care facilities. Journal of the American Medical Directors Association, 16 (1), 49-55. 10.1016/j.jamda.2014.07.008
    Other University of Auckland co-authors: Susan Foster, Martin Connolly, Ngaire Kerse, Joanna Broad, Michal Boyd
  • Lin, H., Wang, M., Brody, J. A., Bis, J. C., Dupuis, J., Lumley, T., ... Reid, J. G. (2014). Strategies to design and analyze targeted sequencing data: cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study. Circulation. Cardiovascular genetics, 7 (3), 335-343. 10.1161/circgenetics.113.000350
  • Mercer, L., Wakefield, J., Chen, C., & Lumley, T. (2014). A comparison of spatial smoothing methods for small area estimation with sampling weights. Spatial Statistics10.1016/j.spasta.2013.12.001
  • Chen, H., Lumley, T., Brody, J., Heard-Costa, N. L., Fox, C. S., Cupples, L. A., & Dupuis, J. (2014). Sequence kernel association test for survival traits. Genetic Epidemiology, 38 (3), 191-197. 10.1002/gepi.21791
  • Lumley, T., & Scott, A. (2014). Tests for Regression Models Fitted to Survey Data. Australian and New Zealand Journal of Statistics, 56 (1), 1-14. 10.1111/anzs.12065

Contact details

Primary location

Level 3, Room 325
New Zealand