Dr Claudia Liliana Rivera Rodriguez

PhD Statistics

Profile Image


I was born in Colombia and did my undergraduate studies  at the National University of Colombia.  After graduating in 2012, I was awarded the University of Auckland doctoral scholarship and moved to NZ.  In May 2015, I finished my PhD studies and moved to Boston, USA to a Research fellow position at the departments of Biostatistics and Epidemiology, at Harvard University.  In 2017, I moved back to the deppartment of Statistics, The University of Auckland.

Research | Current

My research interests follow three general themes. They are highly motivated by my PhD and my post-doctoral research experiences. My main interest focuses on two-phase sampling designs and its application to the solution of different statistical issues. Currently I am particularly interested in developing a valid statistical inference methodology that admits clustered correlated data or non-independent sampling designs. In addition to this, I am also interested in the extension of such methods to longitudinal or routinely collected data.  I also have collaborative projects with researchers at the Schools of Medicine and the OUCRU.  My second statistical research interest is survival models such as proportional hazard models,  I  am interested in the implementation of such methods to complex scenarios like two-phase designs. My third research interest is more practical. It involves proposing strategies that help investigators carry out survey sampling designs in non-ideal conditions, specially where several logistic issues have to be accounted for. This is particularly the case in studies conducted in developing countries. This part of my research is directed by collaborative research on immunization across different countries, with applications to Honduras and Brazil. 

Teaching | Current

STATS740 and STATS730

Postgraduate supervision

Below is a list of potential Phd, Masters and Hns projects. 

Title: Masters or Hns- What can we do with routinely collected data on patients with Avascular Necrosis?

Routinely collected data has the potential to produce inexpensive research in a short period of time. RCD, however, should be used carefully when drawing conclusions about a population. This project aims to analyse data from health records on patients diagnosed with Avascular Necrosis.

The first objective is to characterize this population and additional common conditions in such patients. This requires to search through patients’ histories and identify such conditions.

The second objective is to compare models to study the risk/survival of hip replacements in such patients and to describe the utility of each of these models. Models include Cox models with the time-scale as time-on-study and Cox models with the time-scale as age.

A third objective is to evaluate the validity of the results. That is, to explore current literature on the area and to assess inexpensive approaches to evaluate the consistency of the data.

The project can be extended to a Masters project, but requires more work.

Requirements: R programming and survival data knowledge 


Title: Masters- Predicting the risk of hip and knee replacement in patients with arthritis using observational data.

Abstract: This project will analyze routinely collected data of patients with arthritis. For each of these patients, we will search future hospital discharges to identified those with knee and hip replacements and potential risk factors. Sampling weights will need to be incorporated in order to account for missingness. The project aims to use proportional hazard models (survival models) to study the risk of hip/knee replacement in patients with arthritis.
Requirements: R programming and survival data knowledge



Title: Masters or Hons- Evaluating case-control designs using routinely collected data.

Abstract: When conditions are uncommon, case-control studies generate a lot of information from relatively few subjects.   This project aims to evaluate different case-control designs for the analysis of regression models using routinely collected data. This will follow the RECORD guidelines (https://www.record-statement.org/checklist.php) for analysis of such data.

Initially, the evaluation will be done using simulations studies and once this has been achieved, the methods will be applied to orthopaedics data of patients with   osteonecrosis, a condition with less than 900 cases observed in 10 years in the available dataset.

Information on non-cases will be utilized to select a sample of controls using designs such as matching, counter matching, stratified and random sampling.

Requirements: R programming, modelling and sampling  


Title: Masters- Understanding behavior change in a complex participatory community intervention to improve maternal and child survival in rural Malawi: self-esteem, empowerment, co-coverage and equity

Abstract: A large cluster randomized trial of a participatory community intervention with women’s groups was conducted in rural district of Malawi (https://www.ncbi.nlm.nih.gov/pubmed/23683639 ). This intervention demonstrated an impact on maternal and child mortality, but the mechanism of behaviour change is still not understood. This project will develop the trial analysis by exploring two additional areas: 1) the impact of intervention participation on the “co-coverage” of key maternal, newborn and child health behaviours; and 2) the mediating role of empowerment and self-esteem. There is growing literature on the measurement of “co-coverage” for complex maternal and child health interventions1. This approach acknowledges that improvements in health and reductions in mortality due to complex public health interventions may arise from the adoption of multiple health behaviours. Indeed, public health interventions that target multiple behaviours simultaneously may have the most powerful effects. However, the analysis of trials typically focuses on the impact of interventions on separate individual behaviours. This project will involve a secondary analysis of data from the trial conducted in Malawi. The student will create a “co-coverage” index relevant to this study, and use this to measure the impact of the intervention on health behaviour. Further analyses will explore other ways of grouping health behavior variables, such as calculation of a composite index.

There is a well-established link between self-esteem and mental health, and evidence for the role of empowerment in family planning and maternal health2-4, but less is known about how self-esteem and empowerment may impact upon uptake of health services and childcare behaviours. The student will review the literature on measures of self-esteem and empowerment, and the mediating role of these on health behaviours. They will construct a self-esteem/empowerment scale using the questionnaire data collected, and use this to build a multivariable model looking at the relationship between group membership and health behaviour, then explore how self-esteem mediates this effect.


R programming, modelling and principal components analysis/factor analysis.



Title: PhD:  Methods for complex correlated survey data


Longitudinal data collected from health records usually contains patient information (repeated measures over time).  This data is generally analyzed using mixed models or GEE (generalized estimating equations). When additional information on   survival time (time until a diagnosis) is available, the analysis should account for both: the     time-to-event (survival) outcome and the longitudinal outcome. Joint models allow for    modeling of   longitudinal data and   survival  outcomes.  It combines    longitudinal   mixed effects models, and a time-to-event sub-model (e.g proportional hazard).

This project seeks to develop methods for the analysis of joint models for complex survey data, where weights are unknown and estimated.  The project will first develop methods for survival marginal models, that account for the correlation between subjects using weighted GEE with estimated/calibrated weights. Second, the project aims to develop methods for the analysis of   joint models for survey data  with estimated/calibrated weights.


 Title: PhD:  Advanced   WGEE   under two-phase designs

Abstract:   It is often the case that investigators have limited resources for their epidemiological or medical studies.   Frequently there is access to large amounts of information, but the information may not be sufficient for the study purposes.  One approach to solving this is to collect the crucial information only on a secondary sample of subjects.  This sample can be the result of a complex selection process, where subjects share similarities, i.e. they are correlated.

  A common example is health databases where patients may have repeated measures, or   subjects can belong to the same family or clinic. Weighted GEE (WGEE) has gained increasing attention in   the analysis of complex samples with correlated data.  A key feature of   WGEE is that it yields unbiased estimates even when the working correlation structure is misspecified.   WGEE has been developed for independent working covariance structures (diagonal).  The method considers a weighted version of the working covariance matrix. This method, however, is not valid for more complex structures.  In such cases, valid methods require estimation of the working precision matrix (inverse of the working covariance matrix) and estimation/knowledge of the pairwise sampling probabilities. This project aims to develop    WGEE methods for complex covariance structures. Additionally, the project also aims to implement   methods for hypothesis testing for WGEE under two-phase designs.






2009- 2011   Scholarship from the Faculty of Sciences for academic performance, Colombia

2012             Summer Scholarship from The Institute of Applied and Pure Mathematics (IMPA), Rio de Janeiro, Brazil

2012- 2015   The University of Auckland Doctoral Scholarship, New Zealand

2018              Worsley Early Career Research Award  

Areas of expertise

Sampling, GLM, Two-phase designs,  Biostatistics, GEE models, case-control, case-cohort, longitudinal data 

Selected publications and creative works (Research Outputs)

  • Rivera-Rodriguez, C., Spiegelman, D., & Haneuse, S. (2019). On the analysis of two-phase designs in cluster-correlated data settings. Statistics in medicine, 38 (23), 4611-4624. 10.1002/sim.8321
  • Rivera-Rodriguez, C., Haneuse, S., Wang, M., & Spiegelman, D. (2019). Augmented pseudo-likelihood estimation for two-phase studies. Statistical methods in medical research10.1177/0962280219833415
  • Rivera-Rodriguez, C., Toscano, C., & Resch, S. (2019). Improved calibration estimators for the total cost of health programs and application to immunization in Brazil. PloS one, 14 (3)10.1371/journal.pone.0212401
  • Rivera-Rodriguez, C. L., Resch, S., & Haneuse, S. (2018). Quantifying and reducing statistical uncertainty in sample-based health program costing studies in low- and middle-income countries. SAGE Open Medicine, 610.1177/2050312118765602
  • Haneuse, S., & Rivera-Rodriguez, C. (2018). On the analysis of case–control studies in cluster-correlated data settings. Epidemiology, 29 (1), 50-57. 10.1097/EDE.0000000000000763
  • Rivera, C. L., & Lumley, T. (2016). Using the entire history in the analysis of nested case cohort samples. Statistics in medicine, 35 (18), 3213-3228. 10.1002/sim.6917
    Other University of Auckland co-authors: Thomas Lumley
  • Rivera, C., & Lumley, T. (2016). Using the whole cohort in the analysis of countermatched samples. Biometrics, 72 (2), 382-391. 10.1111/biom.12419
    Other University of Auckland co-authors: Thomas Lumley
  • Rivera Rodriguez, C. L. (2015). Weighted likelihood and calibration for countermatched designs The University of Auckland. ResearchSpace@Auckland.
    URL: http://hdl.handle.net/2292/27300

Contact details

Primary office location

SCIENCE CENTRE 303 - Bldg 303
Level 3, Room 329
New Zealand

Web links