|
Tutorial: Interval censored data analysis
|
Michael
P. Fay, National Institute of Allergy and Infectious Diseases
(NIAID), USA
Abstract
Interval censored data analysis is important in biomedical statistics for any
type of time-to-event response where the time of response is not known exactly
but only known to occur between two assessment times, one before the event
occurred and one after the event occurred. Some examples are:
1. time until first positive HIV blood sample in an HIV vaccine trial,
2. time until first negative sputum culture in a TB treatment trial, and
3. time until cancer progression or death in a cancer treatment trial.
Standard survival methods (e.g., Kaplan-Meier curves, logrank tests,
accelerated failure time regression models) must be modified to properly account
for the interval censoring. For example, naively imputing the failure time as
the mid-point of the interval and performing the usual logrank test for
right-censored data can lead to large type 1 errors. This topic is relevant for
the R users conference because for some important methods for this type of data,
the only readily available software is implemented in R packages. The goal of
this tutorial is to show why these interval censored data methods are needed and
useful, and to show that some of the methods are easily performed in R.
Outline
Topics will include:
- Types of interval censoring (non-informative vs. informative; Case 1,
Case 2, Case k)
- Nonparametric Maximum likelihood estimation (NPMLE) of the Survival
Curve
- Right censored case (Kaplan-Meier). Graphical description of Efron's
redistribution to the right algorithm.
- Interval censored case. Graphical description of Turnbull's
self-consistent algorithm. (To give intuition on the NPMLE).
- Calculation of NPMLE in R
- survfit in survival package, including review of
Surv function and different types of censoring
- interval package
- Icens package and its algorithms.
- Testing the difference between two groups
- Why we usually use rank tests for time-to-event responses
- Basic permutation tests
- Generalizing the Wilcoxon-Mann-Whitney test for survival data
- Likelihoods for interval censored data.
- Marginal likelihood of the ranks
- Grouped continuous model
- Weighted logrank tests as score tests on semiparametric models
- Logrank test (two versions)/ Proportional Hazards
- Wilcoxon test/ Proportional odds
- Multiple imputation
- Why midpoint imputation can give bad type I errors
- What if the inspection process is different between treatment groups
- Overview of type I error problems and different rank tests
- Weighted logrank tests in R using interval package
- Choosing model/score
- Choosing method
- Regression
- Parametric models (accelerated failure time models)
- Examples using survival R package
Potential attendees
Potential attendees are those who analyze interval
censored data or plan clinical trials with endpoints of that type.
Required knowledge
Minimal knowledge of R is required. The tutorial will assume that participants have been exposed previously to standard right-censored
data analysis methods (Kaplan-Meier curves, logrank tests, etc.) although an
in-depth knowledge of those methods is not necessary.
Tutorial Materials
Slides are here.