Wednesday 24 August 2011

Maximum likelihood analysis of semicompeting risks data with semiparametric regression models

Yi-Hau Chen has a new paper in Lifetime Data Analysis which extends his 2010 JRSS B paper from competing risks to semi-competing risks data. Essentially in both cases the main idea is to model the dependence between the competing risks by assuming their event time distributions are related via some family of copulas. Mathematically this approach is quite elegant as it allows regression models to be built on the marginal distributions of each failure time, with the inherent dependency in the censoring accounted for through the copula. From a practical perspective, particularly with semi-competing risks data and medical applications one has to question the sensibleness of the model and the objective of modelling marginal distributions.

It seems most useful to follow Xu, Kalbfleisch and Tai and view semi-competing risks as an illness-death model. After accounting for covariates, a patient's illness time and death time can be related either due to a shared frailty term, which it may be sensible to assume is determined from the outset, or through onset of illness causing death to occur sooner than it would have done. In the copula model these two distinct factors get pooled together. It is questionable how well the copula model would perform when the true process has a more event determined dependence.

More importantly the question has to be asked why you would want to try and estimate the "illness free" survival distribution? This breaks Andersen and Keiding's guideline to "Stick to this world". Illness (or relapse) is never going to be eliminated. More sensible measures like the cumulative incidence function of death (without illness having occurred) can of course be derived from Chen's copula model, although analogously to the case of semi-parametric models on cause-specific hazards, the effect of covariates on the CIF may be complicated.

Monday 8 August 2011

Joint modelling of longitudinal outcome and interval-censored competing risk dropout in a schizophrenia clinical trial

Ralitza Gueorguieva, Robert Rosenheck and Haiqun Lin have a new paper in JRSS A. The paper concerns the joint modelling of a longitudinal outcome and an interval censored competing risks outcome that explains drop-out. As is common with these joint longitudinal and survival types of models the two processes are linked via a normally distributed vector of random effects. The novelty of the paper is in the survival part is a competing risks process and the event time is interval censored. The authors adopt a parametric model for the competing risks, using the family of distributions proposed by Sparling et al (Biostatistics, 2006). This makes inference somewhat more straightforward than it would be if a non-parametric baseline cause-specific hazards were used. As recently noted, parametric treatment of competing risks data is surprisingly rare. One problem faced by the authors is that the hazard family of Sparling, while allowing closed form expressions for interval censored univariate survival data, do not result in closed form expressions for interval censored competing risks data (except in special cases). Instead a numerical integral has to be competed. The presence of the overall random effects would mean the likelihood requires nested integration. To avoid this problem the authors adopt an approximation to the true likelihood for competing risks data. If a patient is known to have had a failure of type j in the interval [t0,t1] the authors assume that the patient is censored of all risks except risk j at time t0. It is clear that this approximation will lead to systematic bias as the time at risk from each failure type will be underestimated so the hazards will tend to be overestimated. The amount of bias will depend on the typical length of the intervals [t0,t1].

For the CATIE data example the proposed approximation is probably not an issue. The drop out (competing risks) part of the model is not the primary focus of the inference, and it is really the relative hazards of different types of drop out rather than their absolute values that is important in determining the trajectories of the longitudinal measure without drop out. For instance the estimates for simulated data of a similar type are close to unbiased.
However in extreme cases like current status competing risks data the approximation will do extremely badly.

Current status observation of a three-state counting process with application to simultaneous accurate and diluted HIV test data

Karen McKeown and Nicholas Jewell have a new paper in Canadian Journal of Statistics as part of the Kalbfleisch and Lawless special issue. The paper considers non-parametric inference for three-state progressive models subject to current status data observation. The authors note previous work by van der Laan and Jewell (Annals of Statistics, 2003) that shows that a naive estimator using only information on the first event cannot be improved upon in a fully non-parametric setting. The authors consider situations where additional assumptions are made about the waiting time (between the first and second events). In the motivating example, there is a fairly extreme case of current status data where all subjects are observed at the same time point (i.e. a cross-sectional sample). In this case, if an assumption that the distribution for the first event time is assumed to be locally linear, then only the mean waiting time is relevant. The authors then consider a couple of different scenarios with a view to choosing an optimal assumed mean waiting time that minimizes the mean squared error of some quantity of interest relating to the first event time distribution (e.g. the mean cumulative hazard in some time interval).