Computes consequences of priors chosen for the parameters hsd
and hrsd
in a flexible hazard model survextrap
on
an interpretable scale. This can be used to calibrate Gamma
priors for these parameters to match interpretable beliefs.
Usage
prior_haz_sd(
mspline,
coefs_mean = NULL,
prior_hsd = p_gamma(2, 1),
prior_hscale = p_normal(0, 20),
smooth_model = "exchangeable",
prior_loghr = NULL,
formula = NULL,
cure = NULL,
nonprop = NULL,
newdata = NULL,
prior_hrsd = NULL,
tmin = 0,
tmax = NULL,
nsim = 1000,
hq = c(0.1, 0.9),
quantiles = c(0.025, 0.5, 0.975)
)
prior_hr_sd(
mspline,
coefs_mean = NULL,
prior_hsd = p_gamma(2, 1),
prior_hscale = p_normal(0, 20),
smooth_model = "exchangeable",
prior_loghr = NULL,
formula = NULL,
cure = NULL,
nonprop = NULL,
newdata = NULL,
newdata0 = NULL,
prior_hrsd = NULL,
tmin = 0,
tmax = 10,
nsim = 100,
hq = c(0.1, 0.9),
quantiles = c(0.025, 0.5, 0.975)
)
Arguments
- mspline
A list of control parameters defining the spline model.
knots
: Spline knots. If this is not supplied, then the number of knots is taken fromdf
, and their location is taken from equally-spaced quantiles of the observed event times in the individual-level data.add_knots
: This is intended to be used when there areexternal
data included in the model. External data are typically outside the time period covered by the individual data.add_knots
would then be chosen to span the time period covered by the external data, so that the hazard trajectory can vary over that time.If there are external data, and both
knots
andadd_knots
are omitted, then a default set of knots is chosen to span both the individual and external data, by taking the quantiles of a vector defined by concatenating the individual-level event times with thestart
andstop
times in the external data.df
: Degrees of freedom, i.e. the number of parameters (or basis terms) intended to result from choosing knots based on quantiles of the data. The total number of parameters will then bedf
plus the number of additional knots specified inadd_knots
.df
defaults to 10. This does not necessarily overfit, because the function is smoothed through the prior.degree
: Polynomial degree used for the basis function. The default is 3, giving a cubic. This can only be changed from 3 ifbsmooth
isFALSE
.bsmooth
: IfTRUE
(on by default) the spline is smoother at the highest knot, by defining the derivative and second derivative at this point to be zero.- coefs_mean
Spline basis coefficients that define the prior mean for the hazard function. By default, these are set to values that define a constant hazard function (see
mspline_constant_coefs
). They are normalised to sum to 1 internally (if they do not already).- prior_hsd
Gamma prior for the standard deviation that controls the variability over time (or smoothness) of the hazard function. This should be a call to
p_gamma()
. The default isp_gamma(2,1)
. Seeprior_haz_sd
for a way to calibrate this to represent a meaningful belief.- prior_hscale
Prior for the baseline log hazard scale parameter (
alpha
orlog(eta)
). This should be a call to a prior constructor function, such asp_normal(0,1)
orp_t(0,2,2)
. Supported prior distribution families are normal (parameters mean and SD) and t distributions (parameters location, scale and degrees of freedom). The default is a normal distribution with mean 0 and standard deviation 20.Note that
eta
is not in itself a hazard, but it is proportional to the hazard (see the vignette for the full model specification)."Baseline" is defined by the continuous covariates taking a value of zero and factor covariates taking their reference level. To use a different baseline, the data should be transformed appropriately beforehand, so that a value of zero has a different meaning. For continuous covariates, it helps for both computation and interpretation to define the value of zero to denote a typical value in the data, e.g. the mean.
- smooth_model
The default
"exchangeable"
uses independent logistic priors on the multinomial-logit spline coefficients, conditionally on a common smoothing variance parameter.The alternative,
"random_walk"
, specifies a random walk prior for the multinomial-logit spline coefficients, based on logistic distributions. See the methods vignette for full details.In non-proportional hazards models, setting
smooth_model
also determines whether an exchangeable or random walk model is used for the non-proportionality parameters (\(\delta\)).- prior_loghr
Priors for log hazard ratios. This should be a call to
p_normal()
orp_t()
. A list of calls can also be provided, to give different priors to different coefficients, where the name of each list component matches the name of the coefficient, e.g.list("age45-59" = p_normal(0,1), "age60+" = p_t(0,2,3))
The default is
p_normal(0,2.5)
for all coefficients.- formula
A survival formula in standard R formula syntax, with a call to
Surv()
on the left hand side.Covariates included on the right hand side of the formula with be modelled with proportional hazards, or if
nonprop
isTRUE
then a non-proportional hazards is used.If
data
is omitted, so that the model is being fitted to external aggregate data alone, without individual data, then the formula should not include aSurv()
call. The left-hand side of the formula will then be empty, and the right hand side specifies the covariates as usual. For example,formula = ~1
if there are no covariates.- cure
If
TRUE
, a mixture cure model is used, where the "uncured" survival is defined by the M-spline model, and the cure probability is estimated.- nonprop
Non-proportional hazards model specification. This is achieved by modelling the spline basis coefficients in terms of the covariates. See the methods vignette for more details.
If
TRUE
, then all covariates are modelled with non-proportional hazards, using the same model formula asformula
.If this is a formula, then this is assumed to define a model for the dependence of the basis coefficients on the covariates.
IF this is
NULL
orFALSE
(the default) then any covariates are modelled with proportional hazards.- newdata
A data frame with one row, containing variables in the model formulae. Samples will then be drawn, for any covariate-dependent parameters, with covariates set to the values given here.
- prior_hrsd
Prior for the standard deviation parameters that smooth the non-proportionality effects over time in non-proportional hazards models. This should be a call to
p_gamma()
or a list of calls top_gamma()
with one component per covariate, as inprior_loghr
. Seeprior_hr_sd
for a way to calibrate this to represent a meaningful belief.- tmin
Minimum plotting time. Defaults to zero.
- tmax
Maximum plotting time. Defaults to the highest knot.
- nsim
Number of simulations to draw
- hq
Quantiles which define the "low" and "high" values of a time-varying quantity (hazard in
prior_haz_sd
and the hazard ratio inprior_hr_sd
). The ratio between the high and low values will be summarised, as a measure of time-dependence. By default, this isc(0.1, 0.9)
, so that the 10% and 90% quantiles are used respectively.- quantiles
Quantiles used to summarise the implied prior distributions of the simulated quantities.
- newdata0
A data frame with one row, containing "reference" values of variables in the model formulae. The hazard ratio between the hazards at
newdata
andnewdata0
will be returned.
Value
A data frame with columns sd_haz
(SD of the hazard),
sd_mean
(SD of the inverse hazard) and hr
(ratio between
high/low hazards) (for prior_haz_sd
), and rows
giving prior quantiles of these.
In prior_hr_sd
, sd_hr
is the SD of hazard ratios
over time, and hrr
is the ratio between high/low hazard ratios.
Details
The spline model in survextrap
allows the hazard to
change over time in an arbitrarily flexible manner. The prior
distributions on the parameters of this model have implications
for how much we expect the hazard to plausibly vary over time.
These priors are hard to interpret directly, but this function can
be used to compute their implications on a more
easily-understandable scale.
This is done by:
(1) simulating a set of parameters from their prior distributions
(2) computing the hazard at a fine grid of equally-spaced points spanning the boundary knots
(3) calculating the empirical standard deviation of the set of hazards at these points
(4) repeatedly performing steps 1-3, and summarising the distribution of the resulting standard deviations. This is the implied prior for the hazard variability.
prior_haz_sd
computes the SD of the hazard, and the SD of the inverse hazard is also
computed. The inverse hazard at time t
is the expected time to the event given survival to t
.
The hazard ratio between a high and low value (defined by quantiles of values at different times)
is also computed.
prior_hr_sd
computes the SD of the hazard ratio between two covariate values
supplied by the user.
All of these SDs refer to the variability over time, e.g. a SD of 0 indicates that the hazard (or inverse hazard, or hazard ratio) is constant with time.