Constructor for a standardising population used for survextrap outputs
Source:R/standardise.R
standardise_to.Rd
Standardised outputs are outputs from models with covariates, that are defined by marginalising (averaging) over covariate values in a given population, rather than being conditional on a given covariate value.
Usage
standardise_to(newdata, nstd = 1, random = FALSE)
standardize_to(newdata, nstd = 1, random = FALSE)
Arguments
- newdata
Data frame describing a population.
- nstd
Number of draws from the population distribution used per MCMC sample from the parameters when
random=TRUE
. With the default of 1, the value of the covariate vector \(X\) is essentially treated as if it were an additional parameter in the Bayesian model, drawn by Monte Carlo independently of the remaining parameters.- random
By default this is
FALSE
, indicating that standardised samples should be obtained by concatenating the posterior samples for each covariate value in the standard population. The sample from the standardised posterior of parameters then has sizeniter
times the number of rows innewdata
, whereniter
is the number of MCMC iterations used in the originalsurvextrap
fit. Computing the resulting output function (e.g. RMST which uses numerical integration) can then be computationally intensive if this sample size is large.A quicker alternative is to sample a random row of the standard population for each MCMC iteration. The standardised sample from the posterior then has size
niter
. This is specified by usingrandom=TRUE
. If this is used, then the result depends on the random number seed, and it should be checked that the results are stable to within the required number of significant figures. If not, runsurvextrap
with more MCMC iterations or increasenstd
here.
Value
A copy of newdata
, but with attributes added to
indicate that this should be used as a standard population. When
this newdata
is passed to survextrap
's output
functions, the outputs will then be presented as an average over
the empirical distribution of covariate values described by
newdata
, rather than as one output per row of
newdata
(distinct covariate values).
Details
These are produced by generating a Monte Carlo sample from the joint distribution of parameters \(\theta\) and covariate values \(X\), \(p(X,\theta) = p(\theta|X)p(X)\), where \(p(X)\) is defined by the empirical distribution of covariates in the standard population.
Hence applying a vectorised output function \(g()\) (such as the RMST or survival probability) to this sample produces a sample from the posterior of \(\int g(\theta|X) dX\): the average RMST (say) for a heterogeneous population.
See the Examples vignette for some examples and notes on computation.
Examples
rxph_mod <- survextrap(Surv(years, status) ~ rx, data=colons, fit_method="opt")
ref_pop <- data.frame(rx = c("Obs","Lev+5FU"))
# covariate-specific outputs
survival(rxph_mod, t = c(5,10), newdata = ref_pop)
#> # A tibble: 4 × 5
#> rx t median lower upper
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Obs 5 0.376 0.208 0.522
#> 2 Obs 10 0.228 0.0429 0.470
#> 3 Lev+5FU 5 0.608 0.407 0.736
#> 4 Lev+5FU 10 0.470 0.186 0.690
# standardised outputs
survival(rxph_mod, t = c(5,10), newdata = standardise_to(ref_pop))
#> # A tibble: 2 × 4
#> t median lower upper
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5 0.483 0.242 0.716
#> 2 10 0.345 0.0673 0.662