Bayesian multi-state models for intermittently-observed data

Fit a multi-state model to longitudinal data consisting of intermittent observation of a discrete state. Bayesian estimation is used, via the Stan software.

Usage

msmbayes(
  data,
  state,
  time,
  subject,
  Q,
  E = NULL,
  covariates = NULL,
  nphase = NULL,
  priors = NULL,
  fit_method = "sample",
  keep_data = FALSE,
  ...
)

Arguments

data

Data frame giving the observed data.

state

Character string naming the observed state variable in the data. This variable must either be an integer in 1,2,...,K, where K is the number of states, or a factor with these integers as level labels.

time

Character string naming the observation time variable in the data

subject

Character string naming the individual ID variable in the data

Q

Matrix indicating the transition structure. A zero entry indicates that instantaneous transitions from (row) to (column) are disallowed. An entry of 1 (or any other positive value) indicates that the instantaneous transition is allowed. The diagonal of Q is ignored.

There is no need to "guess" initial values and put them here, as is sometimes done in msm. Initial values for fitting are determined by Stan from the prior distributions, and the specific values supplied for positive entries of Q are disregarded.

E

If NULL a non-hidden Markov model is fitted. If non-NULL this should be a matrix indicating the structure of allowed misclassifications, where rows are the true states, and columns are the observed states. A zero \((r,s)\) entry indicates that true state \(r\) cannot be observed as observed state \(s\). A non-zero \((r,s)\) entry indicates an initial value for a permitted misclassification probability. The diagonal of E is ignored.

covariates

Specification of covariates on transition intensities. This should be a list of formulae. Each formula should have a left-hand side that looks like Q(r,s), and a right hand side defining the regression model for the log of the transition intensity from state \(r\) to state \(s\).

For example,

covariates = list(Q(1,2) ~ age + sex, Q(2,1) ~ age)

specifies that the log of the 1-2 transition intensity is an additive linear function of age and sex, and the log 2-1 transition intensity is a linear function of age. You do not have to list all of the intensities here if some of them are not influenced by covariates.

nphase

For phase-type models, this is a vector with one element per state, giving the number of phases per state. This element is 1 for states that do not have phase-type sojourn distributions. Not required for non-phase-type models.

priors

A list specifying priors. Each component should be the result of a call to msmprior. Any parameters with priors not specified here are given default priors (normal with mean -2 and SD 2 for log intensities, and normal with mean 0 and SD 10 for log hazard ratios).

If only one parameter is given a non-default prior, a single msmprior call can be supplied here instead of a list.

fit_method

Quoted name of a function from the cmdstanr package specifying the algorithm to fit the model. The default "sample" uses MCMC, via cmdstanr::sample(). Alternatives are cmdstanr::optimize(), cmdstanr::pathfinder(), cmdstanr::laplace() or cmdstanr::variational().

keep_data

Store a copy of the cleaned data in the returned object. FALSE by default.

...

Other arguments to be passed to the function from cmdstanr that fits the model.

Value

A data frame in the draws format of the posterior package, containing draws from the posterior of the model parameters.

Attributes are added to give information about the model structure, and a class "msmbayes" is appended.

See, e.g. summary.msmbayes, qdf, hr, and similar functions, to extract parameter estimates from the fitted model.