Draw a number of bootstrap resamples, refit a `msm`

model to the
resamples, and calculate statistics on the refitted models.

## Arguments

- x
A fitted msm model, as output by

`msm`

.- stat
A function to call on each refitted msm model. By default this is

`pmatrix.msm`

, returning the transition probability matrix in one time unit. If`NULL`

then no function is computed.- B
Number of bootstrap resamples.

- file
Name of a file in which to save partial results after each replicate. This is saved using

`save`

and can be restored using`load`

, producing an object called`boot.list`

containing the partial results. Not supported when using parallel processing.- cores
Number of processor cores to use for parallel processing. Requires the doParallel package to be installed. If not specified, parallel processing is not used. If

`cores`

is set to the string`"default"`

, the default methods of`makeCluster`

(on Windows) or`registerDoParallel`

(on Unix-like) are used.- remove.errors
If

`TRUE`

then bootstrap refits which resulted in an error are removed from the returned list, and a message is returned which states the proportion of failed fits and the first error message. If`FALSE`

, then the error message for failed refits is placed in the corresponding component of the returned list.

## Value

A list with `B`

components, containing the result of calling
function `stat`

on each of the refitted models. If `stat`

is
`NULL`

, then each component just contains the refitted model. If one
of the `B`

model fits was unsuccessful and resulted in an error, then
the corresponding list component will contain the error message.

## Details

The bootstrap datasets are computed by resampling independent pairs of observations at successive times (for non-hidden models without censoring), or independent individual series (for hidden models or models with censoring). Therefore this approach doesn't work if, for example, the data for a HMM consist of a series of observations from just one individual, and is inaccurate for small numbers of independent transitions or individuals.

Confidence intervals or standard errors for the corresponding statistic can
be calculated by summarising the returned list of `B`

replicated
outputs. This is currently implemented for most the output functions
`qmatrix.msm`

, `ematrix.msm`

,
`qratio.msm`

, `pmatrix.msm`

,
`pmatrix.piecewise.msm`

, `totlos.msm`

and
`prevalence.msm`

. For other outputs, users will have to write
their own code to summarise the output of `boot.msm`

.

Most of msm's output functions present confidence intervals based on
asymptotic standard errors calculated from the Hessian. These are expected
to be underestimates of the true standard errors (Cramer-Rao lower bound).
Some of these functions use a further approximation, the delta method (see
`deltamethod`

) to obtain standard errors of transformed
parameters. Bootstrapping should give a more accurate estimate of the
uncertainty.

An alternative method which is less accurate though faster than
bootstrapping, but more accurate than the delta method, is to draw a sample
from the asymptotic multivariate normal distribution implied by the maximum
likelihood estimates (and covariance matrix), and summarise the transformed
estimates. See `pmatrix.msm`

.

All objects used in the original call to `msm`

which produced
`x`

, such as the `qmatrix`

, should be in the working environment,
or else `boot.msm`

will produce an “object not found” error.
This enables `boot.msm`

to refit the original model to the replicate
datasets. However there is currently a limitation. In the original call to
`msm`

, the `"formula"`

argument should be specified directly, as,
for example,

`msm(state ~ time, data = ...)`

and not, for example,

`form = data$state ~ data$time`

`msm(formula=form, data = ...)`

otherwise `boot.msm`

will be unable to draw the replicate datasets.

`boot.msm`

will also fail with an incomprehensible error if the
original call to msm used a used-defined object whose name is the same as a
built-in R object, or an object in any other loaded package. For example,
if you have called a Q matrix `q`

, when `q()`

is the built-in
function for quitting R.

If `stat`

is `NULL`

, then `B`

different `msm`

model
objects will be stored in memory. This is unadvisable, as `msm`

objects
tend to be large, since they contain the original data used for the
`msm`

fit, so this will be wasteful of memory.

To specify more than one statistic, write a function consisting of a list of different function calls, for example,

`stat = function(x) list (pmatrix.msm(x, t=1), pmatrix.msm(x, t=2))`

## References

Efron, B. and Tibshirani, R.J. (1993) *An Introduction to
the Bootstrap*, Chapman and Hall.

## Examples

```
if (FALSE) {
## Psoriatic arthritis example
data(psor)
psor.q <- rbind(c(0,0.1,0,0),c(0,0,0.1,0),c(0,0,0,0.1),c(0,0,0,0))
psor.msm <- msm(state ~ months, subject=ptnum, data=psor, qmatrix =
psor.q, covariates = ~ollwsdrt+hieffusn,
constraint = list(hieffusn=c(1,1,1),ollwsdrt=c(1,1,2)),
control = list(REPORT=1,trace=2), method="BFGS")
## Bootstrap the baseline transition intensity matrix. This will take a long time.
q.list <- boot.msm(psor.msm, function(x)x$Qmatrices$baseline)
## Manipulate the resulting list of matrices to calculate bootstrap standard errors.
apply(array(unlist(q.list), dim=c(4,4,5)), c(1,2), sd)
## Similarly calculate a bootstrap 95% confidence interval
apply(array(unlist(q.list), dim=c(4,4,5)), c(1,2),
function(x)quantile(x, c(0.025, 0.975)))
## Bootstrap standard errors are larger than the asymptotic standard
## errors calculated from the Hessian
psor.msm$QmatricesSE$baseline
}
```