Limiting long-term illness. Ecological inference with one binary and one continuous exposure, and combined aggregate and individual data.

Model for limiting long-term illness among men aged from 45 to 59 years in London, in terms of individual ethnicity, household income and area deprivation. Aggregate data are taken from the UK census small area statistics, and include ngr = 255 wards in London. Individual data are taken from the Health Survey for England, including 412 individuals in London.

model {
## ecological data
for (i in 1:ngr) {

   ## The number of individuals per ward reporting limiting long term illness
   ## is modelled as binomial, with risk py[i] constant over the area.
   
N.llti[i] ~ dbin(py[i], N.pop[i])

   ## This risk is obtained by integrating the individual level risk, firstly over the binary exposure, non-white
   ## ethnicity

py[i] <- p.nonwhite[i] * pnw[i] + ( 1 - p.nonwhite[i] ) * pw[i]

   ## and secondly over the continuous variable, mean log income. Details of this calculation are given in
   ## Jackson et al., (2005)
Improving ecological inference using individual-level data.
logit(pw[i]) <- ( mu.base[i] + a.carstairs*carstairs[i] + a.lincome*mean.lincome[i] ) / sqrt(1 + pow(c*a.lincome, 2) / tau.x[i])
logit(pnw[i]) <- ( mu.base[i] + a.carstairs*carstairs[i] + a.nonwhite + a.lincome*mean.lincome[i] ) / sqrt(1 + pow(c*a.lincome, 2) / tau.x[i])
tau.x[i] <- 1 / pow(sd.lincome[i], 2)

   ## The risk includes a random area-specific effect mu.base[i], to account for heterogeneity between areas
   ## due to unobserved factors.
mu.base[i] ~ dnorm(base.mu, base.tau)
}

   ## Model for limiting long term illness in the individual-level data is just a logistic regression with the same
   ## coefficients
for (i in 1:n) {
llti[i] ~ dbern(pzy[i])
logit(pzy[i]) <- mu.base[ward[i]] + mu.scale +
a.carstairs*carstairs[ward[i]] + a.nonwhite * nonwhite[i] + a.lincome * lincome[i]
}

c <- 16*sqrt(3) / (15 * pi)
pi <- 3.141592654

   ## Log odds ratios for area level Carstairs deprivation index, non-white ethnicity and log income
a.carstairs ~ dnorm(0, 0.1)
a.nonwhite ~ dnorm(0, 1.48) # 95% prior belief that OR(a.nonwhite) is between 1/5 and 5
a.lincome ~ dnorm(0, 1.48)

   ## Odds ratios for area level Carstairs deprivation index, non-white ethnicity and log income
or.nonwhite <- exp(a.nonwhite)
or.lincome <- exp(a.lincome)
or.carstairs <- exp(a.carstairs)

base.mu ~ dlogis(0, 1)
base.tau ~ dgamma(1, 0.01)
base.sig <- 1 / sqrt(base.tau)
p.base <- exp(base.mu) / (1 + exp(base.mu))
mu.scale <- 0.7
# fixed difference in prevalence (logit scale) between HSE and census
}


Data: Click on the arrows to open the data

Inits for two parallel chains:
list(base.mu=0, base.tau=1, a.nonwhite=0, a.lincome=0, a.carstairs=0)
list(base.mu=1, base.tau=5, a.nonwhite=-2, a.lincome=2, a.carstairs=-1)

Results...
After 30000 iterations from two parallel chains, with the first 4000 from each discarded, we obtain the following
summary statistics:
   node   mean   sd   MC error   2.5%   median   97.5%   start   sample
   base.mu   -2.108   0.03027   0.001257   -2.171   -2.107   -2.052   4001   52000
   base.sig   0.1843   0.01172   1.438E-4   0.1622   0.184   0.2082   4001   52000
   or.carstairs   1.071   0.004904   1.836E-4   1.062   1.072   1.081   4001   52000
   or.lincome   0.5617   0.04406   0.00158   0.4785   0.5608   0.6513   4001   52000
   or.nonwhite   1.432   0.1649   0.007305   1.133   1.424   1.763   4001   52000