March 2008
Computation in Bayesian statistical models is often performed using sampling techniques such as Markov chain Monte Carlo (MCMC) or adaptive Monte Carlo methods. The convergence of the sampler to the posterior distribution is typically assessed using a set of standard diagnostics; recent draft Food and Drug Administration guidelines for the use of Bayesian statistics in medical device trials, for instance, advocate this approach for validating computations.
We give several examples showing that this approach may be insufficient when the posterior distribution is multimodal; that lack of convergence due to posterior multimodality can be undetected using the standard convergence diagnostics, including the Gelman-Rubin diagnostic that was introduced for exactly this problem. We show that the poor convergence can be detected by modifying a validation technique that was originally proposed for detecting coding errors in MCMC software (Cook, Gelman and Rubin 2006). The modified validation method can succeed where convergence diagnostics fail, because it evaluates the convergence of the sampling algorithm for many data sets drawn from the model, rather than for the particular data set under consideration.
We first give the example of a mixture model with bimodal posterior distribution where one mode of the posterior has a much smaller basin of attraction than the other. The narrower mode is extremely difficult to detect, both for a Gibbs sampler and for the Gelman-Rubin diagnostic applied to that sampler. Failure to diagnose that there is an undetected narrow mode then leads to overestimation of the posterior variance. We show that the same effect can occur for the popular stochastic search variable selection technique (George and McCulloch 1993), leading to incorrect inferences.
We then argue that the modified validation technique should be widely applied when using sampling methods such as MCMC for computation in models where posterior unimodality is not guaranteed, including stochastic search variable selection. In contexts where accuracy is imperative, such as medical device trials, such measures would be most important.
Keywords: convergence diagnostic, Markov chain convergence, Markov chain validation, mixture model, stochastic search variable selection.
The manuscript is available in PDF format.