Accounting for Model Uncertainty

in Prediction of Chlorophyll a in Lake Okeechobee

E. Conrad Lamon III and Merlise A. Clyde

Long term eutrophication data along with water quality measurements (total phosphorous and total nitrogen) and other physical environmental such as lake level (stage), water temperature, wind speed and direction were used to develop a model to predict chlorophyll $a$ concentrations in Lake Okeechobee. The model included each of the potential explanatory variables as either linear predictors, regression spline predictors and product spline interactions to allow for nonlinear relationships. A Gibbs sampler was used to traverse the model space with Bayesian model averaging (BMA) over the sampled models used for predictions that incorporate uncertainty about which variables and their function forms should enter in to the model. Non-parametric regression with Bayesian model averaging and spline interactions provides a flexible framework for addressing the problems of nonlinearity and counterintuitive total phosphorus function estimates identified in previous statistical models. The use of regression splines allows nonlinear effects to be manifest, while their extension allows inclusion of interactions for which the mathematical form cannot be specified a priori. Prediction intervals for the BMA predictions provided better coverage for new observations than those calculated for single backward selected ordinary least squares models or generalized additive models.

( PDF, postscript )