MODEL SELECTION FOR NEURAL NETWORK CLASSIFICATION

Herbert Lee
Duke University

June 2000

Classification rates on out-of-sample predictions can often be improved through the use of model selection when fitting a model on the training data. Using correlated predictors or fitting a model of too high a dimension can lead to overfitting, which in turn leads to poor out-of-sample performance. I will discuss methodology using the Bayesian Information Criterion (BIC) of Schwarz (1978) that can search over large model spaces and find appropriate models that reduce the danger of overfitting. The methodology can be interpreted as either a frequentist method with a Bayesian inspiration or as a Bayesian method based on noninformative priors.

Key Words: Model Averaging, Bayesian Random Searching


The manuscript is available in postscript and pdf formats