INTEGRATED MODELLING OF CLINICAL AND GENE EXPRESSION INFORMATION FOR
PERSONALIZED PREDICTION OF DISEASE OUTCOMES
Jennifer Pittman, Erich Huang, Holly Dressman, Cheng-Fang Horng,
Skye H Cheng,
Mei-Hua Tsou, Chii-Ming Chen, Andrea Bild, Edwin
S Iversen, Ming Liao,
Andrew T Huang, Joseph R Nevins and Mike West
Duke University
Original Version: February 2003
Published in Proceedings of the National Academy of Sciences, PNAS USA, May 2004
We describe a comprehensive modeling approach to combining genomic and clinical
data for prediction of disease outcomes in individual patients. Statistical analysis, using
predictive classification tree models, evaluates the contributions of multiple forms of
data, both clinical and genomic; the latter makes use of metagenes, gene expression
signatures derived from microarray analyses. In a breast cancer recurrence study, we
demonstrate that multiple metagenes are far more powerful in predicting outcomes than
any single metagene. Further, combining metagenes with clinical risk factors proves
most accurate at the individual patient level. This framework for combining multiple
forms of data provides a platform for development of models for personalized prognosis.
The published manuscript and supplementary tables and figures are available at
PNAS web site.