INTEGRATED MODELLING OF CLINICAL AND GENE EXPRESSION INFORMATION FOR PERSONALIZED PREDICTION OF DISEASE OUTCOMES

Jennifer Pittman, Erich Huang, Holly Dressman, Cheng-Fang Horng, Skye H Cheng,
Mei-Hua Tsou, Chii-Ming Chen, Andrea Bild, Edwin S Iversen, Ming Liao,
Andrew T Huang, Joseph R Nevins and Mike West

Duke University

Original Version: February 2003
Published in Proceedings of the National Academy of Sciences, PNAS USA, May 2004

We describe a comprehensive modeling approach to combining genomic and clinical data for prediction of disease outcomes in individual patients. Statistical analysis, using predictive classification tree models, evaluates the contributions of multiple forms of data, both clinical and genomic; the latter makes use of metagenes, gene expression signatures derived from microarray analyses. In a breast cancer recurrence study, we demonstrate that multiple metagenes are far more powerful in predicting outcomes than any single metagene. Further, combining metagenes with clinical risk factors proves most accurate at the individual patient level. This framework for combining multiple forms of data provides a platform for development of models for personalized prognosis.


The published manuscript and supplementary tables and figures are available at PNAS web site.

Supporting materials and data are available here, and here is the full list of 498 metagenes underlying the reported analysis.