Classification and subtype prediction of soft tissue sarcoma by functional genomics and support vector machine analysis

Neil H. Segal, Paul Pavlidis, Cristina R. Antonescu, Robert G. Maki, William Stafford Noble, James M. Woodruff, Jonathan J. Lewis, Murray F. Brennan, Alan N. Houghton and Carlos Cordon-Cardo

American Journal of Pathology. 169:691-700, 2003.


Adult soft tissue sarcomas are a heterogeneous group of tumors, including well described subtypes by histologic and genotypic criteria, and pleomorphic tumors typically characterized by non-recurrent genetic aberrations and karyotypic heterogeneity. The latter pose a diagnostic challenge, even to experienced pathologists. We proposed that gene expression profiling in soft tissue sarcoma would identify a genomic-based classification scheme that is useful in diagnosis. RNA samples from 51 pathologically confirmed cases, representing nine different histologic subtypes of adult soft tissue sarcoma, were examined using the Affymetrix U95A GeneChip. Statistical tests were performed on experimental groups identified by cluster analysis, to find discriminating genes that could subsequently be applied in a support vector machine algorithm. Synovial sarcomas, round cell/ myxoid liposarcomas, clear cell sarcomas and gastrointestinal stromal tumors displayed remarkably distinct and homogenous gene expression profiles. Pleomorphic tumors were heterogeneous. Notably, a subset of malignant fibrous histiocytoma, a controversial histologic subtype, was identified as a distinct genomic group. The support vector machine algorithm provided a genomic basis for diagnosis, with both high sensitivity and specificity. In conclusion, we showed gene expression profiling to be useful in classification and diagnosis, provide insights into pathogenesis and point to potential new therapeutic targets of soft tissue sarcoma.