Repository: Freie Universität Berlin, Math Department

EMT network-based feature selection improves prognosis prediction in lung adenocarcinoma

Shao, Borong and Bjaanæs, Maria and Helland, Åslaug and Schütte, Ch. and Conrad, T. O. F. (2019) EMT network-based feature selection improves prognosis prediction in lung adenocarcinoma. PLoS ONE, 14 (1). ISSN 1932-6203

Full text not available from this repository.

Official URL:


Various feature selection algorithms have been proposed to identify cancer prognostic biomarkers. In recent years, however, their reproducibility is criticized. The performance of feature selection algorithms is shown to be affected by the datasets, underlying networks and evaluation metrics. One of the causes is the curse of dimensionality, which makes it hard to select the features that generalize well on independent data. Even the integration of biological networks does not mitigate this issue because the networks are large and many of their components are not relevant for the phenotype of interest. With the availability of multi-omics data, integrative approaches are being developed to build more robust predictive models. In this scenario, the higher data dimensions create greater challenges. We proposed a phenotype relevant network-based feature selection (PRNFS) framework and demonstrated its advantages in lung cancer prognosis prediction. We constructed cancer prognosis relevant networks based on epithelial mesenchymal transition (EMT) and integrated them with different types of omics data for feature selection. With less than 2.5% of the total dimensionality, we obtained EMT prognostic signatures that achieved remarkable prediction performance (average AUC values above 0.8), very significant sample stratifications, and meaningful biological interpretations. In addition to finding EMT signatures from different omics data levels, we combined these single-omics signatures into multi-omics signatures, which improved sample stratifications significantly. Both single- and multi-omics EMT signatures were tested on independent multi-omics lung cancer datasets and significant sample stratifications were obtained.

Item Type:Article
Subjects:Medicine and Dentistry > Clinical Medicine
Mathematical and Computer Sciences > Statistics > Applied Statistics
Mathematical and Computer Sciences > Artificial Intelligence > Machine Learning
Divisions:Department of Mathematics and Computer Science > Institute of Mathematics
Department of Mathematics and Computer Science > Institute of Mathematics > Comp. Proteomics Group
ID Code:2251
Deposited By: Admin Administrator
Deposited On:17 May 2018 19:45
Last Modified:01 Feb 2019 07:54

Repository Staff Only: item control page