Skip to Main Content
 
Translator disclaimer

ABSTRACT

Recent advances in high-throughput biotechnologies have provided an unprecedented opportunity for biomarker discovery, which, from a statistical point of view, can be cast as a variable selection problem. This problem is challenging due to the high-dimensional and nonlinear nature of omics data and, in general, it suffers three difficulties: (i) an unknown functional form of the nonlinear system, (ii) variable selection consistency, and (iii) high-demanding computation. To circumvent the first difficulty, we employ a feed-forward neural network to approximate the unknown nonlinear function motivated by its universal approximation ability. To circumvent the second difficulty, we conduct structure selection for the neural network, which induces variable selection, by choosing appropriate prior distributions that lead to the consistency of variable selection. To circumvent the third difficulty, we implement the population stochastic approximation Monte Carlo algorithm, a parallel adaptive Markov Chain Monte Carlo algorithm, on the OpenMP platform that provides a linear speedup for the simulation with the number of cores of the computer. The numerical results indicate that the proposed method can work very well for identification of relevant variables for high-dimensional nonlinear systems. The proposed method is successfully applied to identification of the genes that are associated with anticancer drug sensitivities based on the data collected in the cancer cell line encyclopedia study. Supplementary materials for this article are available online.

Acknowledgments

The authors thank G. Miao and V. Sundaresan for their help on data preprocessing, and thank the Editor, Associate Editor, and two referees for their constructive comments, which have led to significant improvements of this article.

Supplementary Material

Supplement description: (i) Proofs of Theorem 1 and Lemma 1, and (ii) the Pop-SAMC algorithm used for simulating from the posterior distribution of the Bayesian neural network.

Package

The proposed BNN method has been implemented into a R package, which will be made publicly available at CRAN upon acceptance of the article.

Additional information

Funding

Liang’s research was supported in part by the NSF grants DMS-1612924 and DMS/NIGMS R01-GM117597.

Login options

Purchase * Save for later
Online

Article Purchase 24 hours to view or download: USD 44.00 Add to cart

Issue Purchase 30 days to view or download: USD 268.00 Add to cart

* Local tax will be added as applicable