Skip to Main Content
618
Views
19
CrossRef citations to date
Altmetric
Pages 549-569
Received 01 Aug 2013
Accepted author version posted online: 18 Apr 2015
Published online:10 May 2016
 
Translator disclaimer

High-dimensional low sample size (HDLSS) data are becoming increasingly common in statistical applications. When the data can be partitioned into two classes, a basic task is to construct a classifier that can assign objects to the correct class. Binary linear classifiers have been shown to be especially useful in HDLSS settings and preferable to more complicated classifiers because of their ease of interpretability. We propose a computational tool called direction-projection-permutation (DiProPerm), which rigorously assesses whether a binary linear classifier is detecting statistically significant differences between two high-dimensional distributions. The basic idea behind DiProPerm involves working directly with the one-dimensional projections of the data induced by binary linear classifier. Theoretical properties of DiProPerm are studied under the HDLSS asymptotic regime whereby dimension diverges to infinity while sample size remains fixed. We show that certain variations of DiProPerm are consistent and that consistency is a nontrivial property of tests in the HDLSS asymptotic regime. The practical utility of DiProPerm is demonstrated on HDLSS gene expression microarray datasets. Finally, an empirical power study is conducted comparing DiProPerm to several alternative two-sample HDLSS tests to understand the advantages and disadvantages of each method.

ACKNOWLEDGMENTS

The work presented in this article was supported in part by the NSF Graduate Fellowship and NIH grant T32 GM067553-05S1.

Additional information

Notes on contributors

Susan Wei

Susan Wei, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: susanwe@live.unc.edu). Chihoon Lee, Assistant Professor, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Department of Statistics, Colorado State University, Fort Collins, CO 80523-1877 (E-mail: chihoon@stat.colostate.edu). Lindsay Wichers, Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Environmental Media Assessment Group, MD B243-01, National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711 (E-mail: wichers.lindsay@epa.gov), J. S. Marron, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: marron@email.unc.edu).

Chihoon Lee

Susan Wei, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: susanwe@live.unc.edu). Chihoon Lee, Assistant Professor, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Department of Statistics, Colorado State University, Fort Collins, CO 80523-1877 (E-mail: chihoon@stat.colostate.edu). Lindsay Wichers, Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Environmental Media Assessment Group, MD B243-01, National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711 (E-mail: wichers.lindsay@epa.gov), J. S. Marron, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: marron@email.unc.edu).

Lindsay Wichers

Susan Wei, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: susanwe@live.unc.edu). Chihoon Lee, Assistant Professor, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Department of Statistics, Colorado State University, Fort Collins, CO 80523-1877 (E-mail: chihoon@stat.colostate.edu). Lindsay Wichers, Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Environmental Media Assessment Group, MD B243-01, National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711 (E-mail: wichers.lindsay@epa.gov), J. S. Marron, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: marron@email.unc.edu).

J. S. Marron

Susan Wei, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: susanwe@live.unc.edu). Chihoon Lee, Assistant Professor, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Department of Statistics, Colorado State University, Fort Collins, CO 80523-1877 (E-mail: chihoon@stat.colostate.edu). Lindsay Wichers, Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina - Chapel Hill, NC 27599-3260 and currently at Environmental Media Assessment Group, MD B243-01, National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711 (E-mail: wichers.lindsay@epa.gov), J. S. Marron, Department of Statistics and Operations Research, University of North Carolina - Chapel Hill, NC 27599-3260 (E-mail: marron@email.unc.edu).

Login options

Purchase * Save for later
Online

Article Purchase 24 hours to view or download: USD 51.00 Add to cart

Issue Purchase 30 days to view or download: USD 141.00 Add to cart

* Local tax will be added as applicable