Skip to Main Content
 
Translator disclaimer

Statistical procedures for variable selection have become integral elements in any analysis. Successful procedures are characterized by high predictive accuracy, yielding interpretable models while retaining computational efficiency. Penalized methods that perform coefficient shrinkage have been shown to be successful in many cases. Models with correlated predictors are particularly challenging to tackle. We propose a penalization procedure that performs variable selection while clustering groups of predictors automatically. The oracle properties of this procedure, including consistency in group identification, are also studied. The proposed method compares favorably with existing selection approaches in both prediction accuracy and model discovery, while retaining its computational efficiency. Supplementary materials are available online.

ACKNOWLEDGMENTS

The authors are grateful to the editor, an associate editor, and two anonymous referees for their valuable comments. Sharma's research was supported in part by the NIH (National Institutes of Health) grant P01-CA-134294. Bondell's research was supported in part by the NSF (National Science Foundation) grant DMS-1005612 and NIH grants R01-MH-084022 and P01-CA-142538. Zhang's research was supported in part by the NSF grant DMS-0645293 and NIH grants R01-CA-085848 and P01-CA-142538.