Skip to Main Content
 
Translator disclaimer

Abstract

Proper data transformation is an essential part of analysis. Choosing appropriate transformations for variables can enhance visualization, improve efficacy of analytical methods, and increase data interpretability. However, determining appropriate transformations of variables from high-content imaging data poses new challenges. Imaging data produce hundreds of covariates from each of thousands of images in a corpus. Each of these covariates will have a different distribution and needs a potentially different transformation. As such imaging data produce hundreds of covariates, determining an appropriate transformation for each of them is infeasible by hand. In this article, we explore simple, robust, and automatic transformations of high-content image data. A central application of our work is to microenvironment microarray bio-imaging data from the NIH LINCS program. We show that our robust transformations enhance visualization and improve the discovery of substantively relevant latent effects. These transformations enhance analysis of image features individually and also improve data integration approaches when combining together multiple features. We anticipate that the advantages of this work will likely also be realized in the analysis of data from other high-content and highly multiplexed technologies like Cell Painting or Cyclic Immunofluorescence. Software and further analysis can be found at gjhunt.github.io/rr. Supplementary materials for this article are available online.

Additional information

Funding

The authors gratefully acknowledge support from the National Science Foundation (grant no. DMS-1646108) and the National Institutes of Health (NIH grant nos. U54HG008100 and 1U54CA209988).

Login options

Purchase * Save for later
Online

Article Purchase 24 hours to view or download: USD 51.00 Add to cart

Issue Purchase 30 days to view or download: USD 141.00 Add to cart

* Local tax will be added as applicable