Identification of new transmembrane proteins concentrated at the nuclear envelope using organellar proteomics of mesenchymal cells

ABSTRACT The double membrane nuclear envelope (NE), which is contiguous with the ER, contains nuclear pore complexes (NPCs) – the channels for nucleocytoplasmic transport, and the nuclear lamina (NL) – a scaffold for NE and chromatin organization. Since numerous human diseases linked to NE proteins occur in mesenchyme-derived cells, we used proteomics to characterize NE and other subcellular fractions isolated from mesenchymal stem cells and from adipocytes and myocytes. Based on spectral abundance, we calculated enrichment scores for proteins in the NE fractions. We demonstrated by quantitative immunofluorescence microscopy that five little-characterized proteins with high enrichment scores are substantially concentrated at the NE, with Itprip exposed at the outer nuclear membrane, Smpd4 enriched at the NPC, and Mfsd10, Tmx4, and Arl6ip6 likely residing in the inner nuclear membrane. These proteins provide new focal points for studying the functions of the NE. Moreover, our datasets provide a resource for evaluating additional potential NE proteins.


Introduction
The nuclear envelope (NE), which forms the membrane boundary of the nucleus, segregates the genome and chromosome-associated metabolism from the cytoplasm. It is a specialized endoplasmic reticulum (ER) sub-domain containing an outer nuclear membrane (ONM) that has continuity and functional similarity with the peripheral ER, and an inner nuclear membrane (INM) with distinctive properties [1,2]. The lipid bilayers of the ONM and INM are joined at nuclear pore complexes (NPCs), massive supramolecular protein assemblies that provide passageways for molecular transport across the NE [3,4]. The NPC is formed from multiple copies of~30 polypeptides (termed nucleoporins or Nups). A subset of Nups provide scaffolding for the NPC, whereas others, particularly those containing Phe-Gly repeats (termed 'FG Nups'), form a diffusion barrier across the NPC and provide binding sites for nuclear transport receptors [3][4][5].
In higher eukaryotic cells, the INM is lined by the nuclear lamina (NL)a protein scaffold whose backbone contains a polymer of nuclear lamins, type V intermediate filament proteins [6,7]. Three major lamin subtypes are expressed in the majority of mammalian cells: lamins A/C, B1 and B2 [6,7]. In addition to lamins, at least 20 widely expressed polypeptides are concentrated at the INM [1,6,7]. The NL has been implicated in nuclear structure and mechanics, tethering of heterochromatin and the cytoplasmic cytoskeleton to the NE, and regulation of signaling and gene expression [1,6,7]. Consistent with this wide array of functions, mutations in the genes for lamins and associated proteins have been found to cause a spectrum of human diseases (termed 'laminopathies') [8,9]. Many of these diseases target specific tissues, commonly of mesenchymal origin.
Most of the known INM-enriched proteins have one or more transmembrane (TM) segments [1,10]. Following insertion in the peripheral ER and ONM, these TM proteins are thought to become concentrated at the INM by lateral diffusion in the lipid bilayer around the NPC, in conjunction with binding to the NL and/or other intranuclear components [10,11]. Movement is bi-directional, and the degree to which specific NE proteins are localized to the INM vs peripheral ER can vary with different cell types or physiological states [12,13]. Consistent with this diffusion-retention model,~1/3 of transmembrane proteins in the yeast genome have the ability to reach the INM, even though most do not appear to be concentrated there or to have nucleus-specific functions [14]. The principles of the diffusionretention model also appear to specify NE localization of the LINC complex, an interconnected assembly of TM proteins spanning the INM (SUN-domain proteins) and ONM (nesprins) that is responsible for attaching cytoplasmic cytoskeletal filaments to the NL [15,16].
The detailed protein composition of the NL/INM in mammals remains incompletely understood, and it is likely that low-abundance proteins and/or those with cell type-selective expression patterns remain to be revealed. Proteomics analysis has identified numerous TM proteins in isolated NE fractions of different cell types [17][18][19], but it remains unclear whether most of these proteins are concentrated at the NE relative to the peripheral ER, or are more general ER residents that by default can diffuse into the contiguous nuclear membranes. An important goal that remains is a comprehensive characterization of proteins that are concentrated at the NE relative to the peripheral ER, as these proteins de facto are likely to have specific functions for the nucleus.
In this study, we used proteomics to characterize NEs and other subcellular fractions isolated from cultured mesenchymal stem cells (MSCs), and from correspondingly differentiated adipocytes and myocytes. We implemented a scoring system with the datasets to describe the relative enrichment of individual proteins in the NE fraction. This system accurately represented most of the TM proteins known to be concentrated at the NE, supporting its predictive value for new candidates. We selected five of the highscoring new candidates expressed in all three mesenchymal cell types for direct evaluation by quantitative immunofluorescence microscopy. Our results revealed that all of these are substantially concentrated at the NE: one is enriched at the NPC, one occurs in the ONM, and the remainder appear to be localized to the INM. The sequence homologies and other features of these proteins indicate that they are new windows for understanding the functions and dynamics of the NE. Our datasets provide a resource for evaluating the potential NE localization of membrane proteins detected in proteomics and other screens, and should facilitate the identification of additional NE-concentrated proteins.

Results
The frequent manifestation of laminopathies in cells of mesenchymal cells [8,9] prompted us to carry out NE proteomics on the murine C3H10T1/ 2 (C3H) MSC line and differentiated derivatives. Using undifferentiated C3H cells (U), together with differentiated adipocytes (A) and myocytes (M), we isolated three subcellular fractions for proteomic analysis: NE, nuclear contents (NC) and cytoplasmic membranes (CM) (Figure 1; Materials and Methods). The NE and NC fractions were obtained by nuclease digestion of isolated nuclei followed by treatment with 0.5 M NaCl and sedimentation to yield the NE (pellet) and NC (supernatant) fractions. The CM fraction was obtained by flotation of membranes from a postnuclear supernatant to a low-density zone of a sucrose gradient, a procedure that enriches for secretory pathway organelles (Golgi, plasma membrane, and endosomes/lysosomes). We optimized our cell lysis and fractionation methods using Western blotting to follow marker proteins for various organelles (see Materials and Methods). The proteomics analysis of the fractions provided a detailed measure of the relative abundance of benchmark and contaminant proteins in each fraction, as considered below.
We used multidimensional protein identification technology (MudPIT [20]; see Materials and Methods) for analyzing the fractions from the three cell types. Collectively this involved 3-4 mass spectrometry runs for each fraction and cell type, and identified 7938 proteins (Table S1). Approximately 60% of these were detected in all three cell types, whereas 6-8% were uniquely found in only one of the three cells (Figure 2(a); Table S1). As expected, proteins diagnostic of differentiated adipocytes (e.g. long chain fatty acid CoA ligase 1, perilipin 1) and myocytes (myosin 3 heavy chain, titin) were strongly induced in the respective differentiated cells, based on NSAF (normalized spectral abundance factor [21]) values (Table S1). To evaluate the abundance of individual proteins in the NE fraction relative to NC and CM, we calculated a NE enrichment score (termed 'score' below) based on NSAF values (see Materials and Methods). With this method, proteins that were detected only in the NE fraction had a score of 1, proteins that were found only in NC and/or CM had a score of 0, and proteins found both in the NE and in other fractions had intermediate scores. Scores were calculated only for proteins that were detected with 5 or more spectral counts in a particular cell type, since predictions are less reliable with low spectral detection.
Only 3% of the proteins in U cells had scores ≥ 0.7 ( Figure 2(b), top). By contrast,~15-20% of the proteins scored in this range in the A and M cells. When only proteins with annotated TM segments were considered (roughly 20% of all proteins detected), the protein percentage scoring ≥ 0.7 was more similar in the three cell types (between 4-9%; Figure 2(b), bottom). Thus, the high-scoring protein set of U cells is relatively enriched in TM proteins, as compared to those of A and M cells. These differences are correlated with higher levels of tubulin in A and M cells and less efficient extraction of non-TM cytoplasmic and intranuclear proteins from the NE fraction of these cells, as compared to U (Table S1).
We evaluated our scoring system using sets of benchmark proteins with well-defined membrane localizations (Table S2). The NE benchmarks, comprising 30 Nups and 23 INM proteins, included 22 proteins with TM segments. Other benchmarks involved sets of~15-20 abundant TM proteins enriched in different cytoplasmic membrane compartments: peripheral ER, Golgi, mitochondria and the plasma membrane/endosome/lysosome system (Figure 2(c) and Table S2). Most NE proteins with TM segments had a high score (> 0.7) in all three cell types. Notable exceptions with lower scores (~0.5-0.7) were emerin and Tmem43 (LUMA), which are known to be partially localized to the peripheral ER in certain cell types and/or physiological states [12,13,22,23]. The benchmark TM proteins of the peripheral ER had scores ranging from 0.2-0.7, clustering around a mean value of~0.5, although a few proteins characteristic of sheet ER (e.g. Sec11α, Sec61β) [24], had higher scores (~0.7-0.8) in one or more cell types. TM proteins of Golgi, plasma membrane/endosome/lysosome and mitochondria mostly had scores between 0.1-0.5, consistent with Western blot analysis of the NE fractionation (data not shown).
Many of the benchmark NE proteins lacking annotated TM segments also had high scores (> 0.7) in the three cell types, including B-type lamins, the NLassociated proteins Prr14 [25] and Gmcl1 [26], many Nups and the NPC-associated protein Mcm3ap [27]. NE proteins with somewhat lower scores included A-type lamins and~10 Nups lacking TM segments, which correspondingly were relatively abundant in the NC fraction. These results are consistent with the well-established existence of intranuclear pools of lamins A/C [28] and Nups [29] separate from the NL and NPC, respectively. The scores for almost all NE markers in A were lower than their corresponding scores in U and M cells (Table S2), coinciding with relatively higher levels of these proteins in the NC fraction (Table S1). This may reflect greater fragility of the NE of A cells, resulting in release of NE fragments to NC during fractionation. The non-TM protein datasets included lowabundance components with high scores in one or two cell types. Many of these have known regulatory, enzymatic or structural roles in the nuclear interior, cytosol or extracellular matrix. Although some of these also may function at the NE, we expect that most are not strongly concentrated at the NE in situ. The high scores could reflect either intrinsic limitations of MudPIT proteomics (see Discussion), adsorption to the NE during isolation, or association with co-fractionating structures such as the intermediate filament protein synemin or the extracellular matrix components collagen or fibronectin.
The above considerations indicate that our scoring system can most effectively predict new NEconcentrated proteins with TM segments. We performed unsupervised hierarchical clustering to further analyze the set of 243 TM proteins with scores higher than 0.5, sorting the proteins from the three cell types into eight clusters (Figure 3(a) and Table  S3). The cluster with high scores in all three cell types (Figure 3(a)) was predominated by well-established NE-concentrated proteins (see Table S2). It also included another five proteins not previously known  Table S2 for scores associated with specific proteins.
to be concentrated at the NE, which represent new candidates. We selected four members of this group to analyze: Arl6ip6, Mfsd10, Smpd4 and Tmx4 (Figure 3(b)). We added a fifth high-scoring protein found in the three cell types (Itprip) to the query group, since it is predicted to have a TM domain by the CCTOP algorithm and is homologous to Itpripl1 and Itpripl2, which both contain a curated TM segment (Uniprot). We also analyzed another member of the cluster that recently was shown to be concentrated at the NE, Vrk2 [30]. A diagrammatic representation of these proteins, with the position of the epitope tag and predicted TM segments, is shown in Figure 3(c). The specific peptides detected for these proteins are listed in Table S3.
To evaluate whether the five candidates are concentrated at the NE relative to the peripheral ER, we prepared populations of C3H cells stably transduced with lentiviral vectors expressing epitope-tagged versions of these proteins. After verifying that the V5tagged recombinant proteins migrated at their predicted sizes by Western blotting (Figure 3(c), Table  S5), we examined the behavior of the ectopically expressed proteins using a one-step fractionation of cell homogenates to obtain a low speed pellet enriched in nuclei, and a supernatant containing cytoplasmic membranes (Fig. S1). Quantification by Western blotting showed that ectopic Sec61β was enriched in the post-nuclear supernatant, whereas ectopic Lem2 was concentrated in the nuclear pellet. Like Lem2, ectopic Itprip, Mfsd10 and Smpd4 were significantly enriched in the nuclear fraction. However, ectopic Arl6ip6, Tmx4 and Vrk2 were distributed roughly evenly between   the two fractions. This could reflect the substantial amounts of these proteins in the peripheral ER seen by immunolocalization in some growth and expression conditions (below) and/or hypothetical release from NE binding sites and redistribution to the peripheral ER during the hypotonic swelling of cells preceding homogenization. We used immunofluorescence staining and confocal microscopy to more incisively analyze the subcellular localization of the ectopically expressed proteins. We compared the ectopic proteins to two endogenous markers, lamin A and the pan-ER transmembrane protein calnexin [24] (Figure 4(a) and Figs. S2-S4). With appropriate placement of the V5 epitope tag (summarized in Table S5) and transduction of cells with a low lentiviral MOI (multiplicity of infection), we observed that all five of the new candidates were substantially more concentrated at the NE than in cytoplasmic regions in most cells (Figure 4(a) and Fig. S3). This resembled the localization of ectopically expressed Lem2 and emerin (Figure 4(a) and Fig. S2), and contrasted with the pan-ER distribution of ectopic Sec61β [24] (Fig. S2). The control protein Vrk2 clearly was more concentrated at the NE than calnexin, but it was present at relatively high levels in the peripheral ER as well, consistent with previous work (Figure 4(a) and Fig. S3) [30]. Also, the NE-concentrated staining pattern shown for Arl6ip6 (Figure 4(a) and Fig. S3) was typically observed in moderately dense cell cultures. With lower cell densities, considerably higher levels of peripheral ER staining were seen, in addition to the NE labeling ( Fig. S4D-E).
We implemented an unbiased method to quantify the levels of NE localization of ectopic constructs, focusing on cells representing the lower half of the expression spectrum. The method involved comparing the fluorescence intensity ratio of the epitope tag/ endogenous calnexin at the NE, to the epitope tag/ calnexin ratio in a peripheral ER zone surrounding the nucleus (see Materials and Methods). This approach revealed that ectopically expressed emerin and Lem2 were~3-6-fold concentrated at the NE relative to the peripheral ER (Figure 4(b)). The five NE candidates were 1.7-3.5-fold concentrated at the NE, whereas Vrk2 showed a lower (1.3-fold) but statistically significant NE concentration (Figure 4(b)). If anything, the peripheral ER levels calculated for the ectopic constructs over-represent the native levels of these proteins in the peripheral ER, due to potential artefacts of ectopic protein over-expression (discussed below). Unfortunately, we were unable to compare the expression of the ectopically expressed candidates and their endogenous counterparts by Western blotting, due to the lack of convincing detection with commercial antibodies (see Materials and Methods).
During the course of this analysis, it became evident that the localization patterns of the ectopic proteins changed with their expression levels, with strongly NE-selective labeling associated preferentially with low expression. This was particularly conspicuous for Tmx4 and Smpd4. Whereas Tmx4 was highly concentrated at the NE compared to the peripheral ER in cell populations expressing comparatively low ectopic protein, it was uniformly localized throughout the ER/NE system in cells expressing high levels ( Fig. S4A-C). Also, the selective NE targeting of ectopic Smpd4 that was evident with low expression at early times after lentiviral transduction (Figure 4(a) and Fig. S5) was largely obscured by numerous cytoplasmic Smpd4 foci that accumulated in long-term expressing cells (Fig. S5). Consistent with the diffusion-retention model for NE localization, these results suggest that saturation of NE binding sites by overexpression of ectopic constructs results in net redistribution to the contiguous peripheral ER and/or appearance in cytoplasmic aggregates.
We used cell permeabilization with low/high concentrations of digitonin to analyze whether the newly identified NE-concentrated proteins are exposed to the ONM, or reside in a sequestered space at the INM or NPC (Figure 5(a)). With this technique (Fig. S6A), low concentrations of digitonin permeabilize the plasma membrane and allow antibody access to the cytosolic space and ONM, but leave the ER and NE intact [31]. Conversely, high concentrations of digitonin fully permeabilize the NE and allow antibody access to proteins of the INM and NPC-associated membrane as well. We validated this method in C3H cells by antibody labeling of calnexin and Lem2 (Fig.  S6B): only the cytosol-exposed epitopes recognized by the calnexin antibody were accessible with low digitonin, but both calnexin and ectopically expressed Lem2 (concentrated at the INM) were labeled after cell treatment with either high digitonin or Triton X-100. When applied to analysis of the five new NE proteins and Vrk2, low digitonin treatment yielded strong NE staining only for Itprip. In addition, labeling of the relatively minor peripheral ER pools of Arl6ip6, Mfsd10, Tmx4 and Vrk2 also was evident ( Figure 5), suggesting that the V5 epitope tag on these proteins was exposed to the cytosolic/nucleoplasmic space. This topology inference also was supported by phosphorylation site data for these proteins (see legend for Fig. S10). Treatment with high digitonin yielded strong NE labeling of the latter four proteins, as well as strong staining of Smpd4 at both the NE and in cytoplasmic foci. These results indicate that Itprip is exposed to the ONM, and suggest that the remaining proteins are located at membrane-sequestered NE regions. Since Arl6ip6, Mfsd10 and Tmx4 showed relatively uniform nuclear rim staining similar to Vrk2, it is likely that these proteins are localized at the INM.
The NE staining seen for both Smpd4 and Itprip was conspicuously less uniform than that of the other proteins analyzed, particularly in tangential views. We found that Smpd4 was localized to small puncta at the nuclear surface ( Figure 5(b) and Fig. S7). In substantial part, these puncta co-localized with NPCs, as detected by an antibody to FG-repeat Nups ( Figure   5(b); Pearson correlation coefficient R = 0.5). However, the Smpd4 intensity in different NPC puncta varied considerably, and some of the Smpd4 puncta at the NE had little or no Nup staining. Surprisingly, many of the Smpd4 foci found in the cytoplasm of both transiently transduced and stably expressing cell populations also were strongly labeled with the antibody to FG Nups (Fig. S5). This suggests that FG Nups may be recruited to ectopic cytoplasmic foci containing Smpd4. This is reminiscent of experiments involving ectopic overexpression of the transmembrane Nup Pom121, which induces cytoplasmic  Figure 5. Immunofluorescence analysis of the localization of target proteins with respect to NE substructure. (a) Representative images obtained with antibody staining after cell permeabilization with low or high concentrations of digitonin (DG). C3H cells ectopically expressing the depicted targets were incubated with antibodies to the V5 tag, calnexin and lamin A. Merged images of V5 and calnexin staining are shown on the right. (b, c) High resolution immunofluorescence images, obtained with Airyscan, of cells expressing ectopic Smpd4 or Itprip as indicated. Cells were co-labeled with antibodies to V5 and RL1, a monoclonal antibody recognizing FG repeat Nups of the NPC [63], after fixation and permeabilization with standard conditions. Pearson's correlation coefficient R (indicated) shows substantial co-localization between V5-Smpd4 and the NPC but no significant co-localization between Itprip-V5 and the NPC. Right panels show higher magnification views of areas indicated by boxes. In cells expressing V5-Smpd4, examples of foci co-labeled with RL1 and anti-V5 are indicated (arrowheads). All panels: scale bars, 10μm.
foci containing both Pom121 and FG Nups [32]. These results, together with data revealing interactions of Smpd4 with several Nups in pull-down assays [33], support a physiologically relevant interaction between Smpd4 and Nups, and strongly suggest that at least much of Smpd4 is associated with the NPC. The non-uniform co-localization of Smpd4 and FG Nups at the nuclear surface in part could reflect uneven association of ectopic Smpd4 with different populations of NPCs, or assembly of the ectopic protein into non-native NPC-related structures at the nuclear surface.
The distribution of Itprip on the nuclear surface was qualitatively different from the Smpd4 staining ( Figure 5(b-c) and Fig. S7), as it commonly appeared in linear arrays of puncta instead of the more distributed NPC-like pattern. The Itprip puncta did not colocalize with FG Nups (Figure 5(c)), or with nesprin-1, nesprin-2, nesprin-3 or Sun2 (Fig. S8). Nonetheless, due to the potential limitations of antibody-based localization, it remains possible that Itprip is associated with a subset of poorly detected LINC complex components.
We next analyzed whether the five newly identified NE proteins and Vrk2 are concentrated at the NE in differentiated adipocytes and myocytes (Figure 6), as suggested by their high proteomics scores. We were able to visualize Arl6ip6, Itprip, Smpd4, Tmx4 and Vrk2 in differentiated adipocytes, and in all cases obtained strong labeling of the NE with little or no peripheral ER/cytoplasmic staining. We achieved myogenic differentiation of cells stably transduced with Tmx4, Itprip and Arl6ip6, but not with the other proteins (see Materials and Methods). In all three cases, we observed robust NE-concentrated staining. As an additional model for NE targeting, we analyzed stably transduced populations of the human U2OS osteosarcoma cell line. In all cases, we observed strong targeting of the candidates to the NE (Fig. S9). Together these results indicate that the five newly characterized proteins have the capacity to concentrate at the NE in a variety of different cell types, and likely are widespread NE-enriched components.

Discussion
Here we used MudPIT analysis of subcellular fractions to identify five previously unrecognized TM proteins that are strongly concentrated at the NE: Arl6ip6, Itprip, Mfsd10, Smpd4 and Tmx4. Immunofluorescence microscopy with epitope accessibility analysis revealed that Itprip is located at the ONM and that much of Smpd4 is concentrated at the NPC (Fig. S10). The experiments also suggested that the other three proteins reside in the INM (Fig. S10). Although we focused our analysis on mesenchymal stem cells, adipocytes and myocytes, we found that these proteins also target strongly to the NE in U2OS cells. RNA-seq databases indicate that the five proteins are expressed broadly in human and mouse tissues, and all have been detected in HeLa cells by proteomics [33]. Thus, these probably are widespread NE components. The proteomics scoring system we used to evaluate NE enrichment was validated by examination of benchmark proteins for several cytoplasmic membrane compartments, and provided a strong framework for our efforts. One caveat of this strategy is that MudPIT proteomics only semiquantitatively represents the relative abundance of a particular protein in different subcellular fractions, with higher spectral detection increasing the reliability [34]. Thus, scores based on relatively low spectral counts should be interpreted cautiously.
We found that validation of NE proteins by ectopic expression and confocal microscopy was most accurately achieved using stably transduced cell populations, except in the case of Smpd4 (see Materials and Methods). We deem it essential to quantify the relative concentration of the ectopic target at the NE vs the peripheral ER using an internal TM marker that is evenly represented in both membrane systems, such as calnexin. The quantification method we employed (Materials and Methods) provides greater sampling depth than the commonly used line-scanning of confocal sections, which may be subject to user bias. We emphasize that high levels of ectopic protein overexpression in some cases can mask NE localization, as we have observed for Tmx4. This can explain the discrepancy between our demonstration that Tmx4 is concentrated at the NE with relatively low ectopic expression, and previously published work revealing a pan-ER distribution of ectopic Tmx4 [35,36], a pattern we also observed with high Tmx4 expression.
Aside from the sample set we analyzed, we consider it likely that some additional TM proteins with high scores in our data will turn out to be NE-concentrated, even in cases where database annotations suggest otherwise. For example, Pigb (not detected in our analysis) is known to be a mannosyltransferase involved in synthesis of the GPI anchor precursor and is annotated in UniProtKB as a general ER protein, but was recently shown to be strongly concentrated at the NE in Drosophila [37]. Here we analyzed proteins with high NE enrichment scores in all three cell types, which appeared in one of the clusters. Some of the proteins in other clusters, which have high scores in one or two of the cell types, might be concentrated at the NE is a differentiation state-selective pattern.
The group of high-scoring non-TM proteins in our datasets also is likely to contain proteins concentrated at the NE in mesenchymal cells. NE association has been suggested for some of these, such as Akap8l [38], the prostaglandin synthase a Adipocyte V5 Lamin A Calnexin DAPI

V5-Arl6ip6
Itprip-V5 Myotube Tmx4-V5 b Figure 6. Immunofluorescence analysis of the localization of target proteins in adipocytes and myocytes. (a) C3H cells stably transduced with the indicated constructs were differentiated into adipocytes and co-labeled with antibodies to the V5 epitope tag, calnexin and lamins A. DNA staining (DAPI) and a merge of V5 and calnexin labeling is shown in the right panels. (b) C2C12 myoblasts that were stably transduced with the indicated constructs were differentiated into myotubes and labeled as in (a). Scale bars, 10μm.
Ptgs2 [39] and the choline phosphate cytidylyltransferase Pcyt1a [40,41]. However, we consider our datasets and scoring system most useful for the analysis of TM proteins. Sequence analysis of the newly identified NE proteins suggests potential roles in nuclear regulation and membrane dynamics. The most evolutionarily conserved protein of this group is Mfsd10, a member of the ancient Major Facilitator Superfamily of membrane solute transporters. Mfsd10 was proposed to be a cellular efflux pump for organic anions and nonsteroidal anti-inflammatory drugs [42], although its transport properties have not been directly analyzed. Interestingly, the highest Psi-BLAST scores for Mfsd10 involve tetracycline efflux pumps of gram negative bacteria (e.g. 31% identity/45% similarity over 90% of Mfsd10 sequence with the TetA gene of S. marcescens). This raises the possibility that Mfsd10 may transport toxic metabolites and/or xenobiotics across the INM to the ER lumen, as a means of efficiently funneling deleterious compounds out of the nuclear environment.
Unexpectedly, we found the sphingomyelin phosphodiesterase Smpd4 [43] to be concentrated at the NPC. Smpd4 releases ceramide, a signaling molecule itself and biosynthetic precursor to the signaling lipid S1P [44] that is known to act in the nucleus to inhibit HDACs [45,46] and to stabilize telomerase [47]. The NPC association of Smpd4 raises the possibility that the production of S1P in the nucleus might be linked to transport activity at the NPC. Smpd4 also might have a role in lipid bilayer dynamics at the NE. For example, if Smpd4 were localized on the INM side of the NPC, sphingomyelin hydrolysis could reduce lipid head group packing on the nucleoplasmic leaflet of the INM to drive the membrane association and activation of Pcyt1a [40] or the pro-inflammatory phospholipase A2 [48]. Sphingomyelin hydrolysis also could promote local concave membrane curvature, which accompanies the process of NPC insertion in the interphase NE [49].
Itprip was the only protein of the group found to be localized to the ONM. A previous study reported that Itprip binds the inositol triphosphate receptor calcium channels and negatively regulates their activity in vitro [50]. Thus, Itprip could potentially function in localized regulation of calcium fluxes near the nucleus. Interestingly, an~300 residue region of Itprip comprises a Mab-21 nucleotidyltransferase fold (Uniprot) found in multiple proteins [51] including cGAS, a cytosolic enzyme involved in the sensing of cytoplasmic DNA in innate immunity. The concentration of Itprip at the NE could be explained most simply by an interaction with one or more nesprins, which themselves are concentrated at the NE due to transmembrane associations with SUN-domain proteins [15,16]. However, we were unable to detect colocalization with nesprin-1, nesprin-2, nesprin-3 or Sun2 using the antibodies that were available. Nonetheless, a potential interaction of Itprip with LINC components merits further analysis.
The properties of Tmx4 and Arl6ip6 are consistent with a role in regulating NE structure. The thioredoxin domain of Tmx4 is likely localized to the NE lumen, since it occurs between an N-terminal signal sequence and the single TM segment. This suggests a potential role in regulating the luminal aspects of NE specific structures, such as the LINC complex and associated torsinA, both of which may be regulated by disulfide oxidation/reduction [52,53]. Arl6ip6 is a susceptibility locus for ischemic stroke [54]. It lacks enzymerelated domains, but does show physical interactions in proteome-wide pull-down screens with a number of proteins involved in membrane vesicle formation/targeting [33,55], a process that is involved in NE resealing and repair.
In conclusion, the set of new NE proteins identified in this study provide new avenues for studying the dynamics and functions of the NE. It will be useful to extend the methodology used in this study to the analysis of other cell types, where we expect that additional NE-concentrated proteins with interesting properties will be identified.
For C3H myogenesis, cells were grown in DMEM supplemented with 10% FBS and 1% P/S/G. C3H cells were transfected with 10 μg of a doxycycline-inducible MyoD piggyBac transposon vector [56]. After 48 hours, cells with positive integration of the vector were selected using 2 μg/mL puromycin (Invivogen) for 24 hours. To initiate myotube differentiation, stably integrated populations of C3H cells were grown on 500 cm plates to 70% confluency and induced with 20 ng/mL doxycycline (Sigma) for 24 hours. Cells were then changed to differentiation medium containing DMEM with 2% donor equine serum (HyClone), 1% ITS Liquid Media Supplement (Sigma), and 1% P/S/G. Differentiation media was replaced every 24-48 hours until terminal differentiation (about 3 days).
For C2C12 myogenesis, cells were plated and allowed to reach 90-95% confluency. Media was then changed to DMEM supplemented with 1% donor equine serum (HyClone), 1% P/S/G, and 1% ITS Liquid Media Supplement (Sigma). Medium was replaced every 48 hours until terminal differentiation (4 to 5 days after initiation of myogenesis).

Subcellular fractionation
For subcellular fractionation, C3H cells were seeded in 500 cm 2 plates and allowed to reach 90% confluency. Plates were rinsed three times with ice-cold PBS, and then three times with ice-cold homogenization buffer (HB) (10 mM HEPES pH 7.8, 10 mM KCl, 1.5 mM MgCl 2 , 0.1 mM EGTA) containing 1 mM DTT, 1 mM PMSF, and 1 μg/mL each of pepstatin, leupeptin, and chymostatin. After these washes, cells were incubated in HB for 15 minutes on ice. Cells were then scraped off plates and were further disrupted by Dounce homogenization with 18-20 strokes. The whole cell homogenate was then layered on top of 2 mL shelf of 0.8 M sucrose in HB and centrifuged at 2000 rpm for 10 minutes at 4°C in a JS5.2 rotor with no brake to yield a crude nuclear pellet and postnuclear supernatant. The postnuclear supernatant comprising the zone above the sucrose shelf, and pelleted nuclei were each resuspended in 1.8 M sucrose (final concentration) in HB using a cannula. The resuspended nuclei and postnuclear supernatant were layered in separate ultra-clear 13.2 ml nitrocellulose centrifuge tubes on top of a 1 mL layer of 2.0 M sucrose in HB. For the nuclear gradient, HB was layered over the loading zone to fill the nitrocellulose tube. For the postnuclear supernatant gradient, 1 mL of 1.4 M sucrose in HB was layered on top of the loading zone, followed by HB to fill the tube. The gradients then were centrifuged at 35,000 rpm (210,000g) for 1 hour at 4°C with no brake in an SW41Ti rotor. Nuclei that pelleted through the 2.0 M sucrose were resuspended in HB and Dounce homogenized with 2 strokes to disperse aggregates. For the postnuclear supernatant gradient, the HB/1.4 M sucrose interphase was collected and saved as 'cytoplasmic membranes' (CM). Nuclei were then incubated with 1 mM CaCl 2 and 100 ku/mL micrococcal nuclease (New England Biolabs) in HB for 37°C for 15 minutes. Digested nuclei were then placed on ice and NaCl was added to a final concentration of 500 mM. The digested nuclei sample was layered on top of 1 mL shelf of 0.8 M sucrose in HB and centrifuged at 4000 rpm for 10 minutes at 4°C in a JS5.2 rotor. A sample comprising the region above the 0.8 M sucrose layer was collected and saved as 'nuclear contents' (NC). The NE fraction, comprising the pellet, was collected by resuspension in HB. During development of the fractionation method, we monitored different organelles and cellular components at progressive steps of the isolation with antibodies to the following markers: lamin B1 and the INM resident LAP2β for the NE, histone H2B for chromatin, calnexin for sheet and tubular ER [24], Tim23 for mitochondria and Pex14 for peroxisomes.
For Western blot analysis of epitope-tagged proteins with one-step fractionation, stably transduced C3H cells were seeded in 15-cm plates and allowed to reach 80% confluency. Cells were trypsinized, rinsed and swollen as described above. Cells were then passed through a 25g 1.5" needle 18-20 times with steady force. The homogenate was then separated into a crude nuclear pellet and postnuclear supernatant as described above. The quality of homogenization was evaluated by Western blotting to detect H2B and calnexin prior to the densitometry analysis of V5-tagged proteins. More than three fractionation experiments were performed for each construct.
Analysis was performed using an Agilent 1200 HPLC pump and a Thermo LTQ-Orbitrap Velos Pro using an in-house built electrospray stage. MudPIT experiments were performed with steps of 0%, 10%, 20%, 30%, 50%, 70%, 80%, 90%, 100% buffer C and 90/10% buffer C/B [20], being run for 5 min at the beginning of each gradient of buffer B. Electrospray was performed directly from the analytical column by applying the ESI voltage at a tee (150 mm ID) (Upchurch Scientific) [20]. Electrospray directly from the LC column was done at 2.5 kV with an inlet capillary temperature of 325°C. Data-dependent acquisition of tandem mass spectra were performed with the following settings: MS/MS on the 20 most intense ions per precursor scan; 1 microscan; reject unassigned charge state and charge state 1; dynamic exclusion repeat count, 1; repeat duration, 30 second; exclusion list size 500; and exclusion duration, 90 second.
To calculate a NE enrichment score, the sums of the NSAF scores for each protein for NE, CM, and NC fractions were calculated. The following equation was used to determine NE enrichment score, where e = experimental run and p = protein ID: The clustering analysis was carried out only on annotated TM proteins with an enrichment score of greater than 0.5 in at least one of the three cell types (U, A, M). These proteins then were clustered on the basis of their enrichment scores in all three cell types. An unsupervised hierarchical clustering algorithm was applied using the Euclidean distances of the score triplets using the 'hclust' R function with the 'complete' agglomeration method. 8 clusters were selected and plotted in Figure 3(a) and Table S3. The proteomics datasets have been deposited in the public proteomics repository MassIVE (Mass Spectrometry Interactive Virtual Environment), part of the ProteomeXchange consortium [63], with the identifier MSV000083166 (and PXD011856 for ProteomeXchange) and is available through the following link: ftp://massive.ucsd.edu/MSV000083166.
All genes were inserted into pLV-EF1a-IRES-Puro (gift from Tobias Meyer, Addgene plasmid #85132) using ligation independent cloning (LIC). Two LIC-compatible sites, containing either an N-terminal V5 tag or a C-terminal V5 tag, were designed synthetically and inserted into pLV-EF 1a-IRES-Puro using restriction enzymes BamHI and MluI. The primers used to insert the genes of interest into the LIC-compatible pLV-EF1a-IRES-Puro vector are listed in Supplementary  Table S4. The portion of the primer that aligns to the gene of interest is underlined, and the portion that is required for T4 Polymerase (T4P) digestion during LIC is not underlined.
Genes of interest were amplified by PCR using Phusion High-Fidelity DNA Polymerase (New England Biolabs). pLV-EF1a-IRES-Puro LICcompatible vectors were digested with SrfI (New England Biolabs). PCR fragments and SrfI-digested vector were treated with T4 Polymerase (New England Biolabs) in the presence of either dTTP (PCR products) or dATP (vector). T4P-digested vector and inserts were mixed at room temperature for 5 minutes, and then NEB Stable Competent Cells (New England Biolabs) were transformed with the product. Cells were incubated at 30°C for 24 hours, and then colonies were picked for clone validation. All cDNA clones were confirmed by complete DNA sequencing of the ORF in both 5'-3' and 3'-5 directions.
C3H, C2C12, and U2OS cells were changed to DMEM with 10% FBS, 1% glutamine, and 1% NEAA without antibiotics. Cells were diluted to 5 × 10 4 cells/mL and polybrene (EMD Millipore) was added to a final concentration of 10 μg/mL. Cells were transduced with different viral loads (ranging from 1 to 500 μL of viral supernatant per 1 mL of cells) to obtain cell populations with different multiplicities of infection (MOIs). After 3 days of viral transduction, cells were treated with puromycin (Invivogen) to select for cells that had successfully integrated viral DNA. C3H were treated with 5 μg/mL, C2C12 were treated with 5 μg/ mL, and U2OS were treated with 1 μg/mL puromycin for up to 1 week. Cell populations were further expanded and grown for fractionation, Western blotting, and immunofluorescence.

Western blotting
For Western blotting, cells were resuspended in 2X Laemmli buffer (4% SDS, 10% 2-mercaptoethanol, 20% glycerol, 0.004% bromophenol blue, and 0.125 M Tris-HCl pH 6.8) and boiled for 5 minutes. Samples were run on a Novex Tris-Glycine gel (Life Technologies) using FASTRun Buffer (Fisher Scientific). Samples were then transferred to a nitrocellulose membrane (Life Technologies). Membranes were rinsed twice with Tris-buffered saline (TBS) with 0.1% Tween-20 (Tw) and then blocked with 5% bovine serum albumin (BSA) in TBS/Tw. Membranes were incubated with primary antibody diluted in 0.5% BSA in TBS/Tw overnight at 4°C. Membranes were then washed 6 times with TBS/Tw and incubated with HRP conjugated secondary antibodies in TBS/Tw for 1 hour at room temperature. Signals were then developed using an enhanced chemiluminescence kit (Thermo Fisher) for 5 minutes and the signals were captured by UVP digital imaging system.

Immunofluorescence
For immunofluorescence staining, cells were plated on sterile glass coverslips and allowed to grow overnight. 24 hours after plating, cells were rinsed with Dulbecco's phosphate buffered saline (DPBS with calcium and magnesium) and fixed using 2% paraformaldehyde (PFA) (Electron Microscopy Sciences) in DPBS for 20 minutes. Samples were rinsed three times with phosphate buffered saline (PBS) and blocked for 15 minutes in PBS with 5% goat serum (Jackson ImmunoResearch Laboratories) and 0.5% Triton X-100 (Tx) (Fisher Scientific). Samples were then incubated with primary antibody diluted in PBS with 1% goat serum and 0.1% Tx overnight at 4°C. After washing with PBS/Tx (0.1%) 4 times, samples were incubated with Alexa Fluor conjugated secondary antibody diluted in PBS/Tx (0.1%) at room temperature for one hour. Samples were finally washed twice with PBS/Tx (0.1%), incubated with DAPI at room temperature for 10 minutes, and then washed twice with PBS and mounted on glass slides using Aqua-Poly Mount (Polysciences).
For digitonin permeabilization of C3H cells, 4 × 10 4 cells were plated on sterile glass coverslips coated with 0.2% gelatin in a 24-well plate. 24 hours later, cells were fixed and treated with either 40 μg/mL or 1 mg/mL digitonin in PBS at room temperature for 5 minutes. Samples were then washed 3 times with PBS and blocked using PBS with 5% goat serum at room temperature for 15 minutes. Samples were incubated with primary antibody diluted in PBS with 0.5% goat serum overnight at 4°C. The next morning, samples were washed 4 times with PBS and incubated with secondary antibody in PBS for 1 hour at room temperature. Samples were then stained with DAPI, washed with PBS and mounted on glass slides using Aqua-Poly Mount.

Light microscopy and quantification
Confocal images were acquired on a Zeiss 780 or a Zeiss 880 Airyscan laser-scanning confocal microscope with a 63X PlanApo 1.4 NA objective. Contrast adjustment of the representative images was performed with ZEN software (Zeiss). 10 or more images from each stably or transiently transduced cell population of the lower expression levels were randomly chosen and the NE/ER ratio was quantified. Lamin A staining was used to outline the nucleus and the area of NE and ER were defined by −0.5 to 0 μm (NE) and +0.5 to +1 μm (ER) relative to the edge of the nucleus using the 'Enlarge' function in ImageJ (NIH). Total fluorescent intensities of V5 staining in both areas were measured and normalized to the calnexin staining of the same area. The ratio of NE/ ER was then calculated by dividing the normalized V5 signals in the NE to the normalized V5 in the ER.
The co-localization analysis was performed with the 'Coloc2' function in ImageJ. Where necessary, raw images was processed using the rolling-ball 'background subtraction' function in ImageJ. Control and test images were processed with identical parameters. Representative images were prepared with automatic Airyscan processing in ZEN.