Membrane models for molecular simulations of peripheral membrane proteins

ABSTRACT Peripheral membrane proteins (PMPs) bind temporarily to the surface of biological membranes. They also exist in a soluble form and their tertiary structure is often known. Yet, their membrane-bound form and their interfacial-binding site with membrane lipids remain difficult to observe directly. Their binding and unbinding mechanism, the conformational changes of the PMPs and their influence on the membrane structure are notoriously challenging to study experimentally. Molecular dynamics simulations are particularly useful to fill some knowledge-gaps and provide hypothesis that can be experimentally challenged to further our understanding of PMP-membrane recognition. Because of the time-scales of PMP-membrane binding events and the computational costs associated with molecular dynamics simulations, membrane models at different levels of resolution are used and often combined in multiscale simulation strategies. We here review membrane models belonging to three classes: atomistic, coarse-grained and implicit. Differences between models are rooted in the underlying theories and the reference data they are parameterized against. The choice of membrane model should therefore not only be guided by its computational efficiency. The range of applications of each model is discussed and illustrated using examples from the literature. Graphical abstract


Introduction
Peripheral membrane proteins (PMP) are soluble proteins that bind transiently to the surface of biological membranes ( Figure 1) where they accomplish their functions [1]. Unlike transmembrane (TM) proteins, they do not span the entire lipid bilayer but interact with the membrane through a collection of protruding hydrophobic residues located on loops or amphipathic helices [2][3][4][5][6] which, according to the text-book model, are often surrounded by positively charged amino acids lysines and arginines [7]. The term peripheral protein encompasses proteins performing various critical functions including enzymes involved in lipid metabolism or blood coagulation, membrane remodelling machines such as BAR domains [8] or ESCRTIII [9], lipid-transfer proteins and well-known membranetargeting domains such as C1, C2, FYVE, PH, PX, ENTH and GLA [10,11], to name a few.
Peripheral proteins bind biological membranes with an exquisite resolution in time and space, often thanks to a fine-tuned lipid specificity achieved using amino acid motifs exposed on their surface. The residence times of peripheral proteins vary depending on their function, and for example the residence time of a bacterial phospholipase C on small unilamellar vesicles was estimated to 379 ± 49 ms, in agreement with earlier estimations for a highly related PLC [13] or for human phospholipase Cγ2 [14]. The turnover rates for membrane-associated proteins in clathrin-mediated endocytosis have been estimated to a few seconds [15]. Accordingly, PMPs bind membrane through different mechanisms. Some bind a lipid head to a deep pocket, some bind thanks to a covalent lipid anchor and some have neither, relying on displaying on their surface the necessary structural and sequence motif to bind to membrane lipids. In this review, the region interacting with the membrane lipids is referred to as the interfacial-binding site or IBS. The associated amino acids motifs on the surface of membrane-targeting domains have been the subject of numerous studies [11] but other PMPs are not that well characterized. Besides phosphoinositides a number of other lipids are also recognized by membrane targeting domains (and other PMPs) but we are only starting to grasp the chemical complexity of these interactions as the complexity of biological membranes is slowly uncovered [16,17].
Because they most often exist in a soluble form, obtaining structures of PMPs has historically been easier than for TM proteins. While progresses in structure determination of TM proteins has led to a rapid increase of the number of structures available in public databases, the number of structures of proteins with weaker membrane association is comparatively small [18]. This applies to bitopic and monotopic TM proteins, and to PMPs. Due to the transient nature of their interaction to membranes, most structural biology methods are excluded as a direct source of information, and we lack information on the location of the IBS, and about potential conformational changes happening upon membrane association for example. Moreover, the IBSs of peripheral proteins are more challenging to predict from sequence or structure than transmembrane protein segments. The structure and amino acid composition of these interfaces is not as characteristic as those of transmembrane regions, which often consist in helical bundles enriched in hydrophobic residues, making them easily predictable from amino acid sequence only [19]. The prototypical IBS is described as a combination of lysines and arginines and only a few hydrophobic amino acids [1]. Peripheral proteins are indeed restricted in terms of the hydrophobic patches they can expose on their surface and still retain solubility. Such a mix of polar and hydrophobic residues on the surface of the protein is not easy to discriminate from the rest of the protein surface and even less so, from the primary sequence. Moreover, even if the role of unspecific electrostatic interactions has been demonstrated for several protein families [7,20], recent studies report more complex and subtle recognition mechanisms for some proteins. That includes weak electrostatic interactions, cation-pi interactions ( Figure 2) [3,21,22] between aromatic amino acids and phosphatidylcholine-containing lipids, roles of ions in mediating lipid-protein interactions [23], or the importance of macrodipoles [20,24,25]. In summary, while we have high quality structural data of some peripheral proteins in their soluble forms, their IBS is seldomly clearly identified and their binding mechanism remain elusive. Molecular dynamics (MD) simulations have proven useful in predicting IBS of PMPs and modelling PMP binding to lipid bilayers [26][27][28]. MD simulations simulate the behavior of a molecular system by solving the Newton's equation of motion for each particle constituting the system. The system simulated usually consists of several molecules and in MD simulations of PMPs, it is natural to include at least the protein itself, a bilayer formed by lipids relevant to the cellular membrane being modeled, water molecules solvating both sides of the membrane and relevant ions. For a regular size enzyme like a phospholipase D from recluse spider venom (285 amino acids) that we recently simulated, 256 POPC lipids and 22,613 water molecules were needed plus 4 ions, so a system totaling nearly 110,000 atoms.
Interactions between particles are described using a so-called force field. In molecular mechanics (MM) atomistic force fields, each atom is represented by a single particle carrying a point charge, following the Born Oppenheimer approximation. As we will see in this review, there also exists force fields using a coarser description of the simulated molecules where a group of atoms can be represented by a single particle for the sake of the simulation.
A generic atomistic FF for biomolecules expresses the potential of the system as a sum of bonded terms and non-bonded terms. Bonded terms include harmonic potentials for bond stretching and valence angle bending, and a cosine function for torsions. Non-bounded interactions are accounted for by pairwise terms; a coulombic potential for interactions between point charges and a Lennard Jones potential for van der Waals interactions (Cf equation 1).
Non-bounded interactions are the computational bottleneck in MD simulations, as they scale in principle as a function of N 2 , where N is the number of particles simulated. In practice, the use of cut-off schemes and neighbour lists reduces the number of pairwise interactions to be estimated at each integration step and decreases the computational cost associated with the calculation of the coulombic and Lennard Jones terms. Even though MD simulations remain expensive for large systems and on long timescales, such as PMP association to -or desorption from-a membrane. For the 110,000 atoms system described above, using NAMD (v 2.13) [29] and the CHARMM36 force field [30][31][32], we simulated 100 nanoseconds per day on 1024 cores of the newest high-performance computer in Norway at the time of writing (Betzy: BullSequana XH2000, ranked 61 st on the top500 list in November 2020 1 ). A number of schemes exist to more efficiently explore the phase space; enhanced sampling and biasing methods such as replica exchange MD [33], accelerated MD [34], umbrella sampling [35] and steered MD [36] to name a few. In steered MD for example, an external force, represented by a new term in the potential, is applied to a group of atoms.
Atomistic MD simulations are particularly useful to describe interactions between the protein residues and the interfacial lipid groups [2,3,[37][38][39][40]. Moreover, the improvements brought to atomistic force fields in the last decade have among enabled simulations of a wider array of lipids [41]. Yet capturing the spontaneous membrane association of some PMPs is elusive with conventional atomistic MD simulations. One of the reasons is the slow lateral lipid diffusion (D) which is typically on the order of 10 −8 cm 2 .s −1 at ambient temperature [39,42]. Therefore, during a regular atomistic MD simulation (hundreds of nanosecond to microseconds), the bilayer may appear relatively static preventing efficient sampling of lipid-PMP interactions. In order to overcome this problem, different membrane models can be used. There exist indeed a variety of membrane models amenable to simulations. It is common to use a multiscale strategy taking advantage of the computational efficiency of the earliest membrane models, and of the more recent improved accuracy of lipids force fields. The first membrane model we are aware of dates back to 1988 and was used by Edholm and Jähnig to simulate a transmembrane helix from glycophorin [43]. They could simulate one time-step in 0.3 s on a CRAY-1 supercomputer. That model and the following ones represent the membrane as a hydrophobic slab with a polar layer on either sides and are meant to compute the energetic of solvation of membrane proteins in or at the surface of membranes. In essence, coarse-grained (CG) models are somewhat related to the spirit of the implicit models as they too are parameterized against partition coefficients, but CG models present the clear advantage of treating the lipids as individual molecules. CG models are computationally efficient but lack the ability to describe the fine details of protein-lipid interactions that might be needed to explain selective protein-membrane binding. For this, one needs atomistic force fields [41,44].
Biological membranes consist mainly of glycerophospholipids but contain also other molecules, including cholesterol, proteins, etc. [45]. Lipid bilayers form a complex chemical environment, with a hydrophobic core and a polar interfacial region, and their physico-chemical and mechanical properties influence, and are influenced by the membrane-associated proteins. Accounting for the chemical complexity of the interfacial region, lipid diffusion, lateral pressure, curvature stress, electrostatic properties (transmembrane, surface and dipole potentials) and more, is extremely challenging for models that also need to be computationally efficient. In what follows we describe membrane models relevant for simulations of peripheral membrane proteins and how the different levels of resolution of these models are suited to address different aspects of the mechanism of peripheral membrane binding of proteins. We illustrate the use of these methods with a number of selected examples from which peptides and transmembrane proteins are excluded. The latter have been extensively reviewed elsewhere recently [27,46,47].

Atomistic models
The main atomistic force fields for MD simulations of lipid bilayers are CHARMM, AMBER, OPLS-AA, Slipids, and GROMOS. They are derived from atomistic force fields for proteins and other biomolecules and represent lipid molecules in all-atom (AA) detail or in a united-atom (UA) fashion where hydrogens on non-polar chemical groups are not explicitly represented, thus speeding up the simulations. The reader is referred to e.g. Refs. [41,44] and references therein for a detailed account of the development efforts. Here, we briefly overview the most popular AA and UA force fields [48], including the atomistic Highly Mobile Membrane-Mimetic (HMMM) [49].

All-atoms force fields
The CHARMM (Chemistry at HARvard Macromolecular Mechanics) force field for lipids is widely used in simulations of lipid bilayers, and compatible with the CHARMM force field for proteins [31,32,50]. CHARMM describes Figure 3. Phenol-tetramethylammonium potential energy surface obtained with quantum mechanics calculations, the CHARMM36 force field and the CHARMM-WYF modification of the Lennard-Jones potential. Reprinted from [56], with permission from ACS Publications. hydrogen atoms explicitly. The CHARMM force field for lipids has been continuously improved from the initial CHARMM22 lipid parameter set [51,52] developed in the 1990s to the most recent update CHARMM36 [30]. Briefly CHARMM27 [53] and CHARMM27r [54,55] suffered from a number of limitations, including too high torsional barriers along the lipid tails (C27), overstabilization of gauche states (C27), and the tendency of saturated phosphatidylcholine bilayers to transition into the gel phase above the experimental phase transition temperature (C27r). This could be avoided by applying surface tension. Extensive parameterization efforts of the lipid parameter set have led to CHARMM36 which can be used without the need for an external surface tension, and faithfully reproduce the physico-chemical properties of a wide variety of lipid bilayers including glycerophospholipids (saturated, unsaturated, branched, containing cycles) with various headgroups (phosphatidylcholine PC, phosphatidylacid PA, phosphatidylglycerols PG, phosphatidylinositol PI, phosphatidylserine PS), glycolipids, lipopolysaccharides, sphingolipids, ceramides, and a range of sterols. Noteworthy are also our efforts to improve the description of cationπ interactions between choline heads and aromatic amino acids [56,57] observed between PC lipids and tyrosine or tryptophane residues of several peripheral protein [2,3,21,58,59]; modifications of the CHARMM36 Lennard-Jones potential between choline and aromatics resulted in a better agreement for cation-pi interactions (see Figure 3). The parameters are available in the CHARMM-WYF parameter set.
AMBER (Assisted Model Building with Energy Refinement), like CHARMM, treats all hydrogen atoms explicitly [60]. Compared to the CHARMM force fields, AMBER force fields were generally less used for membrane protein simulations due to the lack of a specific parameter set for lipids [61]. The general AMBER force field, known as GAFF (Generalized Amber Force Field) [62] and meant for simulation of arbitrary organic molecules was extended to include lipid parameters in 2007 [63]. This parameter set suffered from some of the same limitations as the CHARMM27 parameter set. In 2012, two somewhat improved force fields were released; GAFFlipid [64] and Lipid11 [65] (Lipid11 was released with AMBER12). Subsequent efforts focusing on head and tail groups charges, Lennard-Jones and dihedral angles parameters resulted in 2014 in Lipid14 [66] a modular lipid force field for tensionless lipid phospholipid simulations and compatible with the AMBER protein force field. Slipids [67,68] (Stockholm Lipids) is another force field compatible with AMBER, but it originates from efforts [69,70] to improve the CHARMM27r parameter set. The Slipids force field is available for saturated and unsaturated glycerophospholipids (PC, PS, PE, PG), sphingolipids (sphingomyelin), and cholesterol.
The OPLS-AA (Optimized Parameters for Liquid Simulations All Atom) [71] force field was developed with emphasis on reproducing the experimental thermodynamical and partitioning properties of short chain hydrocarbons alkanes, and now includes parameters for several classes of biomolecules. It was recently extended to better model long chain hydrocarbons [72], and lead to a parameter set called L-OPLS that has expanded its repertoire of lipids to include PC, PE, unsaturated lipids and cholesterol [73], as well as plasmalogens and PE headgroups [74]. Yet, so far, the availability of lipids in the L-OPLS force field has not been as diverse as that of the CHARMM36 and the AMBER-compatible force fields.

United-atoms force fields
Compared to the AA force fields for lipids, the united-atom model claimed to provide a better balance between accuracy and computational cost in membrane modelling [48]. Among currently available UA lipid force fields, GROMOS (GROningen MOlecular Simulation package) [75][76][77] has the most popular ones and offers the largest variety of lipids. In the GROMOS force fields, nonpolar CH, CH 2 , and CH 3 groups of hydrocarbons are considered as a single particle. The protein force field is parameterized against thermodynamic properties of aliphatic chains to enthalpies and free energies of solvation in polar and apolar environments [78]. There are two main classes of GROMOS force field: those with the original GROMOS non-bonded parameters [77,79] and the ones with Berger modifications [80,81]. In the Berger modifications, in addition to the modification of non-bonded interaction parameters, a Ryckaert-Bellemans potential (cos n (ψ)) is used to describe torsions of the hydrocarbon chains [77,82]. The Berger lipid parameters are recommended to maximize the lipid bilayer sampling due to its fast diffusion and good simulation efficiency [83]. Although Berger lipid is one of the most popular lipid force fields the combination of Berger/GROMOS force fields overestimates the protein lipid interactions and consequently results in drastic changes of lipid properties upon protein insertion [84].

Highly Mobile Membrane-Mimetic Model
HMMM [49,86] uses an AA force field, namely CHARMM36 [30], but represents the lipid bilayer with short-tailed (ST) surfactant-like lipids and a liquid phase for the membrane core, thus increasing lateral lipid diffusion.
In 2011, Arcario et al., used a biphasic solvent model to simulate the membrane insertion of a coagulation protein, namely the GLA domain of human protein C [86]. MD simulations with a full tail (FT) atomistic lipid bilayer did not lead to spontaneous binding of the GLA domain, and one had to resort to steered MD [36] for the protein to anchor at the bilayer interface [87]. With the biphasic solvent model though, the protein did bind the bilayer spontaneously within the early nanoseconds of simulation at a similar depth of anchoring and orientation similar to that obtained with the enhanced sampling simulation. In this model, the lipid bilayer was replaced by a single layer of 1,1-dichloroethane (DCLE).
In 2012, the biphasic solvent was enhanced by adding two extra layers of surfactant-like ST lipids on each side of the organic solvent phase (Figure 4) [49]. In this updated HMMM membrane representation, the bilayer core is represented by a DCLE layer, and the surfactant-like molecule is divalerylphosphatidylserine (DVPS), a saturated ST lipid with five acyl tail carbons. A mixture of DVPS with DCLE forms a bilayer-like structure within 20 ns in a MD simulation, and the lateral diffusion of this membrane model is five times higher than a FT model composed of 1,2-dioleoyl-sn-glycero-3-phospho-L-serine. The performance of the HMMM representation to model the spontaneous insertion of a PMP was assessed by 10 simulations (replicas) of a GLA domain that all converge toward the conformation observed in steered MD [87].
The permeability of the HMMM membrane was validated using the degree of hydration and ions penetration and showed good agreement with AA FT lipid bilayers [49]. Using potential of mean force (PMF) calculations, Pogorelov et al [88]. assessed the free energy of transfer of sidechain analogs of ten selected amino acids (I, A, Y, W, E, N, S, C, D and K) to a HMMM membrane. They reported good agreement with existing experimental, atomistic and coarse-grained free energy profiles although they did highlight some differences. First, D was slightly more stable in the DCLE layer than it would be between FT lipid tails. Aromatic residues (F, Y and W) diverge from the reference in the tail and DCLE region and especially so for TRP; their stability in the membrane core is overestimated by the HMMM PMFs. Most noticeably, HMMM fails to capture the minimum in the interfacial region with positive PMFs for ASN and SER, while the CYS analog has its minimum in the membrane core. The deviations from reference PMFs in the core region are likely due to two factors: the polar nature of DCLE leading to dipole-dipole interactions with polar groups and its fluidity allowing sidechain tumbling in the bilayer core.
HMMM was also used to characterize the membrane-binding site of the catalytic domain of cytochrome P450 3A4 (CYP3A4) [89], an enzyme wellknown for its involvement in xenobiotics and endogenous compound metabolism [90,91]. The simulations revealed reorientations of side chains upon membrane binding, which promote the opening of access tunnels leading to the active site of CYP3A4 Naturally simulations with HMMM cannot account for effects due to the nature of lipid tails, differences between e.g. phospholipids and sphingolipids, or the presence of sterols. Also, the HMMM membrane model is not meant to reproduce mechanical properties (compressibility or bending) of lipid bilayers. The HMMM builder [92,93] in the CHARMM-GUI [94] can be used to generate simulation input files with the HMMM model.

All-atoms/united-atoms MD simulations of lipid transfer proteins
The yeast Osh4 proteins are members of a family of lipid transfer proteins (LTPs) with oxysterol-binding protein (OSBP)-related domain (ORDs) as their lipid transfer domains. Osh4 transfer cholesterol from the endoplasmic reticulum (ER) to the plasma membrane (PM) in yeast cells. The atomistic details of membrane-binding mechanism of Osh4 to charged and uncharged lipid membrane models have been characterized computationally [26,96]. AA MD simulations using CHARMM36 force field on an Anton supercomputer [97] showed that Osh4 binds strongly to anionic membranes in a single binding conformation. The binding event involved contact and interaction from six regions of the protein to the membrane. On the contrary, the Osh4 binds weakly and transiently to uncharged membranes with zwitterionic lipids and consequently does not form a stable binding conformation [26]. The studies also revealed that the phenylalanine loop is one of the most important binding regions of Osh4 stabilizing the protein-membrane interaction [26,96]. Interestingly, an AA MD simulation Figure 5. Schematic atomistic representation of the α-tocopherol transfer protein interacting with a DOPC/DOPE phospholipid bilayer (grey). The mobile gate (purple) at the entrance of the cavity interacts with the PIP 2 headgroup (red) and the lipids. The neighbouring motif is coloured in blue, and the α-tocopherol molecule bound to the protein in yellow. The head of PIP 2 facilitates anchoring of α-TTP to the membrane. Rendered with 3DProteinImaging [12]. PDB ID: 3W67 [95]. on the oxysterol-binding protein-related protein, Osh6p, also showed that a corresponding loop in Osh6p has a similar critical role for anchoring the protein to anionic membrane. The Osh6p is a yeast LTP that transfers PS from the ER to the PM via PS/phosphatidylinositol 4-phosphate (PI4P) exchange cycles [98]. The Osh6p is a yeast LTP that transfers PS from the ER to the PM via PS/phosphatidylinositol 4-phosphate (PI4P) exchange cycles. The study used the CHARMM36 force field and the GROMACS MD simulation engine and showed that Osh6p reduces its avidity for anionic membranes once it captures PS or PI4P in the lipid-binding pocket. The authors proposed an electrostatic switching mechanism that controls the lipid transfer activity of Osh6p between the ER and the PM.
Cholesteryl ester transfer protein (CETP) is another important transfer protein responsible for transferring cholesteryl esters (CE), triglycerides, and phospholipids (PLs) from high-density lipoproteins (HDL) to lowdensity lipoproteins and very low-density lipoproteins. A multiscale AA and CG MD simulation study of a CETP-lipoprotein complex revealed that CETP binds to the surface of HDL-like lipid droplets through its charged amino acids and tryptophan residues [99]. The binding induces the formation of a small hydrophobic patch on the surface of the droplet and opens a route from the core of the CE lipid droplet to the binding pocket in CETP. This route is caused by a conformational change in CETP alternating between opened and closed states [99]. Another AA MD simulation revealed a different CETP-HDL interaction mechanism with a detailed picture of the interactions between the protein and the components of the HDL [100]. The study reported an upright penetration of CETP into the HDL particle surface, the migration of CE from the HDL core to the CETP opening, and transfer of CE from the opening to the binding pocket of CETP [100]. Both above-mentioned CETP AA simulations used GROMOS53A6 UA force field [76] for the protein, and Berger parameters [80] for lipids. AA MD simulations with the CHARMM force field have also been used to investigate the CE transfer pathway through CETP. The study revealed that the hydrophobic tunnel inside CETP is sufficient to allow a CE molecule to completely transfer through the entire tunnel. The predicted rate and transfer time through the tunnel were comparable with those obtained through physiological measurements [101].
The α-tocopherol transfer protein (α-TTP) is a liver cytosolic transport protein that facilitates the transport of α-tocopherol to liver-secreted plasma lipoproteins and plays an important role in the efficient recycling of plasma vitamin E [102] (see Figure 5). A combination of AA and CG MD study using the AMBER FF99SB with LIPID11 FF and MARTINI CG FF investigated the interaction and the binding of α-TTP to phosphatidylinositol phosphate lipids (PIPs), when the protein was embedded in a lipid bilayer mimicking the plasma membrane composition: DOPC and DOPE lipids plus either a PI(4,5)P 2 or a PI(3,4)P 2 molecule [103]. The simulations revealed that enrichment of membranes with PIP 2 molecules facilitates the anchoring of α-TTP to the membrane surface [103]. The results showed that protein binding occurs by direct interaction of the negatively charged PIP 2 headgroup with the positively charged residues present at the protein near the opening of the ligand-binding cavity. Detailed AA simulations also revealed additional contacts between the protein and membrane. An example of such contacts was the formation of H-bonds between protein and the polar heads of DOPC lipids. These H-bonds were formed after protein anchoring to PIP 2 and could not be observed in CG simulations [103].
Phosphatidylinositol-transfer proteins (PITPs) are LTPs that can catalyze the inter-membrane transfer of PI and PC in vitro [104]. AA MD simulations using OPLS force field [105] have also been employed to study the binding of the mammalian StART-like PI/PC transfer protein PITPα to membrane bilayers [106]. The study revealed that association of PITPα with the membrane is spontaneous and captured the partial extraction of PC from bilayer into the PITPα hydrophobic pocket. Steered MD and umbrella sampling simulations were also employed to study phospholipid loading and unloading, respectively, by PITPα. This work also revealed that the presence of PITPα on the bilayer significantly reduces the free energy of desorption of PI and PC, implying the remarkable role of PITPα in facilitating lipid loading and unloading processes [106].
Fatty acid-binding proteins (FABPs) are another class of LTPs. They can transfer free fatty acids and other detergent-like compounds between cellular compartments. An UA MD simulation using the GROMOS96 force field with the G53A6 parameter set was used to investigate the binding and orientation of the fatty acid-binding protein ReP1-NCXSQ to anionic membrane in low and high ionic strength using Na + and NaCl, respectively, for neutralizing the systems [25]. The results revealed the electrostatic nature of the binding of the protein to the membrane. The authors performed several simulations with different initial configurations (orientations and distances) of protein with respect to the membrane. The study showed that in anionic membranes the protein electric macrodipole alignment follows the electric field generated by the negatively charged lipid interface. They also observed that in the presence of high ionic strength the approach of the protein to the membrane was slower than in low ionic strength, indicating that the effect of the membrane electric field on the orientation was shielded by the presence of the soluble ions. They concluded that these observations show that a combination of dipole-electric field interaction and local interactions influence the binding and orientation of the ReP1-NCXSQ protein in the interface [25].

Coarse grained models
CG models simplify a molecular system by decreasing the number of particles being treated explicitly during the simulations, thus drastically reducing computation time compared to all atoms models. Also, a significant advantage over simpler models, such as implicit models (Cf section 2.3) is the treatment of lipids as individual molecules, allowing for a higher level of resolution in the membrane itself. This is beneficial for simulations of lipid mixtures and realistic models of the complexity and diversity of cell membranes. There exist many CG models for simulations of lipid systems [107][108][109][110][111][112][113]. Not all include a compatible protein model though. We focus here on three CG models that have proven useful to simulate peripheral membrane binding.

MARTINI
The history of the MARTINI force field starts with a model that was meant for the simulations of lipids [114,115]. This model was parameterized to reproduce thermodynamics properties such as oil/water partition coefficients. This top-down approach is more similar in its spirit to the implicit membrane models than to the atomistic force fields, usually parameterized to reproduce structural data. The lipid model was later extended to proteins [116] and has since then been continuously expanding with new capability to become what is currently the most used CG force field for simulations of biomolecules. MARTINI uses on average one bead to represents 4 heavy atoms and their associated hydrogens (4:1 mapping). Smaller beads are also available (3:1. and 2:1 mapping) for particular cases. The beads differ by their polarity or hydrophilicity. There are eighteen types of beads in MARTINI 2.2 [117], and they are sorted in four different groups: Q (charged), P (polar), N (intermediate) and C (apolar). MARTINI 2.2P is a variant with polarized side chains. The latest version of the force field, MARTINI 3, counts 29 bead types and 7 groups, adding groups for halocompounds (X), divalent ions (D) and a separate group for water (W). Interactions between beads are represented by a Lennard-Jones potential and parameterized based on thermodynamics data. Non-bonded interactions are short-ranged with a cut-off at 11 Å. Local geometry is maintained through bonded terms parameterized on atomistic simulation data. In addition, elastic network models should be used to maintain the packing of large biomolecules such as protein or DNA [118]. MARTINI-Dry is a version with an implicit solvation model [119].
With ca. 4 times less particles than an atomistic simulation and shorterrange non-bonded potentials, there are fewer interactions to compute leading to the high computational efficiency of MARTINI. Time-steps for simulation can be of 20-30 fs instead of 1-4 fs for an atomistic simulation.
The coarse graining implies a number of limitations one should be aware of, including the non-directionality of H-bonds, missing conformational entropy (but reduced enthalpy as a compensation), and the restricted flexibility of biomolecules. Timescale of simulated events, and therefore kinetics information, should be interpreted carefully as the atomistic frictions are absent. Driving forces should also be interpreted with care, and temperature-dependence cannot be correctly reproduced. Electrostatic screening is implicit as the dielectric constant is the same whatever the polarity of the media is (ε r = 15), though there is a workaround with the polarizable water model; two charges are added to the water bead [120].
Although the top-down thermodynamics-based parameterization of MARTINI reminds of the parameterization of some implicit membrane models, the level of resolution of interfacial protein-lipid interactions is higher. Comparing MARTINI 2.2P CG simulations with MD using AA CHARMM36, we showed that the CG simulations successfully reproduce the IBS, depth of anchorage and orientation of seven different PMPs [121]. In combination with AA simulations, MARTINI is a powerful membrane model to predict IBS, shed light on binding mechanism and map protein- lipid interactions, as illustrated for example by the study from Charlier et al. on interfacial binding of the HIV-1 matrix protein [122], or the study by Koivuniemi et al. of a cholesteryl ester transfer protein [99]. MARTINI has also been used for simulations of the GRP1 PH domain which, combined to umbrella sampling PMFs, were used to evaluate the free energy of binding of the PH domain to phosphatidylinositol phosphates-containing bilayers [123].

Residue-Based Coarse Grained model
The Residue-Based Coarse Grained model (RBCG) [124] is a protein CG model which can be used with the MARTINI CG lipids [114,115]. RBCG uses the same lipid mapping but for proteins, the RBCG graining is coarser than MARTINI models with only two beads per amino acids: one for the backbone and one for the sidechain. The protein RBCG model counts 20 types of beads (19 side chains beads and one backbone bead) sorted in one of the 4 classes used for the MARTINI models [114,115]. RBCG is available for simulations with the NAMD simulation package [29] and a plugin in VMD [125] can be used to generate a RBCG system. While the integration timestep can be chosen between 30 and 50 fs for simulations of lipid bilayers, a 25 fs time step has been used for protein-containing systems [124]. To maintain the tertiary structure, a harmonic potential with a force constant of 5 kcal.mol −1 .Å 2 is added between the backbone beads that are not bonded otherwise. This was parametrized against a 20 ns atomistic simulation of the BAR peripheral domain [126]. For simulations of BAR domains, the authors found that a dielectric constant ε r = 1 resulted in the best agreement between all-atom and RBCG simulations, even though higher values of dielectric constant are used in other studies [114,124,127].

Shape-Based Coarse Grained model
The Shape-Based Coarse Grained (SBCG) model is a coarser model than RBCG and aims to simulate the movements of proteins assemblies from large-scale interactions. While its first application was to viral capsids [128] and bacterial flagellum [129], it has also proven useful to study membrane curvature induction by the BAR domains [126,130].
The graining level can be adjusted and various mapping have been used so far: from 150 atoms per beads [126,128,130] and up to 500 atoms per beads [129]. To faithfully model the shape of the protein, the Topology Representing Network is used to position the beads [131][132][133]. The atombeads mapping is then based on a Voronoi diagram, and the charge and mass of the beads are equal to the total charge and mass of the Voronoi cell. The topology of the system during simulations is maintained by a harmonic distance restraint between beads. Obviously, this type of coarse graining prevents conformational changes and protein refolding [128].
Conversely to the MARTINI or RBCG models, the lipid graining is not done for individual lipids. Rather, a SBCG lipid represents several lipids and consists of two beads (one for the heads and one for the tails) connected by a harmonic bond. For DOPC, one SBCG lipid represents on average 2.2 lipid molecules. The solvent is treated implicitly with a Langevin term to account for water viscosity, but there are beads for Na + and Cl − counterions. While a dielectric constant of 80 was used for simulations of viral capsid systems [128] and flagellum [129], ε r was set to 1 for simulations of a peripheral BAR domain [126,130] for the same reason as those invoked for RBCG.
An integration time step of 100 fs can be used [126,130] and SBCG topology systems can be generated using VMD [125].

Coarse grained modelling of sensing and induction of membrane curvature by BAR domains
Thanks to their computational efficiency and the fact that they account for individual lipids, CG force fields have been the method of choice to model membrane remodelling events. BIN/amphiphysin/Rvs (BAR) domains are long, mostly helical dimeric PMPs that can vary in length, curvature, and membrane affinity depending on the BAR classification [137]. N-BAR has the highest positive curvature and can form tubules that are 20 to 60 nm wide [138,139]. F-BAR has the lowest positive curvature and forms wider tubules (60 to 100 nm) [138], and I-BAR has a negative curvature generating membrane protrusions [140] (Figure 6). A given BAR domain protein can either act on the membrane shape or sense the curvature of the membrane depending mainly on two parameters: local protein concentration and membrane curvature [137,141,142].
In 2008, Arkhipov et al. studied the membrane sculpting by an N-BAR domain using methods covering four different levels of resolution [126]: AA simulations with a single BAR domain protein, RBCG simulations with 6 BAR domains to explore the local and global curvature of the membrane, longer SBCG simulations to model the global membrane curvature with 6 BAR domains, and a custom continuum elastic membrane model. The authors evaluated the effect of two different arrangements of 6 N-BAR domains (aligned or staggered [139]) over an initially planar membrane. The use of RBCG reduced the number of particles to be simulated down to ~200,000, and tõ 4 000 particles with SBCG instead of several million atoms with an atomistic model, thus allowing extremely long simulations (up to 5 µs in SBCG). While the staggered arrangement in RBCG led to a global membrane curvature (curvature radius ~400 Å) comparable to experimentally observed values [143], the aligned arrangement led to the highest local membrane curvature (~150 Å) under each BAR-domains. Interestingly, the SBCG simulations also showed the formation of a global curvature for the aligned system after 1 µs but with a lower curvature radius (up to 1000 Å).
In 2013, the same team reported a spectacular investigation of the membrane tubulation process by the F-BAR domain [144] using more than 30 SBCG simulations of 3 µs each to find the F-BAR lattice yielding the highest membrane curvature. Lattices with lower densities produced smaller membrane curvature and denser lattices achieve higher curvature. However, at very high density (over 10 dimers per 1000 nm 2 ), the F-BARs dimers hindered each other access to the membrane yielding a lower curvature.
Using the lattice that achieved the highest curvature radius as a starting point, they increased the number of F-BAR domains to 68 and they performed two 350 µs SBCG simulations with a large patch of lipids ( 380x17 nm). Each system is composed of 21,800 beads, in comparison, the equivalent atomistic model would contain more than 15 million atoms. In both simulations, they could observe that the curvature of the membrane increased from the sides towards the centre to finally form a tubular structure with a radius of 60-90 nm.
For very large systems like the ones described above, the coarseness level of the SBCG model allows simulations at time-scales that cannot be reached with 'finer' CG force fields, even with the most recent HPC technology or GPU scaling [145][146][147][148][149]. In 2019, Iqbal Mahmood et al. reported simulations of the F-BAR domain with the MARTINI model reaching 2 µs for a 14 F-BAR-membrane system and 10 µs for a system with 6 F-BAR plus membrane [150] to study their capacity to curve the membrane or sense its curvature. The advantage of using a finer-grained CG was the need to take into account the flexibility of the F-BAR domains, which the SBCG does not account for. Since the SBCG method does not allow intrinsic conformational change within the F-BAR domain [144], this recent study emphasized the flexibility of the F-BAR domain to match a variety of membrane curvature. They could show that F-BAR dimers assemble to bend the membrane collectively, but most interestingly, their simulation revealed the F-BAR curvature sensing property. It appears that F-BAR prefers a curvature that is consistent with the experimental curvature of striated tubes (53 ± 18 nm) and pearling vesicles (35 ± 5 nm) [151].

Implicit models
The first membrane implicit models were developed early [43,[152][153][154] to model the stability of transmembrane segments or the complexity of the water-lipid interface when computational resources did not allow atomistic simulations of lipid bilayers. Some implicit membrane models were typically building on implicit solvent models [155][156][157].
Implicit models typically approximate membranes by a hydrophobic slab surrounded by a polar environment while the protein remains described by an atomistic model. Protein-lipid interactions are modelled by a mean-field approach. Because of their computational efficiency implicit membrane models are particularly useful in cases where extensive sampling is needed, such as simulating landing of proteins on membrane surfaces, IBS prediction, free energy calculations, or whenever one needs multiple simulation of a system to probe the effect of a particular parameter in the simulation.
Implicit membrane models have been extensively used for modelling the folding and/or insertion of peptides in membranes [158][159][160] or the structure and stability of transmembrane segments [161]. However, they have been used relatively less for simulations of peripheral proteins. Yet, and despite their simplicity, they have proven useful to predict interfacialbinding sites and identify hot-spots by characterizing the energetic contributions of individual amino acids to the overall free energy of transfer from water to membrane, at a fraction of the cost of atomistic or CG simulations.
Here, we distinguish between two types of implicit methods; those using continuum dielectric theory, such as the Poisson-Boltzman (PB) theorybased models and those using an empirical approach based on residuespecific transfer-free energies from water to apolar solvents, such as the Implicit Membrane Model 1 (IMM1) and present examples of their utilisation for the modelling of peripheral membrane proteins. We refer the reader to the excellent review articles by Grossfield [162] and Feig [163] for detailed descriptions of the underlying theories.

Models based on the Poisson-Boltzmann and Generalized Born theories
For a solute consisting of a set of spherical particles carrying charges (e.g. a protein) embedded in a low dielectric environment the resolution of the PB equation provides a rigorous framework to calculate the electrostatic contribution to the free energy of solvation (see e.g. [164,165].). Methods based on these theories have naturally been applied to evaluate the electrostatic-free energy of solvation of proteins or peptides in membranes [166]. The solvation-free energy is actually addressed by solving this nonlinear partial differential equation for the different states. However, the full PB equation being a non-linear differential equation, its resolution has to be done numerically and is computationally expensive. Even the linearized PB equation, which is valid for weak electrostatic potentials is still fairly computationally intensive, at least compared to other implicit models. There exist efficient tools numerically solving the PB equation for biomolecular systems, such as APBS [167] or DELPHI [168,169], among others. Yet, because of the computational costs and the challenges linked to rigorous calculations of the derivatives, the PB theory has been applied mainly to one or a few structures and is not tractable for use in molecular simulations. Yet it can be useful when applied as postprocessing of selected trajectory frames. Note that non-polar interactions to account for van der Waals solute-solvent interactions and cavity creation (solventsolvent) can be added using a term depending on the solvent-accessible surface of the protein. A particularly interesting application relevant for this review is the evaluation of the electrostatic-free energy of interaction between lipid bilayers and peripheral proteins [20,38,170,171]. This can be calculated as the difference between the electrostatic-free energy of the protein-membrane complex and the sum of the electrostatic-free energies of the protein and membrane infinitely far apart from each other. In this case the membrane lipids and protein are the solute -and treated with an atomistic model -while water and ions are treated by the continuum dielectric. Mulgrew-Nesbitt et al. have dissected the contributions of positively charged amino acids to the interaction of proteins with membrane surfaces, and evaluated the role of electrostatics for a large number of such proteins [7], and found qualitatively good agreement with experimental data. Still, it is important to remember that this type of calculation has limitations pertaining mostly to the static character of the protein and lipid bilayers and the rigidity of the protein and bilayer. The membrane surface potential is also badly modelled by these approaches.
An attempt to simplify the complexity of the PB-based computations resulted in the development of popular Generalized-Born (GB) models. The GB models are based on the same principles, i.e. a description of the solvent or membrane as a continuum model, but use specific simplifying assumptions for the solvation-free energy to allow a closed-form approximated solution to be obtained. The GB formulation allows the use of analytical derivatives and is tractable for MD simulations, but the choice of method for the computation of the Born radii is decisive for the efficiency of the method (Cf [162,163]. and references therein). A rising interest in modelling of membrane proteins resulted in a number of adaptation and modification of GB methods to represent multiple dielectric environments and successfully applied to membranes [172][173][174].
GBSW, Generalized Born with a switching function [175], has been applied to describe the interactions between diacylglycerol derivatives and the C1b domain of protein kinase C δ (PKCδ), which are dependent of the interactions of PKCδ C1 with the membrane. The DAG analogues were docked in the C1 domain and the complexes were subjected to molecular simulations in the presence of a membrane modeled with GBSW to describe the ligands binding mode [176].
GBSW-GCS includes a Gouy Chapman Stern term to describe anionic lipids [177]. It was successfully used to predict the binding orientation of the DNA-repair protein RecA to anionic membranes, calculate binding free energies and correctly predict mutations weakening the affinity of RecA for anionic membranes [178]. The orientations obtained with the GBSW-CGS membrane model were subsequently used as starting structures in atomistic simulations [178,179].

Implicit Membrane Model 1
IMM1 [180] and its extension IMM1-GC [181] to charged membranes are implemented in CHARMM [182] and compatible with the CHARMM19 force field. IMM1 and IMM1-GC are models for simulations of proteins in zwitterionic and anionic membranes, respectively. IMM1 is based on the EEF1 (effective energy function) model for water-soluble proteins, which uses a linear distance-dependent dielectric constant [155]. The ionic sidechains are neutralized and a Gaussian solvent exclusion term is added to the energy function of the CHARMM19 force field [183].
IMM1-GC extends IMM1 using the Gouy-Chapman theory [184] for the diffuse electrical double layer and thus allows for interactions between a charged bilayer and the amino acids of the protein to be accounted for. The model has been extended to include a transmembrane voltage potential [185], a dipole potential for symmetric bilayers [186], the description of pores for transmembrane proteins with internal aqueous pores [159,187]. The IMM1 model can also model the effect of lateral pressure and curvature stress in flat [188] as well as in spherical and tubular membranes [189]. Mori et al. developed an implicit micelle model called IMIC [190] based on IMM1 where a superellipsoid function was used to model the micelle shape. They could show that the model faithfully reproduced structure and dynamics of transmembrane proteins obtained from AA MD simulations.
Simulations using IMM1 as membrane model have been used to predict the membrane-binding site of a human membrane-bound neutrophil protease called Proteinase 3 (PR3) [191], and showed that the hydrophobic insertion of aromatic amino acids contributes significantly to the free energy of transfer of Proteinase 3 to zwitterionic bilayers [40]. This was unlike its homologue the human Neutrophil Elastase (NE) that, according to simulations, inserted fewer hydrophobic amino acids. The modelling results concurred with subsequent Surface Plasmon Resonance experiments showing that the affinity of NE for POPC bilayers was lower than that of PR3 and driven mostly by electrostatics interactions. IMM1-GC was also used to identify the amino acids that are key to the protein-membrane interaction; those were verified experimentally using site-directed mutagenesis and protein expression in cell lines [192].
Implicit models such as IMM1 are also very useful to generate starting structures for atomistic simulations of peripheral binders. Simulations with IMM1-GC were used to predict the orientation of Bacillus Thurigiensis phosphatidylinositol-specific phospholipase C which was used as starting structure for atomistic simulations [3]. We also used IMM1 to predict the orientation and depth of anchorage of the C-terminus helices of the N-terminal acetyltransferase Naa60. We could identify the amino acids contributing the most to the free energy. The predicted orientation was used to initiate atomistic simulations while the predicted hot-spots for membrane binding were used to design mutants that were tested in vitro [37]. More recently, Nepal et al.reported an innovative study of the ESCRT III subunit Snf7 where they could compare the energetics of the binding of monomers and oligomers of the subunit Snf7 to flat and negatively curved membranes. By doing this they could predict which curvature yields strongest binding [193].
Implicit models like IMM1 are extremely computationally efficient and provide information about the thermodynamics of protein-membrane interactions, which can be helpful to identify interactions hotspots and design protein variants that can be further tested experimentally. That being said, and because implicit models do not explicitly describe interactions between lipids and amino acids but rather rely on a model where longrange electrostatics and the hydrophobic effect drive membrane binding, they are not the method of choice for proteins with fine-tuned lipid specificities or lipid-binding cavities.

Conclusion and perspectives
As stated in the introduction section, there exists ample structural data of PMPs but their membrane-bound state remains an elusive molecular assembly for structural biology techniques. It results in a lack of experimental data on the membrane-bound PMPs and leaves unanswered questions about the exact location of the IBS on the PMP structures, the amino acids key to the protein-lipid interactions (hot-spots), protein structural changes upon membrane binding, or structural changes to the membrane to name a few.
As demonstrated by the many examples referred to in section 2 of this review, MD simulations have proven extremely useful to address those questions and more. Indeed, molecular simulations have shed light on the dynamics of many PMPs and the thermodynamics of the protein-membrane interactions providing unvaluable mechanistic insights, something MD simulations are particularly good at, irrespective of the type of protein.
A drawback with MD simulations though is the associated computational cost, scaling as a function of N2 where N is the number of particles simulated.
Molecular simulations of PMPs thus call for models representing membranes at different levels of resolution and many studies combine several models: implicit and atomistic, CG and atomistic, different CG levels, etc. There is no golden rule for which model(s) can be combined with which other(s) as long as the scales covered by the different models are relevant for the time and size scales of the system under investigation. In general, multiscale strategies should take full advantage of the wide range of models available.
In addition, different types of simulations and analyses may be needed. Simulations at different time-scales are used to address the variety of questions asked about PMPs -from HMMM or implicit simulations to predict the IBS and obtain a good model of the membrane-bound state, to atomistic equilibrium simulations of a few 100 ns for predicting lipid-protein interactions in the membrane-bound state, potentially followed by FEP calculations to characterize hot-spot residues. Sometimes microsecond-long simulations are needed for the association of a large PMP with a complex bilayer, and the subsequent membrane remodelling. Moreover, the sizes of the proteins or protein assemblies may vary from a few hundred amino acids to many thousands. Likewise, the dimensions and complexity of the lipid bilayer needed to have a realistic membrane model vary tremendously from one PMP to another.
In this review, we have described membrane models from those with the highest level of details -all atoms force fields with full description of the lipids and other membrane components -to the simplest ones, the implicit membrane models. We have illustrated what the different models are most adequate for using examples of the literature. It is important to note that the differences between models are not only relevant for their computational efficiency, but also for the type of information they can provide depending on which reference data they are parameterized against.
The emerging complexity of biological membranes is pointing at a more complex picture for PMP-membrane recognition mechanisms than the old model based on a balance between unspecific long-range electrostatics and hydrophobic insertion. There are large variations in the mechanisms employed by PMPs to achieve membrane binding and lipid specificity, and we are just starting to understand the subtle interplay between PMPs and membrane lipids. While the value of the low-resolution models is clear and undisputed given nowadays computational capabilities, high-resolution models are and will remain crucial to uncover the details of PMP-membrane interactions. Emerging methods that can reduce the computational costs of MD simulations that represent proteins and lipids at -at least -the atomistic level of details thus hold promises for future larger scale MD simulations of PMPs [194,195].