The Kramers turnover in terms of a macro-state projection on phase space

We analyze how Langevin dynamics is affected by the friction coefficient using an invariant subspace projection of the associated Koopman operator. This provides the friction-dependent metastable macro-states of the dynamical system as well as the transition rates in the entire phase space. We used the algorithm ISOKANN for a wide range of friction coefficient values and reproduced results consistent with the Kramers turnover.


Introduction
The dynamics of high-dimensional chemical systems can be modeled as onedimensional Langevin dynamics governed by the stochastic differential equation where q t and p t denote the position and the momentum of the system at time t on a one-dimensional relevant coordinate, V : R → R is the potential of mean force acting on the system along the relevant coordinate, and m is the effective mass of the system, which represents how much inertia the system has along the relevant coordinate.The system is coupled to a thermostat at temperature T , with Boltzmann constant k B , through the friction coefficient γ and the stochastic force η t defined as Gaussian white noise with ⟨η t ⟩ = 0 and ⟨η t , η t ′ ⟩ = δ(t − t ′ ), where δ(t − t ′ ) is the Dirac delta function.
where the energy function V (q) presents minima and maxima with energy barriers that exceed the thermal energy k B T .
In this context, a fundamental problem is the calculation of transition rates between potential minima, more precise, between macro-states of the system.Indeed, transitions between the macro-states represent the most interesting biochemical processes in many applications, e.g. the folding of an amino acid chain or the binding/unbinding process between a receptor and a ligand.However, they are rare events, i.e. they occur on very large time scales compared to reference time scales such as the oscillation times of hydrogen atoms.Consequently, they are difficult to simulate and analyze, e.g. by means of Molecular Dynamics (MD) simulations.
Over the last century and a half, various theories and methods have been developed to solve this problem and find analytical solutions for calculating rates.The first approximate formula of the problem dates back to Arrhenius, who derived in 1884 [1] the proportionality equation where k denotes the escape rate, β = 1 /k B T is the inverse of the thermal energy, and E b is the height of the energy barrier, also known as activation energy of the reaction.Later, further theories were developed that apply to different contexts of chemistry and physics.Particularly noteworthy is the work of Kramers, who in 1940 [2] studied transition rates for one-dimensional systems driven by the Langevin equation and derived three formulas that apply to low, moderate and high friction regimes.The three formulas well reproduce the so called Kramers turnover, a curve describing the transition rate as a function of γ: the transition rate is linear with the coefficient γ at low friction, then, having reached a plateau, the rate decays inversely to γ.However, Kramers' theory remains incomplete in some aspects that were only later resolved.For example, Langer derived a formula for multidimensional systems that operates in high friction regime [3], Chandler derived a formula that takes into account non-Markovian effects [4], Mel'nikov and Meshkov have found an expression that improves the prediction in the transition from low to moderate friction [5], and, Pollak, Grabert and Hänggi found a single expression that covers the entire friction range using a normal mode approach [6].These and other methods, which we do not mention for the sake of brevity, fall into the category of model-based methods, i.e based on the physical model of the system under investigation.
In this paper, we study the dependence of Langevin dynamics on the friction coefficient γ using its representation in terms of the Koopman operator [7,8], which allows to transform the nonlinear problem defined in eq. 1 into a linear problem.The price of this is that the finite-dimensional dynamics in phase space is transformed into an infinite-dimensional problem in the space of observable functions [9,10].For this reason, we seek invariant subspaces of the Koopman operator with finite dimensions.We use the ISOKANN algorithm [11], a data-driven method that identifies membership functions that constitute a basis of an invariant subspace of the Koopman operator preserving the Markovianity of the projected process.The peculiarity of ISOKANN is that it does not require the identification and the discretization of reaction coordinates, instead, membership functions can be estimated on states of the full space by means of machine learning techniques such as neural networks, overcoming the problem of the curse of dimensionality.
Membership functions are a generalization of ordinary crisp sets and characterize the metastable macro-states of the system preserving the time scales of the micro sys-tem when projected onto the macro-states [12][13][14].Using ISOKANN, we estimated the phase space membership functions and calculated the rates, which represent transitions on the phase space.In this way, we reproduced a rate curve as a function of friction, which is analogous to the Kramers turnover.However, our results show that in low and moderate friction regimes, the rates in position space are an approximation due to the loss of Markovianity of the dynamics defined in eq. 1.To obtain a correct representation of dynamics, even in low and moderate friction, it is necessary to take momenta into account.Thus, with this work we intend to open up a new perspective in reaction rate theory.This is possible through the use of increasingly advanced machine learning techniques, such as ISOKANN, which allows for the estimation of rates as functions of the entire phase space, preventing errors induced by discretization or dimensionality reduction.

Theory
We briefly introduce the operator theory that is needed to project Langevin dynamics onto macro-states [15].

Transfer operator approach
The dynamics of a stochastic process solution of the Langevin equation defined in eq. 1 is equivalently described by the dynamics of the time-dependent probability density ρ t (x) solution of the partial differential equation where the operator Q * defines the Fokker-Planck equation, or forward Kolmogorov equation, and x = (q, p) ∈ Γ ⊂ R 2 represents the state of the system in the phase space.The solution of eq. 3 is formally written as where P τ denotes the propagator of probability densities with stationary density defined by the Boltzmann distribution where Z is a normalization constant.
Besides considering the evolution of probability densities, it is useful to study the evolution of observable functions f (x).To this end, we introduce the infinitesimal gen-erator Q, adjoint of the operator Q * that defines the backward Kolmogorov equation Analogously to eq. 5, we can write a formal solution of eq. 8 as where we introduced the Koopman operator K τ which propagates the expectation value of observable functions.

Rates from membership functions
Consider the τ -dependent eigenvalues λ τ,i and the associated eigenfunctions Ψ i of the Koopman operator Q such that If the dynamics is ergodic and not periodic, then the first eigenfunction is constant Ψ 1 = 1 and it is associated to the non-degenerate eigenvalue λ τ,1 = 1.In reversible dynamics, the subsequent n c dominant eigenfunctions Ψ = {Ψ 2 , ..., Ψ nc }, associated to sorted and negative eigenvalues λ τ,2 > • • • > λ τ,nc , exhibit positive and negative values, which allows for the identification of metastable macro-states.In the non-reversible case, real-valued functions which span an invariant subspace of the Koopman operator can be applied instead of eigenfunctions.Each point in state space is represented by a vector which comprises of the values of these finitely many (n c ) functions.These points can be mapped into a (n c − 1)-simplex whose vertices represent the metastable states whereas the edges represent the transitions.The algorithm PCCA+ [12,13], by means of a linear transformation, transforms the simplex into a standard simplex, i.e. a simplex whose vertices are unit vectors.Accordingly, the set of dominant eigenfunctions is transformed into a set of membership functions χ = (χ 1 , χ 2 , . . ., χ nc ) ⊤ , with χ i : Γ → [0, 1], ∀i = 1, 2, . . ., n c , such that i χ i = 1.Membership functions characterise the membership of a state x in the macrostates of the system and by exploiting the linearity of the Koopman operator the exit rate from a macrostate is estimated as where a 1 and a 2 are obtained solving the linear regression problem min a1,a2 For a complete discussion about the χ-exit rates and the derivation of eq. 13, we refer to [16].

ISOKANN
The calculation of rates according to eqs. 13 and 14 requires the membership function χ and the propagated membership function χ t = K τ χ(x).We use ISOKANN [11,17], an iterative algorithm which modifies the Von-Mises-Algorithm [18] as iteration scheme where the initial function f 0 is an arbitrary function and ∥ • ∥ is the supreme norm.
As k → ∞, eq. 15 converges to the first eigenfunction of the Koopman operator: In fact, by applying K τ iteratively to a function f , one obtains the same result as applying the operator with lag time τ tending to infinity, i.e. a constant function.
Consider now a two-metastable system, as the model studied by Kramers, then the Koopman operator has two dominant eigenfunctions Ψ 1 and Ψ 2 , and the membership functions are written as with b 1 and b 2 appropriate coefficients.We introduce a linear transformation S to prevent the convergence of the Koopman operator to Ψ 1 = 1, and retrieve information regarding the eigenfunction Ψ 2 such that Ψ 1 and Ψ 2 span an invariant subspace of the Koopman operator.For this purpose, we choose as S the shift-scale function that guarantees that f k : Γ → [0, 1].The algorithm defined in eq.15 is rewritten as and converges to one of the two membership function: In general, we do not have an analytical representation of the Koopman operator or do not discretize the entire state space to retrieve its matrix representation.However, we can calculate the action of the Koopman operator on observable functions applied to specific states in space Γ.Exploiting the ergodic property, we approximate the expectation in eq.11 as a time average: where x τ,n are the final states of N trajectories, solutions of eq. 1, starting at x 0 = x.Thus, eq.19 is rewritten as Regarding the choice of the initial function f 0 , a wide range of options is available.The function should be an interpolating function that can be trained at each iteration until it converges to one of the membership function.For higher dimensional systems, the use of neural networks is recommended, as was used in ref. [11].However, for lowdimensional systems, other interpolation techniques may be used, e.g.spline functions or radial basis functions.

Results
As an illustrative example, we considered the Langevin dynamics of a fictitious particle of mass m = 1 amu which moves in a one-dimensional potential energy function The function is characterized by two wells with minima at q A = −1 nm and q C = 1 nm, and a height barrier of 10 kJ mol −1 at q B = 0 nm as illustrated in fig.1-(a).
For our numerical experiments, we used standard thermodynamic parameters: the temperature of the system was T = 300 K and the molar Boltzmann constant was k B = 8.314 × 10 −3 kJ K −1 mol −1 .

Classic Kramers turnover
In order to reproduce the classic Kramers turnover, we selected 25 friction coefficient values γ between 0.1 ps −1 and 30.0 ps −1 and we solved the Langevin eq. 1 using the Brünger, Brooks and Karplus (BBK) integrator scheme [19] with an integrator timestep ∆t = 0.005 ps.For each value of γ, we ran 500 simulations starting at the bottom of the left well of the potential with an initial momentum randomly drawn from the Boltzmann distribution.After the particle reached the bottom of the right well, the simulations were stopped and we calculated the mean time, i.e. the Mean First Passage Time (MFPT) ⟨τ f p ⟩ [20], from which we obtained the transition rate as The results of this numerical experiment are reported in fig.2-(a) as black squares.If the friction is very low (γ ≈ 0.1 ps −1 ), the dynamics of the system (eq. 1) is almost deterministic, and the system, unless it has enough initial momentum, is trapped in the well with an extremely low probability of escape.Correspondingly, the MFPT is very large and the value of the rate tends to zero.However, increasing the friction by a small amount (γ ≈ 1.5 ps −1 ) the system gets enough thermal energy through the random force and increases the probability to escape from the well.In fact, for low values of γ, the stochastic force √ 2k B T γm η t , which is weighted by the square root of the friction coefficient, dominates the friction force −γp t , which is linear with the friction coefficient.Thus, we observe a rapid and linear increase in rates up to a maximum of k A→C ≈ 0.02 ps −1 .Beyond the threshold of γ ≈ 1.5 ps −1 , the system enters what is called the moderate friction regime.Here, the friction force dominates the Langevin equation, and the probability of escaping the well, despite the high thermal energy, decays as k A→C ∝ 1 + 1 /γ 2 .For higher values of the friction coefficient (γ > 20 ps −1 ), the friction term is so strong that the average acceleration of the system tends to zero.The dynamics is overdamped and the transition rate decays as k A→C ∝ 1 /γ.
The three friction regimes, here qualitatively described, were formalized by Kramers in 1940 [2].He assumed a two-metastable system governed by the Langevin dynamics with thermal energy k B T ≪ E + b = V (q B ) − V (q A ), so as to ensure metastability.In addition, he required that the left well and the top of the barrier of the potential V (q) are approximated by harmonic potentials with angular frequencies , and Under these conditions, Kramers derived a transition rate formula for the low friction regime (γ < ω B ) for the moderate friction regime (γ > ω B ) and the high friction regime (γ ≫ ω B ) Note that Kramers defines the three regimes by comparing the coefficient of friction with the angular frequencies of the harmonic potentials that approximate the potential.In fact, the transition probability also depends on the curvature near the pit and barrier.The prediction of Kramers' formulas, reported in fig.2-(a), is excellent, it is only around the threshold separating the low and moderate friction regimes that the model becomes inaccurate.For more details about Kramers theory, we recommend Refs.[21,22].

Kramers turnover of membership functions in phase space
In the second numerical experiment, we estimated the transition rates applying ISOKANN to the same setting of the first experiment.We generated 1000 random initial points x 0 = (q 0 , p 0 ) from a uniform distribution over the phase space defined by the q-range [−2.0, 2.0] nm and the p-range [−10.0,10.0] amu nm ps −1 , and for each state we simulated N = 100 trajectories of length τ = 7 ps, corresponding to 1400 timesteps using a timestep integrator ∆t = 0.005 ps.The ISOKANN algorithm has been applied for 20 iterations using multiquadratic radial basis functions (RBF) [23], which are computationally undemanding and only require a few parameters to optimize during training, resulting in faster convergence.We considered two cases: • the membership functions χ A (q) and χ C (q), and the transition rate k χ A →χ C between the macro-states of the position space; • the membership functions χA (q, p) and χC (q, p) defined on the two-dimensional phase space, and the transition rate kχ A →χ C between macro-states of the phase space.Note that from now on, each quantity marked with the symbol • denotes a quantity measured over the entire phase space.
The two rates, as functions of the friction coefficient γ, are reported in fig.2-(b), respectively as blue upside down triangles and red circles.Both curves show a turnover similar to the rate k A→C reported in fig.2-(a): rates have an ascending profile for very low range values, then, having reached the maximum (k χ A →χ C ≈ 0.4 ps −1 and kχ A →χ C ≈ 0.2 ps −1 ), descend slowly.However, while the values of the rate kχ A →χ C in phase space are overlapping with those of the Kramers rate k A→C (although they are different physical quantities), the rate k χ A →χ C defined in position space turns out to be higher in the low friction region but converges to the values of kχ A →χ C in the high friction regime.To understand these results, it is useful to take a look at the membership functions obtained from ISOKANN and shown in fig.3, where figures (d,e,f) on the second row and (g,h,i) on the third row are respectively the membership functions in the phase space and the position space, for low, moderate and high friction.In fig.3-(a) (low friction regime), the membership functions of the macro-states only have significant values for those states whose total energy E = p 2 /2m + V (q) is less than the height of the barrier.The points with a total energy exceeding the barrier are depicted in white, indicating that they have an equal probability of belonging to either χ A or χ C , approximately 0.5.This occurs because trajectories originating from this area undergo periodic oscillations in phase space, visiting both wells as depicted in fig.1-(b) by the blue trajectory.Correspondingly, in fig.3-(d), we show the membership function values as a projection of χA (q, p) and χC (q, p) onto the position space.The apparent noise is due to the fact that the membership functions on phase space are not constant along the axis of momenta.Therefore, when friction is low, the membership values projected to position space are not functions in a strict sense and do not describe position-based macro-states.In fig.3-(b) (moderate friction regime), the membership functions draw concentric spirals that terminate in the minima of phase space respectively.This may seem counterintuitive, but observing how a trajectory behaves in the moderate friction regime helps to interpret the membership functions correctly.In fig.1-(b), the red trajectory starts at position q = −1 nm and momentum p = −8 amu nm ps −1 , and reaches the right-hand minimum by following a clockwise trajectory.Similarly, if we start trajectories that are far from the central region of the phase space, we would observe spiral patterns that match with the membership functions.From this figure, we deduce that in the moderate friction regime, the effective barrier, i.e. the transition region, is not at q = 0, but between the two spirals.In particular, in the central box [−1, 1]×[−5, 5] of the phase space, the barrier corresponds to a diagonal line which is not parallel to the momentum axis.The reason is that if q = 0 and p > 0, the system reaches the right well with low probability of recrossing.Conversely, if q = 0 and p < 0, the system reaches the left region.Along the white diagonal the system is in an unstable equilibrium, i.e. the system has the same probability of reaching one of the wells, and ISOKANN assigns equal probability of membership to the two macro-states.In fig.3-(c) (high friction regime), membership functions are independent from momentum space.The two regions q < 0 and q > 0 are assigned to the macro-states regardless of the momentum and the transition region is almost a vertical line.The projections χ A (q) and χ C (q) onto the position space also appear well defined in fig.3-(f).This occurs because as the friction is very high, the momentum is quickly damped and it does not provide enough energy to overcome the barrier as shown by the green trajectory in fig.1-(b).In the high friction regime, only the thermal noise can provide enough energy to jump over the barrier.

Results validation
In order to validate our results, we constructed a reference solution by means of a grid-based technique similar to Ulam's method [24] which allows to discretize the operator K τ defined in eq.11 into a transition probability matrix K(τ ), cf.[25].Given a discretization of the phase space Γ into M disjoint subsets Γ i , with i = 1, . . ., M , and a set of N simulations of length τ started in a random position of the subset Γ i , then the entries of the matrix K(τ ) are written as where 1 Γ j is the indicator function and x τ i,n is the final state of the nth simulation started in Γ i .In practice, one counts how many times a simulation starting in Γ i ends in Γ j and divides by the number of simulations to obtain an estimation of the transition probability.Afterward, an approximation of the infinitesimal generator, sometimes referred to as pseudogenerator, is obtained as where I denotes an identity matrix of the same size as Kτ .Then the membership functions χ(q, p) are calculated applying PCCA+ to Q and the coarse-grained rate matrix between macro-states is calculated as a Galerkin projection of Q onto the membership functions: In eq.33, diag(π) denotes an M × M diagonal matrix, whose diagonal entries are the entries of the Boltzmann distribution π(q, p) (eq.7) evaluated at the centers of subsets Γ i .Assuming a two-metastable system, the rate matrix Q has size 2 × 2: with q χ A →χ C , q χ C →χ A > 0 representing the transition rates between the macro-states.
For the sole case of a bimetastable system, these rates are equivalent to the exit rates defined in eq. 13.
Here, we discretized the q-range [−2.0, 2.0] nm in 80 intervals of the same length ∆q = 0.05 nm, and the p-range [−10.0,10.0] nm in 70 intervals of the same length ∆p = 0.29 nm.The transition rates estimated by PCCA+ are reported in fig.2-(b) as black squares, while the membership functions are reported in fig.3-(a,b,c).For each subset, we ran 500 simulations of length 7 ps, with an integrator timestep of 0.005 ps for a total of 1400 timesteps.There is excellent agreement between ISOKANN and the method based on the discretization of the phase space: both methods recreate the Kramers turnover and show the same patterns for the membership functions.

Discussion and conclusion
In this article, we studied the effect of the friction coefficient of Langevin dynamics on metastable macro-states of the phase space and calculated the transition rate.
For this purpose, we used the ISOKANN algorithm [11], which identifies macrostates by means of membership functions that form a basis function of an invariant subspace of the Koopman operator.In this subspace, the Koopman operator produces a linear dynamical system of finite dimensions that preserves the Markovianity of Langevin dynamics and can be used to determine kinetic observables such as transition rates.
We investigated a one-dimensional artificial potential, representing a bimetastable system, and reproduced the Kramers turnover.However, differently from the original Kramers work, the transition rate we estimated represents transitions between macrostates in phase space.Our results show that including both the positions and the momenta in defining the macro-states is necessary.Indeed, neglecting the momentum in the low and moderate friction regime introduces non-Markovian effects that are not properly captured by the position-dependent membership functions.In contrast, in the high friction regime, the velocity is instantaneously damped, and the macro-states can be defined as functions of only the position space.
This approach to estimating transition rates can be extended to highly dimensional problems.The typical strategy requires solving the fundamental equations of motion, projecting the dynamics on a small number of relevant coordinates, and discretizing the low-dimensional model to create a matrix representation of the Koopman operator, as is done with Markov State Models [26][27][28][29] or Square Root Approximation of the infinitesimal generator [30][31][32].The price of these techniques is the introduction of assumptions, such as the Markovianity of the projected dynamics, that can lead to significant errors [33].In contrast, ISOKANN does not require dimensionality reduction or space discretization, and the measured rates can be considered the best representation of the system's dynamics, net of approximations introduced a priori, e.g., when the equations of motion are numerically solved.Thus, the dimensionality of the system poses no limits to ISOKANN on a theoretical level.However, the implementation of ISOKANN for studying high-dimensional systems is more involved.Here, considering the low-dimensionality of the system, we used radial basis functions, but for higher dimensional systems, we suggest more advanced interpolating functions such as feed-forward neural networks, which allow the use of all system coordinates, including momenta.Another aspect to be taken into account is the choice of the mathematical representation of the molecular system.Indeed, neural networks are not invariant with respect to translations and rotations when Cartesian coordinates are used as input data.Thus, Cartesian coordinates must be transformed to a suitable set of input coordinates, for example pairwise distances, internal coordinates or atom-centred symmetry functions [34].
In summary, with this work, we have shown that ISOKANN is a valid tool for the study of dynamical systems that avoids the subspace projection of transfer operators.Here, we have focused on the classic Kramers problem, studying how macro-states are defined in phase space and highlighting the importance of considering momenta in rate calculation.Nevertheless, ISOKANN's flexibility and modern machine learning techniques allow for the study of even more complex systems.

Figure 1 .
Figure 1.(a) Potential energy function (solid line) and harmonic approximation at the bottom of the wells and at the top of the barrier (dashed lines); (b) Phase space with energy levels (black contour lines and yellow dashed line denoting the K B T value) and three trajectories carried out with friction coefficients: γ = 0.1 ps −1 (blue), γ = 2.2 ps −1 (red), γ = 30.0ps −1 (green).

Figure 2 .
Figure 2. (a) Classic Kramers turnover: transition rate k A→C estimated by numerical experiment (black squares), Kramers' formulas for low (blue), moderate (red) and high (green) friction coefficient γ.The dashed vertical lines denote the threshold between friction regimes; (b) Kramers turnover between membership functions: transition rate kχ A →χ C estimated by grid-based method (black squares) and ISOKANN (red circles), transition rate kχ A →χ C estimated by ISOKANN (blue upside down triangles).

Figure 3 .
Figure 3. (a,b,c) Membership functions χ(q, p) for γ = 0.1, 2.0, 30 ps −1 estimated by grid-based model: The blue-white-red colour gradient represents values in the range of 0 to 1.The membership functions are complimentary: χ 1 + χ 2 = 1, then the blue points represent the macro-state χ 1 and the red points represent χ 2 .The white points can be regarded as transitive regions.(d,e,f) Membership functions χ(q, p) for γ = 0.1, 2.0, 30 ps −1 estimated by ISOKANN; (g,h,i) Projection of the membership functions to position space.