Degrees of freedom of atoms in a rigid molecule for local temperature calculation in molecular dynamics simulation

ABSTRACT In molecular dynamics simulation, it is quite common to calculate a precise temperature profile with an atomic or sub-angstrom resolution. While this calculation typically requires the degrees of freedom (DOF) of individual atoms, those in a molecule subject to geometric constraints is undefined in general. Conventionally, the approximation of subtracting from the atom DOF by 1/2 per one distance constraint has been used, but this approximation is valid for spatially homogeneous systems only. In the present study, on the basis of statistical mechanics, we derived more general expressions for the DOF of atoms in fully or partially rigid molecules. The expressions ensure that in an equilibrium state, all constrained atoms have the same temperature as the equilibrium temperature of the whole system and are also applicable to inhomogeneous systems.


Introduction
In molecular dynamics (MD) simulations [1], it is quite common to calculate temperature profiles, i.e. the spatial distribution of local temperature. Such temperature profiles are not only the theoretical predictions of temperature distributions in real materials and phenomena [2,3], but also utilised in various ways. In non-equilibrium MD simulations of heat conduction, the gradient and jump in temperature profiles are used to derive thermodynamic coefficients [4] and thermal transport properties [5,6]. Temperature difference among different atoms or vibrational modes can be examined to check if the system is thermally equilibrated [7,8]. The local temperature fields computed with MD simulations provide reference data in developing more coarse-grained computational fluid dynamic schemes [9,10]. These applications typically require a precise temperature profile with an atomic or sub-angstrom resolution.
On the other hand, MD systems may be subject to geometric constraints in order to allow for a larger simulation timestep and to suppress vibration modes that should not be excited in view of quantum mechanics [11]. In particular, freezing bond lengths and angles involving fast vibrations of hydrogen atoms is often useful [12]. Such constraints are achieved either by imposing holonomic constraints in the equation of motion for a point mass [11,13], or by modelling a molecule or part of a molecule as a rigid body and solving Euler's equation of motion for a rigid body [14]. There is a problem when one calculates temperature profiles in a system subject to such constraints. A single constrained molecule usually spans multiple local volumes used for the local temperature calculation. In this case, the total degrees of freedom (DOF) in a local volume must be obtained by summing up the DOF of individual atoms. However, while the DOF per free atom is known to be 3 (in the 3-dimensional space), the DOF of atoms in a geometrically constrained molecule is undefined in general. For the constraint of the distance between two atoms, a convenient approximation is that each of the two atoms has the DOF reduced by 1/2 [12]. This even-distribution approximation is reasonable when the two atoms are contained in a local volume with similar probabilities or are of the same atom type, and it has been appropriately applied to spatially uniform liquids [12,15,16]. On the contrary, the atom DOF should be determined more carefully in systems where the distribution of the constituent atoms is highly inhomogeneous, such as nanostructures and interfaces of different phases. However, to our knowledge, explicit DOF expressions for constrained atoms in such inhomogeneous systems has not been discussed yet. In the present study, we derive them based on statistical mechanics, together with the validation by MD simulations. The specific expression depends on how the molecular geometry is constrained. Here, we particularly focus on the DOF of atoms in (i) a rigid molecule and (ii) a three-atom molecule with two distance constraints but no angle constraint. These examples cover many of the constrained systems dealt with in current MD research.

Definition
The temperature T loc of a local volume V loc is most commonly calculated via the kinetic temperature definition: where g i , K i = m i v i 2 /2, m i , and v i are the DOF, kinetic energy, mass, and velocity vector of atom i, respectively, k B is the Boltzmann constant, and 〈 … 〉 denotes the ensemble average. Similarly, the temperature of atom i is defined by In the even-distribution approximation, which is conventionally used, g i for constrained atoms is described as where n c,i is the number of distance constraints that involve atom i. Although this approximation is quite convenient, its use is restricted to homogeneous systems.
To derive an atom DOF that is applicable in general cases, we begin with the following definition: where 〈K i 〉 T is the average kinetic energy of atom i evaluated in an equilibrium state at a temperature T. Although Equation (2) and Equation (4) similar in appearance, their physical meanings are completely different: Equation (2) is the definition of atom temperature when the value of g i is known. In contrast, Equation (4) is the definition of g i based on a reasonable requirement that at least in an equilibrium state, the temperature of a constrained atom calculated by Equation (2) is equal to the equilibrium temperature. The atom kinetic energy in Equation (4) can be evaluated in any equilibrium state. Here, for simplicity, we assume a canonical ensemble. For an N atom system, the atom kinetic energy is expressed as where H(r 1 , ..., r N , v 1 , ..., v N ) is the Hamiltonian of the system, i.e. the sum of the kinetic and potential energies, described here as a function of the coordinate vector r i and velocity vector v i of atom i. The function f cns (γ) is necessary to take geometric constraints into account and γ represents the set of coordinates and velocities that are relevant to the constraints. For example, suppose that the distance between atoms i and j is constrained to a constant value d as r ij = |r i -r j | = d and this is the only geometrical constraint in the system. The atom velocities are also constrained since the time derivative of r ij must be zero as dr ij /dt ;ṙ ij = 0. This constraint is expressed as f cns (r ij ,ṙ ij ) = d(r ij −d)d(ṙ ij ), where δ is the Dirac delta function. Since f cns (r ij ,ṙ ij ) depends on r i , r j , v i , and v j , this function implicitly makes v i dependent of r i , r j , and v j . In the general case, f cns is defined in a similar way. Then, with an appropriate transformation of integral variables, γ and other variables irrelevant to K i can be integrated out, and the result is formally written as As we discussed above, v i can depend on other variables such as the velocity and position vectors of other atoms. The integral variable Γ represents the collection of all these variables, and E tot (Γ) is the part of the Hamiltonian that depends on Γ. In the equilibrium state, inserting Equation (4) into Equation (2) gives T i = T, i.e, all constrained atoms have the same temperature as the equilibrium temperature, regardless of the homogeneity of the system. The expressions of Γ and H part (Γ), and therefore the analytic form of g i , differ for each individual case. In the following, we will derive the specific expressions of g i for some important cases.
In the derivation of Equation (6), we assumed a canonical ensemble, but the final expression of g i is independent of statistical ensemble. As will be shown in the examples below, the partial Hamiltonian H part in Equation (6) is typically equal to the kinetic energy of a single constrained molecule. Therefore, as long as the phase variables of a single molecule, such as the translational and angular velocities, are distributed according to the Maxwell-Boltzmann distribution, the same expression as Equation (6) is obtained. This condition is satisfied for common equilibrium ensembles, including microcanonical ensemble, isobaric-isothermal ensemble, etc., since for a given molecule, the surrounding molecules play the role of a thermal bath.

Rigid molecule
Let us first consider a rigid molecule consisting of n atoms, having the mass M = i=1,n m i , centre-of-mass position vector where r i is the position vector of atom i and u i = r i −R. Furthermore, we define the atom contribution to the inertia tensor as.
such that I = Σ i = 1,n I i , where 1 is the unit matrix, and u i T denotes the transpose of u i , so that u i u i T is a matrix whereas u i T u i = u i 2 is a scalar. We note that I is a 3 × 3 matrix in the space fixed frame, but its rank is reduced to 2 in the case of linear molecules. The rank of I i is also 2 regardless of molecule type.
We choose V and ω as the integral variables Γ to evaluate the kinetic energy in Equation (6) for this rigid molecule. Expressing the atom velocity as v i = V + ω × u i and inserting this into K i = m i v i 2 /2 give the atom kinetic energy as.
where ω × u i is the cross product of ω and u i . The first and second terms in the right-hand side of Equation (8) represent the translational and rotational contributions, respectively, and the third term is their cross effect. The summation of K i gives the total kinetic energy of the molecule: which corresponds to the total energy E tot in Equation (6). The cross termK cross i makes no contribution in Equation (9) (8) and (9), the equilibrium average of the translational term is calculated as The rotational term becomes The second equality in Equation (11) is due to the change of T in the body-fixed frame, and the summation indices α and β run through a, b, and c. This transformation is represented by ω = Pω ′ , where P is an orthogonal matrix so that the inertia tensor in the body-fixed frame, is an odd function of V and ω whereas the Boltzmann factor exp[−K/(k B T )] is an even function of them. In summary, the atom kinetic energy in total is given by the sum of Equations (10) and (11) as.
By inserting Equation (12) into Equation (4), the DOF of atom i in a rigid molecule is found to be Let us consider some simple examples. The first one is a diatomic molecule composed of atoms 1 and 2 separated by a fixed distance r 12 . Suppose that in the body-fixed frame, the two atoms are arranged along the c axis as r ′ 1 = (0, 0, m 2 r 12 / M) and r ′ 2 = (0, 0, −m 1 r 12 /M), and thus there is no rotational degree of freedom about the c-axis. In the body-fixed frame, the diagonal elements of the partial and total inertia tensors are calculated to be I ′ i,aa = I ′ i,bb = m i [(M−m i )r 12 /M] 2 and I ′ aa -= I ′ bb = m 1 m 2 r 12 2 /M, respectively. Thus, Equation (13) for a rigid diatomic molecule is simplified as.
The next example is water, the molecule to which the rigid body approximation is most frequently applied. From Equation (13), the DOF expressions of the H and O atoms in a rigid water molecule are finally found to be where m H = 1.008 u and m O = 15.999 u are the atomic masses of hydrogen and oxygen [17], respectively, M = 2m H + m O is the mass of water molecule, and φ is the HOH angle. Different rigid water models may lead to slightly different DOF values via a small difference in the HOH angle φ. Some representative water models are summarised in Ref. [18], for example. The SPC and SPC/E models use φ = 109.47°, in which case the DOFs are calculated to be g H = 1.5947 and g O = 2.8106, whereas the TIP3P and TIP4P models adopt φ = 104.52°, resulting in g H = 1.5925 and g O = 2.8150.

Semi-flexible molecules
In a semi-flexible molecule, only a part of molecule is constrained. As a result, one molecule is composed of free atoms and rigid subunits. If there is no atom simultaneously included in two or more rigid subunits, one can apply Equation (13) to each rigid subunit. For example, in the case of a biphenyl molecule where the two rigid benzene rings are connected by a flexible covalent bond, the atom DOFs can be obtained by applying Equation (13) to each benzene ring independently. Conversely, if multiple rigid subunits share the same atoms, Equation (13) cannot be used, and the expression for the atom DOFs must be derived for each individual case. This type of semi-flexible molecule is called the linked rigid body [19].
To illustrate this situation, Figure 1 shows a three-atom molecule in which the two bond lengths r 21 and r 31 are constrained while the angle φ is allowed to change, where r ij = | r ij | and r ij = r i -r j . The molecule consists of two rigid subunits: one is the bond between atom 2 and 1 and the other is the bond between atoms 3 and 1. The two units share atom 1 and their motions are not independent; for example, if the motion of one unit is completely frozen, the translational motion of the other unit is also frozen. Let the molecular motion of this molecule be described by the translation of r 1 and the rotations of the two rigid subunits around r 1 . Then, the kinetic energy of each atom can be expressed as follows: where ω i is the angular velocity of atom i around atom 1 and The average kinetic energy of atom i is evaluated by where K = Σ i = 1,2,3 K i is the total kinetic energy of the molecule. After a lengthy calculation, one can finally find the DOFs for atoms 1-3 as where M = m 1 + m 2 + m 3 . The detail derivation of Equation (18) is described in the Appendix. While the total DOF of the molecule, g 1 + g 2 + g 3 = 7, is constant, the atom DOFs change with time since the angle φ is time dependent. If necessary, g i can be further averaged over φ as is the Boltzmann factor with respect to φ and U(φ) is the potential energy. As one can expect from this simple example, it is rather complicated or even impossible to derive the analytical form of g i for linked rigid molecules. In such a case, it may be reasonable to calculate 〈K i 〉 T in Equation (4) numerically by carrying out MD simulations of appropriate reference systems.

Systems and procedures
The expressions of atom DOF derived in Section 2 were examined by conducting MD simulations using the large-scale atomic/molecular massively parallel simulator (LAMMPS) [20]. The expressions for fully rigid molecules, Equations (13)- (15), were validated for a silica-water system as shown in Figure 2, where a bulk liquid composed of 3600 water molecules was sandwiched by two rectangular blocks of silica crystals each consisted of 6 × 6 × 4 unit cells of the α-quartz crystal with the lattice constants a = b = 4.916 Å, c = 5.4054 Å, α = β = 90°, and γ = 120° [21]. The (001) face of silica surface was in contact with the bulk liquid of water, and all Si atoms on the surface were doubly hydroxylated. The water molecules were described by the rigid SPC/E model [22], and the DOFs of constituent atoms were calculated with Equation (15). The silica crystals were modelled by the CHARMM-compatible force field of Emami et al. [23] However, the O-H bond length of the surface silanol was additionally constrained to 0.945 Å, and the DOF values of the O and H atoms were calculated with Equation (14). The Rattle algorithm [13] was employed to impose the geometric constraints on water molecules and hydroxyl groups. All non-bonded interactions including those between the water and silica were described by the Lennard-Jones (LJ) and Coulomb interactions. The cut off distance of LJ interaction was set to 12 Å, and the LJ parameters between water and silica atoms were obtained from the Lorentz-Berthelot combining rules. The Coulomb interactions were evaluated using the particleparticle-particle-mesh method [24]. The periodic boundary conditions were imposed in the x and y directions while the boundaries in the z direction were approximately considered non-periodic using the empty volume insertion method [25]. The x and y dimensions, set to L x = 6a = 29.496 Å and L y = 6bsinγ = 25.544 Å, respectively, were kept unchanged throughout the MD simulation.
The velocity-Verlet integrator was used with a timestep of 0.1 fs to integrate the equations of motion. The total simulation time was 8 ns, where the temperature of the whole system was controlled at 298.15 K using the Langevin thermostat with a damping coefficient of 0.2 ps. In the first 4 ns, a pressure of 1 atm was applied in the z direction by exerting constant force on the outmost atom layer of the right silica block while freezing the outmost atom layer of the left silica block. In the latter 4 ns, the outmost layer of the right silica block was also frozen, and the temperature profile along the z direction was computed in the last 2 ns.
In an additional simulation, the atom DOF expressions in Equation (18) for the three-atom linked rigid molecule were applied to a semi-flexible water where the two O-H bonds are constrained but the HOH angle varies with time. The water molecule was described by the flexible simple pointcharge water model (SPC/Fw) [26], and the O-H bond length was additionally constrained to 0.94 Å. A bulk system with a mass density of 1 g/cm 3 was constructed by placing 500 water molecules in a cubic MD box with 24.638 Å on a side, and the three-dimensional periodic boundary conditions were imposed. The MD simulation was conducted for 7 ns employing the velocity-Verlet integrator with a time step of 1 fs, where the system temperature was controlled at 298.15 K using the Langevin thermostat with a damping coefficient of 0.1 ps. The first 2 ns was used for equilibration, and the atom DOFs were calculated from the last 5 ns.
For both the silica-water system and semi-flexible water system, the statistical uncertainty of a physical quantity was estimated by the standard error of mean calculated by dividing the production simulation into five time blocks.

Results and discussion
From the MD simulation of the silica-water system, the local temperature based on Equation (1) was calculated along the z direction with an interval of Δz = 0.2 Å. The resultant temperature profile near the silica-water interface on the left side is shown in Figure 3. With the new expressions of g i , Equations (13)-(15), the DOFs of constrained atoms are evaluated as g H = 2.059 and g O = 2.941 for silanol, and g H = 1.5947 and g O = 2.8106 for water. The Si and O atoms inside the silica crystal are unconstrained, i.e. g Si = g O = 3. The temperature profile based on these atom DOF values are plotted in Figure 3(a) by atom type. As expected, the local temperature in every slab is found to be equal to the setting temperature, T set = 298.15 K. We note that near the boundaries of different regions, the number of atoms in a local volume can be so small that the calculated temperature values deviate significantly from T set with large statistical errors. Since such data are not reliable, temperature data with the standard error of mean (SEM) greater than 3 K (1% of T set ) are excluded from Figure 3 (However, there are still some regions in Figure 3 (a), e.g. at z ∼ 24 Å, where the profile appears to deviate from T set , albeit only slightly). Since the SEM of 3 K is smaller than the markers, the error bars are not explicitly shown in the figure.  In contrast to the new DOF expressions, if the conventional even-distribution approximation Equation (3) is used, the atom DOFs of the constrained atoms are calculated as g H = g O = 2.5 for silanol and g H = g O = 2.0 for water. The temperature profile based on these DOF values are shown in Figure  3(b), where the temperature values of the constrained O and H atoms significantly deviate from T set . The temperature profile in Figure 3(c) is also based on the even-distribution approximation, but it shows the average of all atom types. For z > ∼28 Å, the temperature is close to T set . In this region, the number density distribution in the inset of Figure 3(c) indicates ρ H /ρ O ∼ 2, i.e. the number ratio of H and O atoms in a local slab is close to that in a water molecule, as assumed in the equal-distribution approximation. In this case, the errors in the atom temperature between O and H atoms are cancelled out when their average is taken. For the same reason, in the case of the silanol region, the average atom temperature becomes close to T set when ρ H /ρ O ∼ 1, as is seen at z ∼ 23.5 Å. For other regions, the balance in the number of atoms is broken owing to the inhomogeneous molecular configurations, and as a result, the temperature profile deviates from T set . Thus, the oscillation in the temperature profile in Figure 3(c) is an artifact of an inappropriate definition of atom DOF, clearly indicating that the even-distribution approximation is not suitable for inhomogeneous systems.
Finally, Figure 3(d) displays the profiles of translational and rotational temperatures calculated by considering surface silanol groups and water as rigid bodies. The profiles are flat with the values near the setting temperature of 298.15 K, confirming that the system is in an equilibrium state.
Next, we discuss the result for the semi-flexible water system. The atom DOF expression Equation (18) depends on the instantaneous value of HOH angle φ. However, in the case of water, the large difference in atomic mass between H and O atoms makes the angle dependence ignorable. In Figure 4, the values of g O and g H based on Equation (18) are plotted as a function of HOH angle. For both g O and g H , the difference between the minimum and maximum is only ∼0.2%. From the MD simulation, the atom DOF values calculated by the time average of the analytic expressions in Equation (18) were g O = 2.8823 and g H = 2.0588 with statistical errors of the order of 10 −7 . We also calculated the atom DOFs from Equation (4) together with the numerical average of 〈K i 〉 T as g O = 2.8829 ± 0.0006 and g H = 2.0587 ± 0.0003. The two results are in good agreement with each other. One may be interested in the extent to which the DOF values depend on statistical ensemble. After the canonical simulation with Langevin thermostat, we conducted a 7.0 ns microcanonical simulation. The average values of atom DOFs in Equation (18) were calculated from the last 5 ns of this simulation as g O = 2.8823 and g H = 2.0589 with statistical errors of the order of 10 −7 , where the average temperature was 299.09 ± 0.05 K. Thus, the DOF values are insensitive to the difference in statistical ensemble. As we saw throughout the present study, the DOFs of constrained atoms are non-integer in general. This is because geometric constraints make the coordinates of a single atom no longer independent variables of motion. In addition, the value of atom DOF tends to be larger when the atom has higher atomic mass and higher partial inertia moment. For example, suppose a diatomic rigid molecule moving with a translational velocity V without rotation. Because of the rigid-body constraint, the velocities of two constituent atoms 1 and 2 must also be V. If the kinetic energy of atom i, m i V 2 /2, is equated to g i k B T/2 assuming that these atoms have the same instantaneous temperature T, the atom DOF g i must be proportional to m i . This simple example, if not rigorous, would be helpful to intuitively understand why non-integer and non-uniform DOFs are assigned to constrained atoms.

Conclusion
In the present study, we derived expressions for the DOF of constrained atoms, which is necessary for the calculation of local temperature in MD simulation systems of fully or partially rigid molecules. The expressions are based on statistical mechanics to ensure that in an equilibrium state, all constrained atoms have the same temperature as the whole system. The conventionally used approximation of reducing the atom DOF by 1/2 per one distance constraint can be used if a simulation system is spatially homogeneous. However, recent MD simulations often have to deal with systems that do not satisfy this condition, such as nanostructures and interfaces of different phases, and in such cases, the use of the new atom DOF expressions is essential.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by JST CREST [grant number JPMJCR17I2], Japan. Numerical simulations were performed on the Supercomputer system 'AFI-NITY' at the Advanced Fluid Information Research Center, Institute of Fluid Science, Tohoku University.