Two descriptors avoiding explicit bond consideration on relative surface energies of various terminations in a crystal

ABSTRACT A bulk crystal can be cleaved with various terminations, and finding stable terminations, which may have high Miller indices, is a non-trivial task. We propose two independent and complementary descriptors, based on the atom positions of the crystal only, for surface energies of various terminations in a crystal. One is the ‘atom proximity function of a plane’, p, where the Gaussian of the distance from a plane is summed up for atoms near the surface and then normalized. The other is the ‘surface roughness index’, v, which is based on the density of surface atoms and the bonding environment, namely the solid angle of ‘open space’ around an atom. These descriptors contain parameters, but using predetermined suggested values make these descriptors effectively parameter-free. The validity of the descriptors is demonstrated on sc, fcc, and bcc crystals and oxides Li2O, Na2O, K2O, MgO, CaO, β-Ga2O3, θ-Al2O3, rutile structure TiO2, SnO2, and GeO2, anatase structure TiO2, β-Ga2O3, θ-Al2O3, and monoclinic ZrO2. Very stable terminations typically have both small p and large v compared to others. GRAPHICAL ABSTRACT IMPACT STATEMENT Surface energy descriptors based on the bulk structure only and without explicit consideration of ‘bond lengths’ can significantly accelerate identification of reasonable surface models without actual surface model calculations.


Introduction
Heterogeneous catalysis, including electrocatalysis and photocatalysis, happen on the catalyst surface.Terminations with low surface energies (E surf ) form more easily.The shape of a nanoparticle of a given volume with the minimum E surf can be obtained through the Wulff construction [1][2][3], and applications to nanoparticles on a surface (Winterbottom construction) [4], multiply twinned particles [5], nanoparticles on a corner (summertop construction) [6], and alloy nanoparticles [7] have been developed.On the other hand, there are efforts to form nanoparticles with highindex facets that typically have higher surface energy.Tian et al. obtained tetrahexahedral Pt nanocrystals with mostly high-index {730} facets, though very close {210}, {310}, and {520} facets were also found [8].Subsequently, many other high-faceted nanoparticles have been synthesized and studied in detail, as summarized in Ref [9], and references within, particularly because of their attractiveness in catalysis.
The number of terminations that a crystal can take is infinite.Even when limited to surfaces cleaved from bulk, the number of terminations scale as O(N 3 ) when N is the maximum absolute number among h, k, and l in the Miller index (hkl) of the termination.However, the vast majority of hypothetical surfaces cannot be obtained.An unstable termination may stabilize by spontaneously forming macroscopic facets of more stable terminations even though the total surface area increases [10,11].Therefore, the list of relevant termination could be quite limited, but high-index and/or long-distance periodicity surfaces may enter the list.A procedure to quickly evaluate the E surf of terminations relative to each other, without conducting any energy calculations of explicit surface models, will be very helpful.We note that experimental determination of the ratio of E surf between surfaces, let alone absolute values, is very difficult and reports are limited [12][13][14][15][16].
The E surf is intuitively higher if the number of broken bonds per unit area is larger.The unsaturated coordination index, σ, which is simply the density of missing bonds at the surface, has predictability of low E surf terminations in a crystal [17].A robust method to identify broken bonds for any cleaved termination by defining bonds as vectors and evaluating their spatial relations against a 'dividing plane' was proposed by Mackenzie et al. [18,19].The number of bonds between two adjacent planes containing atoms, A and B, qualitatively increases when the distance between A and B decreases.The preferred bond length is determined from the balance of attractive and repulsive interactions between atoms.A separation distance larger than the preferred bond length favors atoms to be on top of each other, in the direction normal to the planes, to take advantage of attractive forces.In contrast, a small enough separation does not allow atoms to be on top of each other because of the prohibitive repulsion forces.As a result, atoms become closer to a larger number of atoms on the other plane, thereby increasing the number of bonds.Therefore, some objective function related to the distance between planes of atoms and the density of atoms near the surface could be a reasonable and intuitive descriptor of surface energy.
Unfortunately, the concept of 'bonds' could be very subtle.Bonds in crystals such as rocksalt (Strukturbericht symbol B1), zincblende (B3), and fluorite (C1) are obvious.Here, bonds simply connect nearest neighbor (NN) atoms and all have the same distance by virtue of symmetry.In contrast, bonds in bcc crystals usually connect only NN atoms and all atoms are considered eight-fold coordinated, but the second NN (2NN) interaction becomes relevant when comparing surface energies between different orientations, as discussed afterwards.C atoms in graphite have two types of bonds, σ and π.Consequently, there is ambiguity on whether to include π bonds in the coordination number.The σ and π bond distances in graphite are 1.42 Å and 3.40 Å, respectively, and the π bond length is longer than the third NN distance in a graphene plane.However, the π bond length is the NN distance outside the graphene planes, and this distance is an important quantity that defines a lattice parameter of graphene.In β-Ga 2 O 3 , there are four-fold and six-fold coordinated Ga and threefold and four-fold coordinated O. Bonds connecting atoms with different coordination number should, in principle, have different nature.Removing the second bond can require much more energy than removing the first bond [17].
Quantitative verification of computational predictions with experimental investigations is not practical.Experimental evaluation of the absolute value of E surf is very difficult; specific assumptions or theoretical models are implied and the results are strongly influenced by, for instance, the grain size accuracy, scale effect, and atmosphere.In particular, clean surfaces are difficult to attain, and adsorption of atoms and molecules on the surface changes the E surf .There is no reasonable experimental method for high E surf measurements as of now [20].Even if there is a method, hypothetical terminations of many orientations that can be generated computationally are simply not experimentally accessible.Decomposition into macroscopic facets of different orientations could reduce the E surf of a given orientation, where one experimental observation is the decomposition of the chalcopyrite (110) surface to (112) and (11 � 2) facets in the thin-film solar cell photoabsorber Cu(In,Ga)Se 2 [21,22].We previously discussed how to find stable and metastable surfaces that are likely to be accessible by identifying and removing terminations unstable with regard to facet formation [10,11].However, this procedure requires E surf calculations of many orientations and, if desired, evaluation of significant reconstructions that cannot be found with spontaneous relaxation from a cleaved surface by, for example, using a genetic algorithm such as in the USPEX code [23][24][25][26].
This paper describes two E surf descriptors, and one is limited to slabs cleaved from a bulk crystal.We chose to avoid dealing with bonds altogether and work with information on the position of atoms in a crystal.Parameters are required in the formalism, but suggested values, which are not derived through regression in specific systems, are derived and verified in this paper.The descriptors are effectively parameter free, eliminating the need to fit parameters for a particular system with no prior experimental nor computational E surf information.The descriptors can be calculated very quickly, provides some insight on the relative order of E surf between different terminations of a crystal, and have predictive power regarding the lowest E surf termination in various crystals with diverse symmetry from cubic to monoclinic lattices.

Computational methods
Slab-and-vacuum models under three-dimensional periodic boundary conditions (hereafter simply 'slab models') were used to analyze surfaces, where slabs infinitely extending parallel to the ab-plane alternate with vacuum layers along the c-axis.The E surf, was defined as where E slab and E bulk are the energy of the slab without defects and the energy of the constituents of the slab when in a perfect bulk, respectively, and A is the basal area of the slab model.The coefficient of two accounts for both sides of the slab.Values of E surf and the models used to derive them are provided in our previous works [11,17].First-principles calculations were conducted using the projector augmented-wave method [27] as implemented in the VASP code [28,29].The Perdew-Burke-Ernzerhof functional tuned for solids (PBEsol) [30] within the generalized gradient approximation (GGA) was used.The Hubbard U based on Dudarev's formulation [31] was additionally considered on 3d and 4d states of Ti and Zr, respectively, where the effective U value of U-J was set at 3 eV as in our previous studies [11,17].This 3 eV value was used by Stevanović et al. to obtain fitted elemental-phase reference energies (FERE) [32].Spin polarization was considered in slab calculations with an initial ferromagnetic spin ordering.Internal coordinates and lattice parameters were relaxed in bulk calculations.All lattice parameters were fixed in slab model calculations, and atom positions, or internal coordinates of atoms, were either fixed to those of cleaved bulk ('cleaved' positions and 'fixed' calculations) or relaxed during first-principles calculations ('relaxed' positions and 'relaxed' calculations).Only terminations that are nonpolar after cleaving bulk (type 1 or 2 in Tasker's definition [33] and nonpolar type A or B in the definition by Hinuma et al. [34]) were considered unless noted otherwise.

Atom proximity function of a plane, p
The atom proximity function of a plane, p is defined for surfaces simply cleaved from bulk without further relaxation nor reconstruction.Positions of points are denoted in fractional coordinates as (x, y, z).Points in a plane parallel to the ab-plane (basal plane) share the same z.The plane P used to define p, which is the reference plane, is the basal plane in the middle of two adjacent planes containing atoms.The effect of the distance between atoms and P is considered using a decaying positive function u(Δz), where Δz is the difference in the fractional coordinate z between P and an atom.Atoms farther away from P should have a smaller u, and u should approach 0 when Δz increases because the effect of deep subsurface atoms should not affect the E surf .The function p is ideally the same value for the same crystal and termination regardless of the lattice parameter, or in other words, normalized with some measure of the atom size.
The general form of p is and a Gaussian is recommended as the function u as The effective atom size, ρ, is defined as the cube root of the volume per atom.The basis vectors of the unit cell is retaken such that the plane P is spanned by basis vectors a and b.A is the basal area of the retaken unit cell, or |a × b|, t is the thickness of the unit cell in the direction perpendicular to P, or (|a × b|•c)/A, Z is the fractional coordinate z of P, the summation is over all atoms with x and y coordinates both between 0 inclusive and 1 exclusive and the fractional coordinate z (z i ) less than Z and larger than some cutoff, for example,Z À 3λρ=t, and λ (>0) is a parameter that needs to be supplied.The value tΔz is the length between an atom and P, and is normalized in the exponential by a dimensionless parameter λ times the distance related to the average atom size, ρ.The value of u, and also p, changes continuously when λ is continuously changed.The normalization by A is necessary because the number of atoms in the surface is proportional to the surface area.In other words, the descriptor of a given termination must be independent on the surface area of the supercell.Multiplication by ρ 2 , a quantity determined by the bulk crystal, is needed to make the descriptor dimensionless and independent on the lattice parameters for the same crystal structure and termination.Figure 1 shows an example of how p is obtained for the sc (001) surface.Note that the value of p of a given termination does not depend on the element at each site, thus p for a certain termination is the same in the rocksalt structure.Similarly, the p for the same termination in bcc, AlFe 3 (D0 3 ), and Heusler (L2 1 structures) are the same.The reference plane P is taken at z = 0.5 because two adjacent basal planes containing atoms are at z = 0 and z = 1.Defining the lattice parameter as a, ρ = a, A = a 2 , and t = a, hence A Gaussian function of u ¼ exp À t zÀ 0: shown in Figure 1(b).The choice of λ is 2.5, which is a very bad choice because of slow convergence.The value of p is obtained by simply adding the values of u for all atoms on one side (left in this figure) of the reference plane.In other words, u at the red dots, which is where atoms exist, is added up and then normalized.The parameter λ is typically chosen such that p converges very quickly.When a reasonable choice of λ = 0.3 is used, only the first term is relevant because the first term is ~0.06 and the second term is ~10 −11 .
Here, we consider a Lorentzian of the form as an alternate form of u, which is possible because this u is also a decaying function.Figure 2 illustrates the forms of Gaussian and Lorentzian functions for various λ.The Lorentzian function has a very long tail compared to the Gaussian function, meaning that the effect of deeper atoms is much stronger than when the Gaussian function is used.The authors recommend use of the Gaussian function because of the fast convergence and the following physical justification.A two-body potential of a general form between two 'objects' at r0 ¼ x 0 ; y 0 ; z 0 ð Þ and r00 ¼ x 00 ; y 00 ; z 00 ð Þ is denoted as U r0 À r00 ð Þ.The potential at r0 from an atom at r00 can be expressed by If the second 'object' is uniformly distributed with equation primes-dimensional density σ over the plane z′=Z′ in Cartesian coordinates, the potential becomes When the form of Therefore, Equation (2) represents a Gaussian potential-based interaction between an atom and a hypothetical plane.When a bulk crystal is cleaved to form a nonpolar surface at a plane equidistant between adjacent planes containing atoms, pairs of atoms exist at the same distance on both sides of the cleaving plane, thus the atom proximity function p from both sides of the plane plotted for the derivation of p.This system is normalized such that t=ρ=A = 1.The red dots are shown corresponding to atom positions in (a).
are identical.The sum of p from both sides of the cleaving plane increases when the cleaving plane is moved from the middle.The bond distance between actual atoms is determined from the balance between repulsive and attractive terms in the interaction between atoms as well as geometric constraints, but the cleaving plane is a hypothetical plane that does not contain actual atoms.Therefore, the interaction between atoms and the hypothetical plane does not necessarily require both repulsive and attractive terms.In fact, the idea of 'repulsive' and 'attractive' interactions has no meaning in p because only the ratios of p between different surface terminations matter.
Results with λ = 0.2, 0.3, and 0.4 are shown in this study to assess the sensitivity of the parameter λ.We note that the coefficient of determination, R 2 , is not the best performance indicator of the descriptor.Values of E surf and p are not expected to have a linear correlation by virtue of how p is designed, and the predicting power of low E surf terminations is important rather than the performance over all terminations.The value of R 2 is simply shown as a guide of correlation unless there is a point far away from the cluster of other points because such an outlier point strongly affects the linear fit.

Tests in sc, bcc, and fcc crystals
The atom proximity function p was calculated for the (001), (110), and (111) cleaved terminations of sc, bcc, and fcc crystals.The terminations are shown in Figure 3; the blue and purple planes indicate adjacent planes containing atoms.Atoms with distance less than 3λρ/t from P were included in the summation of Equation ( 1).The values of p for various λ are given in Figure 4.The value of p is very close to 0 for λ smaller than ~0.1 because the decaying function u decays so quickly that its value is very close to 0 for all atoms in consideration.On the other hand, p is almost the same value for all surfaces at λ larger than ~0.5, thus an appropriate value of λ should be somewhere in between.
The energy necessary to cleave a surface by breaking bonds, which is fixed E surf , can be approximated by the number of broken (missing) bonds per unit area [17].Table 1 summarizes ρ, A, t, and number of broken bonds per atom (Δbonds) for sc, bcc, and fcc crystals and the (001), (110), and (111) surfaces.The number of broken bonds per unit area, Δbonds/A, was smallest in the (001), (110), and (111) surfaces in sc, bcc, and fcc crystals, respectively.The order of Δbonds/A in Table 1 and p in Figure 4 among surfaces of the same crystal was the same except for bcc (001) and ( 111).The ratio of the 2NN distance to the NN distance is 1.15 in bcc whereas this value is 1.41 in sc and fcc, and considering breaking of sufficiently strong 2NN bonds [35] results in agreement in the relative orders of p in Figure 4(b) and the broken bond count-based E surf .The descriptor p therefore reflects information on breaking of surface bonds without any prior explicit input on the network of bonds.

Application to rocksalt structure MgO and CaO
The space group of the rocksalt structure is Fm � 3m.Slab models with the minimum choice of A less than four times that of the smallest A, which is for the (111) orientation, were analyzed.Six orientations and six terminations were considered.The geometries of slab models and values of E surf with fixed and relaxed atom positions are given in Tables S1 and S2 of Ref [17], and the terminations are shown in Figure S1 of Ref [17].The trends for fixed and relaxed E surf are similar.The (100) termination has the lowest E surf , followed by relatively high index (310) and (210) terminations.The (100) termination has the smallest p; thus, this termination is correctly estimated as the low E surf termination.The (h10) termination may be regarded as steps on a (100) terrace surface.The miscut angle is defined as the angle between the vicinal surface, which contains step edges, and the terrace surface [36].As h increases, the miscut angle, the step density, and E surf all decreases [17].On the other hand, the distance between planes of atoms, d ¼ 1= ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi , decreases as h increases, where the NN bond distance is set to unity.The contribution to p from a surface atom is exponentially larger, and p tends to become larger, when d is smaller.Therefore, larger h terminations have larger p but smaller E surf , suggesting that p is not a suitable descriptor of E surf in step-and-terrace type surfaces with small miscut angles.

Application to anti-fluorite structure Li 2 O, Na 2 O, and K 2 O
The space group of the anti-fluorite structure is Fm � 3m.Slab models with the minimum choice of A less than four times that of the smallest A, which is for the (111) orientation, were analyzed.
Six orientations and six terminations were considered.The geometries of slab models and values of E surf with fixed and relaxed atom positions are given in Tables S3-S5 of Ref [17], and the terminations are shown in Figure S2 of Ref [17].
Figure 6 plots p when λ = 0.3 against E surf for Li 2 O, Na 2 O, and K 2 O. Results for λ = 0.2 and 0.4 are provided in Figures S3 and S4.Labels are provided for the terminations with the three lowest E surf , which are (111), (331), and (110).The (331) termination is (111) steps on a (110) terrace surface (Figure 7).The σ of (111) is smaller than (110), consistent with the lower E surf of the former over the latter.The (111) orientation has alternating layers of cations, anions, cations, and vacancies at equal intervals of 1 � 4 ffi ffi ffi 3 p � 0:14, while the (110) orientation has stoichiometric layers at equal intervals of 1 � 2 ffi ffi ffi 2 p � 0:35.Here, the lattice parameter of the conventional cell is scaled to unity.The distance of outermost surface atoms from the reference plane is ~0.14 in (111) and ~0.18 in (110), thus the p of (110) becomes smaller than (111) for a small λ such as 0.2.The analysis in this section shows that smoother surfaces tend to have small p especially when a small value of λ is chosen.

Application to rutile structure TiO 2 , SnO 2 , and GeO 2
The space group of the rutile structure is P4 2 /mmm.Slab models with the minimum choice of A less than four times that of the smallest A, which is for the (001) orientation, were analyzed.14 orientations and 16 terminations were considered.The geometries of slab models and values of E surf with fixed and relaxed atom positions are given in Tables S6-S8 of Ref [17], and the terminations are shown in Figure S3 of Ref [17].
Figures 8, S7, and S10 are plots of p when λ = 0.3 against E surf for rutile structure TiO 2 , SnO 2 , and GeO 2 , respectively, and results for λ = 0.2 and 0.4 are provided in Figures S5, S6, S8, S9, S10, and S11.Labels are provided for the terminations with the five lowest E surf and five smallest p.
The (111) termination has a very large p of roughly 0.6 in Figures 8, S7, and S10, owing to existence of O atoms with very small Δz in Equations ( 2) and (3), while all other terminations have p between 0.1 and 0.4.Similar trends are also found with λ other than 0.3 (Figures S5, S6, S8-S11).Existence of outlier points exert too much influence on the R 2 , while removing the (111) termination does not have a just cause.There are two low E surf terminations, (100) and (110) (Figure 9), and the latter (110) stabilizes more with relaxation.The p of the (110) termination is among the five smallest p but that of (100) is not in all three crystals, which is also the case when λ = 0.4.In contrast, both (100) and (110) are not among the five smallest p terminations when λ = 0.2, thus λ = 0.2 is clearly an unsuitable choice.

Application to anatase structure TiO 2
The space group of the anatase structure is I4 1 /amd.Slab models with the minimum choice of A less than four times that of the smallest A, which is for the (001) orientation, were analyzed.13 orientations and 16 terminations were considered.The geometries of slab models and values of E surf with fixed and relaxed atom positions are given in Tables S9 of Ref [17], and the terminations are shown in Figure S4 of Ref [17].
Figure 10 plots p when λ = 0.3 against E surf , and results for λ = 0.2 and 0.4 are provided in Figures S13 and S14.Labels are provided for the terminations with the five lowest E surf and five smallest p.Three out of five lowest fixed E surf terminations, (101)A, (001), and (100), are among the five lowest p terminations for all of λ = 0.2, 0.3, and 0.4, suggesting that p has good predictability of low fixed Table 1.Summary of the cube root of volume per atom ρ, area of basal plane per atom A, interval between basal planes containing atoms t, and number of broken bonds per atom (Δbonds) for sc, bcc, and fcc crystals and the (001), (110), and (111) surfaces.The unit of distance is the lattice parameter a of the crystallographic conventional cell.The notation 4 + 1 + 1 in Δbonds means that four, one, and one bonds are broken in atoms in the first, second, and third layers from the surface, respectively, and 5 + 1 represents five and one bonds broken in the first and second layers from the surface, respectively.E surf terminations despite the low R 2 value of at most 0.14.Within these three terminations, (101)A and (100) are also among the five lowest relaxed E surf terminations while (001) is not.The (301)A termination may be regarded as a step model with the (100) terrace surface, thereby resulting in a large p value, but its relaxed E surf is slightly smaller than the (100) termination.

Application to β-Ga 2 O 3 and θ-Al 2 O 3
Compounds β-Ga 2 O 3 and θ-Al 2 O 3 share the same crystal structure with space group C2/m.Slab models with the minimum choice of A less than four times that of the smallest A, which is for the (100) orientation, were considered.There are 34 orientations and 67 terminations.The geometries of slab models are given in Tables S1 and S2 of Ref [11], and the terminations are illustrated in Figures S1-S9 of Ref [11].The fixed E surf for β-Ga 2 O 3 and θ-Al 2 O 3 are given in Tables S10 and S11 of Ref [17], respectively, and relaxed E surf in column 'Fit' in Table S3 and S4 of Ref [11], respectively.Figures 11 and S17 are plots of p when λ = 0.3 against E surf for β-Ga 2 O 3 and θ-Al 2 O 3 , respectively, and results for λ = 0.2 and 0.4 are provided in Figures S15, S16, S18, and S19.Labels are provided for the terminations with the five lowest E surf and five smallest p.The lowest E surf termination, (100)A, is among the five lowest p terminations in all studied conditions, both fixed and relaxed.The three lowest relaxed E surf terminations, (100)A, (20 � 1)A, and (11 � 2)A (E surf increases in this order) are among the five smallest p terminations when λ = 0.3 (Figures 11(b) and S17(b)), while this does not hold for (11 � 2)A when λ = 0.2 and 0.4 (Figures S16 and S19).The above results suggest the use of λ = 0.3 in Ga 2 O 3 and Al 2 O 3 .Interestingly, the proposed descriptor p performed better in relaxed E surf in terms of low E surf termination prediction even though the descriptor is calculated using bulk atom positions.The descriptor σ gives the same values for (100)A and (100)B, suggesting that E surf are the same [17].In contrast, the values of p are different and therefore can correctly estimate that (100)A has a lower E surf than (100)B.

Application to m-ZrO 2
The space group of m-ZrO 2 is P2 1 /c.Slab models with the minimum choice of A less than three times that of the smallest A, which is for the (001) orientation, were analyzed.A total of 28 orientations and 68 terminations were obtained.The geometries of slab models for E surf calculations are given in Tables S12  and S13 of Ref [17], and the terminations are shown in Figures S5-S9 of Ref [17].
Figure 12 plots p when λ = 0.3 against E surf , and results for λ = 0.2 and 0.4 are provided in Figures S20  and S21.Labels are provided for the terminations with the five lowest E surf and five smallest p.The R 2 for ZrO 2 is very small; the largest value among these figures is 0.04.However, the lowest E surf termination, (11 � 1)A, is the termination with the smallest p when λ = 0.4 for both fixed and relaxed E surf (Figures S20(b) and S21(b), respectively), third lowest smallest p when λ = 0.3 for both fixed and relaxed E surf (Figure 12(a,b) respectively), and seventh and fifth lowest smallest p when λ = 0.2 for fixed and relaxed E surf , respectively (Figures S20(a) and S21(a), respectively).Therefore, the low E surf termination prediction ability is very limited for ZnO 2 , but still can capture the lowest E surf termination.

Surface roughness index v
Another descriptor of the surface, independent from p, is based on the density of surface atoms and the bonding environment of the surface atoms.
Surface atoms can be qualitatively identified by the procedure in Hinuma et al. [37].Checking the coordination number (CN) of atoms is one solution to surface atom identification, but has two significant drawbacks.Firstly, the CN of each atom if it is in bulk, or reference CN must be known in advance, otherwise reduction of the CN cannot be detected.This means that CN counting is valid for surfaces whose relation to a surface cleaved from bulk is clearly known.Consequently, surface atoms of an amorphous solid cannot be identified in general because the reference CN is not well-defined.Counting the decrease in the total number of bonds by cleaving is relatively simple, and this approach is used to define the unsaturated coordination index, σ, in Ref      [17].Secondly, the environment around a surface atom is typically not clear from the amount of reduced CN.Atoms at a step edge, terrace surface, and step foot have very different bonding environments and therefore expected to behave differently.The Voronoi cell of an atom may be used to discuss the bonding environment but lacks transferability between different crystals.Hinuma et al. [37] used the solid angle of 'open space' around an atom, S, to identify surface atoms.Atoms within a range d around an atom are projected to the unit sphere around the atom, which is analogous to how stars are projected to the celestial sphere, regardless of the distance to the star, in astronomy.The 'open space' is then obtained as a spherical angle.The value of S is 3π, 2π, and π sr for an atom at a prototypical step edge, terrace surface, and step foot, respectively, and atoms with small S, for instance less than 0.75 π sr, are considered to be in bulk.
The surface roughness index, v, is defined as The symbol v was inspired by the root symbol (√) chosen in ISO1302:2002 as a basic graphical symbol for surface texture and was chosen to avoid symbols s (first letter in 'surface') and r (often implies some kind of distance).Here, S i (d) is the solid angle of 'open space' around atom i using cutoff d, S cut is the solid angle cutoff, and the sum is taken over atoms i near a surface.The max function means that atoms with S smaller than S cut , which is the case for atoms far from the surface, do not contribute to the value of v, allowing controlled ambiguity in what atoms to include in the summation.This descriptor has two parameters, d and S cut .The value of v is designed to continuously change when S cut is changed.This index depicts 'roughness' as the number of surface atoms per unit area while incorporating the quality of surface atoms.Severely undercoordinated atoms have large S, and surfaces with a low density of severely undercoordinated surface atoms could be 'rougher' than surfaces with a high density of slightly undercoordinated surface atoms.As a result, the computed 'roughness' may not necessarily be consistent with visual perception.The derivation of v is worked out on two surfaces of rutile structure TiO 2 .Figure 9 shows values of S calculated for the (100) and (110) surfaces; S is not given for atoms with less than 0.75π sr.The value of  d is 2ρ (~4.416Å).For the (100) surface with A = 13.92Å 2 (Figure 9(a)), there is one surface O atom with S = 2.65π sr and one surface Ti atom with S = 1.35π sr.When S cut = π sr, v = (1.65 + 0.35)/13.92= 0.144π sr/ Å 2 .When S cut = 2π sr (an unwise choice of S cut ), the surface O atoms are no longer considered as a surface atom and v = 0.65/13.92= 0.047π sr/Å 2 .For the (110) surface with A = 19.69Å 2 (Figure 9(b)), there are two surface Ti atoms with S = 1.78π and 1.41π sr and three surface O atoms with S = 2.65π, 1.64π, and 1.64π sr.When S cut =π sr, v = (1.65 + 0.64 + 0.64 + 0.78 + 0.41)/ 19.69 = 0.208π sr/Å 2 .Values of S and A are also given in Figure 7.
The choice of d = 2ρ was used in this study; too small d cannot sufficiently capture nearby atoms, while too large d reduces S in outermost surface atoms with S > 2π sr.Results for S cut = 0.75π sr and S cut =π sr are shown in this study to evaluate the sensitivity of v against the choice of S cut .The value of v needs to be based on atom positions prior to relaxation when v is regarded as a descriptor that can be calculated quickly.However, v can be obtained on relaxed surfaces, unlike p, thus the relation between v from the relaxed surface (denoted as relaxed v in contrast to v without any modifier) and relaxed E surf is also discussed.The relaxed v does not serve as a useful descriptor because surface calculations are necessary before calculation of v, but is discussed in this work as a scalar quantity with scientific merit that reflects the characteristics of the surface.

Tests in sc, bcc, and fcc crystals
The surface roughness index v was calculated for the (001), (110), and (111) surfaces of sc, bcc, and fcc crystals.The cutoff distance, d, used to obtain the 'open space' S was set to d = 2ρ.Figure 13 shows the values of v when S cut was varied between 0.75π and 1.75π sr.0.75π sr was suggested in Ref [37], as the solid angle cutoff to determine surface atoms.The atom on the (001) surface of a sc lattice is a prototypical surface atom and its 'open space' is S = 2π sr (solid angle of a hemisphere).Therefore, 2π sr is clearly a too large value of S cut .The lines in Figure 13 bend at S of surface atoms.The stable surfaces anticipated for sc, bcc, and fcc crystals, which are the (001), (110), and (111) for sc, bcc, and fcc, respectively (see discussion in Section 3.1.1),have the largest values of v, suggesting that more stable surfaces tend to have larger v.

Application to rocksalt structure MgO and CaO
Figures 14 and S24 are plots of v when S cut = 0.75π sr against E surf for MgO and CaO, respectively, results for S cut = π sr against E surf are provided in Figures S22 and  S25, respectively, and relaxed v against relaxed E surf in Figures S23 and S26, respectively.Labels are provided for the terminations with the three lowest E surf .Results for six types of reconstructed nonpolar (111) surfaces are shown in addition to six cleaved nonpolar terminations because, unlike p, v can be calculated for reconstructed surfaces.These 12 terminations are illustrated in Fig. S1 of Ref [17].
Calculations for MgO and CaO share the same trends for all calculation conditions.Among cleaved terminations, the lowest, second lowest, and third lowest E surf terminations, which are (100), (310), and (210), respectively, are the lowest, second lowest, and third lowest v terminations, respectively.The anion terminated and cation terminated octopolar (111) terminations [38,39] take the same v that is close to the (210) termination.The design of v can extract   information on surface atoms much closer to the bulk from the outermost atoms compared to p, thus v can extract the nature of step-and-terrace type high index surfaces better than p.The R 2 value is not extremely high but not bad, between 0.51 and 0.76 within the 12 plots, but what matters most is the predictability of low E surf terminations, not a high R 2 over a set of terminations that is chosen with some ambiguity.

Application to anti-fluorite structure Li 2 O, Na 2 O, and K 2 O
Figures 15, S29, and S32 are plots of v when S cut = 0.75π sr against E surf for Li 2 O, Na 2 O, and K 2 O, respectively, results for S cut = π sr against E surf are provided in Figures S27, S30, and S33, respectively, and relaxed v against relaxed E surf in Figures S28, S31, and S34, respectively.Labels are provided for the terminations with the three lowest E surf .The lowest E surf termination, (111), does not have the largest v, while the trend of larger v in lower E surf terminations is found in other terminations.The values of S for surface atoms are shown in Figure 7. Surface atoms on a flat terrace have S of 2π sr, and step edge atoms and step foot atoms have S larger and smaller than 2π sr, respectively.The anions and cations in the (111) orientation are not on the same plane.Cations on the outermost plane have S = 2π sr, while anions not on the outermost plane have S < 2π sr.The outermost layer of the (110) termination has both cations and anions, and S = 2π sr for all of these atoms.This difference results in a smaller v in (111) compared to (110).
The number of missing bonds is also shown in Figure 7. Atoms with missing bonds have S larger than π sr, but the number of missing bonds and the value of S does not have a strong correlation.Therefore, v provides a different view of the surface from σ determined from the density of missing bonds at the surface.

Application to rutile structure TiO 2 , SnO 2 , and GeO 2
Figures 16, S37, and S40 are plots of v when S cut = 0.75π sr against E surf for rutile structure TiO 2 , SnO 2 , and GeO 2 , respectively, results for S cut = π sr against E surf are provided in Figures S35, S38, and S41, respectively, and relaxed v against relaxed E surf in Figures S36, S39, and S42, respectively.Labels are provided for the terminations with the five lowest E surf and five largest v.
Use of S cut = 0.75π sr resulted in inclusion of at least three among the five lowest fixed and relaxed E surf terminations within the five largest Black, red, and green labels represent terminations in both five lowest E surf and five largest v, in five lowest E surf but not in five largest v, and not in five lowest E surf but in five largest v, respectively.v terminations (Figures 16, S37, and S40).The (110) termination is always included in the five largest v terminations, but the (100) termination is not.The value of v showed good performance in predicting low E surf terminations even though the R 2 is bad at 0.14 to 0.36.Using S cut = π sr results in comparable or worse prediction performance (Figures S35, S38, and S41).In both (110) and (100) terminations, the outermost layer consists of two-fold coordinated anions (Figure 9).The second layer in the (110) termination is relatively dense and contains both cations and anions, while that of the (100) termination contains cations and is sparse.As a result, the v of the (100) is small for its E surf compared to (110).
The v of the (100) termination increases more compared to other terminations after atom position relaxation (Figures S36, S39, and S42), and (100) even became the fourth largest v termination in GeO 2 (Figure S42). Figure 17 plots v when S cut = 0.75π sr against E surf for anatase structure TiO 2 , results for S cut =π sr against surf are provided in Figure S43, and relaxed v against relaxed E surf in Figure S44, respectively.Labels are provided for the terminations with the five lowest E surf and five largest v.

Application to anatase structure TiO 2
Three among the five lowest fixed and relaxed E surf terminations are included in the five largest v terminations for both S cut = 0.75π and π sr (Figures 17 and S43), which is remarkable considering the very low R 2 value of 0.19 and 0.02 for fixed and relaxed E surf when S cut = 0.75π sr.These three terminations are the same for both S cut choices.However, the lowest E surf termination, (101)A, is not included.

Application to β-Ga 2 O 3 and θ-Al 2 O 3
Figures 18 and S47 are plots of v when S cut = 0.75π sr against E surf for Ga 2 O 3 and Al 2 O 3 , respectively, results for S cut = π sr against E surf are provided in Figures S45  and S48, respectively, and relaxed v against relaxed E surf in Figures S46 and S49, respectively.Labels are provided for the terminations with the five lowest E surf and five largest v.
For both S cut = 0.75π sr and S cut = π sr, the five lowest v terminations contains three among the five lowest fixed E surf terminations, namely (100)A, (100)B, and (60 � 1)B, in both Ga 2 O 3 and Al 2 O 3 .Three and two among the five lowest relaxed E surf terminations are included in the five largest v terminations in Ga 2 O 3 and Al 2 O 3 , respectively.The R 2 value is slightly better when S cut = 0.75π sr compared to S cut = π sr.The relaxed v of the lowest relaxed E surf termination, (100)A has a large relaxed v in Ga 2 O 3 (fourth and second largest for S cut = 0.75π sr and S cut = π sr, respectively, as shown in Figure S46), but not so large in Al 2 O 3 .Figure 19 plots v when S cut = 0.75π sr against E surf for anatase structure TiO 2 , results for S cut = π sr against E surf are provided in Figure S50, and relaxed v against relaxed E surf in Figure S51, respectively.Labels are provided for the terminations with the five lowest E surf and five largest v.

Application to monoclinic ZrO 2
The lowest E surf (11 � 1)A surface has the largest or second v when S cut = 0.75π sr (Figure 19) or S cut = π sr (Figure S50), and v is not reliable when identifying other stable surfaces.The R 2 was better when S cut = 0.75π sr.The relaxed v of the (11 � 1)A surface is the third largest relaxed v when S cut = 0.75π sr (Figure S51).

Complementary nature of p, v, and σ
The choice of λ = 0.3 for p, and S cut = 0.75π sr and d = 2ρ, for v was relatively reasonable over various systems, hence these parameters are considered hereon.The capability to use the same parameters over many systems while not significantly compromising predictability makes the descriptors effectively parameter-free and transferable.
The oversimplified assumptions underlying the descriptors are designed to reflect part of the physics and chemistry that are expected in reality.The assumptions are reasonable in some systems but, unfortunately, not so in some.
The complementary nature of p, v, and σ in Li 2 O terminations in Figure 7 is discussed here because these terminations have diverse characteristics and the descriptors extract different features of the terminations.The (111) orientation in Figure 7(a) consists of dense layers of atoms at even intervals but with atoms in every fourth layer removed.The (311) orientation in Figure 7(a) has sparse layers with more narrow intervals, also with atoms in every fourth layer removed.In contrast, the (110) orientation (Figure 7(a)) has very dense layers at even intervals.All Li-O bonds are identical.
The density of severed bonds, and also the density of atoms with two bonds removed, increases in the order of (111), (331), and (110) terminations.This is also the order of E surf , thus bond breaking does indeed play a very important role in determining E surf in this strongly ionic compound, as expected.
The spacing between atom layers at the reference plane is a critical factor that governs p; a larger spacing tends to decrease p because the contribution to p from each atom decays quickly as the distance from the reference layer increases.As a result, the order of p is the same as that of the size of the largest interval between layers of atoms.
The value of v is affected by the surface roughness.The (111) termination has a low density of atoms on the outermost layer of atoms compared to the (110) termination.The spacing between the outermost and second layers of (111) results in a low solid angle S of 'surface atoms' in the second layer.As a result, the v of (111) becomes smaller than that of (110).The (331) surface can be regarded as a stepson-(110)-terraces model, and the large density of surface atoms per (331) vicinal plane area increases v.However, the 'surface atoms' deeper from the outermost layer, such as the step foot atoms, have small S, and the order of v between (110) and (331) changes between S cut = 0.75π sr and π sr.These analyses illustrate that the differences in the order of predicted E surf between the three descriptors are unavoidable in some systems for a very good reason.The ranks of p, v, and σ for the five most stable terminations of rutile structure TiO 2 , SnO 2 , GeO 2 , anatase structure TiO 2 , β-Ga 2 O 3 , θ-Ga 2 O 3 , and m-ZrO 2 are shown in Tables 2-8, respectively.The rank of p and σ of a termination is m when the p or σ of the termination is the m-th smallest value of all considered terminations, respectively, while the rank of v of a termination is n when the v of the termination is the n-th largest value of all considered terminations.Terminations with lower ranks are expected to be more stable, thus the value of the rank is shown when the rank is between 1 and 5 only.
The ranks of both p and v of termination with lowest E surf , both when atom positions are fixed to cleaved bulk or relaxed ('fixed' and 'relaxed' in Tables 2-8, respectively), are both 3 or less except for is the (100) termination of GeO 2 that has the lowest fixed E surf but the ranks of both p and v are higher than 5 and the (101)A termination of anatase TiO 2 with smallest fixed and relaxed E surf where the rank of p is 3 but that of v is larger than 5.The low p and v rank terminations other than the lowest E surf termination tend to be different, representing the complementary nature of these descriptors that capture properties of the surface from very different viewpoints.
Calculating terminations with a low rank in at least one of these descriptors is one strategy to systematically search for low E surf terminations.In particular, terminations with all of low p, v, and σ ranks are likely to have very low E surf ; note that this holds for both fixed and relaxed E surf even though p, v, and σ are obtained without relaxing atoms at the surface.
One approach to improve the performance of E surf descriptors is to sophisticate them by adding higherorder corrections.The number of parameters may be increased and refined by intensive learning, but there is the risk of fine-tuning the descriptor to specific classes of systems and compromising flexibility.Using black-box descriptors is another approach.Total energies from neural network potentials can be considered as a descriptor, which is generally more accurate than classical potentials such as Lennard-Jones and Tersoff potentials, of energies from first principles calculations.The classical potentials are still relevant because the form of the potential and the parameters assumes and strongly represents a certain character of the system.We used an oldschool approach where the form of the descriptors reflects a certain aspect of the physics and chemistry and the parameters have solid meanings.We designed a total of three descriptors, one in a previous report [17] and two in this study, such that the use of three independent probes instead of one can capture more of the stable surfaces.

Summary
We propose the atom proximity function p with the form which is a combination of Equations ( 2) and (3) and λ = 0.3, as an effectively parameter-free descriptor of E surf of various terminations in a given crystal.In addition, the surface roughness index, v, with the form which is Equation (10) with d = 2ρ and S cut = 0.75π sr, is also proposed as another effectively parameter-free descriptor.These parameter values were not obtained by careful regression of a set of limited systems; instead, discrete, rounded numbers were used instead for comparison.These descriptors are robust because these do not require input on bond formation that could have some ambiguity in some cases, and are fast because calculations to relax atoms at the surface are not necessary.Although not perfect, these complementary descriptors that extracts different features of the terminations can be used, together with the unsaturated coordination index σ [17], to estimate low fixed and relaxed E surf terminations.

Figure 1 .
Figure 1.(a) Illustration of sc (001).Lengths ρ and t are the same as the lattice parameter.(b) A Gaussian function of

Figure 5
Figure 5 plots p when λ = 0.3 against E surf for MgO and CaO, and results for λ = 0.2 and 0.4 are provided in Figures S1 and S2 in the Supplemental Material.Labels are provided for the terminations with the three lowest E surf .The trends for fixed and relaxed E surf are similar.The (100) termination has the lowest E surf , followed by relatively high index (310) and (210) terminations.The (100) termination has the smallest p; thus, this termination is correctly estimated as the low E surf termination.The (h10) termination may be regarded as steps on a (100) terrace surface.The miscut angle is defined as the angle between the vicinal surface, which contains step edges, and the terrace surface[36].As h increases, the miscut angle, the step density, and E surf all decreases[17].On the other hand, the distance between planes of atoms, d ¼ 1=

Figure 4 .
Figure 4.The atom proximity function p as a function of parameter λ for (001), (110), and (111) surfaces of (a) sc, (b) bcc, and (c) fcc crystals.Lines are drawn to guide the eye.

Figure 5 .
Figure 5. Relation between p and (a) fixed and (b) relaxed E surf, for MgO and CaO.The parameter λ is 0.3.The three lowest E surf terminations are labeled.

Figure 6 .
Figure 6.Relation between p and (a) fixed and (b) relaxed E surf, for Li 2 O, Na 2 O, and K 2 O.The parameter λ is 0.3.The three lowest E surf terminations are labeled.

Figure 7 .
Figure 7. (a) (111), (b) (331), and (c) (110) terminations of Li 2 O. Blue and red balls represent Li and O, respectively.Numbers in boxes are values of S when d is 2ρ (~4.014Å), and the unit is π sr.Blue and red crosses indicate the number of missing bonds in unsaturated coordination atoms.(111) is the most stable and (110) is the least stable.The is the basal area of the slab model, A, surface roughness index when S cut = 0.75 π sr, and the unsaturated coordination index[17], σ, are also given.

Figure 8 .
Figure 8. Relation between p and (a) fixed and (b) relaxed E surf, for rutile structure TiO 2 .The parameter λ is 0.3.Black, red, and green labels represent terminations in both five lowest E surf and five smallest p, in five lowest E surf but not in five smallest p, and not in five lowest E surf but in five smallest p, respectively.

Figure 9 .
Figure 9. Values of S in atoms at cleaved TiO 2 (a) (100) and (b) (110) surfaces.Blue and red balls represent Ti and O, respectively.The unit is π sr, and S is not shown for atoms with less than 0.75 π sr.The value of d is 2ρ (~4.416Å).

Figure 10 .
Figure 10.Relation between p and (a) fixed and (b) relaxed E surf, for anatase structure TiO 2 .The parameter λ is 0.3.Black, red, and green labels represent terminations in both five lowest E surf and five smallest p, in five lowest E surf but not in five smallest p, and not in five lowest E surf but in five smallest p, respectively.

Figure 12 .
Figure 12.Relation between p and (a) fixed and (b) relaxed E surf, for ZrO 2 .The parameter λ is 0.3.Black, red, and green labels represent terminations in both five lowest E surf and five smallest p, in five lowest E surf but not in five smallest p, and not in five lowest E surf but in five smallest p, respectively.

Figure 13 .
Figure 13.Values of v when S cut is changed for (001), (110), and (111) surfaces of (a) sc, (b) bcc, and (c) fcc crystals.The value of d is 2ρ.

Figure 14 .
Figure 14.Relation between v and (a) fixed and (b) relaxed E surf for MgO.Parameters are S cut = 0.75π sr and d = 2ρ.The three lowest E surf terminations are labeled.

Figure 15 .
Figure 15.Relation between v and (a) Fixed and (b) Relaxed E surf for Li 2 O. Parameters are S cut = 0.75π sr and d = 2ρ.The three lowest E surf terminations are labeled.

Figure 16 .
Figure 16. between v and (a) fixed and (b) relaxed E surf for rutile structure TiO 2 .Parameters are S cut = 0.75π sr and d = 2ρ.Black, red, and green labels represent terminations in both five lowest E surf and five largest v, in five lowest E surf but not in five largest v, and not in five lowest E surf but in five largest v, respectively.

Figure 17 .
Figure 17.Relation between v and (a) Fixed and (b) Relaxed E surf for anatase structure TiO 2 .Parameters are S cut = 0.75π sr and d = 2ρ.Black, red, and green labels represent terminations in both five lowest E surf and five largest v, in five lowest E surf but not in five largest v, and not in five lowest E surf but in five largest v, respectively.

Figure 18 .
Figure 18.Relation between v and (a) fixed and (b) relaxed E surf for Ga 2 O 3 .Parameters are S cut = 0.75π sr and d = 2ρ.Black, red, and green labels represent terminations in both five lowest E surf and five largest v, in five lowest E surf but not in five largest v, and not in five lowest E surf but in five largest v, respectively.

Figure 19 .
Figure 19.Relation between v and (a) Fixed and (b) Relaxed E surf for ZrO 2 .Parameters are S cut = 0.75π sr and d = 2ρ.Black, red, and green labels represent terminations in both five lowest E surf and five largest v, in five lowest E surf but not in five largest v, and not in five lowest E surf but in five largest v, respectively.
3. Black, red, and green labels represent terminations in both five lowest E surf and five smallest p, in five lowest E surf but not in five smallest p, and not in five lowest E surf but in five smallest p, respectively.
Figure 11. between p and (a) fixed and (b) relaxed E surf, for Ga 2 O 3 .The parameter λ is 0.3.Black, red, and green labels represent terminations in both five lowest E surf and five smallest p, in five lowest E surf but not in five smallest p, and not in five lowest E surf but in five smallest p, respectively.
3. Black, red, and green labels represent terminations in both five lowest E surf and five smallest p, in five lowest E surf but not in five smallest p, and not in five lowest E surf but in five smallest p, respectively.

Table 2 .
Ranks of p, v, and σ for low E surf terminations in rutile structure TiO 2 .Parameters are λ = 0.3 for p and S cut = 0.75π sr and d = 2ρ for v, respectively.

Table 6 .
Ranks of p, v, and σ for low E surf terminations in Ga 2 O 3 .Parameters are λ = 0.3 for p and S cut = 0.75π sr and d = 2ρ for v, respectively.

Table 3 .
Ranks of p, v, and σ for low E surf terminations in SnO 2 .Parameters are λ = 0.3 for p and S cut = 0.75π sr and d = 2ρ for v, respectively.

Table 4 .
Ranks of p, v, and σ for low E surf terminations in GeO 2 .Parameters are λ = 0.3 for p and S cut = 0.75π sr and d = 2ρ for v, respectively.

Table 5 .
Ranks of p, v, and σ for low E surf terminations in anatase structure TiO 2 .Parameters are λ = 0.3 for p and S cut = 0.75π sr and d = 2ρ for v, respectively.

Table 8 .
Ranks of p, v, and σ for low E surf terminations in ZrO 2 .Parameters are λ = 0.3 for p and S cut = 0.75π sr and d = 2ρ for v, respectively.

Table 7 .
Ranks of p, v, and σ for low E surf terminations in Al 2 O 3 .Parameters are λ = 0.3 for p and S cut = 0.75π sr and d = 2ρ for v, respectively.