Insights into the primary radiation damage of silicon by a machine learning interatomic potential

ABSTRACT We develop a silicon Gaussian approximation machine learning potential suitable for radiation effects, and use it for the first ab initio simulation of primary damage and evolution of collision cascades. The model reliability is confirmed by good reproduction of experimentally measured threshold displacement energies and sputtering yields. We find that clustering and recrystallization of radiation-induced defects, propagation pattern of cascades, and coordination defects in the heat spike phase show striking differences to the widely used analytical potentials. The results reveal that small defect clusters are predominant and show new defect structures such as a vacancy surrounded by three interstitials. GRAPHICAL ABSTRACT Impact statement Quantum-mechanical level of accuracy in simulation of primary damage was achieved by a silicon machine learning potential. The results show quantitative and qualitative differences from the damage predicted by any previous models.

its applicability to very small impact energies and relevant basic properties [3][4][5][6]. Classical molecular dynamics (MD) simulations have been the dominant approach in radiation damage analysis in past decades, with a key role in enlightening the fundamental physics of primary damage processes. Nevertheless, the reliability of the results from MD simulations has always been in the shadow of the accuracy of the selected interatomic potential and simplifications introduced by its functional form. Machine learning (ML) potentials, as a novel framework in the development of interatomic potentials, offer an alternative approach for the representation of the potential energy surface by learning its topology from large configuration data sets obtained from DFT calculations. This is done by means of mathematical descriptors which encode the information of the local atomic environments, to be retrieved and interpolated for newly encountered configurations during MD implementations [7,8]. This brings MD simulations of systems containing thousands of atoms with DFT accuracy into reality [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. Recently, an ML interatomic potential was developed for Si based on the Gaussian approximation potential (GAP) framework [28]. This potential describes properties of liquid and amorphous phases as well as point defects, which are very important in simulation of radiation damage, with DFT accuracy [28]. This level of accuracy far exceeds that of any classical semi-empirical potentials, and in principle opens up a new avenue for MD simulations of radiation damage with quantum mechanics precision level. However, this potential as published does not have any realistic descriptions of short-range ( 1.6Å) interactions, making it unsuitable for radiation effects simulations.
In this Letter, we augment the GAP with a realistic repulsive potential to enable large-scale DFT-accurate simulations of dynamic evolution of particle-induced collision cascades. We show that the modified GAP accurately reproduces experimental sputtering yields and threshold displacement energies (TDE). Good reproduction of TDEs is very important for many advanced applications, such as a detection possibility of the dark matter weakly interacting massive particles [29]. In nanoelectronics, where the interaction of the low-energy implanted dopants with the defect clusters becomes an obstacle (as a source of diffusion and activation anomalies) to reach desired component design, a precise model of the evolution and form of the surviving defects would lead to a better understanding of the dopant-defect interactions and more realistic treatment of this phenomenon [30][31][32][33].
Short-range interactions in our cascade simulations are described by the all-electron DFT repulsive potential, DMol [34], which is smoothly joined to the original GAP potential, while making sure that it has no effect on the original structures in the training data set. The DFT-DMol calculations were exclusively optimized for the high-energy repulsive part and their excellent agreement with experiments was recently confirmed [35]. The total energy of a system containing N atoms is where V pair is the repulsive pair potential given by a cubic spline fitted to the DMol data. The second term which Table 1. Threshold displacement energies in eV calculated at 30 K based on the method introduced in Ref. [38]. The kinetic energy resolution was 1.0 eV. DFT calculations in Ref. [5] have been conducted with both GGA and LDA exchange-correlation functionals. Results from the GGA functional are presented. 12.5 ± 0.5 11.5 ± 0.5 17.5 ± 0.5 12.5 ± 1.5 12.9 ± 0.6 c a Ref. [5] b Ref. [39] c Ref. [40] takes care of the many-body interactions is a sum over kernel functions that measure the resemblance between the unknown atomic environment i and M representative, trained environments s, encoded by the descriptor h. The α is the vector of learning coefficients obtained by solving a regularized least squares problem [28,36,37]. In accordance with this, we calculated TDE values with the GAP in low-index directions according to the method described in Ref. [38] at 30 K. This initial thermalization sets atoms into the random motion, creating more realistic collision probability between the lattice atoms and the atom in motion. Table 1 provides a comparison of TDE values by GAP and the classical Stillinger-Weber (SW) [41] and Tersoff III (T3) [42] potentials alongside those calculated by DFT [5] and obtained in experiments. The GAP-predicted TDE values in both the 100 and 111 directions are in perfect agreement with DFT and experiments. While SW noticeably overestimates the TDE in all directions, T3 shows a considerable discrepancy only in the 100 direction. The global minimum calculated by GAP is in excellent agreement with DFT and experiments, whereas the predictions of SW and T3 clearly disagree with experiments. This validates our augmentation of the repulsive potential and the way that GAP deals with short-range interactions and defect generation in collision cascades.
We further tested the GAP with single Ar ion implantation simulations. In sputtering simulations, not just a good description of the crystalline, liquid and amorphous phases are important, but also a realistic modeling of the surface binding energy. We simulated 0.1, 0.2, 0.5 and 1.0 keV Ar implantation in Si (see supplemental material for simulation details). Figure 1 shows the acquired sputtering yields with GAP in comparison with SW, T3 and experimental yields [43][44][45][46][47][48]. It also contains the results obtained by Primary-GAP, which is the original potential without our augmented repulsive potential. The clear disagreement of Primary GAP with experiments emphasizes the importance of repulsive response of the potential and its implementation. Although the predictions of all To guide the eye, the same function used in Ref. [43] has been fitted to the simulated yields. potentials are lower than experimental values, they reproduce the energy dependence of the sputtering yield correctly. However, GAP gives the closest agreement with the experiment.
We studied the defect production and clustering of radiation-induced defects by carrying out cascade simulations for 0.1, 0.2, 0.4, 1.0 and 2.0 keV primary knock-on atom (PKA) energies at 300 K (see supplemental material for simulation details). The number of final defects is one of the most important measures in quantifying the extent of irradiation effects in matter. Regarding Si, a question was opened up in Ref. [49] about the actual number of energy-dependent surviving defects upon irradiation. It was shown that the difference in final number of defects between SW and T3, the two most common interatomic potentials in damage analysis of Si, is a factor of 2. As discussed in Ref. [49], this stems from the different functional form of these potentials, which leads to different recrystallization and crystal recovery of the complex configurations generated in the heat spike phase. Figure 2 presents the number of interstitial and coordination defects generated in 2 keV PKA simulations up to 6 ps, after which the system has cooled down to 300 K. The results using GAP reveals that the number of surviving defects is almost the same as in SW, while T3 produces approximately two times more defects. This resolves the abovementioned long-standing debate and supports the prediction of SW regarding the number of radiation-induced defects.
Although the final number of defects is almost the same in GAP and SW, the number of coordination defects and recrystallization of the molten region in the heat spike phase in GAP are clearly different. For instance, in 2.0 keV PKA simulations, GAP shows around 1.5 times more coordination defects than SW and T3 in the heat spike. Moreover, recrystallization is more efficient in GAP as it shows recovery from higher number of coordination defects to the same level of coordination defects in SW during almost the same time span (from maximum heat spike to the equilibrium state). This is related to the different nature of classical and ML potentials. Instead of a fixed, simple functional form, GAP which has been extensively trained to the DFT-calculated liquid and amorphous configurations, treats the molten region in the heat spike and subsequent amorphous regions based on the constructed potential energy surface and a realistic description of these phases can be expected. To endorse this, we refer to Ref. [28] where the structure of liquid and amorphous Si was extensively examined by GAP and compared with analytical potentials. The results showed that GAP has the best agreement with DFT and experiments in characterizing both phases.
Clustering of radiation-induced defects and their size distribution are crucial factors in specifying microstructural evolution of material under irradiation. The size and separation of clusters determine their life time and the strength of their elastic interactions. Hence, the size distribution of initially generated clusters is a critical item for cluster dynamics or rate theory models [50]. Figure 3 presents cluster analysis of defects generated in 2 keV PKA simulations, averaged over the 20 simulations. A defect cluster is defined as a set of neighboring Wigner-Seitz defects; neighbors being defects located within a range up to the r c = 10.8Å [49]. Although isolated defects and small clusters are the dominant forms of surviving defects in all potentials, GAP predicts on average 3 times more isolated defects, and the fraction of small-sized clusters containing 2-10 defects is a factor of 3 and 5 higher compared to the SW and T3 potentials, respectively. On the contrary, the fraction of large clusters containing a major portion of total defects is noticeably lower in GAP. This trend is also true for lower PKA energies, where T3 favors large clustering of defects, followed by SW which with a steady decline produces intermediate sizes as well. The different cluster distributions will substantially affect the long-term evolution of the damage. Unlike defect clusters, isolated defects are highly mobile and can easily recombine, leaving no damage behind [51][52][53][54].
The different clustering behavior can be attributed to another different feature of GAP in cascade simulations; the pattern in which cascades evolve. In the classical potentials, cascades are more compact with collisions confined within pockets, producing dense and localized clustering of the damage. On the contrary, cascades in the GAP spread across a broader spatial range, leaving more isolated defects and covering a larger volume. In other words, more atoms are involved in the distribution of the injected energy by the PKA in the GAP, while the dissipation of energy is concentrated locally in the classical potentials to make larger clusters of displaced atoms. This is shown in Figure 4, where the time evolution of representative 2 keV PKA cascades up to the heat spike phase are illustrated for each of the three potentials. The initial cascade splits into two sub-cascades in the GAP and SW. The different form of cascade propagation between GAP, SW and T3 is clearly visible. Figure 4 also shows the different clustering of the remaining defects in the equilibrium state of the system at 300 K after quenching of cascades and recovery of the displaced atoms as discussed above.
Small defect clusters can play a key role in capturing impurities and self-interstitials in silicon [55]. GAP shows a cluster of four defects in which a vacancy is surrounded by three interstitials, shown in Figure 2. We found this defect structure three times in our simulations with GAP and just once in SW simulations. Among the three occurrences in GAP, one was found in the 0.4 keV PKA simulations and the other two in the 2 keV simulations. For SW we found this cluster in 1 keV PKA simulations. We conjecture that the creation of this defect structure is energy dependent, such that the probability increases with the increase of PKA energy. In one case, the interstitials are symmetrically located around the vacancy. To examine the stability of this defect, we cooled down all cases to 0 K. In one case in the GAP, the central vacancy recombined with one of the surrounding interstitials, but in the other three cases the defect remained as a whole. The stability of the defect can be understood based on coordination analysis of the atoms around the vacancy. In cases where the defect remains stable, the atoms surrounding the vacant position, including the interstitial atoms, all have fourfold coordination (which is the equilibrium coordination number in Si). However, in the case where the interstitial recombined with the vacancy, two of the interstitials had a coordination number of 3. We also checked the stability of the cluster with DFT. We tested 64-and 216-atom boxes with the PBE GGA [56] exchange correlation functional. The cluster remains stable upon relaxing down to the residual forces of 0.01 eV/Å on atoms.
In order to quantitatively examine the clustering behavior, we compare the sizes of the clusters generated in our simulations with those measured by Howe et al. [57,58] in implantation experiments. They used transmission electron microscopy (TEM) to analyse features of collision cascades in silicon by directly observing and characterizing damaged regions created during implantation of various ions (P, As, Sb, Bi) in a wide range of implantation energies. The experiments were conducted at 40-50 K, under low fluences to avoid cascade overlapping. We performed binary collision approximation calculations using srim [59] for all of these implanted ions, calculating the distance between initial point of sub-cascades derived by recoils with energies higher than 500 eV. For heavy ions like 80 keV Bi, with an average cascade-created damage size of 50Å, the average distance between sub-cascades is 15Å, which is directly signaling overlap of sub-cascades created by Si recoils in the substrate. On the contrary, in the case of P ions, where damaged regions by cascades have an average diameter of 18Å, the sub-cascades are initiated 58Å apart from each other. This is in line with the conclusion from experiments that P ion implantation is very inefficient in creating sub-cascades as they have lowest observable damaged regions among the studied ions. Hence, we compare our results with P implantation, where no cascade overlap takes place. Our simulations are representative of the cascades initiated by 0.1, 0.2, 0.4, 1.0 and 2.0 keV Si recoils (our studied PKA energies) in P implantation.
In Figure 5, we present statistics of the average diameter of defect clusters derived from 0.2, 0.4, 1.0 and 2.0 keV PKA simulations.Clusters with average diameter above 10Å are called visible [60]. Considering the recoil-energy distribution of 60 keV P in Si (obtained from SRIM calculations) as conducted in the experiments, we calculated the weighted-average diameter of the visible clusters,d, generated from the above-mentioned PKA events for each potential. 0.1 keV PKAs do not create clusters in the visible region. The weight corresponding to a certain PKA energy, E PKA , was calculated based on the ratio of the number of the produced recoils with E PKA . The weighted-average diameter for each potential then isd [61]. E l and E u are lower and upper limits of the interval that given E PKA resides in, respectively. n(E) is the normalization factor, the number of recoils per ion per energy, d(E PKA ) is the average diameter of the clusters in the given E PKA , and N is the total number of recoils produced by all E PKA s. Figure 5 includes these values alongside the measured average from Ref. [57] which amounts to 18 ± 1.4Å for 60 keV P implantation. Since defects in TEM are imaged through their strain fields [62,63], the experimentally measured cluster sizes are larger than in direct visualization of MD results. We analyzed the effective atomic strain of the defected cells in our simulations, being approximately equal to the second nearest neighbor distance in silicon lattice (3.84Å). Hence we add 3.84Å to our calculated averages from the simulations to allow a fair comparison. We address the presented statistics from two view points. First, by considering the overall weighted average of each potential. GAP with 20.9 ± 0.8Å gives the closest value to the experimental value of 18.0 ± 1.4Å, followed by SW with 23.29 ± 0.8Å and T3 with 34.0 ± 1.3Å, where the uncertainties in our calculations are the weighted standard errors. Second, by taking into account the size distribution of visible clusters generated by the different potentials. Although all of the potentials create clusters larger than the experimental average, the size of the largest cluster in GAP, produced in the highest PKA energy (2 keV), is smaller than in SW and T3. The dominance of small defect clusters agrees with diffuse X-ray scattering experiments which indicated that small defect clusters dominate damage by 4.5 keV He and 20 keV Ga ions in Si at 100 K [51]. The importance of the high ratio of small clusters is also highlighted by them being able to explain the electrical type inversion in Si-based detectors [64].
In conclusion, we have reported the first largescale simulation of irradiation-induced damage and ion implantation with DFT accuracy. We use a machine learning interatomic potential for silicon, GAP, trained over a massive data set of DFT calculations to achieve this goal. Reproducing experimental threshold displacement energies and sputtering yields of Ar implantation, assure us of GAP's performance and reliability in cascade simulations. Clustering of the defects is different in the GAP compared to the classical potentials. Different clustering can be ascribed to the different propagation pattern of the cascades in GAP, as it shows spread-out form of the cascade evolution compared to the compact propagation in the classical potentials. We also encountered a new defect structure, a vacancy surrounded by three interstitials. Moreover, GAP reveals a large fraction of coordination defects in the heat spike phase and more efficient recrystallization of the defects in the postcascade phase. Overall, the key difference between the classical potentials and GAP is that the number of isolated and small defect clusters in GAP are considerably higher than that of large clusters.

Acknowledgments
Grants of computer capacity from CSC -IT Center for Science, Finland, as well as the Finnish Grid and Cloud Infrastructure (persistent identifier urn:nbn:fi:research-infras-2016072533) are gratefully acknowledged. A.H. gratefully acknowledges excellent discussions with F. Granberg and G. C. Sosso. F.D. gratefully acknowledges the financial support of the Academy of Finland (Grant No. 313867).

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
Grants of computer capacity from CSC -IT Center for Science, Finland, as well as the Finnish Grid and Cloud Infrastructure (persistent identifier urn:nbn:fi:research-infras-2016072533) are gratefully acknowledged. F.D. gratefully acknowledges the financial support of the Academy of Finland (Grant No. 313867).