Predicting material properties by integrating high-throughput experiments, high-throughput ab-initio calculations, and machine learning

ABSTRACT High-throughput experiments (HTEs) have been powerful tools to obtain many materials data. However, HTEs often require expensive equipment. Although high-throughput ab-initio calculation (HTC) has the potential to make materials big data easier to collect, HTC does not represent the actual materials data obtained by HTEs in many cases. Here we propose using a combination of simple HTEs, HTC, and machine learning to predict material properties. We demonstrate that our method enables accurate and rapid prediction of the Kerr rotation mapping of an FexCoyNi1-x-y composition spread alloy. Our method has the potential to quickly predict the properties of many materials without a difficult and expensive HTE and thereby accelerate materials development.


Introduction
Recent progress in high-throughput experiments (HTEs) has enabled rapid mapping of compositionstructure-property relationships across a large compositional phase space [1][2][3][4] and has helped accelerate materials development. For example, Takeuchi et al. discovered new ferromagnetic materials using only two processes, combinatorial sputtering and scanning SQUID (superconducting quantum interference device) microscopy [5]. However, HTEs often require expensive equipment.
High-throughput ab-initio calculation (HTC) has also helped accelerate materials development [6]. Nishijima et al., for example, performed HTC on the basis of density function theory and identified better materials for lithium-ion battery cathodes [7]. HTC has the potential to perform composition-property mapping faster than HTE. However, in many cases, HTC does not represent the actual composition-property map due to differences in the experimental data conditions (for HTE) and calculation data conditions (for HTC). Yoo et al., for example, presented an experimental mapping of Kerr rotation θ K for an Fe x Co y Ni 1-x-y composition spread alloy [8]. It was obtained from a combinatorial surface magneto-optic Kerr effect (SMOKE) experiment, which requires large and expensive equipment. The Kerr rotation was roughly proportional to saturation magnetization M, θ k j j ¼ K s M, where K s is a coefficient between 0.2 and 2.0 deg/T [9]. Since saturation magnetization M is almost proportional to magnetic moment m [10], m is a critical descriptor of θ K. Figure 2(a-c) show the predicted mapping of the magnetic moment for the Fe x Co y Ni 1-x-y composition spread alloy for each structural phase (bcc, fcc, and hcp) obtained by ab-initio calculation. We used the Korringa-Kohn-Rostoker (KKR) Green function method, which enables the disordered phases to be calculated by coherent potential approximation (CPA) [11]. We used the Akai-KKR package for the KKR-CPA abinitio calculation [12]. It is clear that the experimental mapping ( Figure 1) cannot be predicted by ab-initio calculation alone because of the difference in structural phases. The ab-initio calculation works only for a single structural phase while actual materials used in HTEs often include a mixture of various structural phases.
Reducing the structural difference between HTE and HTC would enable predicting various properties more rapidly and accurately on the basis of HTC and thereby lead to further acceleration in materials development. We propose combining a simple HTE with HTC and machine learning (ML) for property prediction. ML methods enable rapid analysis of materials big data and have proven to be effective in the development of various materials including potential magnets [13], ferroelectrics [14], superconductors [15] and thermoelectrics [16,17]. We demonstrate that an experimental mapping of Kerr rotation θ K on a composition spread Fe x Co y Ni 1-x-y alloy can be predicted with our proposed method instead of with a difficult and expensive HTE (e.g., a combinatorial SMOKE experiment).

Proposed method
Our proposed property mapping method comprises four steps. As an example application, we describe the steps for predicting the experimental mapping shown in Figure 1 for Kerr rotation θ K for the Fe x Co y Ni 1-x-y composition spread alloy synthesized by Yoo et al. [8]. An Fe x Co y Ni 1-x-y composition spread thin film (100 nm) was fabricated on a sapphire (0001) substrate by combinatorial ion-beam sputtering deposition, followed by postannealing in vacuum at 600°C and 10 −8 Torr for 3 hours. The details are explained elsewhere [8].
First, a combinatorial X-ray diffraction (XRD) experiment, a common and simple type of HTE, was performed for the composition spread sample to obtain comprehensive XRD curves [8]. Figure 3(a) shows the many XRD data points for the composition spread Fe x Co y Ni 1-x-y alloy. The inset shows the XRD curve for Fe 78.5 Co 9.3 Ni 12.2 . The XRD data were obtained using a scanning microbeam X-ray diffractometer with a spatial resolution of 50-300 μm. The details are explained elsewhere [8].
Next, the structure rate was extracted from each XRD curve, which is not so easy. In the Fe-rich region, the structure was bcc. Similarly, the structures were fcc and hcp in the Ni-and Co-rich regions, respectively. However, an Fe 78.5 Co 9.3 Ni 12.2 composition, for example, may have a mixture of bcc-fcc-hcp structural phases. Moreover, it may include ordered phases such as B 2 and L 10 phases. If this is the case, the XRD data can be curve-fitted and decomposed into single structural XRD curves to obtain the structure rate information. However, with combinatorial XRD, it takes a long time for curve-fitting many XRD data points one by one.
Therefore, we used non-supervised ML to decompose the XRD data into single structural XRD curves. Kusne et al. demonstrated that non-negative matrix factorization (NMF) enables combinatorial XRD curves to be quickly decomposed into single structural XRD curves, so the structure rate can be obtained immediately [13,18]. We performed NMF on an R programming language, where an 'NMF' package was used with the 'Brunet' method [19]. Figure 3(b) shows a structure rate mapping. Pie charts showing structure rate R structure (R bcc , R fcc , R hcp , R B2 , and R L10 ) are plotted as a function of composition. The inset in Figure 3(b) shows an example pie chart for Fe 78.5 Co 9.3 Ni 12.2 . It shows a large number of bcc disordered phases and a small number of B2 and L1 0 ordered phases. These results agree with previous results [20].
HTC was then performed to obtain magnetic moment m for each composition and structural phase. In addition to the magnetic moment mapping for the disordered phases (m bcc , m fcc , and m hcp ) shown in Figure 2(a-c), the m B2 and m L10 for FeCo-B2 and FeNi-L1 0 ordered phases were calculated.
Next, the weighted sum of structure rate R structure and magnetic moment m structure were calculated for each structural phase to reduce the structural difference between the experimental data (HTE, Kerr rotation θ K ) and the calculation data (HTC, magnetic moment m). The value of the weighted sum should be proportional to Kerr rotation θ K . θ K % X structure R structure Á m structure (1) Figure 1. Mapping of Kerr rotation θ K of Fe x Co y Ni 1-x-y composition spread alloy from SMOKE experiment. Figure produced using data from Yoo et al. [8]. Kerr rotation is proportional to magnetic moment and/or saturation magnetization.
By following these steps, we can predict the mapping of Kerr rotation θ K on the basis of a simple HTE (combinatorial XRD), HTC (KKR-CPA), and ML (NMF). Figure 4 shows a predicted mapping of the magnetic moment. Unlike the magnetic moment mappings of pure bcc, fcc, and hcp (Figure 2(a-c)), it agrees with the experimental mapping of Kerr rotation θ K shown in Figure 1.

Results and discussion
Our method still has room for improvement. One approach is to reduce the other types of differences between HTE and HTC. One example is the material shape difference. The material shape for the HTE (Figure 1) was thin film while that for the HTC (Figures 2 and 4) was assumed to be bulk to reduce calculation cost. Therefore, a more accurate prediction could be obtained by assuming thin film for the HTC. Another approach is to increase the sophistication of equation (1). We simply calculated the weighted sum of structure rate R structure and magnetic moment m structure for each structural phase, so the prediction was a bit rough. By refining the equation on the basis of physics, we could obtain a more accurate prediction. If there is a sufficient amount of data for ML, we could formulize the equation by using supervised ML. Moreover, to improve convenience, we could use a calculationbased phase diagram method (e.g., CALPHAD [21]) instead of XRD and NMF, which would enable the material properties to be roughly predicted even without having a combinatorial XRD machine.

Summary
We presented a property prediction method for composition spread samples that is based on a simple HTE (combinatorial XRD), HTC (KKR-CPA), and ML (NMF). For a composition spread Fe x Co y Ni 1-x-y alloy sample, we demonstrated that our method enables the mapping of Kerr rotation θ K (magnetic moment m) to be predicted without a difficult and expensive HTE (e.g., a SMOKE experiment). This method has the potential to predict not only Kerr rotation but also other material properties and thereby shorten the development time for various materials.

Data availability
The data and the code that support the results within this paper and other findings of this study are available from the corresponding author upon reasonable request.

Disclosure statement
No potential conflict of interest was reported by the authors.