Dissimilarity measure of local structure in inorganic crystals using Wasserstein distance to search for novel phosphors

ABSTRACT To efficiently search for novel phosphors, we propose a dissimilarity measure of local structure using the Wasserstein distance. This simple and versatile method provides the quantitative dissimilarity of a local structure around a center ion. To calculate the Wasserstein distance, the local structures in crystals are numerically represented as a bag of interatomic distances. The Wasserstein distance is calculated for various ideal structures and local structures in known phosphors. The variation of the Wasserstein distance corresponds to the structural variation of the local structures, and the Wasserstein distance can quantitatively explain the dissimilarity of the local structures. The correlation between the Wasserstein distance and the full width at half maximum suggests that candidates for novel narrow-band phosphors can be identified by crystal structures that include local structures with small Wasserstein distances to local structures of known narrow-band phosphors. The quantitative dissimilarity using the Wasserstein distance is useful in the search of novel phosphors and expected to be applied in materials searches in other fields in which local structures play an important role.


Introduction
Phosphors are used for general lighting and display backlight. Transition metal ions and rare earth ions are used as an emitting ion for phosphors. The emitting color of the phosphors depends on the luminescence properties such as the peak wavelength and the emission spectral width. To realize a novel phosphor, the luminescence properties must be controlled. The emissions of Ce 3+ and Eu 2+ based on a 4 f-5d transition can be controlled by changing the local structure around the ions. An 8 K TV requires next-generation green and red Eu 2+ phosphors with a narrow-band emission to satisfy BT.2020 standard. Eu 2+ phosphors with a narrow-band emission are reported for CaF 2 [1], BaSi 2 O 2 N 2 [2], CaAl 2 S 4 [3], β-Sialon [4][5][6], etc. The full width at half maximum (FWHM) of the emission spectrum also depends on the local structure in the phosphor. However, the ideal local structure with the narrowest FWHM is unknown theoretically. Since Eu 2+ in CaF 2 , which is a simple crystal with a cubic local structure, and UCr 4 C 4 -type phosphors with a cubic-like local structure such as SrLiAl 3 N 4 :Eu 2+ [7], SrMg 3 SiN 4 : Eu 2+ [8], CaLiAl 3 N 4 :Eu 2+ [9] and RbLi (Li 3 SiO 4 ) 2 :Eu 2+ [10] show a narrow-band emission, a crystal structure with a cubic local structure has been garnering attention as a novel narrow-band phosphor candidate.
Currently, an efficient approach by material informatics has been used to discover candidate materials from a large amount of data. This method has been applied to search for novel phosphors and interpret relationship between the luminescence properties and material attributes [11][12][13]. Kim et al. extracted K 2 Zn 6 O 7 , which has a cubic-like local structure, and 37 other kinds of crystal structures by screening the crystal structures in the Inorganic Crystal Structure Database (ICSD) under various conditions [13]. In addition, they developed a Sr 2 MgAl 5 N 7 :Eu 2+ narrowband phosphor as an element substitute from K 2 Zn 6 O 7 . Zimmermann et al. reported a method to determine the coordination number of a local structure [14], and expressed the local structure as a vector by defining the local structure order parameters [14][15][16]. Zimmerman's method can classify local structures.
Another method to efficiently screen for a new phosphor is to search for a similar local structure to that in a known practical phosphor. Kim et al.'s method is qualitative because the similarities of the extracted cubic-like structure to the cubic one are not listed. Therefore, quantification of the similarity between the basis local structure and obtained local structure is required. Quantification can directly search for a host crystal, which includes the objective local structure. On the other hand, Zimmermann et al.'s procedure is complex because a new coordinate system must be set to the local structure and many parameters can be introduced. Consequently, a simpler method to obtain the similarity is required. Moreover, a cubic structure is not the only that shows a narrow-band emission.
Therefore, a simple and versatile method to obtain quantitative similarity for a local structure around the center ion is necessary to develop a more efficient phosphor search. In this work, we define the local structure in an inorganic crystal as a geometrical one consisting of the center ion and the surrounding ligands. The local structure is represented as a distribution of the interatomic distances, which consist of the center-ligand distances and ligand-ligand distances. Since the local structure is represented as a distribution, this approach can be applied to any local structures from cube with high symmetry to highly distorted local structures with C 1 symmetry. By expressing the local structure as a distribution, the difference in distribution can be obtained by the method of data science. There are some indicators which showing the difference in distribution. In this study, we propose a dissimilarity measure using the Wasserstein distance. The Wasserstein distance is typically used for image and audio processing as well as generative adversarial networks [17][18][19]. The Wasserstein distance, which is based on the transport problem, is the distance between the distributions, and the calculation does not require any parameters. In the field of data science, a (dis)similarity between data corresponds to a distance between data. Therefore, we can obtain and quantify the dissimilarity between local structures by the Wasserstein distance. Transport problem is calculation of the minimum cost for transporting luggage from one point to another point, and thus the obtained Wasserstein distance in this study is the minimum deformation for changing a local structure that is a polyhedron to another local structure. Hence, a small value of the Wasserstein distance indicates that the local structures are similar, while a large value implies that they are dissimilar To demonstrate the efficiency in searching for novel phosphors, we performed a calculation of the Wasserstein distance to obtain the quantitative dissimilarity between various local structures. Specifically, the Wasserstein distances were calculated in three cases. First, we calculated the difference between characteristic ideal structures such as cubic and confirmed that the variation of the Wasserstein distance corresponds to the structural variation of the local structures. Second, we calculated the difference between the ideal structure and the local structure in known phosphors, demonstrating that the quantitative dissimilarity is effective for actual structures. Finally, we calculated 170 local structures in a known Eu 2+ phosphors [3,10,[20][21][22][23][24], the cubiclike Sr1 site in SrLiAl 3 N 4 , and the 9-coordinated structures in β-Sialon, which are narrow-band phosphors. Then, we discussed the relations of the Wasserstein distance and FWHM of the phosphors. The results confirm that the quantitative dissimilarity by the Wasserstein distance is useful to search for novel phosphors. This method should also be applicable to materials searches in other fields where the local structures in crystals play an important role.

Local structure representation
In this work, the local structure in the crystal is defined as a geometrical structure consisting of the center ion and the ligands. The numeric representation of the local structure is required to be invariant to a translation, rotation, and exchange of the order of the ligands. Accordingly, we propose that the local structure is represented as a bag consisting of the center-ligand and ligand-ligand distances. The bag is expedient and an invariant to a translation, rotation, and exchange of the order of the ligands. However, it lacks the correspondence between atoms. A bag of interatomic distances of an N-coordinated ion is therefore represented by a bag of N center-ligand distances and N(N-1)/2 ligand-ligand distances, namely, where d CL i and d LL i;j are distances between the center ion and i-th ligand, and between i-th and j-th ligands, respectively. The distances can be normalized by the average center-ligand distance, d CL , to cancel the difference in ionic radii of the center ions, as The bag representation of the normalized distances is used in this study. Since the coordination number must be determined to extract the local structure from the actual crystal structure, we employed the CrystalNN method [15]. Here, the bonds are assumed to be equivalent but other cases can be assumed. For example, the ratio of the ligand-ligand distances in the bag increases when the coordination number increases because the number of ligand-ligand distances is where N is the coordination number. It can be corrected by assigning different weights to the center-ligand and the ligand-ligand distances.

Dissimilarity measure of local structures
The distance bag can be expressed as the distribution of the interatomic distances, which is given as where e D � � � � is the number of elements in the bag e D, and δ : ð Þ is the Dirac delta function. In this study, we propose that the dissimilarity of local structures can be defined as the Wasserstein distance, which is the distance between the distance distributions. The Wasserstein distance between one-dimensional distributions of f and g is expressed as where c x; y ð Þ ¼ x À y j j is the cost matrix of the transport distances, and Γ f ; g ð Þ is a set of all couplings of f and g that have marginals f and g on the first and second factors, respectively. The Wasserstein distance is calculated using SciPy packages [25]. If two local structures are congruent or geometrically similar in terms of geometry, W is 0. Figure 1 schematically shows the calculation of the Wasserstein distance between a cubic and a square antiprism that distorts the helix angle 45 degrees from cubic keeping its height as an example. When a cubic structure is expressed as the bag in Equation (1), the bag consists of eight center-ligand distances, . The distributions of cubic and square antiprism in Equation (2) are indicated in Figure 1 as a histogram, where the arrows denote the optimal transport and the numbers next to the arrows are the transport cost. W is calculated as the sum of the products of the transport distance and the transport cost. In this case, W is 0.094. Figure 2 shows the variations of W corresponding to the structural variation of the helix angle θ from a cubic to a square antiprism. The red line indicates W between cubic and the structure with θ, while the green line indicates W between square antiprism with θ = 45° and the structure with the θ. W is calculated in units of 5 degrees for helix angle θ. The variations of W are monotonic. Since W's to cubic and square antiprism cross at 0.047. The cubic is changed by one-dimensional distortion to a flat and slender rectangular parallelepiped. The variations of W which correspond to the structural variation of c/a, where a and c are respectively width and height of the rectangular parallelepiped, from a cubic to a rectangular parallelepiped are shown in Figure 3. The variations of W are also monotonic. From these results, the variation of the W corresponds to the structural variation of the local structures, and W can quantitatively explain the similarity between local structures.

W between ideal structures
In this approach, W can be calculated as the local structures with different coordination numbers because the local structures are expressed as the distance distribution. To calculate W of the local structures with the different coordination numbers, seven kinds of ideal structures were selected: tetrahedron, trigonal bipyramid, octahedron, pentagonal bipyramid, cubic, square antiprism, and cuboctahedron. The center-ligand distances are the same in each ideal structure. Table 1 shows W of the seven kinds of ideal structures. The W between pentagonal bipyramid and square antiprism is the smallest at 0.067, but pentagonal bipyramid and square antiprism are dissimilar because the value is larger than 0.047 in Figure 2. Local structures with different coordination numbers indicate dissimilarity. Therefore, this quantitative dissimilarity can be calculated for any local structure, and a local structure with a different coordination number is dissimilar spontaneously.

W between the ideal structure and the actual local structure in known phosphors
The 170 local structures were extracted from crystal structures registered in ICSD, except for solid solutions, partial occupancy. The central ion of the local structure was an alkaline metal, alkaline earth metal, or rare-earth ion. Figure 4 shows W to cubic and square antiprism of eight local structures, which are selected from the 170 structures of known phosphors. The x and y axes indicate W to cubic and W to square antiprism, respectively. Cubic is located at (0,0.094) on the y axis, while square antiprism is located at (0.094,0) on the x axis. Black circles in a diagram indicate W of square antiprisms with helix angle θ to cubic and square antiprism. When the local structure has a distortion different from θ, the coordinate recedes from the points of the square antiprism structure to the upper right. In this paper, to distinguish the local structure in the crystal structure, we use a notation with a central ion after the chemical formula. For example, the Ca site of CaF 2 is expressed as CaF 2 [Ca]. When there are multiple sites of the same ion  [Ba1] at the top right is dissimilar to both cubic and square antiprism. The dissimilarity using Wasserstein distance is accurately quantified in the actual local structures. Although multiple sites can be substituted in Eu 2+ phosphors, the discussion between the FWHM of the emission spectrum and the local structure is difficult due to effects such as inhomogeneous broadening. In Figure 5( Figure 6(a) shows the distribution of W to the 9-coordinated structure in β-Sialon. Since there is no local structure with W of 0-0.07, the local structures that selected in this study are dissimilar to 9-coordinated structure in β-Sialon. In order to confirm the dissimilarity, the three local structures with the smallest W to the 9-coordinated structure in β-Sialon are shown in Figure 6( O 8 [Sr1] is 9 but the structure is a clearly dissimilar to 9-coordinated structure in β-Sialon. From these results, the 9-coordinated structure in β-Sialon are exceptional in Eu 2+ phosphors. Therefore, if a crystal has a local structure with a very small W to the 9-coordinated structures in β-Sialon, the crystal is a promising candidate for a novel green phosphor with a narrow-band emission.

Conclusions
We proposed a (dis)similarity measure of the local structure using the Wasserstein distance to efficiently search for novel phosphors, and calculated the Wasserstein distance to obtain the quantitative dissimilarity of the local structures in crystals. Herein the local structures in crystals are numerically represented as a bag of the interatomic distances. The representation is invariant to the translation, rotation, and exchange of the order of ligands. This method is simple and versatile. It provides the quantitative dissimilarity for a local structure around the center ion.
The Wasserstein distance was calculated for various ideal and local structures in known phosphors. The variation of the Wasserstein distance corresponds to the structural variation of the local structures, and the Wasserstein distance can quantitatively explain the dissimilarity of local structures. Since W between the SrLiAl 3 N 4 [Sr1] and the SrLiAl 3 N 4 [Sr2] is 0.016, which is the minimum in this work, the two Sr sites are very similar. Therefore, the narrow-band emission of SrLiAl 3 N 4 :Eu 2+ is due to the restraint in inhomogeneous broadening by the similarity of the two Sr sites.
A bulk crystal, which is not a phosphor but includes a small W local structure to SrLiAl 3 [Rb]. On the other hand, there is no structure similar to the 9-coordinated structure in β-Sialon because W of the other local structures exceed 0.07. However, if a crystal has a local structure with a very small W to the 9-coordinated structure in β-Sialon, it will be a promising candidate for the novel green phosphor with a narrow-band emission.