Practical applications and limitations of basalt discrimination diagrams

ABSTRACT Determining the tectonic setting of unknown volcanic rocks continues to be one of the key challenges in geoscience. While discrimination diagrams have been successfully employed due to their ease of use, recently, validation with big data has raised questions about their performance. In this study, the discrimination boundaries of Th/Yb versus (vs.) Nb/Yb and TiO2/Yb vs. Nb/Yb diagrams, which are the most used types of discrimination diagrams, were redefined based on a large amount of compiled data and support vector machine, a machine learning method. The effectiveness of discrimination diagrams was verified, and the limitations and conditions when using them were clarified. The results show that when using the Th/Yb vs. Nb/Yb diagram, only basalts with Th/Yb ratios higher than the discrimination boundary can be identified as volcanic arcs in origin. In contrast, a significant overlap occurs across boundaries in other cases when using these diagrams, particularly for enriched samples with Nb/Yb ratios higher than five. Therefore, when using these diagrams to determine the tectonic setting of unknown samples, their limitations must be considered when interpreting their results.


Introduction
Discriminating igneous rocks with unknown tectonic settings has long been an important topic in geoscience.The demand for differentiating oceanic rocks is particularly high, because understanding the tectonic setting of rocks subducted/obducted and then exposed on land provides important clues to the mechanism and history of plate tectonics (Davies, 1992;Hamilton, 1998;Kato & Nakamura, 2003;Stern, 2005).
One criticism of conventional discrimination diagrams is that they are based on boundaries drawn by the human eye using insufficient data (Snow, 2006;Vermeesch, 2006aVermeesch, , 2006b)).Recent developments in trace-element datasets of igneous rocks provide an important opportunity to define theoretically rational boundaries of conventional discrimination diagrams and discuss their applicability.This is expected to result in more efficient utilization of the diagrams.This study re-examines long-familiarity discrimination diagrams using big data and discusses their validity and limitations.

The Th/Yb versus Nb/Yb and TiO 2 /Yb versus Nb/Yb discrimination diagrams
In this study, the Th/Yb versus (vs.)Nb/Yb and TiO 2 /Yb vs. Nb/Yb discrimination diagrams defined by Pearce (2008) were used.The reasons for this are as follows.
(1) Although these diagrams cannot simultaneously discriminate between various settings, they can discriminate between oceanic and volcanic arc basalts (Th/Yb vs. Nb/Yb diagrams), mid-ocean ridge, and oceanic island basalts (TiO 2 /Yb vs. Nb/ Yb diagrams), which are useful in practical applications.
(2) These diagrams use the elemental ratios (rather than elemental concentrations) of immobile elements that are resistant to weathering, alteration, and metamorphism, and can be applied to older rocks affected by the addition and/or depletion of highly mobile elements.Notably, chemical composition data are subject to the constant sum constraint, in which the sum of all element concentrations is fixed at 1, 100 (%), or 1,000,000 (ppm) (Aitchison, 1986).Therefore, if the concentration of one element changes significantly due to alteration, the concentration of other elements (even if immobile) will also change.Only the ratios of immobile elements can avoid this effect because they are unaffected by the relative variations caused by increases or decreases in mobile elements.
(3) Because these discrimination diagrams have been well studied, the implications of the results are relatively clear and can be interpreted using petrological rationale (Pearce, 2008;Pearce et al., 2021).
However, these discrimination diagrams did not provide valid discrimination results in recent big data validations (Li et al., 2015); therefore, discrimination boundaries and applicability must be re-examined.
The MOR samples were limited to those collected from spreading centers.Moreover, samples from the Galápagos and Iceland, plume-ridge interaction zones, and those from the Mediterranean Sea, a continental collision zone, were excluded.
The total number of samples in this dataset was 6263, broken down by tectonic setting: 467 in CA, 350 in IOA, 604 in IA, 476 in BAB, 901 in OI, 628 in OP, and 2837 in MOR (Supplementary Table S1).

Method for determining discrimination boundaries
The boundaries were defined using a support vector machine (SVM).An SVM is a supervised machine learning model that can be applied to problems such as classification and regression (Boser et al., 1992).When a two-dimensional data group represented by two features is classified into two classes using a linear model, the straight line that best separates the two data groups is determined, which calculates the distance from the data point closest to the candidate line (called the support vector) and selects the line with the farthest distance from the support vectors of both classes (Cortes & Vapnik, 1995).Compared with other machine learning methods, SVM has the advantage of generating highly accurate models with minimal data and easy parameter adjustment, although it lacks result transparency (Auria & Moro, 2008).This study used the Scikit-learn (Pedregosa et al., 2011) implementation of SVM.
When using chemical composition data for geological samples, such as those used in this study, the presence of errors and anomalous data cannot be completely excluded, nor can they be expected to be completely separated at the boundaries.Therefore, the soft margin method, which allows boundaries to be drawn even when the data are not completely separated, was used in this study.
When analyzing the data using SVM, the data were expressed as a logarithm with a base of 10.In general, it is known that an SVM is sensitive to the scale of the input features; thus, standardization of the input data is needed.However, a comparison of the SVM results with and without data standardization reveals no difference in boundary line positions or discrimination performances between them (Figure 1).Notably, the discrimination diagrams examined in this study originally used the log ratios of elements as variables, and the data distribution when using log ratios was not significantly different between standardized and non-standardized data.In other words, the range of the variables did not differ significantly, and the position of the zeros did not differ significantly from the data average (Figure 1).This explains why SVM can be used effectively even when the data are not standardized.Therefore, in this study, we decided not to standardize the data when applying SVM.This enabled us to plot the analyzed data directly on the discrimination diagrams, similar to conventional discrimination diagrams.

Comparison of discrimination boundaries with Pearce (2008) and Pearce et al. (2021)
Figure 2 depicts the determined boundaries with the samples used for them.The boundaries were drawn in oceanic (MOR, OI, OP) and volcanic arc (CA, IA, IOA, BAB) samples for the Th/Yb vs. Nb/Yb diagram (Figure 2a), and in MOR and OI samples for the TiO 2 /Yb vs. Nb/Yb diagram (Figure 2b).Considering the Th/Yb vs. Nb/Yb diagram (Figure 2a), the boundaries determined in this study were generally the same as those determined by Pearce (2008), although the divergence tended to be slightly higher on the low-Nb/Yb side.Compared to the boundary determined by Pearce et al. (2021), Th/Yb tended to be slightly lower on the low-Nb/Yb side; however, the divergence decreased on the high-Nb/Yb side.
For the TiO 2 /Yb vs. Nb/Yb diagram (Figure 2b), the discrepancy between the boundary defined in this study and those of Pearce (2008) and Pearce et al. (2021) was greater on the low-Nb/Yb side, but smaller on the high-Nb/Yb side.In addition, the discrepancies tended to be larger in Pearce et al. (2021) than in Pearce (2008), particularly for the low-Nb/Yb side.

Th/Yb vs. Nb/Yb diagram
Almost all oceanic (MOR, OP, and OI) basalts plotted below the boundary, and those on the side of Th/Yb plotted higher than the boundary were absent, except for scattered erroneous data (Figure 3a).Therefore, samples with Th/Yb ratios higher than the boundary were immediately established as volcanic arc basalts.This provides useful insight into the genesis of ophiolites and greenstone belts (Pearce, 2008).
Samples with Th/Yb ratios lower than the boundary cannot be immediately described as oceanic basalts because some volcanic arc basalts have Th/Yb ratios lower than the boundary (Figure 3b).Therefore, although a sample set without high Th/Yb ratios would most likely consist of oceanic basalts, the possibility of a volcanic arc cannot be ruled out.
Considering each tectonic setting separately for the IA and IOA basalts, a few samples overlapped in the area below the boundary (Figure 4a, b).In contrast, for the CA basalts, some parts of the samples were plotted below the boundary (Figure 4c).However, most of the CA samples that plotted below the boundary were samples with Nb/Yb ratios higher than five, and when limited to samples with Nb/Yb ratios lower than that, fewer samples showed Th/Yb plots below the boundary (Figure 4c).
More than half of the BAB samples plot below the boundary, indicating that oceanic and BAB basalts could not be discriminated in this plot (Figure 4d).One possible reason for this is that the formation mechanism of the BAB basalts is the same as that of the MOR basalts, that is, decompression melting in a tensile field, resulting in similar chemical compositions.However, as this discrimination diagram is divided into a combination of only three elements (Th, Nb, and Yb), it can possibly be discriminated against using more elements.
Thus, the rocks plotted above the boundary in the Th/Yb vs. Nb/Yb diagram can be regarded as originating from volcanic arcs.Moreover, for depleted samples with Nb/Yb ratios lower than five, rocks plotted below the boundary can be regarded as oceanic and/or BAB basalts because BAB basalts are plotted both below and above the boundary line.This suggests that some BAB basalts with Th/Yb values below the boundary are less affected by the volcanic arc component.In contrast, rocks with Nb/Yb ratios higher than five cannot be clearly classified into oceanic and volcanic arc basalts using this diagram because the BAB, CA, and oceanic basalts are plotted below the boundary.

TiO 2 /Yb vs. Nb/Yb diagram
This plot is applicable only to oceanic basalts (Pearce, 2008).Thus, the oceanic basalts were examined individually.As shown in Figure 5a, almost all the OI basalts plot above the boundary.Almost all MOR samples with Nb/Yb ratios below five were plotted on the lower side, whereas many of those with Nb/Yb ratios above five were plotted on the upper side (Figure 5b).This suggests that the enriched MOR basalts are a mixture of normal MOR basalts and OI basalts, or that the melting process of the enriched MOR basalts is similar to that of OI basalts (i.e.melted deep in the mantle).This result indicates that distinguishing MOR samples from OI samples in this diagram is difficult, particularly for enriched samples (Nb/Yb > 5), although the samples plotted above the boundary are likely OI basalt.
Looking at the OP basalts, the samples were plotted both above and below the boundary (Figure 5c).This suggests that OP basalts may be formed mainly by plumeridge interactions, or that plume formation may be different from that of OI basalts.From the above results, we can conclude that it is difficult to discriminate between the MOR, OI, and OP samples from this diagram.However, samples with high TiO 2 /Yb ratios were affected by the OI component.
These results indicated that the tectonic settings of the unknown samples could not be uniquely determined using this discrimination diagram.However, if Nb/Yb is less than five, the lower part of the boundary is either MORB or OPB, whereas the upper part of the boundary is OIB or OPB.Alternatively, if Nb/Yb is greater than five, all oceanic basalts plot on the upper side of the boundary and cannot be distinguished.However, if the possibility of OP can be ruled out by geological evidence, we can conclude that samples with  Nb/Yb less than five, those plotted above and below the boundary are OI and MOR basalts, respectively.
As mentioned previously, this plot was originally only available for oceanic basalts; however, when the arc basalts were plotted, almost all BAB and IOA samples were plotted below the boundary (Figure 5d, e).In contrast, many of the enriched IA and CA samples with Nb/Yb values higher than five plot above the boundary (Figure 5f, g), whereas those with Nb/Yb values lower than five plot below the boundary, as do the BAB and IOA basalts (Figure 5d,e).This suggests that some plume-related rocks are present within the IA and CA basalts.

Conclusions
Redefinition of the boundary lines and examination of the practical feasibility of the Th/Yb vs. Nb/Yb and TiO 2 /Yb vs. Nb/Yb discrimination diagrams yielded the following results: (1) Based on the Th/Yb vs. Nb/Yb diagram, basalts with Th/Yb ratios higher than the discrimination boundary can be regarded as originating from volcanic arcs.Additionally, for depleted samples with Nb/Yb ratios lower than five, the rocks plotted below the boundary can be regarded as oceanic and/or BAB basalts.However, rocks with Nb/Yb ratios higher than five cannot be clearly classified as oceanic or volcanic arc basalts using this diagram.(2) For the TiO 2 /Yb vs. Nb/Yb diagram, the OI basalts were plotted only above the discrimination boundary; however, the MOR basalts with Nb/Yb ratios greater than five were also plotted, and the OP basalts were plotted in both regions.Thus, the tectonic setting of the unknown samples cannot be uniquely determined using this discrimination diagram.However, only in cases where the possibility of OP can be ruled out, for example, by geological evidence, can samples with Nb/Yb ratios less than five plotted above and below the boundary be determined as OI and MOR basalts, respectively.(3) Plotting the volcanic arc basalts on the TiO 2 /Yb vs. Nb/Yb diagram shows that most of the IA and CA basalts enriched with Nb/Yb ratios higher than five are plotted on the OI-basalt side.This suggests that some plume-related rocks could have been included among those classified as IA or CA basalts.(4) In both diagrams, the discrimination performance for enriched samples with Nb/Yb ratios greater than five was significantly poor, whereas it was relatively good for depleted samples with Nb/Yb ratios less than five.
Consequently, the findings of this study suggest that extreme caution should be exercised when using these diagrams to determine the tectonic settings of unknown samples.In addition, these diagrams cannot completely account for the effects of crustal assimilation or even fractional crystallization and mineral accumulation.Therefore, the diagrams should be used as first-order indicators of the tectonic settings, and the results must be interpreted with an awareness of their limitations and other clues, such as geological and mineralogical observations.

Figure 1 .
Figure 1.Comparison between the results of SVM using unstandardized data (a, b) and standardized data (c, d).Note that there is essentially no difference between the results using unstandardized and standardized data for both Th/Yb versus (vs.)Nb/Yb diagrams (a, c) and TiO