Grain growth prediction based on data assimilation by implementing 4DVar on multi-phase-field model

We propose a method to predict grain growth based on data assimilation by using a four-dimensional variational method (4DVar). When implemented on a multi-phase-field model, the proposed method allows us to calculate the predicted grain structures and uncertainties in them that depend on the quality and quantity of the observational data. We confirm through numerical tests involving synthetic data that the proposed method correctly reproduces the true phase-field assumed in advance. Furthermore, it successfully quantifies uncertainties in the predicted grain structures, where such uncertainty quantifications provide valuable information to optimize the experimental design.


Introduction
Controlling the temporal evolution of grain structures in metals is a crucial issue in obtaining materials with desirable mechanical properties [1][2][3]. To control temporal evolution, predicting the dynamics of grain growth is necessary by considering the minimization of interfacial energy among grains with different crystallographic orientations. Phase-field (PF) models [4][5][6][7][8][9][10] have often been used to simulate such dynamics in materials science. Since PF models allow us to simulate the complex morphological dynamics of grain structures by modeling interactions among grains, simulations using these models help us understand and predict their dynamics with the goal of the efficient development of materials. However, PF models need many phenomenological parameters that are not directly observable in experiments. Thus, establishing a methodology to estimate these parameters from limited observational/experimental data is necessary to pre-CONTACT Hiromichi Nagao nagaoh@eri.u-tokyo.ac.jp dict the dynamics of grain structures. We emphasize here that an evaluation of the uncertainties in the estimated parameters also provides valuable information, since the uncertainties affect the results of the simulated grain structures and, hence, the mechanical properties of metals. Data assimilation (DA), which integrates simulation models and observational/experimental data on the basis of Bayesian statistics, is a computational technique to estimate unobserved states and/or parameters involved in simulation models [11]. It has been applied to many fields of science [12][13][14][15][16][17][18][19]. In general, DA can be classified into online and offline DAs. Online DA sequentially estimates states referring to continuously entered observational data, where a sequential Bayesian filter, such as a particle filter [20,21] or the ensemble Kalman filter [22,23], improves the simulation in accordance with the data. A naïve implementation of online DA is often impractical because of the massive amount of computations needed, proportional to e O(N) , where N is the dimensionality of the variables. Therefore, online DA does not seem appropriate for most PF models, where N easily increases when a fine grid spacing is adopted for the accurate description of the boundaries of different grains.
On the contrary, an offline DA searches the optimum initial states that best match the observed time series. The representative method for offline DA is a four-dimensional variational method (4DVar) [24,25], which is the main subject of this paper. The number of computations needed to obtain optimum initial states does not depend on N such that 4DVar is applicable to even massive simulation models, such as PF models. Although the conventional 4DVar has a serious disadvantage in that it only provides estimates of initial states but cannot evaluate their uncertainties, Ito et al. [19] solved this problem by establishing a new 4DVar that enables us to obtain not only the optimum, but also the associated uncertainty using a second-order adjoint (SOA) method [26][27][28]. This new 4DVar is appropriate for DA based on PF models because it can quantify several uncertainties of interest within a practical number of computations. The authors showed the validity of the new 4DVar by applying it to the most fundamental PF model, the Kobayashi PF model [4], which describes the growth of a single grain driven by a chemical reaction between it and its parent metal. However, the validation is inadequate from the perspective of application to more complex PF models because the dynamics of grain growth are mainly driven by interfacial energy, and the chemical reaction is a secondary factor.
In this paper, we investigate whether the new 4DVar works well in the case of a PF model that describes interface-driven dynamics. Moreover, we propose a DA procedure that allows us to investigate the extent to which uncertainties in the estimated parameters influence the prediction of the temporal evolution of grain structures. The proposed DA procedure based on the new 4DVar is expected to provide valuable information concerning the optimization of experimental design, such as regarding the number of experiments or the observation intervals. The remainder of this paper is organized as follows: Section 2 explains the experimental setup used in this paper and the PF model to describe interface-driven grain growth. Section 3 describes the procedure to predict the temporal evolution of the grain structure. Section 4 validates the proposed procedure through numerical tests and Section 5 contains a discussion of the results. Section 6 states the conclusions and directions for future work in the area.

Observational data
When the grain structure of a metal is evolved in an experiment, a sample of the metal is placed in isother- mal conditions and its grain structure is observed (see Figure 1). In this situation, we can only observe snapshots of the grain structure rougher than those observed before placing the sample into the furnace, since the grain structures in the furnace would be contaminated by noise. We assumed that we had placed a sample in the furnace at t = t 1 , and observed its grain structures at t = t 1 , t 2 , . . . , t K . Note that the grain structure at t = t 1 in the furnace was observed with noise, whereas the original, true grain structure was already known as it could be observed accurately before placing the sample in the furnace. For the experimental setup, the problem to be considered in this paper was estimating the grain structures at t = t P (t P ≥ t K ) by using the observed snapshots at t = t 1 , t 2 , . . . , t K . Solving the predictive problem tells us how the grains are distributed at a future time, and hence is very useful for knowing how long we should keep the metal in the furnace to obtain the desired grain structure.

Multi-phase-field model for grain growth
We describe the dynamics of grain growth driven by the minimization of interfacial energy among grains by means of PF modeling [4][5][6]. The modeling characterizes the grain structure by a set of time-dependent field variables {φ i (x, t)}, termed PF variables. φ i (x, t) describes the probability of the existence of grain i at point x and time t, where the integer i specifies the type of grain to be distinguished. φ i (x, t) takes the value one when point x at time t is occupied only by grain i and zero when grain i does not exist. The case where 0 < φ i (x, t) < 1 means that x belongs to the grain boundary between grain i and other grains. Since {φ i (x, t)} are existence probabilities, they must obey the normalization condition n i=1 φ i = 1, where n is the maximum number of different grains. This paper assumes that the PF variables obey the multi-phase-field (MPF) model proposed by Steinbach et al. [5]: where denotes a Laplacian operator, and and γ are the parameters to be determined. The parameter characterizes the thickness of the grain boundary, where the thickness is proportional to . Considering that several snapshots of the grain structure can be obtained, is a measurable parameter in this model. Thus, does not need to be estimated, and we use it as a unit of spacing. On the other hand, the parameter γ is an unobservable parameter in this model because we cannot estimate γ only by observing the snapshots. Thus, establishing a methodology to estimate γ from limited amounts of observable data is needed, and has been an important task in the field of material science.
In order to compute the time evolution of {φ i (x, t)}, we discretize Equation (1) in rectangular twodimensional (2D) space. Let N x and N y be the numbers of grid points along the x-and y-directions, respectively, N tot be the total number of grid points, i.e. N tot = N x N y , and h be the grid spacing. A periodic boundary condition is imposed on the boundary of the computational domain. Letting φ t i,m be the PF variable of grain i at the mth grid point and time t, Equation (1) can be rewritten as where the Einstein summation convention has been used for repeated subscripts not appearing on the lefthand side, and mm denotes a discrete Laplacian in the regular 2D grid.
We find that the grain structure evolved to reduce the curvature of the grain boundaries, and larger grains grew to absorb smaller grains. For the sake of measuring the grain structure quantitatively, we introduce the average grain size S defined as where we define the areal size S i of grain i as Figure 2(e) shows the time evolution of average grain size S , which grows monotonically in accordance with the growth of the grain structure, so that S is useful for quantitatively evaluating growth.

Methodology
We propose a novel methodology for evaluating the grain structure at t = t P by using observational data based on Bayesian statistics. Since the methodology is based on the DA technique reported in [19], we explain how to apply the technique to the MPF model prior to explaining the methodology. The DA technique allows us to estimate simultaneously the grain structure and the parameter γ at an arbitrary time t O , earlier than the time of the first observation t 1 . Furthermore, the technique, which is based on 4DVar, also allows us to obtain the uncertainty in the parameter γ . The 4DVar is composed of three procedures: (i) defining a 'state-space model' (SSM) and an 'observation model', (ii) constructing a probability density function (PDF) called a 'posterior PDF' based on Bayesian statistics, and (iii) optimization of the posterior PDF using an 'adjoint model'. These procedures are explained in Sections 3.1-3.3. Section 3.4 describes the technique for quantifying the uncertainty in the parameter γ . Finally, we explain our new methodology in Section 3.5.

The state-space model and the observation model
When implementing 4DVar, we must define an SSM, which is the time evolution of a large vector called the 'state vector' that contains variables involved in a given simulation model and, sometimes, the model parameters. The state vector used in this study is given by a time-dependent vector θ t ∈ R N with elements where we use the brief notation (i, m) = m+(i −1)N tot and N is the total number of elements in the state vector, i.e. N = nN tot + 1. Equation (2) written in terms of θ is given by where we abbreviate the superscripts that describe time t for the sake of simplicity, except when the time must be explicitly specified. Moreover, we can write the time evolution of γ as since the parameter γ is not a time-dependent variable. This technique embeds the parameters involved in the simulation model into a large vector together with the time-dependent variables of the model, and allows us to estimate the parameters and the initial states of the variables simultaneously. The set of Equations (8) and (9) is the SSM considered in this study. Since the SSM indicates that the state vector θ evolves autonomously, we are allowed to use a simple notation for the righthand side, i.e.
where F : R N → R N is a function of θ . Note that the trajectory of θ t is determined only by controlling the initial state θ t O = , so that our purpose here is to search for the optimumˆ for that best explains the time series of observational data.
In the construction of the methodology for DA, we need an observation model that describes the relation between observational data and theoretical values comparable with them. Supposing that we can detect the spatial distributions of PF variables from snapshots and that the data include noise, we write the relation between PF variables and observational data D t as where ω t is the observation noise that occurs when keeping the metal in an isothermal environment. We assume that the observation noise is independent and identically distributed, and follows a normal distribution with mean zero and variance σ 2 .

Bayes' theorem
In order to search for the optimumˆ , the 4DVar measures and maximizes the posterior PDF, which is the conditional PDF of with a given D. Bayes' theorem states that the posterior PDF is given by where p( ) and p D| are called the prior PDF and the likelihood function, respectively.
The prior PDF describes prior information provided by experience and intuition. This paper assumes that prior information describes a constraint on given by where L I and U I are the lower and upper bounds of I , respectively. These bounds are given by experience, and some of them can be determined by conditions for the MPF model to be physically valid. Since PF variables indicate the existence probabilities of grains, we set the lower and upper bounds of these variables to L I = 0 and U I = 1 (I = 1, . . . , N − 1), respectively. Furthermore, we set the lower bound L N to zero because γ is a positive parameter. On the other hand, we cannot determine the upper bound U N on the basis of conditions for the MPF model. In this paper, we set it as a large constant γ max . In computations to obtain estimates, it has been confirmed that the selection of γ max does not influence the estimation results. Assuming that the prior PDF p I for I is uniform within the range given by Equation (13), we describe the joint prior PDF p as where The likelihood function quantifies the difference between observational data and the corresponding theoretical values. The relation defined by Equation (11) allows us to describe the likelihood function as where N D denotes the total number of data items, i.e. N D = nN tot , and the standard deviation σ is interpreted as a hyper-parameter to be estimated.
Due to the convenience of numerical computation, we search for the optimumˆ by minimizing cost func-tion J: The constraint in Equation (17) arises from the negative logarithm of the prior PDF p( ), which appears when calculating − log p( | D). The optimum standard deviationσ of σ is determined by optimizing J with respect to σ . By solving ∂J/∂σ = 0, we obtain whereθ is the θ calculated by Equation (10) usinĝ . In practical computations, searching forˆ andσ simultaneously is complex as σ might make J singular. To avoid this difficulty, we decompose the optimization problem into two steps: (i) optimization of another cost function given by and (ii) the calculation ofσ via Equation (18)

Optimization utilizing an adjoint model
When searching forˆ by optimizing the cost function J with the constraints, we should use gradient-based optimization, such as the L-BFGS-B method [29], because the number of dimensions of θ is large. However, the gradient ∂J /∂ needed in gradient-based optimization cannot be calculated by directly differentiating J with respect to , since J does not include explicitly as seen in Equation (19). In 4DVar, we utilize an adjoint method to calculate the objective gradient. This method involves constructing the following three equations with respect to a time-dependent vector λ t ∈ R N : where • denotes the transpose of •, and t f is an arbitrary time later than the time of the last observation t K . J is a time-dependent function satisfying where δ( • ) stands for the Dirac delta function. The details of the derivation of these equations can be found in [24][25][26]. The set of Equations (20)- (22), which is called the adjoint model, states that we can obtain the objective gradient ∂J /∂ as the initial condition of λ (Equation 22) by solving Equation (20) backward in time, with the final condition given by Equation (21). By utilizing the gradient obtained by solving the adjoint model in the iteration of gradient-based optimization, we can search forˆ quickly even when the dimensionality of is large. The number of computations needed to solve the adjoint model is proportional to the same order of the computations needed to solve the simulation model (Equation 10), which is denoted by C. Thus, searching for the optimumˆ can be dealt with by a number of computations proportional to O(C). In Appendix A.1, we present a tangible description of the adjoint model where F is not used.

Uncertainty quantification utilizing a second-order adjoint method
The conventional 4DVar described in Section 3.1 allows us to estimate the optimumˆ quickly even when the dimensionality of is large. However, it does not provide us information about the uncertainty inˆ . To extract this information, another procedure must be implemented on the conventional 4DVar. Ito et al. [19] recently developed a methodology to solve this problem by combining Laplace's method and the SOA method [26][27][28] that efficiently evaluates the second derivative of the cost function. The methodology allows us to extract an uncertainty of interest more efficiently than conventional approaches, and hence is appropriate for our purposes, i.e. evaluating the uncertainty in the parameter γ .
This section briefly explains the essence of the methodology. Having obtained the optimumˆ by optimization using the conventional 4DVar, we assess the posterior PDF p | D in the neighborhood of =ˆ . Given that gradient ∂J/∂ | =ˆ = 0, an approximation of p | D can be obtained by Laplace's method as a multivariate normal distribution: where |•| denotes the determinant of •, and H is a Hessian matrix of J given by or, alternatively, for the definition of J . H is a Hessian matrix of J , i.e.
Let l (1 ≤ l ≤ N) be an index of the component of interest in . Integrating Equation (24) over all variables except l , we obtain a marginal PDF with respect to l as   (29) can be solved by an iterative technique utilizing the Krylov subspace, such as the conjugate gradient method or the conjugate residual method. The iterative method utilizing the Krylov subspace needs to compute many Hessianvector products. The SOA method can compute these Hessian-vector products avoiding the computations of all elements of the matrix. The SOA method considers the time evolutions of two time-dependent vectors ξ t ∈ R N and ζ t ∈ R N , given by and whereλ corresponds to λ obtained by solving Equation (20) with θ =θ , and theθ -derivatives represent the θ -derivatives into which θ =θ is substituted. A combination of Equations (30) (30) and (31). The number of computations used to solve these models is proportional to O(C), so that obtaining an uncertainty in by solving Equation (29) with an iterative method is proportional to O(C). This paper aims to obtain the uncertainty δγ in the estimated parameterγ , and hence we associate l with the index of that corresponds to γ , i.e. l = N. The procedure used to obtain the uncertainty δγ can be summarized as follows.
Step Step 3. Extract r N from the solution r of Equation (29) and compute the uncertainty by δγ =σ √ r N , whereσ can be computed by Equation (18).
Appendix A.2 presents the derivations of the TL and the SOA models for the MPF model.

Prediction of grain structure
The 4DVar-based methodology presented in Sections 3.1-3.4 enables us to determine the optimum estimates that maximize the posterior PDF together with their uncertainties. This section presents how the predictive problem is solved through the obtained optimum estimates and the uncertainties.
Let φ t = θ t 1 , . . . , θ t N−1 be a vector describing the grain structure at time t. In the prediction methodology, we first evaluate the optimumγ for γ and its uncertainty δγ by the 4DVar-based methodology, where the initial time t O is set to t 1 . Since we know the initial grain structure accurately in advance of placing the sample in the heating furnace, we utilize the grain structure as the 'true' structureφ t 1 , i.e. we fix φ t 1 toφ t 1 in the optimization of the 4DVar-based methodology. The simplest way to obtain the predicted grain structure φ t P is to solve the MPF model until t = t P with the optimum parameterγ , starting fromφ t 1 . However, it is not adequate from the perspective of the evaluation of φ t P because the error in the parameter influences the forecast of φ t P . Thus, we calculate the change in the forecast of φ t P when using a value of γ =γ slightly deviating fromγ , by associating the change with the time evolution p φ t |γ , D , which is a conditional PDF of φ t for a given γ =γ and observational data D.
Although the time evolution of the conditional PDF, in principle, can be computed by using a sampling method, such as a particle filter, it is unrealistic as the number of dimensions of φ t is large. Thus, we employ a strategy whereby we track only the evolution of the maximizer of p φ t |γ , D , which is the solution that maximizes the PDF, instead of directly calculating p φ t |γ , D . This strategy can help save a large number of computations, as the complexity of obtaining the time evolution of the maximizer is O(C), as described below.
The strategy requires the construction of the time evolution equation of the maximizer from that of p φ t |γ , D . To evolve p φ t |γ , D , we first need to define the initial PDF p φ t 1 |γ , D . It is linked to the initial joint PDF p θ t 1 | D and described as where the initial joint PDF is equivalent to p( | D) shown in Equation (24). However, employing the righthand side of Equation (32) as the initial PDF makes the computation difficult as we must completely calculate the time evolution of p(θ t | D). Thus, we consider approximating p θ t 1 | D by a multivariate normal distribution q(θ t 1 ) with mean vectoř Since q(θ t 1 ) satisfactorily approximates p(θ t 1 | D) whenγ γ , we consider the time evolution of q(θ t ), rather than that of p(θ t | D), by an integral form using a small time step t as where the kernel function w θ t |θ t− t provides a mapping from θ t− t to θ t . Since the mapping is given by the discretized Equation (10), ( 3 4 ) the kernel function becomes (35) The combination of Equations (33) and (35) provides a transformation from q θ t− t to q θ t , meaning that q θ t for any t can be calculated in principle by recursively using Equations (33) and (35), starting at the initial PDF. However, the direct computation of q θ t is difficult due to the large number of dimensions of θ t and the nonlinearity of F. Thus, we extract the time evolution equation of the maximizer from that of q θ t by linearizing Equation (34). Letθ t be the maximizer of q θ t , andθ be the state vector obtained by solving Equation (34) with given γ =γ and initial stateφ t 1 . A Taylor expansion of Equation (34) in the neighborhood of θ =θ is given by where both matricesǍ

The use of Equation (36) instead of Equation (34) provides an approximation of the kernel function as
(37) The kernel function that describes a linear mapping ensures that when q θ t− t is a multivariate normal distribution, q θ t is another multivariate normal distribution. Since q θ t 1 is a multivariate normal distribution, the time evolution equation of the maximizerθ t is obtained by substituting Equation (37) by Equation (33) and then calculating its expectation, where the initial condition is given byθ In the next step, we associateθ t with the maximizer of p φ t |γ , D . Letφ t be the maximizer of p φ t |γ , D .
Considering that p θ t | D is approximated by q θ t , p φ t |γ , D is described as The linear mapping ensures that q θ t is a multivariate normal distribution  = 1, . . . , N − 1). Note that the calculation of the expectation does not require the covariance matrix U.
In summary, the procedure to obtain the maximizer of p φ t P |γ , D consists of the following three steps.
The use of this procedure together with the method in Section 3.4 enables us to evaluate the uncertainty in prediction from parameter uncertainties. This paper evaluates p φ t P |γ , D by givingγ and δγ , i.e. it calculates three vectorsφ t P ,φ t P + , andφ t P − , which are the maximizers of p φ t P |γ , D , p φ t P |γ + 3δγ , D , and p φ t P |γ − 3δγ , D , respectively.

Setup for estimation tests
We apply the proposed methodology to the MPF model to assess its performance by using estimation tests involving a time series of synthetic data. The synthetic data are created by the following procedures: (i) we prepare a time series of φ with a given initial state φ 0 = φ 0 and parameter γ =γ (we call this time series the 'true trajectory'); (ii) we extract φ t at time t = t k (k = 1, . . . , K) from the calculated time series; and (iii) we then add noise that follows a normal distribution with mean zero and variance σ 2 . Our methodology is validated if it can reproduceγ and accurately estimateφ t P .
We call this type of estimation test a 'twin experiment'.
Throughout the experiments reported in this paper, we use the grain structure shown in Figure 2(a) asφ 0 and γ = 1/τ . When performing the twin experiments, we need to calculate the temporal evolutions of the adjoint model, the TL model, and the SOA model, by discretizing them. For simplicity, we discretize them in time with the Euler scheme, and use the same time step and space resolution as the ones used in Section 2.2. To ensure that each element in satisfies the constraint (Equation 13), J is optimized by using the L-BFGS-B technique [29]. The upper bound γ max of the constraint on γ described in Section 3.2 is set to γ max = 2/τ . When we quantify the uncertainty in γ , we use the conjugate residual method as the Krylov subspace method.

Results and discussion
The proposed method is validated through two types of twin experiments: (I) estimation of parameter γ conditional on fixing φ 0 =φ 0 , and (II) evaluation of φ t P . Test I determines whether the 4DVar-based methodology works well by investigating how the estimations depend on observational data. Test II investigates and quantifies how the prediction of the grain structure depends on the estimations.

Test I: Parameter estimation
Test I investigates the influences of three parameters related to the observational data: (i) the interval of observation T; (ii) the standard deviation of observation noise σ ; and (iii) the length of observation time T max . In this test, we assume that the observations are obtained at equal intervals T, i.e. the number of observations K is given by K = T max / T, and the observation times are given by t k = k T (k = 1, . . . , K). The observational parameters used in Test I are shown in Table 1. Figure 3 shows the results of Test I(i). Figure 3(a) indicates that parameter estimation is successful as the estimated parameter is in the rangeγ − δγ <γ < γ +δγ . Figure 3(b) indicates that δγ is a monotonically increasing function of T, and is proportional to √ T for T < τ . Table 1. Values of the interval of observation T , the standard deviation of observation noise σ , and the length of observation time T max used in Test I. Tests I(i), I(ii), and I(iii) investigated how the estimation depended on T , σ , and T max .     Figure 4 shows the results of Test I(ii). Figure 4(a) indicates that the 4DVar-based method is also successful in parameter estimation, and Figure 3(b) shows that δγ is proportional to σ . The power laws observed in Figures 3(b) and 4(b) become fine in accordance with increasing number of observations (i.e. 1/ T → ∞) or decreasing roughness of data (i.e. σ → 0). Figure 5 shows the results of Test I(iii). Figure 5(a) indicates that parameter estimation is satisfactory. Figure 5(b) shows that δγ is a monotonically decreasing function of T max that does not obey any power laws in the range, and its functional form appears to saturate to a constant when T max → ∞.
These features of the uncertainty δγ can be explained by the following theoretical consideration of the Hessian matrix. Having obtained the optimumˆ by optimization using the conventional 4DVar described in Section 3.3, the Hessian matrix at the optimumˆ can be described by a formal solution given by whereσ is obtained by Equation (18). The timedependent matrix R ∈ R N×N is given by where ⊗ denotes a direct product. Letting the number of observations be infinity with a fixed length of observation time, i.e. T → 0 and K → ∞ such that TK = T max = constant, the limitation of the Hessian matrix becomes where a matrix Q ∈ R N×N is given by (44) As Q does not depend on T orσ , we immediately find, from the functional form of Equation (43), that the uncertainty is proportional to σ √ T on the assumption thatσ converges with σ when K → ∞. This fact qualitatively explains the power laws associated with T and σ shown in Figures 3(b) and 4(b), respectively. Furthermore, as Q includes all information pertaining to the simulation model, the power laws are naturally independent of the details of the model. The T maxdependency of the uncertainty can be described by Q −1 . Although it is difficult to evaluate Q −1 analytically, its long-term behavior can be discussed. As seen in Equation (44), the diagonal elements in term (a) are always positive, whereas the off-diagonal elements in term (a) and all elements in term (b) can be either positive or negative. When T max is significantly large, the diagonal elements of Q grow to positive values larger than the off-diagonal elements, and this growth continues until φ t is uniform in the computational domain. That is, Q −1 approaches a constant diagonal matrix when T max → ∞, and this causes the saturation of the uncertainty. For this reason, we consider that the uncertainty shown in Figure 5(b) approaches a constant value when T max → ∞.

Test II: Prediction of grain structure.
In Test II, we use three types of synthetic datasets parameterized by interval T. The synthetic dataset is assumed to be composed of two snapshots of grain structures (i.e. K = 2) at times t 1 = 25.6τ and t 2 = t 1 + T, where the list of T used in this test is summarized in Table 2. The standard deviation σ for generating the synthetic data is set to 0.1. Using the synthetic dataset, we estimate the parameter γ together with its uncertainty, and then apply our prediction method. In the three tests, we confirmed that the uncertainties were significantly small compared with the estimated parameters. Figure 6 shows the results of Test II. To quantify the differences between Tests II(i)-II(iii), we investigate  the temporal evolution of the normalized variation of the average grain size given by where˜ S indicates the average grain size calculated by using the PF variables that obeyed the true trajectory. Let d m , d + , and d − be the normalized variations using the PF variablesφ,φ + , andφ − , respectively. Figures 6(a)-6(c) indicate the trajectories of the normalized variations in Tests II(i)-II(iii). The estimation of the grain structure is successful because the trajectory of d m is almost zero. Furthermore, we find that the trajectories of d + and d − enclose that of d m , meaning that our methodology is successful in providing the upper/lower realizations of the grain structure. The deviations in d + and d − from d m tend to grow over time, and their growth rates are smaller in accordance with the magnitude of T. These results imply that using a larger T provides better estimation of grain structure because γ characterizes the speed of the grain boundary, meaning that using snapshots of structures that are significantly different is advantageous for estimating γ . Furthermore, d + and d − fluctuate in their trajectories, indicating that the normalized variation is sensitive to that in the average grain size. In particular, the large reductions of the deviations appear at t ∼ 200τ in Figures 6(a)-6(c). These phenomena can be explained by considering the dynamical properties of the MPF model. Figures 6(d) and 6(e) show the predicted grain structures ρ + and ρ − , respectively, at t = 200τ in Test II(i), which are ρ calculated byφ + andφ − , respectively. They show a small grain in ρ − that is not observed in ρ + . In general, once small grains have disappeared, the grain boundaries around multiple points rapidly vary to their equilibrium forms (i.e. triplet points with a 120-degree structure). Once the 120-degree structure has been formed, the growth/decay of grains slows down. This implies that the grain structures ρ + and ρ − , once the small grains have disappeared, tend to be synchronized with each other. The large reduction in the deviation at t ∼ 200τ is due to this synchronization.

Conclusions
This paper proposed a 4DVar-based DA procedure applicable to the MPF model that describes purely interface-driven grain growth. We adopted the 4DVar technique proposed by Ito et al. [19] that allows us to obtain the optimum estimates together with their uncertainties within realistic computational time and resources, even in the case of massive simulation models like the MPF model. The procedure properly estimated parameters of the MPF model together with their uncertainties. Moreover, we devised a method to determine how a perturbation of the parameter influences the prediction of the temporal evolution of grain structure. The method allows us to quantify the uncertainty propagation of the parameter by combining the 4DVar technique proposed by Ito et al. We tested the method through numerical tests and quantified how the observational conditions influenced the prediction of the temporal evolution of grain structure. These results are expected to provide valuable information related to the mechanical properties of metals.
Further investigation into the uncertainty in the prediction can improve the design of the observational setup. For example, we can optimize a distribution of sensors and/or observation frequencies that minimizes prediction uncertainties. Such an investigation can allow us to attain a cost-effective development of materials within limited observational time and resources. Furthermore, the proposed prediction method is applicable not only to the interface-driven MPF model, but also to other PF models that involve more complex physics, such as various types of chemical reactions, and thermal and elastic conditions. Through such applications, we can predict the future states of the key factors related to mechanical properties, such as the spatial distributions of chemical compositions and the residual stress in the metal, as well as the grain structure. The 4DVar technique can also estimate past states by shifting the time of origin t O to before the first observational time t 1 . This enables us to uncover the histories of the mechanical properties and then pursue the causality that explains the given state of the metal. We believe that such applications will contribute to the efficient development of materials in the future.

Disclosure statement
No potential conflict of interest was reported by the authors.

A.1. Derivation of the adjoint model
Substituting the right-hand sides shown in Equations (8) and (9) for F as shown in Equation (20), the adjoint model can be given as (A3)

A.2. Derivation of the tangent linear model and the second-order adjoint model
The (A5)