Resonance Scattering Treatment with the Windowed Multipole Formalism

A new method for directly sampling the neutron resonance upscattering effect is presented. Alternatives have relied on inefficient rejection sampling techniques or large tabular storage of relative velocities. None of these approaches, which require pointwise energy data, are particularly well suited to the windowed multipole cross section representation. The new method called multipole analytic resonance scattering (MARS) overcomes these limitations by inverse transform sampling from the target relative velocity distribution where the cross section is expressed in the multipole formalism. The closed form relative speed distribution contains a novel special function we deem the incomplete Faddeeva function: $$ w(z, x) = \frac{i}{\pi} \int_{-\infty}^x \frac{e^{-t^2} dt}{z-t}. $$ We present the first results on its efficient numerical evaluation.


Introduction
Early continuous energy Monte Carlo neutron transport programs sampled scattering from nuclei in thermal motion assuming that the scattering cross section is effectively constant within the scattering kernel [17].However, as [32,30] detail , the resulting scattering kernel implied by the constant cross section approximation may be far from the actual doubledifferential cross section near a scattering resonance.As shown in [24,29], the resulting error tends to cause a worst-case 11% underestimation of the Doppler feedback coefficient in a PWR, with even larger discrepancies in HTGR problems.
As exhibited by the PRAGMA project [8,9], the conventional methods for treating this effect leave something to be desired on graphics processing unit (GPU) architectures, which constitute the majority of computational power on today's leading supercomputers.Due to the unique architecture of the GPU, algorithmic modifications to standard Monte Carlo algorithms for neutron tracking can tangibly accelerate computation [9].In the same direction, we herein present a GPU-friendly method for handling resonance upscatter when the windowed multipole (WMP) [23] formalism is employed to represent cross sections.In particular, the heuristic for fast GPU code is to avoid rejection sampling and accesses to distantly spaced places in memory, which the new method achieves.To give context to the new method, we first recall some of the conventional methods for modeling resonance upscattering.
The Doppler Broadening Rejection Correction (DBRC) [4] was one of the early proposed techniques to treat the effect of strong variations of the interaction cross section within the energetic vicinity of a scattering neutron, whereas S(α, β) tables had been used prior [11].The method has since been implemented in numerous continuous energy Monte Carlo neutron transport programs [4,39,35,18], and has shown to successfully model the resonance upscatter effect.However, DBRC suffers from rejection probabilities as high as 99.995% [31] for neutron energies in the vicinity of a resonance.
The weight correction method (WCM) [24] also successfully models the effect of resonances on the double-differential cross section of nuclei in thermal motion.This method adjusts the weight of particles to provide numerically correct results even when the constant cross section double-differential free gas distribution is employed.WCM carries the same benefit of our newly proposed method of not requiring an additional rejection loop or tables; however, adjustments to the particle weights introduce substantial variance to the overall Monte Carlo simulation, thus degrading estimates on quantities of interest [8].
Another technique known as target motion sampling (TMS) [36] can be used to model the resonance upscatter effect.However, this method is only applicable to Monte Carlo neutron transport programs employing the delta tracking technique.Unfortunately, performance of delta tracking appears to be lackluster on GPUs [33].
The relative velocity sampling (RVS) method [38,31] was created to ameliorate the high rejection rates characteristic to the rejection algorithms used to model resonance upscatter.These schemes, in essence, sample probability distributions proportional to f (x)g(x), where f (x) is a distribution and g(x) ∈ [0, 1].One then samples from f (x) and accepts the sample with probability g(x).The RVS method moves the direct sampling from the thermal motion term to the cross section term, thus worsening the average case rejection rate but massively improving the worst-case rejection rate.
The relative speed tabulation (RST) method [8] was developed to address the rejection sampling performance impact on GPUs.RST is the first resonance upscatter modeling technique not requiring a rejection loop or bivariate scattering distribution tables.The key observation underlying RST is that the target velocity distribution can be factorized into a marginal relative speed distribution encapsulating the information about the resonances and a simple distribution of the target polar angle conditioned on the target relative speed.Consequently, the univariate relative speed cumulative distribution can be tabulated in select areas of the pointwise cross section, and the conditional polar angle distribution is then directly sampled without requiring any additional data or rejection step.
Despite its simplicity and efficacy on GPUs, the RST method comes with some clear disadvantages.Gigabytes of additional memory are used in storing the relative speed cumulative distributions (CDFs) at each energy point and temperature on a pointwise cross section representation, if all nuclides have the resonance upscatter effect treated.The resonance influence on the double differential cross section is thus restricted for practical reasons to a select few nuclides in the problem to avoid extreme memory usage.Moreover, the method introduces some discretization error in temperature, although this was shown to be a reasonable approximation.
If one could avoid pre-tabulated relative speed distributions, this could substantially reduce the memory needs and avoid costly divergent memory accesses.Reducing serialized memory accesses on GPUs typically leads to significant speedups.We propose a new method which does just that, and achieves this using the windowed multipole (WMP) cross section representation [23].Our new method introduces a novel special function we have deemed the incomplete Faddeeva function which encodes the behavior of the temperature-dependent influence of resonances on the double differential scattering distribution.It relies on the numerical inversion of the analytic representation of the relative speed distribution under the WMP formalism, and polar angle sampling in the same manner as the RST method, but without any precomputed tables.
This contrasts the WMP-based target velocity sampling technique presented in [27], which is similar in nature to the newly presented method in this work.However, [27]'s method relies on the separation of the target velocity distribution into a zero kelvin cross section component and a Maxwell-Boltzmann component.This method thus requires a numerical inversion step of the integrated scattering cross section function wrapped in a rejection loop, similar to [31] but instead using a functional representation of the integrated cross section rather than tabular.
The algorithm presented in [6] shows how the relative speed can be sampled in the windowed multipole framework similarly to our work.However, [6] relies on fitting a sum of Gaussians to replace the poles in Eq. 4, thus representing the zero kelvin scattering cross section in the form: While it remains to be seen that Gaussians can be used to approximate all poles appearing in a windowed multipole library, this proposed approximation introduces a considerable number of degrees of freedom to the problem, with eighteen unknowns for each pole.The optimization problem thus encountered is highly nonlinear and nonconvex leading to fitting difficulties.Additionally, scattering kernels in [6]'s formalism cannot be straightforwardly differentiated with respect to windowed multipole parameters which would be needed to perform sensitivity analysis.Our new method, multipole analytic resonance scattering (MARS), only requires the same windowed multipole data as would be used in a calculation without any treatment of the resonance upscatter effect, avoiding the need for additional tables or fitting steps.We demonstrate the new method's negligible performance overhead compared to other methods for sampling the resonance upscatter effect on CPU architectures, with future work exploring its optimized implementation on GPUs.
where the center of mass angular distribution comes from the nuclear data file.The resulting physics matches the complicated expressions of [32].
Similarly considering the distribution of collision target velocities, this joint distribution is: where M is the Maxwell-Boltzmann distribution of target speeds: and the variable with units of inverse velocity β = A 2kT parameterizes the target velocities, and the distribution f (µ) is a uniform distribution between -1 and 1.The other variables are C, the distribution's normalizing constant; v r , the relative speed of the neutron with respect to the target; σ, the zero kelvin scattering cross section; A, the target mass; k, the Boltzmann constant; T , the absolute temperature; and V , the target speed.
Classically, the approximation that σ(v r ) is constant has been employed.However, it was shown in [30,24] that this approximation is incorrect in the vicinity of resonances, where σ(v r ) varies over a few orders of magnitude, preferentially causing scattering with targets of relative velocity more closely matching the scattering resonance peaks.The various aforementioned techniques are all just methods for sampling from the distribution of Eq. 2 with arbitrary forms of the function σ(v r ).
More specific knowledge about the form of σ(v r ) can be employed.It has been shown extensively [28,13] at this point that the cross section is accurately represented as a sum of poles in addition to a low order Laurent expansion (N ≈ 7), vis: In fact, for the purposes of sampling the resonance upscattering effect, we claim and later numerically demonstrate that the narrow range of attainable v r leads the zero-kelvin cross section to be accurately represented as a single pole and a linear term, over a sufficiently narrow range of energies: The σ 0 and σ 1 terms are calculated through a linearization process to avoid some complexities with higher order polynomial fitting.An algorithm for finding the best values of σ 0 and σ 1 is presented in section 3.4.
With this approximation, we use the technique developed in [8]: rather than attempting to sample the target speed (V ) and direction cosine (µ), one instead samples first the relative velocity v r , and then samples µ from the distribution of µ conditioned on v r .The distribution in this form for projectile speed v is thus [8]: Next, as previously shown for Doppler-broadening of the Windowed Multipole format [23] and the original full multipole format [21], we note the term e −β 2 (v+vr) to be negligible and use the following: At this point, we write the multipole cross section in terms of the relative velocity rather than in terms of energy.The formula to use is: If we define the auxiliary variables x = β(v r − v) and y = βv, and insert Eq. 8 in the marginal target collision rate distribution in terms of relative speed, Eq. 7, some algebra reveals that: where z = βp j − y, which represents a dimensionless measure of the energy gap between the center of the Maxwell-Boltzmann distribution and the location of the resonance.At this point, we can consider the CDF for the random variable x (dimensionless target velocity) conditioned on y (dimensionless projectile velocity).Integrating Eq. 9 yields: where C is the normalizing constant.After distributing the integral and interchanging the operator with integration, we obtain: where the normalizing constant for the distribution is: which we point out is nothing more than the Doppler-broadened scattering cross section at temperature T under the single pole approximation.
The new special function we deem the "incomplete Faddeeva function" is defined as: And it can easily be seen that w(z, ∞) = w(z) as per the definition of the Faddeeva function for [z] > 0: Indeed, for this application, [z] > 0 and we maintain this assumption going forward.A specialized root finder has been developed to quickly invert this CDF and consequently sample the relative velocity.This thus constitutes a method to sample target velocities without rejection sampling or extensive tables.The novelty in our approach lies entirely in the treatment of the zero kelvin cross section and analytical representation of the relative speed cumulative distribution.After sampling from the relative speed distribution, we must sample the target polar angle distribution conditioned on the relative speed as done in [8].For completeness, we conclude with the CDF of the target speed: for which [8] provides a straightforward sampling technique.The remaining work is purely numerical, particularly in requiring an efficient, reasonably accurate algorithm for the incomplete Faddeeva function w(z, x).

The Incomplete Faddeeva Function
The forthcoming discussion explores the properties of the incomplete Faddeeva function, with a particular focus on properties which can be leveraged to obtain efficient numerical approximations to it.We advise the reader that absorbing this section in depth is not necessary to grasp the claims and algorithms made herein.The incomplete Faddeeva function, as defined by Eq. 13 is w : C × R → C.This section attempts to build some intuition as to how this function behaves as z and x individually vary.In resonance upscatter treatment, z parametrizes the location, height, and width of the resonance with respect to the Maxwell-Boltzmann distributed velocities.Values of [z] = 0 correspond to scattering resonances exactly situated at the mean component of relative speed along the neutron's line of flight predicted by the Maxwell-Boltzmann distribution.Values of [z] > 0 correspond to resonances at higher energies than the particle's energy, and therefore induce preferential scattering with relative velocities higher than the incident velocity, and vice-versa for [z] < 0. x parametrizes the dimensionless target relative speed.Small values of [z] imply tall, narrow resonances, with a limit of [z] = 0 being a singularity representing a resonance of infinite cross section.Large values of [z] model wider, weaker resonances.
Fig. 1 plots the incomplete Faddeeva function as a function of z for a few values of x.This illustrates the rapidly varying behavior of this function for small values of [z] when [z] ≈ x, where a sharp peak follows the value of x near the real line.On top of that, it shows the approach to the familiarly shaped w(z) as x → ∞.The figure shows that w(z, x) is an increasing (in the sense of increasing in both real and imaginary part) function in x for many values of z, but not all.This behavior is explained by the following asymptotic analysis.
Humliĉek [19] provides the following asymptotic formula for the Faddeeva function without derivation:   Considering the definition of the Faddeeva function once more in Eq. 14, This approximation results from supposing that if z is large, and that only values of t near zero contribute to the integral, we can approximate w(z) as: which immediately yields the asymptotic estimate of Eq. 16.Proceeding with the same approximation in the context of the incomplete Faddeeva function, we thus obtain: Suggesting a close connection between the incomplete Faddeeva function's behavior in x and the error function.In fact, this hunch is confirmed by the following identity which connects the incomplete Faddeeva to the standard Faddeeva function: Proof of this relation is provided in Appendix A.
An even more accurate asymptotic estimate can be obtained for large [z], approximating the pole term as This matches the value, slope, and curvature with respect to t of the pole about t = 0. Substituting this back to Eq. 13 and adjusting the expression such that [w(z, x)] is strictly increasing (as suggested by Eq. 18), and matching the asymptotic value of w(z) as suggested by Eq. 19, results in Eq. 21 is sufficiently accurate to be used in practical computations, as shown by Fig. 3.
In the context of resonance scattering, Eq.21 shows that the relative speed distribution gets shifted forward by a nondimensionalized factor of 1/ [z] with its influence scaling by [r j w(z)] times the resonances residue as suggested by Eq. 11.
Another useful property of the incomplete Faddeeva function is a simple connection between its derivative in the complex plane and its value.This is similar in nature to the derivative of the Faddeeva function [1]: The relation we have obtained generalizes this as:

Efficacy of Shifted Error Function Approximation
True value 1 2 (1 + erf(x))w(z) which clearly maintains consistency with Eq. 22 as x → ∞.This fairly simple connection between the derivative of w(z, x) and its value can be utilized for efficient sensitivity analysis of the scattering kernel with respect to windowed multipole parameters.
The forthcoming discussion presents some further concepts in the direction of efficient numerical evaluation of the incomplete Faddeeva function.Going forward, we denote the second integral appearing in Eq. 19 as: where R m = x − z.While it may seem that a half-range Gauss-Hermite quadrature may work well to efficiently approximate Eq. 24, this is not the case.Firstly, complex exponentials would have to be calculated at each quadrature point.Secondly, as the real part of z grows, the integrand oscillates more.In practice, values of [z] > 10 are frequently encountered, and low degree quadratures would not capture the oscillation.Additionally, given that the value of [z] is small, the denominator becomes nearly singular when x ≈ [z].In fact, when [z] = 0, I(z, m) becomes a discontinuous function in x when interpreted as a principal value integral.
Eq. 24 is equivalent to the incomplete Goodwin-Staton integral, referred to in [12].However, to our knowledge, only asymptotic analysis has been performed on this type of integral before, without any development of numerical routines.Recent work in the field of finance [3] presents results for computing what the authors define as the extended incomplete Goodwin-Staton integral, for which the ν = 1 case is of interest in the present discussion.Unfortunately, the authors' numerical method works for all cases except ν = 1, suggesting Eq. 24 to be of a fundamentally different nature.
In our experience, the difficulty with large [z] cannot be ameliorated by a stationary phase technique [5], as these tend to accentuate the pole behavior and remain of similar difficulty for half range Gauss-Hermite quadrature.
In the windowed multipole method, [z] is near zero, as shown in Fig. 4. Therefore, we can expect to frequently encounter nearly singular integrands in Eq. 24.An efficient numerical technique which explicitly treats this behavior can be devised by first noticing that I(z, m) satisfies this differential equation in the complex plane: This can be used to connect the value I(z 0 , x) at a point z 0 to another point z 1 .The integrating factor technique shows that: If the distance between z 0 and z 1 is small, the exponential in the numerator of Eq. 26 can be Taylor expanded about z 0 to yield an efficient numerical scheme.
We have found that the imaginary part of I( [z], x) can be calculated with a closed-form, discontinuous-in x formula.Because the nearly discontinuous behavior of I(z, x) in x is the main source of difficulty here, Eq. 26 can be used to resolve this behavior accurately after calculating the value of I( [z], x).The real part of I( [z], x) is continuous in x but not available in formula in terms of elementary functions; however, it is readily amenable to numerical approximation.In conclusion, the real and imaginary part of I( [z], x) contrast each other: the former is readily numerically approximated by series or similar methods, whereas the latter is discontinuous and therefore not amenable to series or rational function approximation, and fortunately has an exact formula.
The first goal at hand is to calculate [I(z, m)] for real values of z.This can be written as: The linearity of integration can be distributed over both trigonometric terms.Each of the resulting integrals can be found as respective sine and cosine transform integrals, which are listed in [14].Combining results from the sine and cosine transform tables yields: Next, we can find an approximation for the real part of I(z, m): This expression can neither be written terms of elementary nor special functions, to our knowledge.In order to manipulate it to obtain a numerical expression, consider the auxiliary function: which again, of course, cannot be represented in terms of elementary or special functions.This pinpoints the difficulty in calculating the real part of I(z, m) because: We thus seek a straightforwardly differentiable approximation to R(z, m).The intuition behind our forthcoming numerical approximation to R(m, z) comes from the fact that for m 1, t is negligible in the denominator compared to m over the range where the e −t 2 weighting is large, hence where F (z) is the Dawson F function [1].By standard asymptotic analysis, matching the z derivative at z = 0 and the z 1 asymptote for the Dawson F function results in the improved estimate: which is accurate to within about 5% across the full range of m values.However, this level of accuracy is not appropriate for engineering calculations.
To build more intuition for R(m, z), its close relation to the Dawson F function is evinced by considering the Maclaurin series in z for both: which further highlights the utility of maintaining consistency of R(m, z) with its asymptotic sister F (z).The series are the same, but with the addition of exponential integral multiplying factors in R(m, z).
In order to find such a numerical relation which maintains this consistency in the asymptotic case, Eq. 30 can be transformed via interchange of differentiation and integration.Consider the generalized function: Differentiating reveals that: The fundamental theorem of calculus then applies: Using the fact that R(m, z, ∞) = 0 and doing a change of variables, Eq. 30 becomes: which confirms our hunch about the close relation of F (z) to R(m, z); it is an infinite superposition of stretched and scaled Dawson F functions.
We have found success in inserting an approximation for F (z) to Eq. 39.Well-known approximations to F (z) based on rational expressions and other elementary functions [26,25] do not result in numerically useful expressions.However, noting that: it can be seen that approximations to F (z) in the form a power series can be computationally efficient in light of the recursion relation for exponential integrals: Careful attention must be paid to the floating point properties of this relation [16].The magnification of computational errors grows arbitrarily large, and we later present a specialized numerical algorithm guaranteeing floating point stability.
In order to thus obtain a simple, efficient approximation to R(m, z), we employ the Chebyshev expansion valid for z ∈ [−5, 5] presented in [20].The results of our method could be improved by using a finer piecewise division for the Chebyshev expansion of F (z) as in [10], but we have used the present approach for simplicity of implementation.Thus, if F (z)'s truncated Chebyshev expansion is converted to the power series basis: we obtain approximations of the form The magnification of error in the forward recurrence relation for exponential integrals from [16] is: Notably, the error magnification of the reverse recurrence relation is the reciprocal of this quantity.Moreover, |ρ n | is a function increasing from 1, reaching a maximum, and monotonically descending below 1 [16].As a consequence, a critical index n * exists such that iterating outward from it results in a numerically stable recursion algorithm.In terms of evaluating Eq. 43, this means splitting the polynomial in z into parts above the index n * and those below.After computing e m 2 E n * (m 2 ), Horner's method is used in the reverse recurring relation down to the term of order z, and forward recursion is employed to evaluate the polynomial of degree leading from 2n * + 1 up to 2n + 1.
A simple result we have obtained is that the smallest value of n * such that: is well approximated by: Appendix B shows how this can be obtained.Fig. 5 illustrates the accuracy of Eq. 46.
Using this information, algorithm 1 explicitly states the procedure to calculate R(m, z) and its derivative with respect to z.While the use of a power basis polynomial is sub-optimal, numerically speaking, the main source of numerical error in this scenario originates from the recursive exponential integral formula.The specification of the algorithm assumes that an accurate method for computing E n (x)e x has been provided, which is well documented in many other works.We have employed a C++ adaptation of the continued fraction approximation employed by the Cephes library [7], which is documented in [1].

The Jump Integral
After computing the value of I( [z], m), the differential equation of Eq. 25 which I follows can be used to calculate I(z, m).We calculate I(z, m) in this manner due to the nearly discontinuous behavior of I(z, m); it has a jump discontinuity about m = 0 if z ∈ R. Because the imaginary part of z is small in windowed multipole libraries, the resulting behavior is nearly discontinuous and hence is not captured efficiently by general approximation techniques; finely resolved tables or high polynomial orders would be required.Our approach Figure 5: Approximate solution to finding the first value of n such that thus resolves the discontinuous component exactly with the piecewise function Eq. 28.The nontrivial part of Eq. 26 is the transcendental integral: We have deemed this term the jump integral because it allows jumping from values of I(z, m) on the real line to values above the real line in the complex plane.While it seems that our issue of approximating the transcendental integral w(z, x) has seemingly not been heretofore ameliorated due to the appearance of yet another transcendental integral Eq. 47, a change of variables puts it into a form suitable for numerical approximation: where again, m = [z] − x.Because [z] is small as shown by Fig. 4, the argument to the exponential term is similarly small.Where this integral is well-defined (m = 0), the exponential term can be expanded in its Maclaurin series and integrated term by term: It is verified that the coefficients a n satisfy the two-term recurrence: Next, the term-by-term integrals appear in the form: Where B x (•, •) is the incomplete beta function, defined as: A recursion formula derived as a special case of formulas in [1] efficiently calculates these incomplete beta function values of higher n in sequence: In combination with the fact that: this yields an efficient numerical scheme for evaluating an integral of the truncated Maclaurin series of the exponential of Eq. 48.Algorithm 2 details the combination of all of these facts for an efficient approximation to J(m, z).This approximation works very well for problems with | [z]| ≤ 5, which easily covers the range of scattering events where resonances appreciably affect the double differential at temperature.Outside of that range, the integral becomes increasingly oscillatory, so an asymptotic approximation is employed for |z| > 5.This approximation is documented in Appendix D. Lastly, Algorithm 3 gives the overall algorithm to compute w(z, x) efficiently.It relies on access to some implementation of calculating w(z), e.g. the permissively licensed [22] which implements a variety of approximations to achieve high accuracy, or one of the various rational approximations [34,2,19] when higher error is permitted.Regardless of the chosen w(z) implementation, our algorithm maintains asymptotic consistency such that lim x→∞ w(z, x) = w(z).This work leverages a recent approximation tailored for WMP [15].

The Pole Sampling Approximation
A key approximation of our technique that enables its computational efficiency is viewing the multipole cross section in the relative speed distribution as a mixture distribution.The theoretical justification is that if poles are present and sufficiently close to the incident neutron energy (|z| < 20 specifically), the relative speed PDF is well-approximated by ignoring the polynomial contribution: This expression is not employed to actually sample the scattering distribution.Rather, it is best viewed as a mixture distribution in which each pole contributes a probability proportional to: which defines a discrete distribution.In order to avoid the need for auxiliary storage and the calculation of a normalizing constant to this distribution, we recommend finding the Input : z ∈ C, x ∈ R Output: w(z, x) from Eq. 13 m ← x − [z]; // R(m, z) integral and its derivative rmz, drmzdz ← call(Alg.1); maximum of −|r j w(z j )|/ log(ξ j ) where ξ j are uniform random numbers differing for each pole.The j corresponding to the maximum of this expression follows the desired discrete distribution.We also note that the quantity r j w(z j ) is exactly the Doppler broadened contribution to the integrated scattering cross section, so this sampling procedure incurs no additional Faddeeva function evaluation overhead if this is done in tandem with a WMP cross section lookup operation.
Finally, we emphasize that the pole sampling approximation is not precisely consistent with the original multipole cross section representation.Instead, it uses the fact that polynomial contributions to the cross section negligibly affect the scattering kernel, while poles do so substantially.

Finding the values of σ 0 and σ 1
While it may seem that the polynomial contribution to Eq. 5 could come as the first terms from the polynomials defined within the windows, as [6] used, we have found in practice that this choice is inconsistent with the approximation of the pole sampling technique and leads to negative cross section values.
To remedy this issue, we use the heuristic that the relative error of the cross section's local approximation is minimized by matching polynomial values in the vicinity of the dip of the resonance.The location of the scattering resonance trough is calculated as: where At this point, the window index of √ E trough is calculated 1 .This may be a different window from the incident neutron energy's window.The windowed multipole cross section of Eq. 4 is then evaluated at √ E trough but excluding the sampled pole p j , i.e. 1 The term under the square root in Eq. 57 can sometimes be negative in the vicinity of nonphysical poles which are artifacts of the fitting process, and dealing with imaginary quantities in this case is undesirable.Therefore, our calculation uses a linearization of the non-pole cross section at the incident energy instead in that case.
From there, σ 1 ≈ ∂ σs(E) ∂ √ E is calculated at the same point, this time only including contributions from the polynomial expansion but not from any poles.This linearization technique has been found to improve the accuracy of our method when considering nuclides with tightly spaced resonances such as 235 U.For complete clarity, the resulting expression is: Finally, a linearization of the cross section in √ E space has been obtained.For use with the root finder, the nondimensional variable β( ) is preferable, so σ 0 is appropriately shifted and σ 1 appropriately scaled.

Inverting the relative speed CDF
To sample from the relative speed PDF, we employ the CDF inversion technique.A naive attempt at this would be a few bisection root finding steps followed by a handful of Newtonlike iterations.In practice, we've found that five bisection iterations followed by three Halley-Newton iterations resolves the root to within acceptable tolerance; however, a far more efficient root finder has been developed which takes a maximum of four iterations total, only requiring more work for unusual edge cases.
The bootstrapping step, as we call it, is essential to an efficient implementation of MARS.The bootstrapping step cheaply obtains an initial guess to the solution of the CDF inversion problem, from which a small number of Newton-like iterations improve the solution.
The key to doing so lies in finding a cheap approximation to the inverse of the CDF with general pole parameters.In order to do so, we first move from the root finding space of x ∈ (−∞, ∞) to the nonlinearly mapped variable x = 1  2 (1 + erfx).The intuition behind using this modified space is that as the resonances become weak and the incident neutron energy becomes high, it can be shown that the CDF is simply equal to x which ranges between zero and one.Resonances and low energy free gas effects simply act as perturbations to this linear function, which enables a good starting point for approximating the root location.
The next step in improving the CDF model in the mapped space is to observe that the contribution of collision probability from the resonance largely does not depend on its imaginary part.Increasing the imaginary part of the resonance broadens it and decreases its width.Therefore, the magnitude of the jump in w(z, x) when x is near [z] quantifies the probability that the neutron experiences a collision near the peak of the resonance.The jump in w(z, x) for small [z] is: and therefore the probability contribution due to the resonance is approximately: where C is the normalizing constant given by Eq. 12.Because this is an approximation, the probability of Eq. 66 may not be bounded between zero and one, so we threshold it to that range.In practice, the estimate provided here is accurate.We have found that this probability tends to be added into the CDF about ].This estimate could obviously be tuned for greater accuracy.
One final tool we employ to bootstrap the root finding process pertains to the values of the CDF about x = 0.In this case, numerous instances of functions occuring in its expression such as erf(x) and e −x 2 take on easily calculated values.On top of that, the incomplete Faddeeva function has a closed form expression when x = 0: Because e −z 2 is already computed and cached for the CDF inversion, the calculation of a complex exponential integral is the only difficulty.This is much easier and computationally cheaper to do than the more involved w(z, x) evaluation, so any off-the-shelf approximation of E 1 (−z 2 ) can be employed here.
Addtionally, the derivatives of the CDF with respect to x about x = 0 are also easily obtainable, which we use to further improve our rootfinding guess.So far, we have only incorporated information from the first derivative which has proven sufficient.This leaves us with the following pieces of information from which the root estimate is extracted: the probability due to the resonance, its width, the value and slope of the CDF about x = 0 i.e. x = 1/2, and the known endpoint values of the CDF at 0 and 1.We therefore construct a function which is piecewise quadratic on the left and right of x = 1/2.This quadratic interval ranges to either the endpoints x = 0 or x = 1, or the resonance's upper or lower range of probability gain, estimated here as x∈ ].Note that this interval has to be mapped to an interval in x space.Because the interval of the resonance is small, the Jacobian of the transformation x → x which is proportional to e − [z] 2 (a quantity already computed) can be used to calculate the range in x space.
With this knowledge, the CDF can be approximated somewhat accurately in x space.Despite the apparent complexity of what was just described, the inversion of the previous paragraph's function can be done using simple branching logic and, at worst, the solution of a quadratic equation.Because translating the inverse of the above function into code can take nontrivial effort, C++ code to achieve this has been provisioned in Appendix E. Figure 6 shows two examples of how this can be a quite satisfactory approximation of the CDF in x space when resonances are influencing the scattering distribution.

Calculation of w(z, x)
In order to test the accuracy of Alg. 3, we haved computed reference values of w(z, x) using scipy's [37] adaptive quadrature routine, scipy.integrate.quad, to evaluate the integral formulation Eq. 13.In approximation of the jump integral Eq. 47, only the first five terms in the series are retained.Where functions such as log(x) or e x appear, C++ standard library implementations have been employed.The implementation of w(z) from [22] has been employed.This results in the error profiles exhibited by Fig. 7, where we have plotted the real part of w approx (z, x) − w(z, x) /w(z).Because only the real part is of interest in resonance upscatter calculations, results on the imaginary component's error are omitted.

Single Energy Testing
We first present in Fig. 8 the relative speed distribution of 238 U for two different energies and a few temperatures as calculated both by numerical integration and the MARS analytic CDF.The energies correspond to being in the trough and near the peak of a scattering resonance.These plots clearly show the influence of the resonances on the double differential cross section; a nuclide with constant cross section has a relative speed distribution which is very nearly an error function at epithermal energies.The relative speed distribution near resonances has a jumping effect which is governed by w(z, x).They bear a resemblance to w(z, x) behavior depicted by Fig. 2.
If the relative speed distribution is correct, the resultant double-differential scattering distribution is also correct.Fig. 9 shows this is the case for our method when compared to the RVS method of [31].These results were obtained from our modified version of OpenMC, available at github.com/gridley/openmc/tree/mars.It also shows that the pole sampling technique successfully works for 235 U and its tightly spaced resonances.

Pin Cell Reactivity Feedback
The 2.4% enriched PWR pin cell example from OpenMC's suite of example problems was used to calculate Doppler reactivity feedback effects with four different models.The first and second used pointwise cross sections that were interpolated between 300, 600, 900, 1200, and 2500 kelvin.The model was run at temperatures ranging from 300 to 1800    kelvin in increments of 20 kelvin.Of the two using pointwise cross sections, one used the historical constant cross section free gas scattering approximation, and the second used the RVS method.The second two cases both used windowed multipole cross sections, one using RVS and the second MARS.The ENDFB-VII.1 nuclear dataset was employed.Figure 10 shows how these cases compare.Two hundred cycles with ten inactive were employed, using 200,000 particles per cycle.k eff was thus converged to 20 pcm for each case.
It can be seen that the pointwise cross section representation incurs some interpolation error between 1200 and 1800 kelvin.The MARS method matches the RVS results where multipole cross sections were employed.We can thus conclude that the new method works correctly across the range of energies where resonances influence the double-differential cross section at temperature for nuclides of both strong, distantly spaced resonances ( 238 U) and closely spaced weak resonances ( 235 U).

Influence on Tracking Rate
Finally, in order to determine the computational efficiency of the new method, tracking rate comparisons were carried out on the same PWR pin cell example problem.The computational performance of both inactive and active cycles was assessed.For the active cycles, a 100x100 Cartesian mesh tallied flux, fission rates, and neutron production rates using tracklength estimators.In addition, a spatially homogenized energy spectrum tally consisting of 500 equal lethargy bins was applied.
An Intel Xeon W-2133 with six physical cores carried out the calculations, and obtained the results depicted in scenario, future work will explore its performance on vector computer architectures where we expect it to outperform.The computational expense incurred by tallying tends to render the performance impact of our new method particularly negligible.Collision estimators could be used on the mesh tally to improve the tracking rate, but we arbitrarily opted for track length estimators.Due to subtle hardware-related effects such as cache utilization or branch prediction, the tracking rates of the three resonance upscatter handling methods have different relative performances when comparing active and inactive cycles.Future work will explore detailed performance results on a variety of architectures.

Conclusion
The multipole formalism carries a variety of advantages compared to pointwise cross sections.Aside from its potential gains in computational efficiency on modern compute architectures, it enables accurate Doppler broadening without a library size tradeoff [23], elegant sensitivity quantification, and narrows the gap between R matrix theory and the cross section representation [13].This work develops yet another advantage to the windowed multipole formalism: closed-form resonance upscatter treatment.
We have demonstrated that the new method matches the results obtained by other resonance upscatter techniques.To achieve this, we derived an expression for the target relative speed distribution, and identified a novel special function which universally arises in this application.Novel numerical techniques that balance efficiency and accuracy were derived, implemented, and tested.The overall scheme was shown to achieve the same tracking rate as other resonance upscatter modeling methods.
The new method called multipole analytic resonance scattering (MARS) overcomes the storage requirements of relative speed tabulation [9], and avoids rejection sampling as employed by other common approaches.Without a need to access intermediate storage, the accesses to global memory can be reduced on GPU architectures.On top of that, the work discrepancy between threads incurred by rejection sampling on GPUs is similarly overcome.Future work will explore the implementation and optimization of this method on GPUs.
[38] Jonathan A. Walsh, Benoit Forget, and  A Derivation of Eq. 19 The forthcoming discussion has not been made mathematically rigorous for sake of brevity and the context of a nuclear engineering journal.We begin by defining the auxiliary complex function F(z): The complex line integral theorem can then be applied when [z] > 0: Computing dF dz and inserting then reveals: where the linearity of integration has been employed, and the interchange of differentiation and integration has also been used.The innermost integrals can now be computed exactly, carrying the z through to the integral defining the incomplete Faddeeva function.This results in: Recalling that the Faddeeva function can be defined as we can identify w(z) as the trailing term of Eq. 71.The term w(0, x) must be interpreted in a principal value sense, which results in a contribution in the form of a Heaviside function.
The following expression then results: Figure 11: Modified contour used to cancel exponential integral in Eq. 73.The contribution from the top line is zero.
This result could perhaps be used for numerical calculations of w(z, x).However, it suffers the shortcoming that the exponential integral term goes to infinity for x = 0, which is cancelled out by the integral term.However, w(z, x) is well-defined at x = 0, and the addition of branching logic to numerical routines to handle this case would be cumbersome.The expression can be made more amenable to numerical approximation with some further simplification.
The integral term goes from 0 to z, and the integrand encloses no poles of the following path whenever x = 0.As such, a change of integration path is employed: the contour of Fig. 11 results in a convenient cancellation of terms.The top leg of the contour is zero from the e t 2 term, resulting in: A little bit of algebra shows that: The sign function term ends up cancelling out the Heaviside term upon substitution of Eq. 74 back to Eq. 73.The second term in Eq. 74 easily can be transformed via the change of variables t = −it to the integral in Eq. 19, thus yielding Eq. 19.
B Derivation of Eq. 46 The goal is to find the smallest n such that The variation of the left hand side function of Eq. 76 as n increases, for a constant value of x, has been shown to be first increasing from one, reaching a maximum, and monotonically decreasing from that point, never becoming negative [16].Firstly, the exponential integrals are replaced with the equivalent upper incomplete gamma function: so the equation becomes: Using the asymptotic formula for the upper incomplete gamma function Γ(s, x) → x s−1 e −x gives this approximation to solve: Which can be solved approximately by first inserting Stirling's formula: Taking the nth root results in ex ≈ (2πn) the second term on the right can be approximated by expanding the exponential, for large n: ex This can be solved exactly in terms of the Lambert W function: The asymptotic property of the Lambert W function that W (z) ≈ log z − log log z is then used to obtain Eq. 46.
Using this, one can show that f (2n) (0, m) = a n .This allows for easy evaluation of the asymptotic series of Eq. 85. Numerical experimentation has shown this to be an excellent approximation with a maximum error around 10 −6 when |z| = 5 and |m| = 1, retaining only five terms.The error rapidly falls from there as |z| → ∞ and |m| → ∞.
D Asymptotic Approximation for J(z, x) of Eq. 47 Compared to the asymptotic approximation for the R(m, z) integral, a clean expression for simple computer code is not available to our knowledge.Obtaining an asymptotic expression thus relies on access to a computer algebra system.Finding this starts by applying a simple change of variables to Eq. 47 to find: Where it becomes clear that numerical difficulty from expanding the exponential term originates from the e 2i [z]t modulation.This is the term to isolate to obtain the correct asymptotic behavior as the integrand becomes increasingly oscillatory.The standard repeated integration by parts procedure can then be applied.This is pure tedium, so we simply report the C++ code which evaluates five terms below.

Figure 1 :
Figure 1: w(z, x) for a few values of x.The height represents the magnitude, and coloring is done by phase.

Figure 2 :
Figure 2: [w(z, x)] for a few values of z.The legend is the imaginary number added to the real part specified in each figure's caption.The plotted quantity is normalized by [w(z)] so all lines tend to unity as x grows.

Figure 3 :
Figure 3: Accuracy of the w(z, x) approximation for | [z]| > 5.An unshifted error function is shown for comparison, representing a more naive asymptotic approximation to w(z, x).

Figure 4 :
Figure 4: Distribution of imaginary part of the poles in the windowed multipole method.These were collected from every nuclide in OpenMC's regression testing dataset, based on ENDFVII.1.

Figure 6 :
Figure 6: The bootstrapping CDF provides a fairly accurate, easily invertible approximation to the true relative speed CDF to kickstart the root finding process.The pairs of lines, moving from top to bottom, represent 35.25, 36.25, 38.25, and 66.25 eV incident neutron energies.

Figure 8 :
Figure 8: The MARS analytic CDF matches numerically integrated relative speed cumulative distributions.Some error can be observed for the 1500K case at 35.25 eV; scattering in the resonance dip is fortunately an extremely rare event.

Figure 9 :
Figure 9: Scattering at 1200K matches results from RVS method well at two different energies.These energies interact with resonances for both nuclides.

Figure 10 :
Figure 10: MARS matches the k eigenvalue of the RVS method on a 2.4% enriched fresh PWR pin cell problem.Line width represents estimated standard deviation of the mean.
c o n s t s t d : : complex<double> pp1 = 6 .0 + m2 * ( 6 .0 + 3 .0 * m2) + z r * (m * ( −3.0 − 3 .0 * m2) + z r * (m2 * ( 2 .0 + 2 .0 * m2) + z r * ( −2.0 * m2 * m + 4 .0 * m2 * m2 * z r ) ) ) + jump * = ( 0 .5 − yjumplo ) / ( yjumphi − yjumplo ) ; yjumphi = 0 .5 ; } e l s e i f ( yjumplo > 0 .5 ) { yjumplo = 0 .0 ; yjumphi = 0 .0 ; jump = 0 .0 ; } i f ( jump > a p p r x 0 c d f ) jump = a p p r x 0 c d f ; c o n s t double d = yjumphi − yjumplo ; c o n s t double s o u t = ( a p p r x 0 c d f − jump ) / ( 0 .5 − d ) ; c o n s t double s i n v = jump > 0 .0 ?d / jump : 0 .0 ; i f ( x i >= s o u t * yjumplo + jump ) { c o n s t auto r = s o u t * yjumplo + jump ; c o n s t auto a = ( r − dcdx * ( yjumphi − 0 .5 ) − a p p r x 0 c d f ) / s t d : : pow ( yjumphi − 0 .5 , 2 ) ; r e t u r n 0 .5 * (−dcdx + s t d : : s q r t ( s t d : : pow ( dcdx , 2 ) − 4 .0 * ( a p p r x 0 c d f − x i ) * a ) ) / a + 0 .5 ; } e l s e i f ( x i > s o u t * yjumplo ) { r e t u r n ( x i − s o u t * yjumplo ) * s i n v + yjumplo ; } e l s e { i f ( s o u t > 0 .0 ) r e t u r n x i / s o u t ; e l s e r e t u r n 0 .5 * yjumplo ; } } e l s e { // x i > a p p r x 0 c d f i f ( yjumplo < 0 .5 && yjumphi > 0 .5 ) { jump * = ( yjumphi − 0 .5 ) / ( yjumphi − yjumplo ) ; yjumplo = 0 .5 ; } e l s e i f ( yjumphi < 0 .5 ) { yjumplo = 1 .0 ; yjumphi = 1 .0 ; jump = 0 .0 ; } // C l i p i n n a p r o p r i a t e l y l a r g e jumps i f ( jump > 1 .0 − a p p r x 0 c d f ) jump = 1 .0 − a p p r x 0 c d f ; c o n s t auto d = yjumphi − yjumplo ; c o n s t auto s o u t = ( 1 .0 − jump − a p p r x 0 c d f ) / ( 0 .5 − d ) ; c o n s t auto s i n v = jump > 0 .0 ?d / jump : 0 .0 ; c o n s t auto t h r e s h 1 = s o u t * ( yjumplo − 0 .5 ) + jump + a p p r x 0 c d f ; c o n s t auto t h r e s h 2 = s o u t * ( yjumplo − 0 .5 ) + a p p r x 0 c d f ; i f ( x i >= t h r e s h 1 ) r e t u r n ( x i − t h r e s h 1 ) / s o u t + yjumphi ; e l s e i f ( x i > s o u t * ( yjumplo − 0 .5 ) + a p p r x 0 c d f ) r e t u r n ( x i − t h r e s h 2 ) * s i n v + yjumplo ; e l s e { c o n s t auto a = ( t h r e s h 2 − dcdx * ( yjumplo − 0 .5 ) − a p p r x 0 c d f ) / s t d : : pow ( yjumplo − 0 .5 , 2 ) ; r e t u r n 0 .5 * (−dcdx + s t d : : s q r t ( s t d : : pow ( dcdx , 2 ) − 4 .0 * ( a p p r x 0 c d f − x i ) * a ) ) / a + 0 .5 ; } } UNREACHABLE( ) ; }

Table 1 .
This clearly demonstrates the computational efficiency of MARS compared to the RVS and DBRC methods.While it does not outperform RVS in this

Table 1 :
Tracking rate in thousand particles per second obtained by the constant cross section treatment, and three resonance upscatter models.MARS is comparable in speed to widely accepted techniques.