Detecting and Localizing Differences in Functional Time Series Dynamics: A Case Study in Molecular Biophysics

ABSTRACT Motivated by the problem of inferring the molecular dynamics of DNA in solution, and linking them with its base-pair composition, we consider the problem of comparing the dynamics of functional time series (FTS), and of localizing any inferred differences in frequency and along curvelength. The approach we take is one of Fourier analysis, where the complete second-order structure of the FTS is encoded by its spectral density operator, indexed by frequency and curvelength. The comparison is broken down to a hierarchy of stages: at a global level, we compare the spectral density operators of the two FTS, across frequencies and curvelength, based on a Hilbert–Schmidt criterion; then, we localize any differences to specific frequencies; and, finally, we further localize any differences along the length of the random curves, that is, in physical space. A hierarchical multiple testing approach guarantees control of the averaged false discovery rate over the selected frequencies. In this sense, we are able to attribute any differences to distinct dynamic (frequency) and spatial (curvelength) contributions. Our approach is presented and illustrated by means of a case study in molecular biophysics: how can one use molecular dynamics simulations of short strands of DNA to infer their temporal dynamics at the scaling limit, and probe whether these depend on the sequence encoded in these strands? Supplementary materials for this article are available online.


Functional Data Analysis
Functional data analysis (FDA; Ramsay and Silverman 2005;Ferraty and Vieu 2006;Horváth and Kokoszka 2012;Wang, Chiou, and Mueller 2016) deals with inferential situations where each data point that is best modeled as the realization of a stochastic process, understood as a random function or a random surface, such as weather data, neuroimages, electricity consumption curves, or phonetics, to name a few (e.g., Ramsay and Silverman 2002;Antoniadis, Paparoditis, and Sapatinas 2006;Aston and Kirch 2012a;Hadjipantelis et al. 2015). In a typical setting, one is interested in drawing inferences on the law of a random function X ∈ L 2 ([0, 1], R) based on a sample X 1 , . . . , X T iid ∼ X. The main characteristics of interest are the mean function, and the covariance surface/operator. The mean function describes the average shape of the random object of interest, whereas the covariance operator encodes the secondorder fluctuations of the random function around its mean. The covariance operator and its eigendecomposition are at the basis of the Karhunen-Loève expansion (Grenander 1981), which provides insight into the fluctuations of the random function X, and is the basis for optimal linear finite-dimensional approximations of X (through functional principal component analysis, or natural form of dependency in the data collection, such as when data are collected sequentially in time. Examples of such data include (but are not limited to) daily electricity consumption curves (Antoniadis, Paparoditis, and Sapatinas 2006), functional MRI data (Aston and Kirch 2012a), or molecular dynamics trajectories of DNA minicircles (see Section 2). In such cases, the data can be modeled as a stationary time series of functions, or stationary functional time series.
Inference for functional time series is classically carried out under functional autoregressive models, or more general linear models (see, e.g., Bosq 2000;Mas 2002;Ferraty and Romain 2011;Hörmann and Kidziński 2012;Aue et al. 2014;Aue, Norinho, and Hörmann 2015). Nevertheless, there is no a priori reason to expect that the behavior of a functional time series should be described by a linear process of any particular form or order: potentially nonlinear and/or non-Gaussian situations must be considered.
The problem of inference for functional time series beyond linear assumptions has only recently started to be addressed (Hörmann and Kokoszka 2010). Work has been done on assessing whether a dataset is independent against an ergodic alternative (Horváth, Hušková, and Rice 2013), or whether two functional time series are independent (Horváth and Rice 2015a). Horváth, Kokoszka, and Rice (2014) developed test for the stationarity of functional time series, and Kokoszka and Young (2016) considered tests for stationarity around a deterministic trend. The problem of two-sample testing for equality of the mean function of two functional time series, without resorting to linearity assumptions, has been considered by Horváth, Kokoszka, and Reeder (2013), Fremdt et al. (2014), and Horváth and Rice (2015b). Zhang et al. (2011) and Aston and Kirch (2012a,b) considered the change point detection of the mean function. Concerning inference on the second-order structure, Horváth, Kokoszka, and Reeder (2013) and Horváth, Rice, and Whipple (2014) proposed a consistent estimator for the longrun covariance operator, Kokoszka and Reimherr (2013) established the asymptotic normality of the sample covariance operator and its eigenfunctions, while Zhang and Shao (2015) considered the comparison of the covariance operators of two functional time series, a problem that has attracted considerable attention in the case of two collections of iid functional data (Panaretos, Kraus, and Maddocks 2010;Boente, Rodriguez, and Sued 2011;Kraus and Panaretos 2012;Horváth and Kokoszka 2012;Fremdt et al. 2013;Paparoditis and Sapatinas 2014;Pigoli et al. 2014;Boente, Rodriguez, and Sued 2014). The covariance operator, however, fails to capture any of the dynamics of a functional time series; and the long-run covariance operator captures only crude aspects of the time dynamics (being the sum of the autocovariance operators over lags). To capture the complete second-order dynamics of a functional series, Panaretos and Tavakoli (2013a,b) introduced a frequency domain framework (see also Hörmann, Kidziński, and Hallin 2015;Hörmann, Kidziński, and Kokoszka 2015), by means of the spectral density operator, the Fourier transform of the complete collection of auto-covariance operators; arguably, this would be the object upon which inferences related to dynamics ought to be based.
The methodological contribution of our article is the development of inferential tools for the comparison of the complete second-order dynamics (encoded by all the lag-t autocovariance operators) of two stationary functional time series, by means of the frequency domain approach, as introduced in Panaretos and Tavakoli (2013a,b). Though this is arguably a core methodological problem in its own right, our study is motivated by the applied challenge of inferring and comparing the coarse-grained dynamics of DNA in solution, based on molecular dynamics simulation. The details of this problem, and its connection with functional time series are surveyed in the next section. Our methodological contributions will then be developed in parallel with a case study on inferring base-pair-dependent dynamical properties of DNA.

Molecular Biophysics
Molecular biophysics investigates, models, and characterizes the physical structures and dynamics that occur within a living organism at the molecular level (Glaser 2012, chap. 1). This study encompasses a bewildering variety of macromolecular structures, processes, and their interactions, the most iconic of which is arguably the biopolymer DNA. It is by now well understood how the structure of DNA encodes the entirety of genetic information of an organism in the sequence of its base-pairs, and how its double helix geometry with complementary steps built of these bases allow it to be transcribed or passed on from parent to offspring. The structural composition and geometrical arrangement is only part of the story, though: when fulfilling its biological purpose, the DNA polymer undergoes important mechanical maneuvres including twisting, bending, and looping (Garcia et al. 2007;Prévost, Takahashi, and Lavery 2009). For this reason, biophysicists are particularly interested in understanding the mechanical properties and dynamics of DNA (Mastroianni et al. 2009), and the relationship between the base-pair sequence composition, and the mechanical properties and dynamics of DNA (Peters and Maher 2010). Linking sequence to mechanical properties and dynamics would not only shed light into fundamental biological processes (such as transcription, regulation, and packing), but also holds promise in the use of DNA as a material for nanoengineering (Seeman 2005;Rothemund 2006).
The mechanics of DNA can be studied in a wide range of different scales, ranging from the fine-grained (atom-byatom, for instance) to the coarse-grained (at the order of persistence length, i.e., about 160 base-pairs (Walter, Gonzalez, and Maddocks 2010;Gonzalez, Petkevičiūtė, and Maddocks 2013); or even at the order of thousands of base-pairs (Sambriski, Schwartz, and De Pablo 2009). These mechanics are fundamentally stochastic: the intrinsic conformational characteristics of the molecule are subjected to thermal fluctuations due to their surrounding environment. And though elaborate stochastic models exist that offer detailed descriptions of the atom-byatom (or atomic level ensemble-by-ensemble) behavior of the molecule, the determination of their scaling limits at the more coarse-grained level is typically mathematically intractable. In this limit, one would consider the conformational mechanics of an entire strand of DNA, seen as a curve or function, rather than of its individual constituent atoms. It is therefore natural to ask whether by observation of DNA strands, one could statistically infer information on their coarse-grained limit, seen as a random curve, and indeed link it to their base-pair sequence composition. The natural setting for this would be the setting of functional data analysis, where one is precisely interested in probing the law of a random curve on the basis of several of its realizations.
To obtain sample DNA curves, one may perform cyclization experiments. Cyclization experiments involve short DNA strands whose ends meet, resulting in a loop-like configuration. The resulting closed curves are called DNA minicircles, and are excellent specimens for the study of DNA mechanics, as they yield a naturally stressed state in which intrinsic properties are amplified relative to extrinsic thermal fluctuations (Shore, Langowski, and Baldwin 1981;Shore and Baldwin 1983;Kahn and Crothers 1992). To link conformational variability to basepair sequence, one may probe minicircles with slightly differing base-pair sequences, for example, CAP minicircles and TATA minicircles (Amzallag et al. 2006; see Section 2), whose base-pair composition coincides in all but a few basis steps. For example, based on three-dimensional reconstructions of a sample of CAP & TATA minicircles, obtained by cryo-electron microscopy, it appears that the differences between the two minicircles are not in their mean conformation (Amzallag et al. 2006), but in the way they vary around their mean conformation, as was established by means of a functional data analysis of their covariance structure (Panaretos, Kraus, and Maddocks 2010).
Microscopy-based studies, however, only allow static insight into the mechanics of DNA minicircles, and do not yield information on their dynamical properties, since they are based on "still images" of the DNA minicircles embedded in vitrified ice. The ideal kind of data needed for inferring the coarse-grained dynamics of DNA, and their link to sequence, would be in the form of a movie of DNA minicircles oscillating in solution. Though empirical acquisition of such data is as of yet infeasible, in silico surrogates can be created via Molecular Dynamics (MD) simulations (Leach 2001;Dryden et al. 2002;Lankas, Lavery, and Maddocks 2006;Freddolino et al. 2006;Sanbonmatsu and Tung 2007;Pérez, Luque, and Orozco 2011;Mitchell, Laughton, and Harris 2011;Curuksu, Kannan, and Zacharias 2014;Pasi et al. 2014). MD simulations are used to obtain the trajectory of a DNA minicircle moving in solution, and are obtained by numerically solving fine-grained atomic level models on multi-body interactions between all the atoms of water and DNA. These simulations are extraordinary in their computational and mathematical complexity. However, as was mentioned earlier, it is not the trajectories of each individual DNA atom that is of interest, but their joint behavior in the coarse grained limit, which can be thought as a function.
The motivation for this article is that of inferring dynamical properties of DNA through MD simulations in their scaling limit, and to investigate their dependence on base-pair sequence. The data we will be working with are MD trajectories of CAP & TATA minicircles oscillating in solution. We shall model their scaling-limits as functional time series (FTS). An FTS is a sequence {X t : t ∈ Z}, where t denotes the time index, and each X t is a random function, say X t ∈ L 2 ([0, 1], R), representing the conformation of a minicircle at time t, seen as a continuous curve. The dynamics of the minicircles can be viewed through the lens of the second-order structure of the time series, which is contained in the collection of its lag-t autocovariance operators: these contain the covariation of the random function t time points apart. Understanding and comparing the dynamics of CAP and TATA minicircles can therefore be translated into the problem of estimation and inference for the second-order structure of functional time series.
In particular, we wish to be able to detect and further localize any differences either at the level of frequencies, and along the length of the curves. Our methodological work is thus presented in parallel with the DNA motivation, in the form of a case study. The article is organized as follows: in Section 2, we present the molecular dynamics simulation data on the TATA and CAP minicircles, including the necessary preprocessing steps, and the estimation of their functional dynamics through the spectral density operator. Section 3 considers the problem of detecting differences between the spectra of two time series, such CAP and TATA, by first comparing them at fixed frequencies-using a test for comparing the spectrum that we introduce in this article (Theorem 1)-and then adjusting for multiplicities to localize the differences in the frequencies, while controlling the overall significance level at which we pronounce detections. The detection of the differences between the CAP and TATA is further investigated in Section 4, where we consider the problem of first selecting frequencies at which the spectrum of CAP and TATA are different, and then detecting and localizing their differences on the minicircles, within each selected frequency. A supplementary file collects necessary technical details, proofs of our main results, and simulation studies.

Description of the Data
The dataset (produced by the group of Prof. John Maddocks, Laboratory for Computation and Visualization in Mathematics and Mechanics, Institut de Mathématiques, EPFL, Lausanne, Switzerland, http://lcvmwww.epfl.ch/.) of our case study consists of the (time) trajectories of two DNA minicircles moving freely in solution. These trajectories are constructed by means of molecular dynamics (MD) simulations (Leach 2001;Freddolino et al. 2006;Lankas, Lavery, and Maddocks 2006;Sanbonmatsu and Tung 2007;Mitchell, Laughton, and Harris 2011;Mitchell and Harris 2013). Such MD simulations simulate the trajectory of a strand of DNA in water by numerical integration of a model taking into account multi-body interactions between all the atoms of the minicircle and the aqueous solution (actually, this is an oversimplification, but the precise description of MD simulations is not the goal of this article).
The two DNA minicircles are called CAP and TATA: they are built of 158 base-pairs (BP), and differ only in 14 BP (see Table 1). The MD simulation employed an integration step of 2 femtoseconds (2 · 10 −15 sec) for the numerical integration algorithm, and the data were recorded every picosecond (10 −12 sec). Time-wise, the data consist of 50,000 snapshots, where at each snapshot one retains the three-dimensional coordinates of the 158 BP centers of the minicircle (see Section A in the supplementary material for a description of the MD simulation protocol). These are the result of a massive computation, simulating a particle system consisting of 200,000 particles, and taking approximately 6000 CPU hours on a Cray XT5 at the Swiss National Supercomputing Centre-an ambitious molecular dynamics simulation (see also Freddolino et al. 2006;Sanbonmatsu and Tung 2007;Mitchell, Laughton, and Harris 2011, for more ambitious MD simulations).
In principle, the resulting time series is not guaranteed to be stationary (indeed, at least 300-500 nanoseconds seem to be typically needed for convergence to equilibrium, see, e.g., Dans Table . The sequences of base-pairs for the CAP and TATA minicircles; the differences between the two sequences are in gray.

GATGAATTCACGGATCCGGTTTTTTGCCCGTTTTTTGCCGTTTTTTGCCCGTTTTTTGCCGTTTTTT GCCCGTTTTTTCCGGATCCGTACAGGAATTCTAGACCTAGGGTGCCTAATGAGTGAGCTAACTCACA TTAATTGCGTTGCGCCATGGAATC
TATA GATGAATTCACGGATCCGGTTTTTTGCCCGTTTTTTGCCGTTTTTTGCCCGTTTTTTGCCGTTTTTT GCCCGTTTTTTCCGGATCCGTACAGGAATTCTAGACCTAGGGTGCCTAATGAGTGCCCTTTTATAGC TTAAACGCGTTGCGCCATGGAATC et al. 2012, 2014Lavery et al. 2014), but one expects its time increments to be stationary, at least locally in time. We therefore focus on the last 10,001 snapshots, and discard the first 39,999 snapshots to avoid burn-in period effects.
For each minicircle, the data we consider are 3D time series {M t ( j) : t = 1, 2, . . . , 10,001; j = 1, . . . , 158} ⊂ R 3 , where t denotes the time index (in picoseconds), and j is the BP index. In other words, M t ( j) ∈ R 3 gives the coordinates of BP j at time t. In the sequel, we shall view the set {1, . . . , 158} as the quotient group Z/158Z, so that M t ( j + k) has a meaning for all j, k ∈ Z.
The data are shown for various timepoints t in Figure 1. Since the data can be rotated or translated without altering the information they convey on the intrinsic minicircle dynamics, our analysis should hinge on features that are invariant to the action of the group of rigid motions. We therefore choose to work with the curvature of the DNA minicircles, an invariant with respect to this group (indeed an object typically studied in minicircle experiments). Further to solving the problem of registration of the data, focusing on the curvature also reduces the dimensionality of the data, transforming the object of study from a time series of curves in R 3 to a time series of real-valued functions.

Preprocessing Steps
The estimation of curvature based on discrete noisy observations is often carried out using a plug-in approach. For instance, Sangalli et al. (2009) used free-knot regression splines to estimate the curve γ (t ), and then computed the curvature estimate The performance of such techniques heavily depends on the value of a smoothness parameter, the choice of which can be rather subjective (Sangalli et al. 2009, sec. 4). Our own experience with plug-in estimation of the curvature was that it yielded estimates with too many degrees of freedom, even with roughness penalization of the estimated curve, and this symptom we attribute to the presence of the renormalization factor in (2.1). We therefore opted to start out with a discrete version of curvature, motivated by subjectspecific knowledge on the larger scale behavior of DNA minicircles. Specifically, for each minicircle trajectory, we computed the curvature trajectory {c t ( j) : t = 1, 2, . . . , 10,001; j = 1, . . . , 158} ⊂ R + , where c t ( j) is the curvature (inverse radius) of the circle passing through the three points M t ( j − 5), M t ( j), M t ( j + 5). Recall that for three points p 1 , p 2 , p 3 ∈ R 3 , this curvature is given by curvature The reason we selected triples separated by five base-pairs (corresponding to indices j − 5, j, j + 5), instead of consecutive ones (i.e., indices j − 1, j, j + 1), is that the DNA double helix performs a complete rotation in 11-12 BP, on average; taking the curvature of the directly adjacent BP would represent a highly local curvature measure, whereas the curvature computed on further spaced BP is more in line with the coarser scale at which we wish to understand their dynamics. A welcome sideeffect is that the resulting estimates are more stable (less sensitive to small perturbation of the BP centers). From a statistical point of view, this procedure results in a smoothed version of the curvature, discarding very local bends of the DNA, but retaining larger scale bending.
Since the curvature is constrained to be positive, the curvature functions c j (t ) do not lie in a linear space. Given that functional data analyses typically hinge on linear space methods, we converted the curvature trajectories into elements of a linear space by applying the transformation (2. 2) The constant δ prevents erratic fluctuations of d j (t ) when c j (t ) approaches zero. If δ is too small, the functions d j (·) will present very large spikes, and if δ is too large, d j (·) will be essentially constant. Based on an exploratory analysis, we set δ = 10 −3 , which struck a balance between the two extremes (this choice of δ is of course specific to our dataset). Figure S11 of the supplementary material illustrates the role of δ.
Since the δ-linearized curvatures are discretely sampled versions of smooth curves, we transformed each function j for each fixed t. This was done using a basis expansion (Ramsay and Silverman 2005) with 80 periodic cubic B-splines (King, Nguyen, and Ionides 2016), respecting the nature of the data as closed loops. Our choice of 80 basis functions came from the combination of considerations on the postulated degrees of freedom of the curvature (which should be fewer than the number of base-pairs), computational considerations, and graphical goodness-of-fit assessment. It is of course specific to our dataset. We also conducted the analysis presented in the rest of the article with 40 and 60 basis functions, and the results were similar to those obtained with 80 basis functions. We note that exploratory plots revealed no need for further penalization in the smoothing of the functions d t . Figure S12 of the supplementary material illustrates the smoothing process. Exploratory analysis of the functional time series {Y t : t = 1, . . . , 10,001} revealed that the series exhibited a nonstationary behavior coupled with a long memory behavior, as is natural with the movement of a rigid body. By contrast, the time differenced curves, X t := Y t+1 − Y t , exhibited a weakly dependent stationary behavior, and so we focused on the series {X t : t = 1, . . . , 10,000} as our object of study. The model implicitly assumed is, therefore, where {Y t } is the δ-linearized curvature of the DNA minicircle, and {X t } is the stationary innovation process governing the time-increments of {Y t }. Applying all these steps to the CAP minicircles, respectively, TATA minicircles, we get two functional time series, X 1 t , respectively, X 2 t (see Figure 2).

Estimation of the Dynamics
Let us start by introducing some useful notation.
, the usual inner product will be denoted by The (second-order) dynamics of a stationary FTS {X t : t ∈ Z} are encoded in the collection of lag-t autocovariance operators where μ = EX t . Inferences on these dynamics could therefore be carried out by means of corresponding inferences on the individual autocovariance operators. However, such an approach encounters significant roadblocks, since the theory of the estimation of the lag-t autocovariance operators without structural assumptions (such as that of linearity, made in Mas & Pumo (in Ferraty and Romain (2011)) is largely unexplored (with the exception of Zhang and Shao (2015) who recently developed tests only for comparing the lag-0 autocovariance operator without structural assumptions). We choose to take a different approach, and estimate the complete dynamics through a frequency domain approach, following the paradigm recently introduced by Panaretos and Tavakoli (2013a).
In the frequency domain approach, the objects of interest are no longer the lag-t autocovariance operators, but their Fourier transform, π]} is called the spectral density operator, and is well defined under suitable summability conditions on the autocovariance operators (see Panaretos and Tavakoli 2013a, Proposition 2.1). At each frequency ω, the operator F ω is a linear oper- Various properties of the spectral density operator are given in Panaretos and Tavakoli (2013a) and Tavakoli (2014); we remark that f −ω = f ω , and we therefore restrict our interest to the range ω ∈ [0, π]. Intuitively, the spectral density operator is the generalization of the spectral density matrix encountered in multivariate time series (Brillinger 2001;Priestley 2001) to the functional setting, and yields an analysis of variance decomposition of the variance of the FTS, given by the inversion formula R t = π −π exp(iαt )F α dα, t ∈ Z. The reason we decide to carry out inference on the dynamics of an FTS via a frequency domain approach-and not through the autocovariance operators-is that the spectral density operator is inextricably linked with the so-called Cramér-Karhunen-Loève decomposition of the functional time series (and the associated harmonic principal component analysis), which elucidates its complete dynamical properties, separating the temporal, functional, and stochastic components (Panaretos and Tavakoli 2013b;Tavakoli 2014;Hörmann, Kidziński, and Hallin 2015). The Cramér-Karhunen-Loève decomposition also elucidates why spectral density operators are equally important across all frequencies (under no prior assumption concerning the dynamics of the time series). Furthermore, working in the frequency domain has a crucial whitening effect: the sample spectral density operator is asymptotically independent at distinct frequencies. In contrast to these properties, it is not clear what relative importance one should give to each lag-t autocovariance operators in the assessment of dynamics of an FTS, and the sample lag-t autocovariance operators are asymptotically correlated (Mas & Pumo in Ferraty and Romain 2011), which would make a multiplicity correction approach (see Sections 3 and 4) more involved, and potentially considerably less powerful.
Notice also that for f , g ∈ L 2 [0, 1], f , F α f is the power spectrum of the one-dimensional time series X t , f , while f , F α g is the cross spectrum of X t , f and X t , g . In this sense, the spectral density operator can be used to probe any continuous linear functional of the dynamics of the process (a fact that will be exploited in Section 4).
The estimation of the spectral density operator is carried out by first computing the discrete Fourier transforms of the FTS, and then smoothing its empirical covariance (called the periodogram operator), with a kernel function W (·) of bandwidth B T (see Panaretos and Tavakoli 2013a for details). The first step can be done using the Fast Fourier Transform, whose calculation is most efficient when the length of the series is highly composite.
As usual, the bandwidth parameter B T needs to satisfy the conditions B T → 0 and TB T → ∞ as T → ∞, for the asymptotic results to hold. The choice of B T governs also the bias/variance trade-off for the estimation of the spectral density operator, similarly to nonparametric regression. Inferential procedures can then be constructed using Theorem 1 and Panaretos and Tavakoli (2013a, Theorem 3.7), which gives a central limit theorem for the estimated spectral density operator at distinct frequencies. For finite samples, it is crucial to notice that the central limit theorem effect emerges because the spectral density estimator at a given frequency ω is obtained by weighted averaging of (2m + 1) approximately independent summands, where m = TB T /2π . Following Brillinger (2001, p. 252), the equivalent number of independent pieces of information used to estimate the spectral density estimator at frequency ω can be defined as where κ 2 = R W 2 (x)dx. The order of n(ω, m, κ ) plays therefore a role similar to the number of iid summands when applying the classical central limit theorem, and should be taken into account before making any inferential statements based on asymptotics.
In the present setup, we choose B T = 0.158, a choice that corresponds to the heuristic that B T ∼ O(T −1/5 ) asymptotically, to minimize the mean square error (Brillinger 2001, p. 251). We choose W (x) to be the Epanechnikov kernel (e.g., Wand and Jones 1995), W (x) = 3 4 (1 − x 2 ) if |x| < 1, and zero otherwise. These choices imply m = 252 and n(ω, m, κ ) = 420. We denote by F a,(T ) ω the estimated spectral density operators, or sample spectral density operators, of X a t , for a = 1, 2. The graphical representation of the sample spectral density operators is not straightforward: for each frequency ω ∈ [0, π], one has an operator on L 2 ([0, 1], C): the corresponding trace norm is shown in Figure 3 for CAP and TATA. The (modulus of the) sample spectral density kernels f a,(T ) ω , a = 1, 2, associated with the sample spectral density operators, are depicted in Figure S13 of the supplementary material. We notice that most of the variance is distributed along the high-frequency end of the spectral density operator, meaning that the series X t consists mainly of high-frequency oscillations, and that the low-frequency oscillations contained in the linearized curvature series Y t mostly cancel out when taking its time differences to form X t (this is not surprising considering the transfer function induced by a time differencing). Figure S13 of the supplementary material illustrates that most of the nuclear norm of the spectral density operator is near the diagonal of the sample spectral density kernels, and then falls off sharply as one moves away from the diagonal. The interpretation of this is that the series X t has strong local interactions, in the sense that X t (τ ) and X 0 (σ ) are interacting strongly for |τ − σ | small, say |τ − σ | < ε , and much more weakly for |τ − σ | > ε . This reflects the fact that the series X t is locally smooth, but globally quite rough, as can be seen in Figure 2.

Comparison of the Spectral Density Operators
Though the traces of the sample spectral density operators of the CAP and TATA series appear different, and though small differences between their sample spectral density kernels are visible, it is not a priori clear whether these differences are indeed statistically significant. The next section sets out to address this issue, by introducing inferential tools for comparing two spectral density operators on a grid of frequencies, and localizing any detected differences at the level of frequency.

Comparing and Localizing Spectral Differences by Frequency
Comparing the second-order dynamics of the functional time series X 1 t and X 2 t can now be formalized as testing the equality of their spectral density operators. More precisely, if H ω :"F 1 ω = F 2 ω , " for ω ∈ [0, π], we wish to test ω∈ [0,π] H ω against "H ω fails for some ω ∈ [0, π]." We will take a multiple testing approach to this problem by first constructing a test for each H ω , marginally, and then enforcing a multiplicity correction.

Comparing the Spectral Density Operator at a Fixed Frequency
To test H ω for a fixed ω ∈ [0, π], we construct a test inspired by the class of tests introduced by Panaretos, Kraus, and Maddocks (2010) for testing the equality of covariance operators in iid collections of Gaussian random functions. The key idea is that, in light of the central limit theorem of Panaretos and Tavakoli (2013a, Theorem 3.7), the estimated (or sample) spectral density operator F (T ) ω can be roughly seen as the empirical covariance operator of a sample of TB T /2π approximately iid replications of the discrete Fourier transform of {X t } at frequency ω. Under this heuristic, one can construct a test statistic in the spirit of Panaretos, Kraus, and Maddocks (2010), by considering the Hilbert-Schmidt norm of the difference between the sample operators, F 1,(T ) ω − F 2,(T ) ω , restricted on a subspace of Hilbert-Schmidt space of dimension K. The choice of this subspace is such that it retains the bulk of the Hilbert-Schmidt norm of the difference, subject to its dimension being K. When H ω is valid, this is achieved by projecting onto the (random) subspace generated by the tensor products of the first be the eigenvalue/eigenfunction pairs of the pooled spectral density operator (F 1,(T ) ω + F 2,(T ) ω )/2, and denoting by ) the rescaled difference between the two sample spectral density operators at ω, we consider the statistic˜ where κ 2 = R W 2 (x)dx is the L 2 -norm of the weight function W (x). Each component in the sum comprising the test statistic measures the squared norm of the component of the difference D (T ) ω that lies in the subspace spanned byφ i ⊗φ j , renormalized according to its asymptotic variance. The indicator in the denominator is a correction term for the frequencies ω ∈ {0, π}, at which the sample spectral density operator has an increased variance.
The heuristics leading to the construction of the test statistic can indeed be made rigorous, and the following Theorem establishes that the asymptotic distribution of our test statistic under the null hypothesis is chi-squared. Its proof is given in Section C of the supplementary material. Theorem 1. Let K 1 , . . . , K J be fixed nonnegative integers, ω 1 , . . . , ω J ∈ [0, π] be a fixed number of distinct frequencies.
Assume that Conditions B.1 of the supplementary material hold, and that B T → 0 and TB T → ∞ as T → ∞. Furthermore, assume that for each ω j , the first K j eigenvalues of the spectral density operator F ω j are all distinct, and nonnegative. Then, under the null hypothesis H 0 = ∩ J j=1 H ω j , the test statis- (3. 2) The truncation level K-which is assumed to be fixed in the asymptotic framework of Theorem 1-represents a regularization parameter whose choice governs a bias/variance trade-off reflected in Type I and Type II error probabilities. Small values of K will guarantee the preservation of the nominal level of the test under the null, but are likely to incur "bias-related" Type II error under the alternative, when differences in the two operators are to be found in dimensions higher than K. A more aggressive choice of a large K can result to instabilities due to the ill-posedness of the problem, resulting in Type I and Type II errors alike. In principle, the choice of truncation level K should be dependent on the corresponding frequency ω j at which the test is carried out. This is because a spectral density operator F ω may exhibit a different rate of spectral decay as ω varies. At a fixed sample size, the optimal value of K j (T ) depends in a complicated manner on the eigenvalues, and the gaps between the eigenvalues, of the spectral density operator at the corresponding frequency. Though a theoretical investigation along this avenue would be of interest (using results from, e.g., Fremdt et al. 2014), it is beyond the scope of our article. In practice, a frequency-dependent choice may be made by choosing K = K(ω), which optimizes a model selection-type criterion. We use a fit/penalty trade-off criterion that is inspired by the pseudo-AIC (Akaike information criterion) criterion of Yao, Müller, and Wang (2005), and its two-sample generalization introduced by Panaretos, Kraus, and Maddocks (2010): where GOF(K, ω) is a goodness-of-fit criterion, and PEN a (K, ω), a = 1, 2, penalize for overfitting of the spectral densities F a,(T ) ω , a = 1, 2. We propose taking where (μ a,ω j ,φ a,ω j ) denotes the jth eigenvalue/eigenvector pair of F a,(T ) ω , a = 1, 2; j = 1, 2, . . ., and is the projection of the sample spectral density operator onto the first K eigenspaces of the pooled sample spectral density operator (F 1,(T ) ω + F 2,(T ) ω )/2. The constant n(ω, m, κ ) is defined in (2.6), and depends only on ω, m = TB T /2π , and κ. The intuition behind this criterion is that it corresponds to the AIC criterion of Panaretos, Kraus, and Maddocks (2010, sec. 3.3) had we observed n(ω, m, κ ) iid complex curves from a random function with covariance F a ω , for a = 1, 2. Even though these curves are not observed in our context, the choice of n(ω, m, κ ) reflects the number of independent pieces of information used to construct our estimate F a,(T ) ω .
We also propose a variant of the AIC criterion, by using the following penalty in lieu of (3.5), The difference between AIC and AIC * is that the second criterion takes into account the difficulty of estimating the eigenstructure of the pooled spectral density operator, in addition to penalizing for the relative roughness of the pooled spectral density operator in comparison to F 1,(T ) ω and F 2,(T ) ω , respectively (see Bosq 2000, Lemma 4.3). We also note that both criteria are invariant to scaling of the sample spectral density operator.
Numerical simulations (see Section D of the supplementary material), conducted with the automatic choice of K = K(ω) using either AIC and AIC * , suggest that our testing procedure respects the level of the test in a variety of settings, unless for very low sample size when K(ω) is chosen by AIC (see Table S1 of the supplementary material). It should be noted that our procedure is slightly conservative in general, due to the multiple testing approach taken (Benjamini and Hochberg 1995). The numerical simulations suggest that K(ω) should be selected by means of AIC in settings where the eigenvalues of the spectral density operator decay steeply, and that AIC * should be used if the eigenvalues of the spectral density operator decay slowly (we note that the power using either AIC criterion is not necessarily higher than that obtained using a prechosen value of truncation level K; however it is not clear how to choose K a priori). Since our DNA minicircle data correspond to the second case, we shall use AIC * to choose K(ω) in the remainder of the article.

Localization of Differences on the Frequencies
Recall that H ω denotes the null hypothesis F 1 ω = F 2 ω . To test the global null hypothesis H G := ω∈ [0,π] H ω , we will first obtain marginal p-values for each of the null hypotheses H ω , ω ∈ , where := {ω 1 , . . . , ω J } ⊂ [0, π] is a grid of frequencies, and then adjust the p-values to account for multiplicity effects. The p-values will be based on the asymptotic distribution of test statistic˜ (T ) K (ω), given by Theorem 1. The results of applying the automatic truncation level rule (3.7) to our DNA minicircle dataset are shown in Figure 4. We notice that the selected values of K(ω) vary between 21 and 25. The corresponding (approximate) p-values are , π}, and ν(ω j ) = K(ω j ) 2 otherwise. The choice of the grid of frequencies at which the p-values are computed should be guided by a priori knowledge of the nature of the alternative hypothesisprovided there is any such knowledge-see Section 3.3. In our case, we chose a grid of 81 frequencies, which is shown in Figure 4. Adjusting the p-values for multiplicities can be done to control the false discovery rate (FDR; see Benjamini and Hochberg 1995), the expected value of the proportion of false rejections among all rejections. In terms of the FDR, since the p-values p i and p j are dependent for |ω i − ω j | < 0.15, but approximately independent for |ω i − ω j | > 0.15, we are in the context of dependence in finite blocks, and the original Benjamini-Hochberg (BH) algorithm for controlling the FDR is valid (Storey, Taylor, and Siegmund 2004). We show the adjusted p-values using the BH procedure in Figure 5. We notice that the two spectral density operators appear to be highly significantly different at all frequencies. We also conducted numerical simulations to assess the performance and validity of our procedure in finite sample; these suggest that the BH procedure controls the Type I error for small sample sizes (see Section D of the supplementary material).
Since these adjusted p-values are remarkably small (they range from 10 −30 to 10 −140 ), and our simulations in Section D of the supplementary material demonstrate the validity of the test procedure, we perform a "sanity check" to make sure that these p-values are not the result of any remaining transient (nonstationary) behavior: we consider what p-values would be produced when comparing the dynamics of two time-separated stretches of the same FTS. Using the procedure described in this section, we compare the spectral density operators of the two FTSs {X CAP t : t = 1, . . . , 1000} and {X CAP t : t = 9001, . . . , 10,000}, which are assumed to be approximately independent under weak dependence. The null hypothesis should be valid, of course, provided stationarity is not violated. The adjusted p-values, shown in Figure S14 of the supplementary material, suggest that the spectral density operator of CAP is indeed stable in time (across all frequencies), in accordance with the hypothesis of stationarity made earlier, and supporting the soundness of the adjusted p-values presented in Figure 5. The same conclusions hold for TATA ( Figure S14 of the supplementary material).

Figure .
Adjusted p-values (using the BH procedure) for testing the equality of the spectral density operator of CAP and TATA, with the truncation level K (ω) automatically chosen at each frequency ω with the AIC * criterion (.). The small ticks on the horizontal axis represent the grid of frequencies for which the test is computed.

Choice of the Discretization Grid
The choice of the grid (and therefore J) is related to finer knowledge of potential departures in the direction of the alternative against which we wish to test the global null H G = ∩ ω∈ [0,π] H ω : Power for global differences between the two spectral density operators: If we believe that the true difference between the two spectral density operators is going to be on a large interval of [0, π], J should be small, so that the power of the test is not lost because of multiple comparisons. If ⊂ [0, π] is chosen such that |ω i − ω j | ≥ 2B T , for all ω i = ω j ∈ , then the p-values are approximately independent, and one could control the familywise error rate via Hochberg's procedure (Dudoit, Shaffer, and Boldrick 2003), if this is what is desired. Power for narrow banded differences between the two spectral density operators: If we believe that the true difference is in a very narrow band of the spectral density operator, for example, H ω is false only for |ω − ω | < ε, with ω ∈ [0, π] and ε > 0 small, then should be chosen to be a dense grid over [0, π]. The largest gap between any two consecutive frequencies in will indicate approximately smallest bandsize ε for which the test would be able to detect departures from the global null H G . Frequencies near {0, π}: Although we expect˜ (T ) K (ω j ) to follow, for large T , approximately a χ 2 K 2 distribution for any ω j ∈ {0, π}, the approximation might not hold for frequencies ω j very close to {0, π}. This happens because the asymptotic distribution of˜ (T ) K (ω) is χ 2 K(K+1)/2 for ω ∈ {0, π}, but χ 2 K 2 for ω ∈ (0, π ), and because˜ (T ) K (ω) is continuous in ω. Therefore, for ω j close to {0, π}, the approximate distribution of˜ (T ) K (ω j ) is a mixture of χ 2 K(K+1)/2 and χ 2 K 2 random variables, with unknown mixture proportion. We therefore recommend that all the frequencies ω ∈ , with ω ∈ {0, π}, be at least at distance B T of the frequencies {0, π}.
Understanding how the power varies with the number of frequencies J at which the spectral density operators are compared is important, since there may be situations in which no prior knowledge on the spectral density operators is available. As is turns out, there is a heuristic upper limit to the gridsize because 1. The test used is continuous in ω for constant K, that is, Computationally, it is more efficient to compute the sample spectral operators, and hence˜ (T ) K (ω), on the Fourier frequencies * = {2π s/T : s = 0, . . . , T/2}, or on a subset of them. If the multiple correction is done using the false discovery rate (FDR), numerical simulations (Section D.2) suggest that conditionally on the observed functional time series, the FDRadjusted p-values (the q-values) are approximately stable as the grid becomes as dense as * (see Figure S9 of the supplementary material). Concerning the power of the test, it seems that though a very sparse grid may yield more power in some situations, in other situations, slightly denser grids yield considerably more power. However, choosing the densest grid * results generally in a loss of power, due to multiplicity corrections and the continuity of the test in the frequencies (see Figure S10 of the supplementary material).

Localizing Differences in Frequency and Along Curvelength
We now wish to qualify the difference between CAP and TATA dynamics at a finer level: we wish to first detect the specific frequencies at which CAP and TATA curves differ (the significant frequencies), and then localize the region on the minicircles (curves), within each significant frequency, where these differences occur. This serial framework is quite naturally amenable to recent selective multiple testing methodology proposed by Benjamini and Bogomolov (2014). Note that if one wishes to search for dynamical differences between the two series at a frequency ω 0 and attributable to the covariation between two regions [τ 1 , τ 2 ] and [τ 3 , τ 4 ] along the curves, it suffices to consider hypotheses comparing linear contrasts F 1 ω 0 g, h and F 2 ω 0 g, h , for g, h ∈ L 2 [0, 1] two contrast functions concentrated on [τ 1 , τ 2 ] and [τ 3 , τ 4 ], respectively.
The contrasts we choose to employ are the periodic B-spline basis functions used to represent the curves (King, Nguyen, and Ionides 2016). Consequently, we base our procedure on the differences in the (i, j)th basis coefficient between the spectral density operator of CAP and TATA, at a given frequency ω. Let us denote by f a ω , respectively, f a,(T ) ω , the 80 × 80 coefficient matrices with respect to the periodic B-spline basis (King et al. 2010) of the true spectral density operator, respectively, the sample spectral density operator, at frequency ω, for the time series X a t , a = 1, 2. We shall call f a ω the projected spectral density operator, and f a,(T ) ω the projected sample spectral density operator, and denote by f ω (i, j) the (i, j)th entry of the matrix f ω . The local null hypotheses we wish to test for are of the form . . . , 80; ω ∈ [0, π].
By symmetry of the projected spectral density operator, we restrict ourselves to the indices 1 ≤ i ≤ j ≤ 80. We point out that this approach is different from a classical multivariate approach, as discussed in Remark 2. For each frequency ω and each 1 ≤ i ≤ j ≤ 80, assuming f ω (i, i)f ω ( j, j) = 0, we can use the projected sample spectral density operator to construct a p-value p(ω; i, j) for the null hypothesis H ω (i, j), as described in Section E of the supplementary material. The p-values are only computed on a subgrid = {ω 1 , . . . , ω L } ⊂ [0, π], which is chosen such that |ω i − ω j | ≥ 2B T , so that the p-values across different ω j 's are approximately independent (see the discussion in Section 3.3).
We choose to select significant frequencies and localize the differences between CAP and TATA in a way that controls the expected average of the false discovery proportion over the significant frequencies (Benjamini and Bogomolov 2014). To make this statement precise, let p l = {p(ω l ; i, j) : 1 ≤ i ≤ j ≤ 80} be the set of p-values at frequency ω l , and P = {p 1 , . . . , p L } be the set of all p-values over the grid . Let S(P) be the selection procedure for the significant frequencies, based on all the p-values P, that is, S(P) ⊂ , and |S(p)| denote the number of significant frequencies. Let FDP(ω) = V (ω)/R(ω) be the false discovery proportion at frequency ω, where V (ω) denotes the (unknown) number of wrong rejections within frequency ω, and R(ω) denotes the total number of rejections at frequency ω. The error criterion we will seek to control is (4.1) Notice that if the selection procedure S is carried out without relying on the data, (4.1) simplifies to l∈S FDR(ω l )/|S|, the Figure . The plots show the regions on the minicircles, for each frequency, where the spectral density operator of CAP and TATA are overall significantly different at a 1% level (with respect to the error criterion (.)). Each plot represents the regions of differences (in black) between the spectral density kernels of the two minicircles. The two gray vertical and horizontal bands correspond to the region where the base-pair sequences of the two DNA minicircles are different.
To select the significant frequencies and select the null hypotheses to reject within each significant frequency while controlling the expected average FDP (4.1) at the level α, we use the following procedure (see Theorem 1 and sec. 5, Benjamini and Bogomolov 2014): 1. Adjust, within each frequency ω l , the p-values p l for the control of the FDR, and denote the result by q l , also called q-values (see Remark 1). 2. Select the significant frequencies S by applying the BH procedure to the set of minimum q-values {min q 1 , min q 2 , . . . , min q L }. 3. Within each significant frequency ω l , l ∈ S, reject the null hypotheses whose corresponding q-value are smaller than |S|α/L. In addition to controlling the error criterion (4.1), this procedure has also the additional property that it controls the FDR at the level of the frequencies. Notice however that the frequencies selected as significantly different by this method need not be exactly the same as the ones selected by the method of Section 3.
The result of applying this procedure to our minicircle data with α = 0.05, and on the grid of frequencies = {0, 0.38, 0.77, 1.15, 1.57, 1.95, 2.34, 2.72, 3.14} is shown in Figure 6 in the form of zero-one plots, which exhibit graphically the regions where the spectral density operator of CAP and TATA differ significantly at a 1% level. We notice first that all the tested frequencies are significant, which is not surprising since the frequency tests (Section 3) suggested that the null hypothesis H ω for each fixed frequency was confidently rejected. We also see that the rejected hypotheses are mostly situated on the diagonal of the spectral density operator, that is, the rejected nulls are mostly of the form H ω (i, j) with |i − j| small. This signifies that the differences in the dynamics of CAP and TATA curves are primarily due to local interactions (between X t (τ ) and X 0 (σ ) for |τ − σ | small) differ. This is not surprising since we have already seen (in Section 2.2) that most of the covariation of the minicircles stems from their local interactions. An interesting observation is that the detected differences between the spectral density operators of the two minicircles do not exclusively reside in the region where their BP sequence is different (see Table 1), but extend to other regions of the minicircles.
Remark 1 (p -Value adjustment within each frequency). Since the p-values p l = {p(ω l ; i, j) : 1 ≤ i ≤ j ≤ 80} are correlated with a nontrivial correlation structure (see, e.g., (E.2) of the supplementary material), we cannot use the BH procedure to control the FDR, nor more recent procedures (which require, e.g., dependence in finite blocks, see Storey, Taylor, and Siegmund 2004;Schwartzman, Dougherty, and Taylor 2008). We therefore use the conservative version of FDR, which works under arbitrary dependence structure of the p-values (Benjamini and Yekutieli 2001, Theorem 1.3) to obtain the q-values q l . Nevertheless, numerical simulation that we carried out to assess the validity of our procedure suggested that the BH procedure seems to control the FDR within each frequency. Further work along this line would be of interest.
Remark 2 (Differences with multivariate analysis). Although the idea of comparing at the level of basis coefficients seems like a multivariate approach, it differs from it in that the choice of the basis functions will influence the qualitative conclusions that can be drawn from the analysis. Our choice of a periodic Bspline basis allows one to distinguish differences between CAP and TATA that are very localized on the minicircles. Another choice could be that of a wavelet basis, which would allow one to detect differences between CAP and TATA across multiple scales. The choice of the basis is therefore intimately related to the directions (in function space) in which the test is most powerful.

Concluding Remarks
We have introduced a method for comparing the dynamics of two functional time series (FTS) at a hierarchy of levels, through a frequency domain approach. Our method was illustrated through a case study in molecular biophysics, and specifically aimed at detecting sequence-dependent effects on the molecular dynamics of DNA at persistence length.
Our procedure is based on a test for comparing the spectral density operators of two FTSs at fixed frequencies. As a first step, this test can be used in combination with multiple testing procedures to detect differences between the spectral density operators of FTSs, and enables localizing at which frequencies the differences occur, while controlling an overall error measure. As a second step, one can compare the spectral density operators of two FTSs jointly in frequencies and along the curvelength, by first localizing differences in the frequencies, and then identifying their differences along the curvelength, within each frequency, while controlling the average false discovery rate over the selected frequencies. We conducted numerical simulations to assess the strength of our method in finite samples, and its robustness to "adversarial" setups.
Our case study indicates that the dynamics of the two DNA minicircles we studied (CAP and TATA) seem to be globally significantly different, across every fluctuation frequency, at least at the given MD simulation. A finer investigation of their differences-along the curvelength-show signs of an interesting phenomenon: the dynamics of CAP and TATA, though being mainly local, seem to be not only limited to the region where their base-pairs are different, but to extend to other regions of the DNA minicircles. This suggests that a local change of base-pairs might induce a global change of dynamics through a "propagation effect" along the DNA minicircle (see also Kim et al. 2013). Though the effect appears to affect the entire DNA minicircle in our case study, it is not clear whether the effect's intensity is fading with distance, and if it would have affected only part of dynamics had the DNA minicircle been longer. Another important point to mention is that though the difference between CAP and TATA were strongly statistically significant (the largest adjusted p-value for testing the equality of their spectral density operators was smaller than 10 −20 ), a bare eye examination of the data does yield any hints on differences in their dynamics.
Our method relies on assumptions on the stationarity and weak dependence of the underlying FTSs. To validate our findings, we performed a "sanity check" by searching for differences in the spectral density operators of each FTSs, estimated using the first 1000 and last 1000 timepoints of each FTS. No difference was detected, suggesting that the assumption of stationarity is not violated. Furthermore, exploratory analysis of the FTSs revealed that the weak dependence assumption is acceptable.
To our knowledge, this work is the first attempt of comparing the entire second-order dynamics of FTSs, and localizing their differences across frequencies, and within frequencies along the curvelength. This also appears to be the first time that a functional data analysis has been employed to study the coarse-grained molecular dynamics of DNA, and indeed that significant dynamical sequence-dependent differences have been statistically quantified in a functional setup. Moreover, the method we propose does not hinge on any linearity or Gaussian assumption on the underlying FTSs, but on stationarity and moment-type weak dependence assumptions. This is particularly fitting in the case of DNA where the scaling limit models are far from clear, and potentially non-Gaussian; and indeed, given the rigid body nature of DNA, existence of moments of all orders naturally leads to moment-type mixing. A drawback of our method, due to our model-free and frequency domain approach, is that interpretation is not straightforward (though the Cramér-Karhunen-Loève decomposition, Panaretos and Tavakoli 2013b, can be used to this aim). Potential extensions of our work could be in the direction of tests for stationarity (similar to our "sanity check"), development of bootstrap version of our tests to take into account the local dependency (in frequencies) of the sample spectral density operator, or incorporation of the theory of excursion sets of Gaussian processes for the localization of the difference along curvelength.
A note of caution is that, even though the differences detected between the spectral density operators are statistically significant, we do not claim over-arching conclusions on the nature and effect size of these differences: this would require further and finer analyses (e.g., Freddolino et al. 2006;Sanbonmatsu and Tung 2007) and of course may also require more than one set of MD trajectories. Still, our preliminary results can hopefully serve as a starting point for further work on the DNA context, and as a case study illustrating the scope and potential of functional time series inference in biophysical modeling.

Supplementary Materials
The supplement contains the molecular dynamics simulation protocol (Section A), the technical assumptions of our main results (Section B), the proof of Theorem 1 (Section C), and numerical simulations (Section D).
R Package The R code implementing the methods of this article is implemented in the R package ftsspec available at https://cran.r-project.org/web/packages/ftsspec/index.html. The data generated and simulations studies are available at https://www.repository. cam.ac.uk/handle/1810/253695*.