Bayesian Model Search for Nonstationary Periodic Time Series

Abstract We propose a novel Bayesian methodology for analyzing nonstationary time series that exhibit oscillatory behavior. We approximate the time series using a piecewise oscillatory model with unknown periodicities, where our goal is to estimate the change-points while simultaneously identifying the potentially changing periodicities in the data. Our proposed methodology is based on a trans-dimensional Markov chain Monte Carlo algorithm that simultaneously updates the change-points and the periodicities relevant to any segment between them. We show that the proposed methodology successfully identifies time changing oscillatory behavior in two applications which are relevant to e-Health and sleep research, namely the occurrence of ultradian oscillations in human skin temperature during the time of night rest, and the detection of instances of sleep apnea in plethysmographic respiratory traces. Supplementary materials for this article are available online.


Introduction
Identifying the periodicities present in a cyclical phenomenon allows us to gain insight into the sources of variability that drive the phenomenon.For example, respiratory traces obtained from a plethysmograph used on rodents in experimental sleep apnea research exhibit many abrupt changes in their periodic components as the rat naturally changes their breathing pattern in the course of its sleep-wake activities (Han et al. 2002, Nakamura et al. 2003).Similarly, human temperature, as measured by a wearable sensing device over several days at relatively high temporal resolution, may be subject to a different periodic behaviour during the night when the individual transitions between ultradian sleep stages (Carskadon et al. 2005, Komarzynski et al. 2018).
While the theory and methods for analyzing the periodicities of time series data whenever the time series is stationary is relatively well-developed, the task of modelling time series that show regime shifts in periodicity, amplitude and phase remains challenging, as the timing of changes and the relevant periodicities are usually unknown.
There have been many developments for modeling stationary oscillatory time series data.Rife & Boorstyn (1976) and Stoica et al. (1989) addressed the problem of estimating the frequencies, phases and amplitudes of sinusoidal signals under the assumption of a known number of sinusoids, where inference is based on maximum likelihood frequency estimators.These models, however, require very long time series and a large separation in the frequencies that drive the process, which will not always be the case in practice (Djuric 1996, Andrieu & Doucet 1999).Quinn (1989), Yau & Bresler (1993) and Zhang & Wong (1993) tackled the problem of model selection on the number of sinusoidal signals by employing the Akaike information criterion (Akaike 1974) and the minimum description length principle (Rissanen 1978).Djuric (1996) showed that these procedures tend to estimate a wrong number of components when the sample size is small and the signal-to-noise ratio is low.
Bayesian approaches to modelling stationary oscillatory signals were explored for the first time by Bretthorst (1988Bretthorst ( , 1990) ) with applications to nuclear magnetic resonance spectroscopy.Dou & Hodgson (1995, 1996) presented a Bayesian approach that uses a Gibbs sampler to identify multiple frequencies that drive the signal.Their method required the number of frequencies to be fixed in advance, and model selection was achieved by choosing the most probable model based on the estimation of the parameters for all possible models.Bayesian model selection for stationary oscillatory signals based on posterior model probabilities were also investigated by Djuric (1996).Andrieu & Doucet (1999) introduced a more efficient reversible-jump MCMC method (Green 1995) that jointly tackles model selection and parameter estimation for an unknown number of stationary sinusoidal signals and avoids the computationally expensive numerical optimization of Dou & Hodgson (1995, 1996) by sampling the frequencies one-at-time via Metropolis-Hastings (M-H) steps.To the best of our knowledge, currently there is no extension of this methodology to analyze nonstationary oscillatory signals.
A formal statistical modeling framework for a specific class of nonstationary time series data, called locally stationary time series, was developed by Dahlhaus et al. (1997).Extending this framework to a Bayesian setting, Rosen et al. (2009) proposed an approach to model the log of the time-varying spectral density using a mixture of smoothing splines.Rosen et al. (2012) improved on this by splitting the time series into an unknown but finite number of segments of variable lengths, thereby avoiding the need to pre-select partitions, and to estimate the timevarying spectral density using a fixed number of smoothing splines.For a given partition of the time series, the likelihood function is approximated via a product of local Whittle likelihoods (Whittle 1957).The methodology was developed using a Bayesian framework and is based on the assumption that, conditional on the position and number of partitions, the time series are piecewise stationary, and the underlying spectral density for each partition is smooth over frequencies.However, exploratory analyses of the time series in both of our case studies revealed spectral densities with very sharp peaks, often at several nearby frequencies, thus invalidating the assumption that the spectral density is smooth over frequencies.In addition, the frequency location of these sharp peaks changed over time.
In this article we propose a novel Bayesian methodology for modelling oscillatory data that show regime shifts in periodicity, amplitude and phase.In contrast to previous work our approach does not require pre-specifying the number of regimes or the order of the model within a regime.
We assume that, conditional on the position and number of change-points, the time series can be approximated by a piecewise changing sinusoidal regression model.The timing and number of changes are unknown, along with the number and values of relevant periodicities in each segment.
We develop a reversible jump MCMC technique that jointly explores the parameter space of the change-points and sub-models for all segments.
The paper is organized as follows.Section 2 and 3 present the model, the prior specifications and the general structure of our Bayesian approach.Sections 4 and 5 provide a detailed explanation of our sampling scheme and a simulation study to demonstrate the performance of our approach.In Section 6 we illustrate the use of our methodology in two data-rich scenarios related to sleep, circadian rhythm and e-Health research, namely the identification of the spectral properties of experimental breathing traces arising in sleep apnea research, and the analysis of human temperature data measured over several days by a wearable sensor.We conclude and discuss our current findings in Section 7.

The Model
Consider a time series realization y 1 , . . ., y n whose periodic behaviour may change at k unknown time-points s (k) = (s 1 , . . ., s k ) where k is also unknown.Assume that in each sub-interval I j = [s j−1 , s j ) there are m j relevant frequencies ω j = (ω j, 1 , . . ., ω j, m j ) , for j = 1, 2, . . ., k + 1. Setting s 0 = 1 and s k+1 = n, we can write the following sinusoidal model (Andrieu & Doucet 1999) denotes the indicator function, and µ j and α j may, if needed, account for a linear trend within each segment.For simplicity we assume independent zero-mean Gaussian errors with regime-specific variances noting that, in principle, the methodology can in principle be adjusted to the non-Gaussian case.
The dimension of the model is given by the number of change-points k and the number of frequency components in each regime denoted by where The vector of basis functions x t ω j is defined as x t ω j = 1, t, cos(2πω j,1 t), sin(2πω j,1 t), . . ., cos(2πω j,m j t), sin(2πω j,m j t) , where θ j = ( β j , ω j , σ 2 j ) is the vector of parameters and n j the number of observations of the j th segment.

Bayesian Inference
Given some pre-fixed maximal numbers of change-points, k max , and frequencies per regime, m max , inference is achieved by assuming that the true model is unknown but comes from a finite class of models where each model M k , with k change-points, is parameterized by the vector and Ω m j = (0, 0.5) m j denote, respectively, the sample space for the locations of change-points and the frequencies of the j th segment.The overall parameter space can be written as a finite union of subspaces and Bayesian inference on k, m (k) , s (k) and θ (k) is based on the following factorization of the joint posterior distribution where we use π ( • ) as generic notation for probability density or mass function, whichever appropriate.Sampling from it poses a multiple model selection problem, namely of the number of change-points and number of frequencies in each regime, which can be addressed by constructing a reversible-jump MCMC algorithm Green (1995).The algorithm in its basic structure iterates between the following two moves: 1. Segment model move: Given a partition of the data at k locations s (k) , inference on the parameters m (k) and θ (k) is based on the conditional posterior A reversible-jump MCMC algorithm is performed in parallel on each of the k + 1 segments, where at each iteration the number of sinusoids m j , the linear coefficients β j , the frequencies ω j and the residual variances σ 2 j are sampled independently in each segment, for j = 1, . . ., k+1.Notice that at this stage the algorithm will explore subspaces of variable dimensionality regarding the number of frequencies per segment, while the change-point model remains fixed.Our prior specifications assume independent Poisson distributions for the number of breakpoints k and frequencies in each segment m j , conditioned on k ≤ k max and 1 ≤ m j ≤ m max , respectively.Given k, a prior distribution for the positions of the change-points s (k) can be chosen as in Green (1995) Conditional on k and m (k) , we choose a uniform prior for the frequencies ω j, l ∼ Uniform(0, 0.5), l = 1, . . ., m j , and j = 1, . . ., k + 1. Analogous to a Bayesian regression (Bishop 2006), a zero-mean isotropic Gaussian prior is assumed for the coefficients of the j th segment, β j ∼ N 2m j +2 ( 0, σ 2 β I ), j = 1, . . ., k + 1, where σ 2 β is a pre-specified large value, and the prior on the residual variance σ 2 j of the j th partition is Inverse-Gamma ν 0 2 , γ 0 2 , where η 0 and ν 0 are fixed at small values.

Sampling Scheme for Nonstationary Periodic Processes
Here we provide the sampling scheme associated with the nonstationary periodic processes that we wish to model.An outline of the overall procedure is as follows.Start with an initial configuration of number of change-points k, along with their locations s (k) ; this yields a partition of the data y = (y 1 , . . ., y k+1 ).Initialize the number of frequencies in each regime m (k) and their values ω (k) , along with the coefficients β (k) and residual variances σ 2 (k) .At each iteration of the algorithm a segment model and a change-point model move are estimated.A random choice with probabilities (6) based on the current number of parameters will determine whether to attempt a birth, death or a within-model move.In particular, let z denote the current number of parameters, i.e. change-points k in the change-point model or frequencies m j in the j th segment model; then, the dimension may increase by one (birth step) with probability b z , decrease by one (death step) with probability d z or remain unchanged (within step) with probability for some constant c ∈ [0, 1 2 ], and π (z) is the prior probability of the model including z. Reversibility of the Markov chain is guaranteed for move types that involve a change in dimensionality as b z π (z) = d z+1 π (z + 1).Here we chose c = 0.4 but other values are legitimate as long as c is not larger than 0.5, to assure that the sum of the probabilities does not exceed 1 for some values of z.Naturally, b k max = b m max = 0 and d k 0 = d m 1 = 1.The pseudocode of the overall algorithm that describes an iteration of the sampler is given in Algorithm 1.We next describe in more detail the specific procedures needed to update the moves.

Updating a Segment Model
Given the number of change-points k and their locations s (k) , a segment model move is performed independently and in parallel on each of the k + 1 partitions.Hence, throughout this subsection the subscript relating to the j th segment may be dropped and a segment of interest is denoted by y = (y a , . . ., y b ) , which contains n observations.Assume that the current number of frequencies is set at m; then, an independent random choice is made between attempting a birth, death or within-model step, with probabilities given in ( 6).An outline of these moves is as follows (further details are provided in the Appendix).
Within-Model Move: Conditioned on the number of frequencies m, we sample the vector of frequencies ω following Andrieu & Doucet (1999), i.e. by sampling the frequencies one-at-time using a mixture of M-H steps, with target distribution In particular, the proposal distribution is a combination of a Normal random walk centred around the current frequency and a sample from values of the Discrete Fourier transform of y.The corresponding vector of linear parameters β is then updated in a M-H step, from the target posterior conditional distribution where the proposed values are drawn from normal approximations to their posterior conditional distribution.Finally, the residual variance σ 2 is then updated in a Gibbs step from Between-Model Moves: For this type of move, the number of frequencies is either proposed to increase by one (birth) or decrease by one (death).If a birth move is proposed, we have that m p = m c + 1, where current and proposed values are denoted by the superscripts c and p, respectively.The proposed vector of frequencies is constructed by proposing an additional frequency to include in the current vector.Conditional on the frequencies, the corresponding vector of linear coefficients and the residual variance are sampled as in the within-model move.
If a death move is proposed, we have that m p = m c − 1.Hence, one of the current frequencies is randomly chosen to be removed.The proposed corresponding vector of linear coefficients is drawn, along with the residual variance.For both moves, the updates are jointly accepted or rejected in a M-H step.

Updating the Change-Point Model
This part of the algorithm identifies the number and locations of change-points.Suppose the number of change-points is currently set to some value k, then according to the probabilities given in ( 6) a random decision is made between adding, removing or moving a change-point.
The rules for updating these types of moves are described below and more details are given in the Appendix.
Within-Model Move: An existing change-point is proposed to be relocated with probability 1 k , obtaining say s c j .The update for the selected change-point is proposed from a mixture of a Normal random walk centred on the current change-point s c j and a sample from a uniform distribution on the interval [s c j−1 + ψ s , s c j+1 − ψ s ].Here, we introduced ψ s as a fixed minimum time between change-points avoiding change-points being too close to each other.Rosen et al. (2012) used a similar scheme, but on a discrete-scale.The number of frequencies and their values are kept fixed, and, conditional on the relocation, the linear coefficients for the segments affected by the relocation are sampled.These updates are jointly accepted or rejected in a M-H step and the residual variances are updated in a Gibbs step.
Between-Model Moves: For this type of move, the number of change-points may either increase (birth) or decrease (death) by one.If a birth move is proposed, we have that k p = k c + 1.

The new proposed change-point is drawn uniformly on
The latter involves splitting an existing segment.The number of frequencies and their values in the proposed segments are selected from the current states.Two residual variances for the new proposed segments are then constructed from the current single residual variance.Finally, two new vectors of linear parameters are sampled.If a death move is proposed, we have that k p = k c − 1.Hence, a candidate change-point to be removed is selected from the vector of existing change-points, with probability 1 k c .The latter involves merging two existing partitions.The number of frequencies and their values in the proposed segments are selected from the current states.A single residual variance is constructed from the current variances relative to the segments affected by the relocation.Finally, a new vector of linear coefficient is drawn.For both type of moves, these updates are jointly accepted or rejected.

Simulation Study
In this section, we carry out a simulation study to explore the performance of our method, which will be referred to as AutoNOM (Automatic Nonstationary Oscillatory Modelling).We simulate a time series consisting of n = 900 data points from model (1) with k = 2 change-points located at positions s (2) = (300, 650), and fixed number of frequencies per regime m (2) = (3, 1, 2) ( See Supplementary Material, Table 5 for further details of the parametrization).Figure 1 (top panel) shows a realization from this model.The prior means λ ω and λ s , say, on the number of frequencies and change-points, respectively, were set to 2, to reflect a fair degree of prior information on their numbers.We will discuss later our finding that AutoNOM was relatively insensitive to these prior specifications.The maximum number of change-points k max is set to 15, and the maximum number of frequencies per regime m max is set to 10. Furthermore, we fixed ψ s = 20 and φ ω = 0.25 (Appendix A.1.2) for the uniform distribution for sampling the frequencies.The full estimation algorithm is run for 20,000 updates, 5,000 of which are discarded as burn-in period.The estimation took 390 seconds with a (serial) program written in Julia 0.52 on a Intel R Core TM i7-4790S Processor 16 GB RAM.The results, summarized in Table 1 clearly show that a model with two change-points has the highest estimated posterior probability (left panel) and that AutoNOM correctly identifies the right number of significant frequencies in each regime (right panel)..00.00.00.00

Sensitivity Analysis
To investigate the influence of the prior means λ ω and λ s we simulate 10 realizations from the same model and run our estimation algorithm for combinations of values for λ ω and λ s , ranging from 0.1 to 10.0.Table 2 shows the average posterior probability of choosing the correct model, We also investigated the average mean squared error between the true underlying signal and the estimated signal (see Table 6, Supplementary Material).Both analyses suggest that, for this example, the choice of the prior means λ ω and λ s has hardly noticeable impact on the results.However, our experience is that small values for these hyper-parameters are preferable as they prevent the algorithm from overfitting.

Detecting Spectral Peaks
We simulate a time series from the same simulation model as above with the only difference that the residual variances were set equal to one for all segments and thus are smaller than above.The performance of AutoNOM is compared with two existing methods, namely the Bayesian adaptive spectral estimation for nonstationary time series proposed by Rosen et al. (2012), referred to as AdaptSPEC, and the frequentist piecewise vector autoregressive method of Davis et al. (2006), referred to as AutoPARM.Specifically, we explore the performances of these methodologies in  ification of the number of spline basis functions used for the smoothing, where increasing the number of basis function yields a better performance for AdaptSPEC.The example also shows that smoothing by splines may lead to peaks in the periodogram to be over-smoothed and neighbouring close peaks to be merged.AutoPARM seems to also suffer from the latter problem.
When we increase the residual variance to the high levels set originally, AdaptSPEC fails to detect any change-points for both J = 7 and J = 15, with posterior probability π (k = 0 | y) of 69% and 93%, respectively, while AutoPARM finds 7 change-points and thus severely overestimates their number.Our conclusion from this comparison is that although AdaptSPEC and AutoPARM may be well suited for time series processes with smooth time-varying spectra with few or no peaks, both methods are severely challenged in detecting changes in spectra that exhibit pronounced peakedness, possibly at nearby frequencies, as can be expected to occur in reality for the type of time series that we wish to analyze.

Case Studies
The development of our methodology was motivated by the following two case studies where dense physiological signals were observed which exhibit unknown periodicities whose role may change over time in a more or less abrupt manner and where their detection is of relevance to the health and well-being of the subject.

Analysis of Human Skin Temperature
The development of information and communication technologies, in particular widespread internet access and availability of mobile phones and tablets, allows considering new developments in the health care system.To address the issue of personalized medical treatment according to the circadian timing system of the patient, referred to as chronotherapy (Levi & Schibler 2007), a novel and validated non-invasive mobile e-Health platform pioneered by the French project PiCADo (Komarzynski et al. 2018) is used to record and teletransmit skin surface temperature as well as physical activity data (Huang et al. 2018) from an upper chest e-sensor.Figure 3 (a) shows an example of 4 days of 5-minutes temperature recording for a healthy individual.The circadian rhythms in core and skin surface temperature are usually 8-12 hours out of phase, with respective maxima occurring near 16:00 at day time, and near 2:00 at night (Krauchi & Wirz-Justice 1994).The early night drop in core body temperature, which is critical for triggering the onset of sleep (Van Someren 2006), results from the vasodilatation of the skin vessels and associated rise in skin surface temperature (Kräuchi 2002).Under the assumption of stationarity Komarzynski et al. (2018) analyzed the skin temperature time series identifying both strong 12 hours (circahemidian) and 24 hours (circadian) rhythms.Here, we applied our methodology to the skin-temperature time series shown in Figure 3 (a) for 300,000 iterations, discarding the first 100,000 updates as burn-in.The maximum number of change-points k max was set to 10, whereas the maximum number of frequencies per regime m max was set to 5. The estimated number of change-points had a mode at 7, with π ( k = 7 | y) = 0.97 and their estimated posterior distributions are shown in Figure 3 (c).Inspecting them alongside the physical activity data we can see that the change points mainly correspond to the start and end points of the prolonged rest periods at nights showing that skin temperature alternates between day activity and night rest including sleep.Figure 4 shows the estimated posterior distribution of the frequencies for the sleep segments (2, 4, 6, 8) along with the square root of the estimated power of the corresponding frequencies, where the power of each is frequency ω j,l is summarized by the sum of squares of the corresponding linear coefficients, i.e.I (ω j,l ) = β (1) 2 j,l + β (2) 2 j,l (Shumway & Stoffer 2005).
Figure 3 (b) shows the piecewise fitted signal, along with a 95 % credible interval obtained from the 2.5 and 97.5 empirical percentiles of the posterior sample using Equation ( 17).Cycles of approximately 3 hours appear in segments 2, 4 and 6; cycles that range approximately 1-1.5 hour appear in segments 2, 4, 8 and cycles of around 2 hours appear in segments 4 and 6 while some longer periods identified in segments 4, 6, 8 indicate the presence of a trend.
Stages of sleep are characterized by ultradian oscillations between rapid eye movements (REM) and non-rapid eye movements (non-REM).The biological functionality that regulates the alternations between these two types of sleep is not yet much understood (Altevogt et al. 2006).However, several physiological changes that occur over night differ between REM and non-REM phases, such as heart rate, brain activity, muscle tone and body temperature (Berlad et al. 1993, Pace-Schott & Hobson 2002).The body cycles between REM and non-REM sleep stages with an average length that ranges approximately between 70 to 120 minutes, and there are usually four to six of these sleep cycles each night (Carskadon et al. 2005, Shneerson 2009).
Our analysis was able to use skin temperature data alone to detect periods of sleep throughout the day and identify oscillatory behaviour during the night, whose frequencies are compatible with ultradian oscillations between REM and different non-REM sleep stages.Sqrt ( Power ) q q q q 14.8 2.9 2. Sqrt ( Power ) q q q q q 14 4.9 2.2 1. Sqrt ( Power ) q q q 5.4 2.9 1.7 (d) Segment 8 Figure 4: Spectral properties for segments corresponding to night rest.Estimated posterior distribution of the frequencies along with square root of the estimated power of the corresponding frequencies.The results are conditional on the modal number of frequencies per segment.

Characterizing Instances of Sleep Apnea in Rodents
Sleep apnea is the temporary (≥ 2 breaths) interruption of breathing during sleep.Moderate or severe (≥ 15 events per hour) sleep apnea, occurs in about 50 % of men and 25 % of women over the age of 40 (Heinzer et al. 2015), with 91% of people with sleep apnea being undiagnosed (Tan et al. 2016).Sleep apnea is linked to many diseases.Patients with sleep apnea are at increased risk of: cardiovascular events, e.g., stroke and heart attack (Lanfranchi et al. 1999); cancer (Nieto et al. 2012); liver disease, e.g., non-alcoholic fatty liver disease (Sundaram et al. 2016); diabetes (Harsch et al. 2004); metabolic syndrome (Parish et al. 2007); and cognitive decline, e.g., early onset mild cognitive impairment (MCI) and Alzheimer disease (AD) (Osorio et al. 2016), and increased risk of dementia in the elderly (Lal et al. 2012).The motivation of this research is to provide a statistical methodology that can be applied to analyze large breathing data sets resulting from in vivo plethysmograph studies in rats to characterize the occurrence of sleep apnea under different experimental conditions.If this could be attained, a concrete aid to the understanding of the pathological implications of this status could be provided to clinicians and experimental biologists.
An unrestrained whole-body plethysmograph is used to produce a breathing trace from freely behaving rats for periods of up to 3 hours.Plethysmographs were made using an 2L air-tight box connected to a pressure transducer, with an air pump and outlet valve producing a flow rate of 2L/min.Airflow pressure signals were amplified using Neurolog system (Digitimer) connected to a 1401 interface and acquired on a computer using Spike2 software (CED).
Apneas are subclassified as post-sigh apneas, if the preceding breath was at least 25% above the average amplitude of prior breaths, or spontaneous apneas, if there was no manifestation of a previous sigh (Davis & ODonnell 2013).Airflow traces from the plethysmograph are shown in Figure 5 (left panels) and consist of three time series, which will be referred to as  Our procedure allows us to set an upper bound, φ ω , (Appendix A.1.2) for the uniform interval where the new frequencies are sampled.As the periodogram ordinates for these data were approximately zero for all frequencies larger than 0.01, we decided accordingly to set φ ω = 0.01.
The locations of the changes (vertical lines) are displayed in Figure 5 (left panels).The posterior power of the frequencies, for each time series, is shown in Figure 5 (right panels).These results are conditional on the modal number of change-points and the modal number of frequencies per segment.
For each data set, we summarise in Table 4 the spectral properties of each partition by displaying the periodicities corresponding to the first two largest values of the estimated power.When the rat is sniffing, (a), the air flow trace oscillates with a dominant period of approximately 0.2 seconds, namely 5 cycles per second.Normal breathing, (a) and (b), is characterised by lower frequencies and lower magnitude than sniffing, by oscillating with a dominant period of around 0.5 seconds, namely around 2 cycles every second.Apneas, (b) and (c), appear to be characterised by higher frequencies than normal breathing but with a lower power, with dominant periods of around 0.25 and 0.35 seconds.Notice that in the first partition of (c), the highest value of the power corresponds to the frequency responsible for a sigh before apnea.
Table 4: Spectral properties of respiratory traces of a rat.Periodicities (in seconds) corresponding to the first two largest values of the estimated power, for each time series (a), (

Conclusions and Outlook
We have developed a novel Bayesian methodology for analyzing nonstationary time series that exhibit oscillatory behaviour.Our approach is based on the assumption that, conditional on the position and number of change-points, the time series are approximately piecewise stationary.
The timing and number of changes are unknown, along with the number and values of relevant periodicities in each regime.Bayesian inference is performed via a reversible jump MCMC technique that jointly explores the change-points and sub-models for all segments.
We illustrated the utility of our methodology in two case studies.First, we analyzed human skin temperature time series data obtained from a wearable device, which exhibited unknown periodicities that changed over time in an abrupt manner.Our proposed methodology identified interesting oscillations whose frequencies are consistent with ultradian oscillations between REM and non-REM sleep stages.Second, we characterized the occurrence of sleep apnea in large breathing data sets resulting from in vivo plethysmograph studies on rodents.Our spectral investigation was able to distinguish very sharp peaks, corresponding to different nearby frequencies, that are responsible for the different actions of the rodent.
Our approach proposed here is limited to a single-subject analysis and does not take into account the within-subject dependence across multiple measurements or the variability between subjects.Future works could involve extending the methodology to be able to analyze larger studies involving multiple subjects, taking account of the within-subject dependence and the inclusion of covariates, such as sex or age.
values, respectively.According to Equation (10) we carry out with probability δ ω a M-H step with proposal distribution q 1 ( ω p l | ω c l ), where ñ = n/2 and I h is the value of the squared modulus of the Discrete Fourier transform of the observations y evaluated at frequency h/n In this way frequencies are proposed from regions in parameter space with high posterior density, yielding a Markov chain which converges quickly to its invariant distribution.The proposal distribution q 1 ( ω p l | ω c l ) is independent of the current state ω c l .The acceptance probability for this move is , where ω p = (ω c 1 , . . ., ω c l−1 , ω p l , ω c l+1 , . . ., ω c m ) .On the other hand, with probability 1 -δ ω , we perform a random walk M-H step with proposal distribution q 2 ( ω p l | ω c l ), whose density is a univariate Normal distribution with mean ω c l and variance σ 2 ω , i.e. ω p l | ω c l ∼ N( ω c l , σ 2 ω ).This perturbation around the current value ω c l allows a local exploration of the conditional posterior distribution.The acceptance probability for this move is Setting δ ω to a relative low value integrates a fairly high acceptance rate with a quick exploration of the parameter space.For our experiments, we followed Andrieu & Doucet (1999) and set σ 2 ω = (1/5n) 2 and δ ω = 0.2.
Sampling β: Given values of ω and σ 2 , the vector of linear parameters β can be sampled via a M-H step from the target posterior conditional distribution π ( β | ω, σ 2 , m, y) (see Equation ( 8)).The proposed vector of coefficients β p can be drawn from normal approximations to their posterior conditional distribution (Gilks et al. 1995, Rosen et al. 2012), e.g where and .
The proposal for β p is independent of the current values β c , and the acceptance probability for this move is where q ( β c ) and q ( β p ) denote the Normal proposal densities N 2m+2 ( β c , Σ c ) and N 2m+2 ( β p , Σ p ), respectively.

A.1.2 Between-Model Moves
The number of frequencies on a segment is proposed to either increase or decrease by one.Let ) and assume the Markov chain is currently at ( m c , θ c ).We propose a move to ( m p , θ p ) by drawing from a proposal density q ( m p , θ p | m c , θ c ) and accepting this update with probability The proposed state ( m p , θ p ) is drawn by first drawing m p , followed by ω p , β p and σ 2p .In fact, we can rewrite the proposal density as Birth move: If a birth move is proposed, we have that m p = m c +1.The proposed frequency vector ω p is constructed as namely by keeping the current vector of frequencies and proposing an additional frequency ω p m p .Alternatively to Andrieu & Doucet (1999), we choose to sample ω p m p uniformly on the interval (0, φ ω ), where 0 < φ ω < 0.5.The value of φ ω can be chosen to reflect prior information about the significant frequencies that drive the variation in the data, for example by choosing φ ω in the low frequencies range ( 0 < φ ω < 0.1).Additionally, for computational and/or modelling reasons, we would like not to sample frequencies that are too close to each other.Hence, we choose to draw a candidate value ω p m p uniformly from the union of intervals of the form [ω c l + ψ ω , ω c l+1 − ψ ω ], for l = 0, . . ., m c and denoting ω c 0 = 0 and ω c m c +1 = φ ω .Here, ψ ω is a fixed minimum distance between frequencies, which is chosen larger than 1 n ; in fact, when the separation of two frequencies is less than the Nyquist step (Priestley 1981) n , the two frequencies are indistinguishable (Dou & Hodgson 1995).Moreover, we sort the proposed vector of frequencies ω p to ensure identifiability and perform practical estimation, as suggested in Andrieu & Doucet (1999).For proposed ω p and given σ 2c , the proposed vector of linear coefficients β p is sampled following the same procedure of Section A.1.1,namely we draw β p from a normal approximation to their posterior conditional distribution π ( β p | ω p , σ 2c , m p , y ).Finally, the residual variance σ 2p is sampled directly from its posterior conditional distribution π (σ 2p | ω p , β p , m p , y) (see Equation ( 9)).The proposed state (m p , θ p ) is accepted or reject in a Metropolis-Hastings step with probability , where the likelihood function is given in Equation ( 4), π( m ) is the density of the Poisson distribution truncated at m max , b m c and d m p are defined in Equation ( 6), q (ω p m p ) is the density of the uniform proposal for sampling the additional frequency, q (β c ) and q (β p ) are the multivariate Normal proposal densities N 2m c +2 ( β c , Σ c ) and N 2m p +2 ( β p , Σ p ), respectively; q ( σ 2 p ) and q ( σ 2 c ) are the Inverse-Gamma proposal densities defined in Equation ( 9).
Death move: If a death move is proposed, then m p = m c − 1.A vector of frequencies ω p is constructed by randomly selecting with probability 1 m c one of the current frequencies as the candidate frequency for removal.Given ω p and σ 2c , a vector of linear coefficients β p is sampled from a normal approximation to its posterior conditional distribution.Finally, conditioned on ω p and β p , the residual variance σ 2p is drawn its posterior Inverse-Gamma distribution.It is straightforward to see that the acceptance probability for the death step has the same form as the birth step, with the proper change of labelling of the variables, and the ratio terms inverted.
( k p , ξ p (k p ) ) by drawing from a proposal density of the form Birth move: If a birth move is proposed, we have that Hence, the new proposed location sj is sampled from a Uniform { f (s (k c ) , ψ s )}, where the proposal density is given by As the proposed location sj will lie within an existing interval (s c j , s c j+1 ) with probability one, we can define the proposed change-points location vector as s p (k p ) = (s c 1 , . . ., s c j , sj , s c j+1 , . . ., s c k c ) .
The number of frequencies m p j , m p j+1 corresponding to the two newly proposed segments [s c j , sj ) and [ sj , s c j+1 ) are set equal to the current number of frequencies on the whole segment (s c j , s c j+1 ).Therefore, we can construct the proposed vector of the number of frequencies m p where the residual variances σ 2p j , σ 2p j+1 for the split partition are constructed following Green (1995), namely as a perturbation of the current variance σ 2c j .Specifically, we draw u ∼ Uniform (0, 1) and let σ 2p j , σ 2p j+1 be deterministic transformations of σ 2c j , i.e where the vectors β p j , β p j+1 are drawn from normal approximations of their posterior conditional distribution, as in Section 4.1.The proposed move to the state (k p , ξ p (k p ) ) is accepted with probability where the likelihood function is provided in Equation (3), q (s p (k p ) ) is the uniform density defined in Equation (15); q (β c h ), q (β p h ) are the multivariate Normal proposal densities, and the Jacobian J σ 2 is The numerator of the proposal ratio is better understood by looking at the details of the death step, which are given below.
Death Move: If a death step is proposed, then k p = k c − 1.A candidate change-point s c j to be removed is sampled uniformly from the vector of existing change-points; that is, we propose to remove s c j with probability 1 k c .Then, the proposed vector of change-points locations s p (k) is defined as s p (k) = (s c 1 , . . ., s c j−1 , s c j+1 , . . ., s c k c ) .
The number m p j and the vector of relevant frequencies ω p j of the newly merged segment [s c j−1 , s c j+1 ) are selected by drawing an index at random from { j, j + 1}, obtaining say j * , and setting the proposed parameters equal to the current ones relative to the selected index.That is, we set m p j = m c The residual variance σ 2p j of the newly merged segment is obtained by inverting the transformation of Equation ( 16).Specifically, we construct σ 2p j = σ 2c j σ 2c j+1 and set the proposed vector of residual variances σ 2p (k p ) as σ 2p (k p ) = ( σ 2c 1 , . . ., σ 2c j−1 , σ 2p j , σ 2c j+2 , . . ., σ 2c k c +1 ) .
The proposed vector of linear coefficients β p (k p ) is β p (k p ) = ( β c 1 , . . ., β c j−1 , β p j , β c j+2 , . . ., β c k c +1 ) , where the vector of coefficients β p j is drawn from Normal approximation to its posterior conditional distribution.The acceptance probability for the death step has the same form of the birth step, with the proper change of labelling of the variables, and the ratio terms inverted.

2.
Change-point move: This step performs a reversible-jump MCMC algorithm for changepoint model search where the number k and locations of change-points s (k) are sampled, along with the linear coefficients, number of frequencies and their values as well as the residual variances for any segments affected by the move.

Figure 1 (
Figure 1 (middle panel) shows the estimated posterior distribution for the location of the change-points, conditioned on three segments.The posterior means of the change-point locations are Ê (s 1 | k = 2, y) = 298.7 and Ê (s 2 | k = 2, y) = 650.1.Figure 1 (bottom panel) shows that the estimated posterior distributions are an excellent match to the true frequencies.

Figure 1 :
Figure 1: Simulation study.(Top) Simulated time series.(Middle) Estimated posterior distribution for the location of the change-points, conditioned on k = 2.The dotted vertical lines represent true location of change-points.(Bottom) Estimated posterior distribution of the frequencies for each different segment, conditioned on k = 2, m 1 = 3, m 2 = 1 and m 3 = 2.The dotted vertical lines represent true values of the frequencies.

Figure 2 :
Figure 2: Simulation study with unitary residual variances.Estimated frequency peaks for AutoNOM (AN), Adapt-SPEC (AS J = 7, 15) and AutoPARM (AP); 95% credible intervals (horizontal lines) are also reported for Bayesian methods.Dotted vertical lines are true locations of the frequency peaks.

Figure 3 :
Figure 3: Analysis of skin temperature of a healthy subject.Panel (a) are the time series of skin temperature corresponding physical activity.Panel (b) is the estimated signal (solid line) along with its 95% credible interval; vertical lines are the estimated locations of the change-points.Panel(c) is the estimated posterior density histogram of the locations of the changes, conditioned on k = 7 change-points.Rectangles on the time axis of each plot correspond to periods from 20.00 to 8.00.The variation in skin temperature finds analogies with the rest-activity pattern that alternates between day activity and night rest.
(a), (b) and (c).They correspond to different actions for this rat: (a) is characterised by an alternation of sniffing and normal breathing; (b) presents the trace of a spontaneous apnea, followed by normal breathing; (c) shows normal breathing followed by a sigh, and a post-sigh apnea.Each time series contains 20,000 observations where the signal was sampled at 2000 Hz, namely at a rate of 2000 observations per second.

Figure 5 :
Figure 5: Plots of the respiratory traces of a rat (left panels) and corresponding estimated posterior power (right panels).Panel (a) is characterised by an alternation of sniffing and normal breathing.Panel (b) is a plot of the trace of a spontaneous apnea, followed by normal breathing.Panel (c) shows normal breathing followed by a sigh, and a post-sigh apnea.Dotted vertical lines correspond to the estimated locations of the change-points.

Table 1 :
Simulation study.(left panel) posterior probabilities for number of change-points; (right panel) posterior probabilities for number of frequencies in each regime, conditioned on k = 2.

Table 3 :
Simulation study with unitary residual variances.Estimated change-points locations (left panel) and frequency peaks (right panel) for AutoNOM (AN), AdaptSPEC (AS J = 7, 15) and AutoPARM (AP); standard errors are also reported for Bayesian methods.