Sequential testing of a Wiener process with costly observations

ABSTRACT We consider the sequential testing of two simple hypotheses for the drift of a Brownian motion when each observation of the underlying process is associated with a positive cost. In this setting where continuous monitoring of the underlying process is not feasible, the question is not only whether to stop or to continue at a given observation time but also, if continuing, how to distribute the next observation time. Adopting a Bayesian methodology, we show that the value function can be characterized as the unique fixed point of an associated operator and that it can be constructed using an iterative scheme. Moreover, the optimal sequential distribution of observation times can be described in terms of the fixed point.


Introduction
In the hypothesis testing problem of a Wiener process, one seeks to determine the value of the dri of a Wiener process. Solving the problem amounts to determining a decision rule that minimizes the total expected cost, which in a Bayesian formulation of the problem is typically de ned as the sum of the cost of a faulty decision and the cost of lengthy observations. Early papers in the area, including Bather (1962), Cherno (1961Cherno ( , 1965 and Breakwell and Cherno (1964) study hypothesis testing problems with normal prior distributions of the dri for various loss functions, corresponding to di erent costs of a faulty decision. In the absence of closed form solutions of such problems, the main focus in these references is on determining asymptotic properties of the optimal decision rule. Utilizing the connection between optimal stopping problems and free-boundary problems, Shiryaev (1969Shiryaev ( , 1978 provides an explicit solution of the hypothesis testing problem when the dri can take only two di erent values. Notable recent contributions include the extension to the nite horizon hypothesis testing problem (Gapeev and Peskir, 2004), the characterization of the solution to the original Cherno problem in terms of an associated integral equation (Zhitlukhin and Muravlev, 2013), a study of the case with three hypotheses (Zhitlukhin and Shiryaev, 2011), and a study of the case with general prior distributions (Ekström and Vaicenavicius, 2015). Along a related line of research, various authors have extended the problem to include more general underlying processes. For example, a study of testing two hypotheses on the intensity of a Poisson process was provided in Peskir and Shiryaev (2000), hypotheses testing on the intensity and the jump distribution of a compound Poisson process was investigated in Dayanik and Sezer (2006), and results on the testing of two hypotheses for some Lévy processes can be found in Muliere (2013, 2016). Furthermore, techniques similar to those employed in the statistical literature have been used to study nancial problems involving simultaneous learning about the dri and nancial optimization. For example, Lakner (1995) studies a classical problem of utility maximization but with incomplete information about the dri of the underlying asset, Décamps et al. (2005) investigate a timing problem for investing in a real option under incomplete information, and Ekström and Vaicenavicius (2016) consider a liquidation problem for general prior distributions of an unknown dri .
In the current article we study a version of the classical sequential hypothesis testing problem for the dri of a Wiener process where, additionally, each observation is associated with a positive cost. With this assumption, continuous observation of the underlying process is impossible, and a strategy thus consists of a decision whether to stop or not, together with a rule specifying how long to wait for the next observation if continuation is preferred. Imposing a positive cost for each observation gives a discrete structure to the sequential hypothesis testing problem, and we hence analyze it using a certain operator closely associated with the discrete structure of the setup. Our main result states that the value function of the problem can be characterized as the unique xed point of this operator and that the value function can be determined by an iterative procedure involving the operator. In the iterative construction of the value function, each element in the sequence has a natural interpretation as the value function of a problem with only nitely many observation rights. Moreover, we show that the optimal strategy can be described in terms of the value function. As expected, the optimal strategy consists of a decision rule whether to stop or not at a given observation time, together with a rule that speci es when to make the next observation. The distribution of the next observation time is described by a function of the current posteriori probability process. A numerical study suggests that in the iterative procedure, the sequence of optimal strategies is convergent, but we have not been able to verify this analytically.
The formulation of the problem with xed observation costs has direct applications in experimental design, where the cost of setting up an experiment is proportional to the number of trials (with coe cient c in the notation below), and the cost of analyzing an experiment (d in the notation below) is independent of the number of trials performed. However, while formulated for the hypothesis testing problem, the general methodology of the current article should be applicable in other optimal stopping problems where each observation is costly. To the best of our knowledge, no such optimal stopping problem has been studied in the literature.
The current article is organized as follows. In Section 2, we formulate the sequential hypothesis testing problem for a Wiener process with costly observations under consideration. In Section 3, we introduce a closely associated operator and we study its properties. In particular, we show that the value function is characterized as its unique xed point. Finally, in Section 4, we show that an optimal decision rule can be described in terms of the value function.

Problem formulation
Let X t = µt + σ W t be a stochastic process, where W is a standard Brownian motion, σ = 0 is a known constant, and the dri µ is an unknown constant. Consider a situation in which one wants to determine µ from observations of X as accurately as possible and at the same time as quickly as possible. In a Bayesian setting, the uncertainty about the dri is captured by modeling µ as a random variable with a given prior distribution, and the Bayes risk is de ned as the sum of the risk of a large error in the estimate for the dri and the cost of time. In a classical version of the sequential testing problem, the unknown dri can only take values in the set {µ 1 , µ 2 } where µ 1 = µ 2 are two given constants, and the Bayes risk associated with a strategy (τ , d) is speci ed as Here τ is an F X -stopping time, where F X = {F X t , t ≥ 0} is the ltration generated by the process X, d is an F X τ -measurable random variable, a > 0 and b > 0 are the costs for the two possible kinds of faulty decisions, and c > 0 is the observation cost per unit of time.
Introducing the a posteriori probability process following standard lines of argument, gives that the minimal Bayes risk is given by where g(π) := aπ ∧b(1−π). It is well known that the a posteriori probability process satis es where ω = (µ 2 − µ 1 )/σ denotes the signal-to-noise ratio and the innovation procesŝ is a standard Brownian motion. Moreover, is a (time-homogeneous) strong Markov process with respect to its natural ltration, which coincides with {F X t , t ≥ 0}. It is well known that the function U de ned in (2.2) can be determined as the solution of an associated free-boundary problem; see, for example, Shiryaev (1969Shiryaev ( , 1978.
We consider a similar hypothesis testing problem but with the added constraint that each observation is associated with a xed cost. To formulate the problem, letτ = {τ k } ∞ k=0 be an increasing sequence of random times with τ 0 = 0, and let We only consider sequencesτ = {τ k } ∞ k=0 such that τ k is a predictable Fτ -stopping time. Note that, due to the discrete structure, τ k is a predictable Fτ -stopping time precisely if τ k is k=0 is as described above and τ is an Fτ -stopping time with τ (ω) ∈ {τ 0 (ω), τ 1 (ω), τ 2 (ω), . . .} a.s. is called an admissible strategy, and the set of admissible strategies is denoted T .
De ne the value function of the sequential hypothesis testing problem with costly observations to be Here the constant d > 0 represents the cost of each observation.
Remark 2.1. Note that U ≤ V ≤ g is immediate from the de nition. Also note that an implicit consequence of the de nition of T is that stopping is only allowed at observation times. This is without loss of generality, since stopping between observation times would necessarily be suboptimal as no more information is obtained in such intervals.

Analysis of the value function
In this section, we introduce an operator that is closely associated with the sequential hypothesis testing problem (2.3), and we study its properties. Let where U is the value function of the classical hypothesis testing problem de ned in (2.2) above. Consider the operator J de ned by for any given function f ∈ F.
In view of Lemma 3.1, we de ne the function t(·; f ) : for π ∈ [0, 1]. In other words, t(π; f ) is the rst time at which the function s → cs+E π [f ( s )] attains its minimum.
Proof. For π ∈ [0, 1], we have that which proves (a). For (b), note that by de nition, J f (π) ≤ g(π). Moreover, for a xed t, the function π → d + ct + E π [ t ] is concave (for results on preservation of convexity for martingale di usions, see, for example, Hobson (1998) or Janson and Tysk (2003)), so therefore J f is also concave since it is the pointwise minimum of concave functions. It remains to check that U ≤ J f . For this, note that U ≤ f , so by (a). Moreover, by standard results in optimal stopping theory, we know that the process ct + U( t ) is a submartingale, so U(π) ≤ ct + E π [U( t )] for any t ≥ 0. Therefore, which together with (3.2) gives (b).
De ne the sequence f n recursively by f 0 = g and f n+1 = J f n , n ≥ 1.
By Proposition 3.2, the sequence {f n } is decreasing in n and thus its limit f ∞ := lim n→∞ f n exists. Since the pointwise limit of a sequence of concave functions is concave, we have that f ∞ ∈ F. Proof For the opposite inequality, x π ∈ [0, 1] and let t ∞ = t(π; f ∞ ), where t(π; f ∞ ) is de ned as in (3.1). Then f n+1 (π) = J f n (π) ≤ min g(π), d + ct ∞ + E π f n ( t ∞ ) , so letting n → ∞ yields by monotone convergence. Together with (3.3), this shows that f ∞ is a xed point. Finally, assume that h ∈ F is another xed point of J . Then f 0 = g ≥ h, and using (a) in Proposition 3.2, an easy induction argument shows that f n ≥ h. Consequently, f ∞ ≥ h, which nishes the proof.
De ne the function V n : and note that V n then is the value function of a version of our hypothesis testing problem where the underlying process may be observed at most n times.
Theorem 3.1. We have V n = f n , n ≥ 0.
Theorem 3.2. The value function V satis es V = f ∞ . Consequently, V is the largest xed point in F of the operator J .
Proof. In view of Lemma 3.3 and Theorem 3.1, it su ces to prove that lim n→∞ V n (π) = V(π).
Remark 3.1. For a graphical illustration of the convergence of the sequence {V n } ∞ n=0 , see Figure 1. We point out that while it is well known that the value function U from the classical sequential hypothesis testing problem with continuous observations satis es the smooth-t condition at the boundary points of the continuation region, there is no reason to expect smooth t for the value functions V n or V. In fact, Figure 1 suggests that smooth t fails in the case of discrete observation costs. Remark 3.2. It follows from Theorem 3.2 that the value function V is decreasing in the signalto-noise ratio ω = (µ 2 − µ 1 )/σ . Indeed, for given signal-to-noise ratios ω andω satisfying ω ≤ω, denote by V,Ṽ, V n , andṼ n the corresponding value functions.
Moreover, if V n ≥Ṽ n for some n ≥ 0, then by general monotonicity results with respect to the di usion coe cient (see Hobson (1998) and Janson and Tysk (2003)) one has where J andJ are the corresponding operators. By induction, it follows that V n ≥Ṽ n for all n ≥ 0, so One can show that the operator J fails to be a contraction on F (equipped with the supnorm), so we cannot use the Banach x-point theorem to establish uniqueness of x-points or deduce convergence rates for the convergence V n → V. Instead, we end this section by showing that V is the unique x-point using a more direct method. Proof. De ne a second sequence {f n } ∞ n=0 in F recursively byf 0 = U and f n+1 = Jf n , n ≥ 0.
By Proposition 3.2 (b),f 1 ≥ U =f 0 , so an induction argument using Proposition 3.2 (a) shows thatf n+1 ≥f n for all n ≥ 0. Also, de ne the functionṼ n bỹ and note thatṼ n then is the value when the underlying process may be observed at most n times given that if no stopping has occurred then one receives the function U at the nth observation time. Using similar arguments as in the proofs of Theorems 3.1-3.2 above, we nd thatf n =Ṽ n and lim n→∞Ṽ n (π) = V(π).
Remark 3.3. It follows from the analysis above that even though J is not a contraction, the sequence {f n } ∞ n=0 de ned by f 0 = f and f n+1 = J f n , n ≥ 0, converges to V for any starting point f ∈ F.

The optimal strategy
In Section 3, we characterized the value function V as the unique xed point of the operator J (Theorem 3.3). Moreover, this xed point can be determined using an iterative procedure; see Theorem 3.2. Given the value function V, there is a natural way to de ne a corresponding strategy. In the current section, we show that this strategy is indeed optimal.
By de nition, V ≤V.
Remark 4.1. Given n ≥ 0, consider the strategy de ned recursively by τ n 0 = 0 and τ n k+1 = τ n k + t( τ n k ; V n−k−1 ) for k = 0, . . . , n * − 1, where n * = min{k : V n−k ( τ n k ) = g( τ n k )}, and τ n k = ∞ for k ≥ n * + 1, and let τ n = τ n n * . Employing similar methods as the ones used in the proof of Theorem 4.1 shows that the strategy (τ n , τ n ), whereτ n = {τ n k } ∞ k=0 , is optimal for the problem V n de ned in (3.4). Since V n → V by Theorem 3.2, it seems reasonable to expect that t(·; V n ) would converge to t(·, V) and thus that the optimal strategy (τ n , τ n ) would tend to (τ * , τ * ). While numerical evidence supports this, compare Figure 2, we have not been able to con rm it analytically.