A second-order dynamical system with Hessian-driven damping and penalty term associated to variational inequalities

Abstract We consider the minimization of a convex objective function subject to the set of minima of another convex function, under the assumption that both functions are twice continuously differentiable. We approach this optimization problem from a continuous perspective by means of a second-order dynamical system with Hessian-driven damping and a penalty term corresponding to the constrained function. By constructing appropriate energy functionals, we prove weak convergence of the trajectories generated by this differential equation to a minimizer of the optimization problem as well as convergence for the objective function values along the trajectories. The performed investigations rely on Lyapunov analysis in combination with the continuous version of the Opial Lemma. In case the objective function is strongly convex, we can even show strong convergence of the trajectories.


Introduction
The Newton-like dynamical system ẍ(t) + γ ẋ(t) + λ∇ 2 Φ(x(t))( ẋ(t)) + ∇Φ(x(t)) = 0 x(0) = u 0 , ẋ(0) = v 0 , has been investigated by Alvarez, Attouch, Bolte and Redont in [5] in connection with the optimization problem inf Here, H is a real Hilbert space endowed with inner product •, • and associated norm • = •, • , u 0 , v 0 ∈ H are the initial data, λ, γ > 0, while ∇Φ and ∇ 2 Φ denote the gradient and Hessian of the function Φ : H → R, respectively.We speak here about a second order system in time (through the presence of the acceleration term ẍ(t), which is associated to inertial effects) and in space (through ∇ 2 Φ(x(t))).One can also notice the presence of the geometric damping that acts on the velocity through the Hessian of the function Φ.
As underlined in [5], the dynamical system (1) can be seen as a mixture of the continuous Newton method ∇ 2 Φ(x(t))( ẋ(t)) + ∇Φ(x(t)) = 0, investigated by Alvarez and Pérez in [6], with the heavy ball with friction system ẍ(t) + γ ẋ(t) + ∇Φ(x(t)) = 0, (4) studied for the first time in Polyak [35] and Antipin [7].Due to this remarkable fact, the dynamical system (1) possesses most of the advantages of the systems (3) and (4).We refer the reader to [5,6,9,[18][19][20] for more insights on Newton-type dynamics and their motivations coming from mechanics and control theory.The aim of this paper is to associate a second order Newton-type dynamical system to the optimization problem inf where Φ, Ψ : H → R are convex and twice differentiable functions, and to investigate its asymptotic properties.
Let us notice that, due to the first order optimality conditions, solving (5) can be formulated as a variational inequality of the form find x ∈ argmin Ψ such that ∇Φ(x), y − x ≥ 0 ∀y ∈ argmin Ψ, (6) where argmin Ψ denotes the set of minimizers of Ψ over H. Attouch and Czarnecki have assigned in [12] to (5) the nonautonomous first order dynamical system where β : [0, +∞) → (0, +∞) is a function of time assumed to tend to +∞ as t → +∞, which penalizes the constrained function.Several convergence results of the trajectories generated by (7) to the solution set of ( 5) have been reported in [12] under the key assumption where Ψ * : H → R ∪ {+∞} is the Fenchel-Legendre transformation of Ψ: σ argmin Ψ : H → R ∪ {+∞} is the support function of the set argmin Ψ: and N argmin Ψ is the normal cone to the set argmin Ψ, defined by We present a situation where the above condition (8) is fulfilled.According to [12], if we take for a nonempty, convex and closed set C ⊆ H, then the condition ( 8) is fulfilled if and only if The paper of Attouch and Czarnecki [12] was the starting point of a considerable number of research articles devoted to this subject, including those addressing generalizations to variational inequalities formulated with maximal monotone operators (see [10, 12, 14, 15, 17, 22-24, 27-29, 33, 34]).We refer also to the above-listed references for more general formulations of the key assumption (8) and for further examples for which these conditions are satisfied.
In [24] we approached the optimization problem (5) through the second order nonautonomous dynamical system ẍ(t) + γ ẋ(t) + ∇Φ(x(t)) + β(t)∇Ψ(x(t)) = 0, where γ > 0 and β : [0, +∞) → (0, +∞) is a function of time.Under the assumption that β tends to +∞ as t → +∞, we proved weak convergence of the generated trajectories to a minimizer of (5) as well as convergence for the objective function values along the trajectories.We refer to [13] for another variant of this system, where the objective function is penalized instead of the constraint function.
The aim of this paper is to combine Newton-like dynamics with systems of the form (9) in order to approach from a continuous perspective the solving of the optimization problem (5).To this end, we propose ourselves to investigate in this paper the asymptotic behavior of the dynamical system ẍ(t)+γ ẋ(t)+λ∇ 2 Φ(x(t))( ẋ(t))+λβ(t)∇ 2 Ψ(x(t))( ẋ(t))+∇Φ(x(t))+(β(t)+λ β(t))∇Ψ(x(t)) = 0. ( 10) Condition (8) will be again crucial in the analysis performed.By using Lyapunov analysis in combination with the continuous version of the Opial Lemma, we prove weak convergence of the trajectories to a minimizer of the optimization problem (5) as well as convergence for the objective function values along the trajectories.In case the objective function is strongly convex, we can even show strong convergence of the trajectories.

Preliminaries
In this section we will introduce preliminary notions and results that will be useful throughout the paper.
The following statement can be interpreted as the continuous counterpart of the convergence result of quasi-Fejér monotone sequences.For its proofs we refer the reader to [1,Lemma 5.1].
We will focus our investigations on the following second order dynamical system where γ, λ > 0, u 0 , v 0 ∈ H, provided that the following assumptions are satisfied: We look at strong global solutions x : [0, +∞) → H of the dynamical system of (11), that is, x and ẋ are locally absolutely continuous (in other words, absolutely continuous on each interval In view of the Lipschitz continuity of ∇Ψ, ∇ 2 Ψ and ∇Φ, ∇ 2 Φ, assumed in (H Ψ ) and (H Φ ), respectively, the existence and uniqueness of strong global solutions of ( 11) is a consequence of the Cauchy-Lipschitz-Picard Theorem (see for example [5,17,25,31]).
Remark 2 (a) In case Ψ = 0, the dynamical system (11) becomes the convergence of which has been investigated in [5] in connection with the minimization of the function Φ over H.
(b) The time discretization of second order dynamical systems leads to iterative algorithms involving inertial terms, which basically means that every new iterate is constructed in terms of the previous two iterates (see for example [3,4]).In view of this observation, it makes sense to investigate a time discretized version of (11) and to study the convergence properties of the generated iterates in relation with the solving of the optimization problem (5).We leave this topic as future research work.

Convergence of the trajectories and of the objective function values
This section is devoted to the asymptotic analysis of the trajectory generated by the dynamical system (11).We show weak convergence of the trajectory x(•) to an optimal solution of (5) as well as convergence for the objective function values along the trajectory as t → +∞, under the following assumption: For δ > 0, we considere the following energy functional that will play an important role in the analysis below: For its derivative we have for almost every t ∈ [0, +∞) Finally we obtain for almost every t ∈ [0, +∞) Further, for z ∈ S and we consider the functional By using (14) we easily derive for almost every t ∈ [0, +∞) The following lemma will play an essential role in the asymptotic analysis of the trajectories.
Lemma 4 Assume that (H Ψ ), (H Φ ), (H β ) and (H) hold and let x : [0, +∞) → H be the trajectory generated by the dynamical system (11).Then for every z ∈ S the following statements are true: Proof.Take an arbitrary z ∈ S. Relying on the convexity of the functions Φ and Ψ, the fact that z ∈ argmin Ψ (hence Ψ(z) = 0) and the non-negativity of β and Ψ we obtain for every t ∈ [0, +∞) From here and ( 17) we derive for almost every t ∈ [0, +∞).By using the growth condition on β we get where due to (H β ) and (15).Furthermore, for almost every t ∈ [0, +∞).Since z is an optimal solution of (5), the first order optimality condition delivers 0 From here and by using the Young-Fenchel inequality we obtain for every t ∈ [0, +∞) Thus, from ( 23) and ( 26) we obtain for almost every t ∈ [0, +∞) where the last inequality follows from ( 15) and the fact that θ ∈ (0, 1).By integrating the last inequality from 0 to T (T > 0) and by taking into account (H), ( 16), ( 13) and the fact that Φ and Ψ are bounded from below, it yields that there exists M > 0 such that Combining this with (18) and the fact that Φ and Ψ are bounded from below, one can easily see that there exists M ′ > 0 such that d dT A direct application of the Gronwall Lemma implies that x is bounded.(30) Further, this yields via (29) that From ( 30), ( 31), ( 16) and( 13) we conclude that Moreover, from ( 20), ( 26) and ( 22) we obtain for almost every t ∈ [0, +∞) (i) Consider the function F : [0, +∞) → R defined by Making again use of (see (33)) and ( 32), we easily derive that F is bounded from below.Moreover, from (26) it follows that for almost every t ∈ [0, +∞) Notice that according to (H), the function on the right-hand side of this inequality is L 1 -integrable on [0, +∞), hence a direct application of Lemma 1 yields that lim t→+∞ F (t) exists and is a real number.Thus Since Ψ ≥ 0, we obtain for every t ∈ [0, +∞) and from here, similarly to (26), Thus, for almost every t ∈ [0, +∞) it holds Following the same technique as in the proof of (34), it yields that Finally, from (34), ( 35) and ( 21) we obtain (i).

Remark 5
The assumption lim t→+∞ β(t) = +∞ has not bee used in the above proof.However, it will play an important role in the arguments used below.
For the asymptotic analysis of the trajectories generated by the dynamical system (11), the continuous version of the Opial Lemma that we state as follows will be crucial.
Lemma 6 Let S ∞ be a nonempty subset of the real Hilbert space H and x : [0, +∞) → H a given function.Assume that (i) lim t→+∞ x(t) − z exists for every z ∈ S ∞ ; (ii) every weak limit point of x belongs to S ∞ .
Then there exists x ∞ ∈ S ∞ such that x(t) converges weakly to x ∞ as t → +∞.
We state now the main theorem of the paper.
Proof.Fix an arbitrary z ∈ S and consider the energy functional defined in (13) for A simple computation (see (14)) shows that Taking into account the growth condition on β, Lemma 4(i) and the fact that E δ 1 is bounded from below, we obtain from Lemma 1 that Similarly, consider We have From ( 36) and (37) we get This implies by the definition of the energy functional that From here we deduce lim The statement (39) follows now from the last relation and (38).
Let us assume that lim t→+∞ Φ(x(t)) + β(t)Ψ(x(t)) > Φ(z).Then there exist η > 0 and t 0 ≥ 0 such that for every t ≥ t 0 we have Hence, for every Integrating the last inequality and taking into account Lemma 4(i) and (iii) we obtain a contradiction.
(iii)-(iv) The statements have been proved in Lemma 4. (v) From ( 14), Lemma 4(i) and Lemma 1, we derive that Combining this with ( 13) and (38), we obtain that The statement follows now from (iv).
(vi) This will be a consequence of the Opial Lemma.Let us check the first statement in Lemma 6.From ( 28), (H), (32) and Lemma 1 we obtain Further, by using ( 16), ( 45), ( 44), ( 30) and (v) we conclude that Since z ∈ S was arbitrary chosen, the first statement of the Opial Lemma is true.
We prove now that the second condition in Lemma 6 is fulfilled, too.Let (t n ) n∈N be a sequence of positive numbers such that lim n→+∞ t n = +∞ and x(t n ) converges weakly to x ∞ as n → +∞.By using the weak lower semicontinuity of Ψ and (ii) we obtain 0 ≤ Ψ(x ∞ ) ≤ lim inf Finally, we consider the situation when the objective function of ( 5) is strongly convex.In this case, the trajectory generated by (11) converges strongly to the unique optimal solution of (5).
Proof.Let µ > 0 be such that Φ is µ-strongly convex.In this case the optimization problem (5) has a unique optimal solution, which we denote by z.
We replace (22)  Taking into account (H) and that E is bounded from below (see (32)), by integration of the above inequality we obtain that there exists a constant C > 0 such that According to (46), lim t→+∞ x(t) − z exists, thus x(t) − z converges to 0 as t → +∞ and the proof is complete.