Stochastic PDEs via convex minimization

We prove the applicability of the Weighted Energy-Dissipation (WED) variational principle [50] to nonlinear parabolic stochastic partial differential equations in abstract form. The WED principle consists in the minimization of a parameter-dependent convex functional on entire trajectories. Its unique minimizers correspond to elliptic-in-time regularizations of the stochastic differential problem. As the regularization parameter tends to zero, solutions of the limiting problem are recovered. This in particular provides a direct approch via convex optimization to the approximation of nonlinear stochastic partial differential equations.


Introduction
This paper is concerned with stochastic quasilinear partial differential equations of the form complemented with suitable boundary and initial conditions. Here, the real-valued function u is defined on Ω × [0, T ] × O, where (Ω, F , P) is a probability space, O ⊂ R d is a smooth bounded domain, and T > 0 is a reference time. The functions φ(t, ·) : R d → R and ψ(t, ·) : R → R are asked to be convex, the gradients Dφ and Dψ are taken with respect to the second variable only, and the time-dependent sources f and B are given. In particular, B(·) ∈ L 2 (U ; L 2 (O)) (Hilbert-Schmidt operators) is stochastically integrable with respect to W , a cylindrical Wiener process on a separable Hilbert space U .
Under different choices for the nonlinearities φ and ψ, equation (1) may arise in connection with various classical models, including the Allen-Cahn and the p-Laplace equation. Assume equation (1) to be complemented with homogeneous Dirichlet boundary conditions, for notational simplicity, and with the initial condition u(0) = u 0 , where u 0 is some suitable initial datum. Letting φ(t, ·) and ψ(t, ·) be of p-growth, equation (1) can be weakly formulated in the dual of the space W 1,p 0 (O), according to the classical theory by Pardoux [53] and Krylov-Rozovskiȋ [35]. It is well-known that the solution u is an Itô process, in the sense that it can be represented in the general form where the process u d is differentiable in time and u s is L 2 (U ; L 2 (O))-valued and stochastically integrable with respect to W . This decomposition into the deterministic part u d and the stochastic part u s is unique. With this notation, u is a solution to the original problem (1) if and only if u satisfies the constraint (2) and the equations ∂ t u d − div (Dφ(·, ∇u)) + Dψ(·, u) ∋ f , u s = B , u d (0) = u 0 .
The aim of this paper is to tackle the weak solvability of equation (1) via the Weighted Energy-Dissipation (WED) variational approach. This hinges upon the minimization of the parameter-dependent functional I ε on entire trajectories, the so-called WED functional, given by |∂ t u d (r)| 2 + φ(r, ∇u(r)) + ψ(r, u(r)) − f (r) u(r) dx dr where B ε is a suitable approximation of the process B. The convex WED functional I ε has to be minimized under two linear constraints, namely the decomposition (2) and the initial condition u(0) = u 0 . This results in a convex minimization problem. Our main result, Theorem 2.1, states that that, under suitable assumptions on data, for all ε > 0 the minimizer u ε of I ε uniquely exists. As ε → 0 we have that u ε → u where u is the unique solution of the stochastic differential problem (1).
This provides a new variational approximation to the stochastic differential problem (1), making it accessible to a direct optimization approach, and paving the way to the application of the far-reaching tools of the calculus of variations [15,18,19].
The role of the exponential weight in I ε is revealed by computing the corresponding Euler-Lagrange equation. In the current setting these formally read − ε∂ t (∂ t u d ε ) d + ∂ t u d ε − div (Dφ(·, ∇u ε )) + Dψ(·, u ε ) = f , u s ε = B ε , ε∂ t u d ε (T ) = 0 , u d ε (0) = u 0 , where we have also included the initial condition, for completeness. In particular, the minimizers u ε solve an elliptic-in-time regularization of the stochastic differential problem (1), complemented by an extra Neumann boundary condition at T . Note that for all ε > 0 the problem is not causal and that causality is restored in the limit ε → 0.
Elliptic-regularization techniques for nonlinear PDEs are quite classical. Introduced by Lions in [40], they have been used by Kohn & Nirenberg [33], Oleinȋk [51], and again Lions [41,42] in order to investigate regularity. An account on linear results can be found the the book by Lions & Magenes [43], whereas an early result on solvability in a nonlinear setting is due to Barbu [7].
The variational formulation of elliptic-regularization via WED functionals can be traced back to Ilmanen [31], who used it in the context of Brakke mean-curvature flow of varifolds, and to Hirano [28] in connection with periodic solutions of gradient flows. A reference to WED functionals is already pointed out in the classical textbook by Evans [23,Problem 3,p. 487].
In the context of stochastic PDEs, the application of tools from calculus of variations in order to characterize variational solutions is much less developed, and has been employed so far mainly in connection with the Brezis-Ekeland principle. In this direction, we mention the pioneering works by Barbu & Röckner [9,10] dealing with SPDEs with additive and linear multiplicative noise, and by Krylov [34]. More recently, Boroushaki & Ghoussoub [12] generalized these results also to the case of multiplicative noise, by characterizing solutions as minima of self-dual functionals.
This paper contributes to the first application of the WED principle in the stochastic setting. Compared with the deterministic situation, the theory is here much more more involved.
The first main difficulty arises in proving existence of minimizers for I ε . This requires the characterization of the subdifferential of I ε in terms of the Euler-Lagrange problem. In the stochastic setting, this ε-regularized problem consist of a forward-backward system of SPDEs. The identification of the Euler-Lagrange equation is far more involved compared to the deterministic framework. In the deterministic case, it is well known that the space of compactlysupported C k test-functions C k c (0, T ) is dense in L 2 (0, T ) for all k ∈ N: this allows to identify the Euler-Lagrange equation pretty straightforwardly at least in a weak sense. By contrast, in the stochastic case the space L 2 (Ω; C k (0, T )) is not dense in L 2 (Ω; L 2 (0, T )), due to the presence of nonzero martingales in L 2 (Ω; L 2 (0, T )). The main drawback is that usual deterministic techniques do not apply here, and the Euler-Lagrange equation has to be characterized using different tools, both on the analytical side and the probabilistic side. As a matter of fact, on the one hand we need to introduce suitable functional spaces of processes in Banach spaces (Itô processes), and on the other hand we rely on the abstract variational theory for backward SPDEs and martingale representation theorems in infinite dimensional spaces.
The second main difficulty concerns proving the well-posedness of the Euler-Lagrange problem. As we have pointed out above, the second-order Euler-Lagrange equation is noncausal and corresponds to a system of a forward and a backward first-order stochastic equation. The discussion of this forward-backward system calls for a further approximation on the nonlinearity. Identifications of nonlinear limits are performed via lower semicontinuity arguments, which in turn rely on specific Itô's formulas, both at the approximate and at the limit level.
In the paper, we actually consider a general class of abstract equations, including (1). Indeed, we frame the problem in the abstract variational setting of a Gelfand triple (V, H, V * ) and focus on where A is a time-dependent subdifferential-type operator from V to V * , V being a separable reflexive Banach space and H a separable Hilbert space. We collect all relevant notation, list assumptions, and state Theorem 2.1, our main result, in Section 2. The proof of Theorem 2.1 is then split into Section 3 (Euler-Lagrange problem), Section 4 (convergence as ε → 0), and Section 5 (existence of minimizers).

Main result
In the following, we directly focus on the abstract Cauchy problem The latter arises as variational formulation of an initial and boundary value problem for equation (1) by choosing the convex map Φ(t, ·) as Note that we have neglected the deterministic forcing f in (1) for the sake of notational simplicity. Indeed, this could be included in the analysis with no specific difficulty.
In this section we introduce the necessary notation and assumptions to make the meaning of problem (3) precise and we state of our main result, Theorem 2.1. This is then proved in Sections 3-5.
Let (Ω, F , P) be a probability space endowed with a complete and right-continuous filtration (F t ) t∈[0,T ] , where T > 0 is a fixed final time. Let also W be a cylindrical Wiener process on a separable Hilbert space U . We will assume that (F t ) t∈[0,T ] is the natural augmented filtration associated to W . The progressive σ-algebra on Ω × [0, T ] will be denoted by P. For any Banach space E, the norm in E will be denoted by · E . For any r, s ∈ [1, +∞) we denote by L r P (Ω; L s (0, T ; E)) the usual space of Bochner-integrable functions which are strongly Pmeasurable from Ω × [0, T ] to E. When r > 1 and s = +∞, we explicitly define where for any f ∈ L 1 (Ω) we use the standard notation E f := Ω f dP for the expected value. Recall that by [22,Thm. 8.20.3] we have the identification Moreover, for any r ≥ 1, the symbol L r (Ω; C 0 ([0, T ]; E)) denotes the space of r-integrable continuous adapted process (hence also progressively measurable) with values in E. For any pair of separable Hilbert spaces E 1 and E 2 , we will use the symbols L (E 1 , E 2 ) and L 2 (E 1 , E 2 ) for the spaces of linear continuous and Hilbert-Schmidt operators from E 1 and E 2 , respectively.
Let us fix now a useful notation in order to denote suitable spaces of Itô processes. For every separable reflexive Banach space E 1 and any Hilbert spaces E, E 2 , with E 1 , E 2 ֒→ E continuously, and for any s, r ∈ [1, +∞), we use the notation where we have used the classical symbol ·W to denote stochastic integration with respect to W . Equivalently, we have the representation z d ∈ L s P (Ω; W 1,s (0, T ; E 1 )) , z s ∈ L r P (Ω; L 2 (0, T ; L 2 (U, E 2 ))) . The latter specifies that the two components z d and z s are uniquely determined from the process z, so that the sum appearing above is actually a direct sum, and the projections Π d : I s,r (E 1 , E 2 ) → L s P (Ω; W 1,s (0, T ; E 1 )) , z → z d , Π s : I s,r (E 1 , E 2 ) → L r P (Ω; L 2 (0, T ; L 2 (U, E 2 ))) , z → z s , are well-defined, linear, and continuous. Let us also point out that the space I s,r (E 1 , E 2 ) is a Banach space, and even a Hilbert space if s = r = 2 and E 1 is a Hilbert space. A natural norm on I s,r (E 1 , E 2 ) is given by Throughout the paper, we assume the following setting.
H0: H and V 0 are separable Hilbert spaces and V is a separable reflexive Banach space, with V 0 ֒→ V ֒→ H continuously and densely. In particular, we suppose that there is In the sequel, we will identify H with its dual H * in the canonical way, so that we have the continuous and dense inclusions The scalar product in H and the duality pairing between V * and V (and between V * 0 and V 0 ) will be denoted by the symbols (·, ·) and ·, · , respectively.
We assume the following hypotheses.
Let us point out that the progressive measurability of Φ required in H1 implies that A is P ⊗ B(V )/B(V * )-Effros-measurable, in the sense of [27,54].
Before moving on, let us comment on the choice of the space V 0 . The introduction of V 0 will be needed in the paper since at some point we would have to rely on Itô's formula for the square of the V * -norm. However, this cannot be done in general if V is a Banach space: indeed, in such case the duality mapping of V * is nonlinear and possibly not twice Fréchet-differentiable, hence the required Itô formula is not trivial and not known in general, even in the extended framework of stochastic integration in UMD Banach spaces (see [14,65,66]). The introduction of the space V 0 is then employed to bypass this problem exploiting its structure as Hilbert space, and allows to write an Itô formula in V * 0 . Clearly, if V is a Hilbert space itself, the optimal choice of V 0 is given by V 0 = V . In general, if V is only a Banach space, roughly speaking one should ideally choose the space V 0 as large as possble. For example, if V = W s,ℓ (O) for a certain domain O ⊂ R d with Lipschitz boundary, with ℓ ∈ (2, +∞) and s > 0, one could choose with the choice s ′ = s + d/2 − d/ℓ being optimal in this sense. The existence of a regularizing sequence of operators (T n ) n can be easily exhibited, in the case of Sobolev spaces, by means of convolution with a sequence of mollifiers, for example.
The classical variational theory on SPDEs (see [35,53]) ensures that under the assumptions H0-H2 the Cauchy problem (3) admits a unique solution (u, ξ), with and Let us reformulate this solution concept in a different fashion. We introduce the space ) . Note that U can be written in compact form as With this notation, the process u solves the problem (4)- (6) if and only if In such a case, (4)-(6) are satisfied with the choice ξ := −∂ t u d .
As mentioned, the WED approach consists in minimizing an ε-dependent functional over entire trajectories and passing to the limit in the parameter ε. This procedure results in an elliptic regularization in time, hence delivering regular approximations. In particular, the differential problem (3) is reformulated as a linearly constrained convex minimization. In the abstract setting of (4)-(6), letting ε > 0 we introduce the WED functional We qualify the ε-dependent data (u 0,ε , B ε ) above by requiring that the two sequences, as ε ց 0, are given in such a way that The existence of sequences fulfilling (8)-(10) follows directly from H2 and the density of V 0 ֒→ H, by standard regularization techniques.
Minimizers of I ε will be proved to belong to the space Again, note that a more compact notation for U reg reads Let us point out in particular that U reg ֒→ V ֒→ U with continuous inclusions.
The Euler-Lagrange equation for functional I ε corresponds to the ε-regularized problem Note that the second-order problem (11) can be seen as a system of two equations of first order in time, one forward and one backward, by using the classical substitution v ε := ∂ t u d ε . Indeed, with this notation (11) is equivalent to Note that the variables of the forward-backward system (12) are three, namely u ε , v ε , and G ε . Indeed, while the forward equation has a unique variable (u ε ), the concept of solution for the backward stochastic equation requires the two variables v ε and G ε due to the need of representation theorems for martingales. In particular, we have that G ε = v s ε is uniquely determined by the backward stochastic equation.
The main result of the paper reads as follows. ii) (Euler-Lagrange equation) The minimizer also satisfies u ε ∈ U reg and it is the unique solution to the problem (11). Namely, there exists a unique triplet for every t ∈ [0, T ], P-almost surely. In particular, one has that where (u, ξ) is the unique solution to the problem (7) in the sense of (4)- (6). Furthermore, if V ֒→ H compactly and p < 4, it also holds that The proof of Theorem 2.1 is recorded in the coming Sections 3-5. In particular, Part ii of the theorem is proved in Section 3, where we focus on the well-posedness of the forward-backward regularized problem (11). Then, the convergence Part iii of Theorem 2.1 is proved in Section 4. Eventually, the existence of minimizers is checked in Section 5.
This counterintuitive structuring of the proof of Theorem 2.1 is motivated by the fact that the existence of minimizers of I ε follows from proving that the corresponding Euler-Lagrange problem has a unique solution. One hence has to check the well-posedness of problem (11) first.

The forward-backward regularized problem
This section is devoted to proof of the well-posedness of the ε-regularized problem (11) is well-posed in the sense of Theorem 2.1.ii. Throughout the section, ε > 0 is fixed.
First of all, let A H be the random and time-dependent unbounded operator on H defined as It is not difficult to show that, for every (ω, t) ∈ Ω × [0, T ], the unbounded operator A H (ω, t, ·) is maximal monotone on H. Indeed, the monotonicity is an immediate consequence of the monotonicity of A. As for the maximality, note that the operator , is maximal monotone and coercive by assumption on A, hence it is surjective, which yields the maximality of A H (ω, t, ·). Furthermore, 3.1. The approximation. Since A H is maximal monotone on H in its last component, for any λ > 0 its resolvent and its Yosida approximation are well defined, respectively, as It is well-known that J λ and A λ are 1-and 1/λ-Lipschitz-continuous in their third component, respectively, uniformly in Ω × [0, T ]. Moreover, the Effros-measurability of A H implies that J λ and A λ are P ⊗ B(H)/B(H)-measurable (see for example [45,Prop. 3.12]).
For any λ > 0, we consider the approximated problem We say that a triplet (u ελ , v ελ , G ελ ) is a solution to the approximated problem (13) if for every t ∈ [0, T ], P-almost surely.

3.2.
Existence of solutions to the approximated problem. We prove here that the approximated problem (13) admits a solution (u ελ , v ελ , G ελ ). To this end, we characterize the the unique solution (u ελ , v ελ , G ελ ) as the unique minimizer of a suitable approximated WED functional.
Let us first introduce some preliminary notation. Note that we have the representation Moreover, it will be useful to introduce the notation The natural candidate as WED functional related to the approximated problem (13) is clearly given by In this spirit, we introduce the functional We now show that the approximated problem (13) is equivalent to the minimization of I ελ . In this direction, we aim now at characterizing the subdifferential of I ελ . This will follow after some intermediate steps.
First of all, we characterize the subdifferential of the sum I 1 ε + S ε .
Proof. First of all, it is clear that I 1 ε is proper, convex, and lower semicontinuous on I 2,2 (H, H). Moreover, we have that I 1 ε is actually Gâteaux-differentiable. Indeed, for every z, h ∈ I 2,2 (H, H) and δ = 0 we have where the second term on the right-hand side converges to 0 as δ → 0 since h ∈ I 2,2 (H, H). Hence, I 1 ε is Gâteaux-differentiable its Gâteaux-differential coincides with its subdifferential and it is given by Secondly, S ε is proper, convex, and lower semicontinuous on I 2,2 (H, H). Moreover, its subdifferential is given by . This implies that, for every z ∈ D(∂(I 1 ε + S ε )) and w ∈ ∂(I 1 ε + S ε ), we have w = ∂I 1 ε (z) +w for a certainw ∈ I 2,2 0 (H, H) ⊥ , as required. Now, we characterize the subdifferential of I 2 ελ . We are now able to characterize the subdifferential of the functional I ελ . if and only if there existsw ∈ I 2,2 0 (H, H) ⊥ such that, for every h ∈ I 2,2 (H, H), In particular, for every z ∈ I 2,2 (H, H) with z d (0) = u 0,ε and w ∈ ∂I ελ (z) it holds that . The thesis follows then directly from Lemma 3.1 and Lemma 3.2.
We have now all the tools in order to show existence of solutions to the approximated problem (13) via minimization of the regularized functional I ελ . Namely, we have the following result. Moreover, the triplet (z ελ , ∂ t z d ελ , (∂ t z d ελ ) s ) is a solution of the approximated problem (13).
Proof. We note first that the functional I 1 ε + S ε is strictly convex and coercive on I 2,2 (H, H), hence so is the functional I ελ by monotonicity of A λ . Since I 2,2 (H, H) is reflexive, this ensure the existence and uniqueness of a global minimizer z ελ ∈ I 2,2 (H, H) for I ελ . Clearly we have that z ελ ∈ D(I ελ ), so that z d ελ (0) = u 0,ε . Moreover, by definition of minimizer we have that 0 ∈ ∂I ελ (z ελ ) .
Hence, by Itô's formula we have, in differential (formal) form, that Integrating on [0, T ] and taking expectations we infer that Noting that the first term on the right-hand side appears in (14) as well, by substitution we infer then that for every h ∈ I 2,2 (H, H) such that h(0) = 0. Now, note that for any such h, we have that which yields in turn that Using this equality for the last term of (15) we obtain that for every h ∈ I 2,2 (H, H) with h d (0) = 0. Now, for any arbitrary C ∈ L 2 P (Ω; L 2 (0, T ; L 2 (U, H))), the process h C := C · W ∈ I 2,2 (H, H) satisfies h C (0) = 0 and is hence a possible test in (16). Since ∂ t h d C = 0, by arbitrariness of C we infer that z s ελ = B ε .
We are then left with the variational equality for all h ∈ I 2,2 (H, H) with h d (0) = 0. For any arbitrary K ∈ L 2 P (Ω; L 2 (0, T ; H)), note that the process satisfies h K (0) = 0, hence it is a possible test in equation (18). Since h s K = 0, we deduce that for every K ∈ L 2 P (Ω; L 2 (0, T ; H)). Let us stress that the first component of the scalar product appearing in this equality is not progressively measurable, hence one cannot simply deduce that it vanishes by arbitrariness of K. Nonetheless, note that by definition of conditional expectation and by the adaptedness of K, we have Since (F t ) t∈[0,T ] is the filtration generated by W and T 0 e −s/ε A λ (s, z ελ (s)) ds ∈ L 2 (Ω, F T ; H) , the process t → E T 0 e −s/ε A λ (s, z ελ (s)) ds F t is an H-valued continuous square-integrable martingale, and in particular is progressively measurable. We deduce then that the variational equality reads equivalently T ; H)) . At this point, since the process appearing on the left term of the scalar product belongs to the space L 2 P (Ω; L 2 (0, T ; H)), by arbitrariness of K we have that almost everywhere in Ω × [0, T ]. We deduce that there is a dP ⊗ dt-version of ∂ t z d ελ (which will be denoted with the same symbol for brevity of notation) such that Furthermore, by the classical martingale representation theorem in Hilbert spaces (see e.g. [25,Prop. 4.1] and [29]), there exists a process C ελ ∈ L 2 P (Ω; L 2 (0, T ; L 2 (U, H))) such that for every t ∈ [0, T ], from which it follows that It follows in particular that and It is then clear now from (17), (19), (20), and (21), and by uniqueness of the system (13), that (z ελ , ∂ t z d ελ , C ελ ) is a solution to the approximated problem (13).

Uniform estimates.
We want to pass now to the limit as λ ց 0 in (13). To this end, let us show some uniform estimates in λ, still with ε > 0 fixed.
Itô's formula for the square of the H-norm yields Note now that d(εv ελ , u ελ ) = εv ελ du ελ + u ελ εdv ελ + εd[G ελ , B] , which yields, taking (13) into account, Recalling that εv ε (T ) = 0, we deduce then that Now, noting that by definition of resolvent I − J λ = λA λ , recalling also that A λ (·) ∈ A(J λ (·)) and the coercivity condition for A, we have Hence, by comparing (22) and (23) we obtain Next, denoting by R 0 : V 0 → V * 0 the duality mapping, Itô's formula for the square of the V * 0 -norm of v ελ yields, by (13), for every t ∈ [0, T ], P-almost surely. We would like to write Itô's formula for the q-power of the V * 0 -norm of v ελ . Clearly, if p = 2 then also q = 2 and nothing has to be done. If p > 2 then we have q ∈ (1, 2) and this can be achieved by writing Itô's formula for the real function | · | q/2 . However, since q ∈ (1, 2) the function | · | q/2 is not of class C 2 , and this cannot be done straightaway. We need then to rely on a suitable approximation of the function | · | q/2 . Let us introduce to this end the approximations Clearly, we have that γ δ ∈ C ∞ ([0, +∞)), with Since γ δ is of class C 2 , we can use the classical finite dimensional Itô's formula (see e.e. [17]) and infer that Now, letting δ ց 0 it follows by the Dominated Convergence Theorem that Multiplying by ( ε 2 ) 1− q 2 , taking expectations, and using the Young inequality yields where c 0 denotes the norm of the continuous inclusion V 0 ֒→ V . Since by rearranging the terms and using the boundedness of A we deduce that for every t ∈ [0, T ], P-almost surely. Now, since 0 < q/2 < 1, its conjugate exponent −q/(2 − q) is negative: the reverse Young's inequality implies then that Taking this information into account we deduce from (25) that yielding, by the Gronwall lemma, Now, by multiplying the inequality (26) by e − T q 2 2(2−q) cA CAc q 0 and summing it with inequality (24), the last term on the right-hand side of (26) can be incorporated into the corresponding term on the left-hand side of (24): rearranging the terms, we obtain At this point, note the second and fourth terms on the right-hand side above can be handled using the averaged Young inequality: indeed, we infer that, for every σ > 0, p L p (Ω;L p (0,T ;L 2 (U,V0))) .
Choosing and fixing σ sufficiently small, independent of λ and ε, for example , rearranging the terms we deduce that there exists a positive constant M = M (c A , C A , c 0 , q, T ), independent of both λ and ε, such that At this point, note that by the assumptions (9)-(10) on (u 0,ε ) ε and (B ε ) ε , we have that the right-hand side is uniformly bounded in ε and λ.
Then, we deduce that, by updating the value of the constant M (here below and the following possibly changing from line to line), In particular, since (v ελ ) is uniformly bounded in L q P (Ω; L q (0, T ; V * 0 )) by (28) and (B ε ) is uniformly bounded in L 2 P (Ω; L 2 (0, T ; L 2 (U, H))) by (9), it follows from the definition of u ελ itself in (13) The boundedness of the operator A yields also Furthermore, following a classical argument employed in backward SPDEs, we can refine the estimate on (v ελ ). Indeed, let us recall the already obtained Itô's formula for v ελ in V * , which reads Instead of taking expectations at t fixed, we can now take supremum in time and then expectations. The first term on the right-hand side can be easily bounded using the Hölder inequality and the estimates (28) and (30) as The second term on the right-hand side can be bounded, thanks to Burkholder-Davis-Gundy and Young inequalities, as for every σ > 0 (independent of λ and ε). Hence, choosing σ sufficiently small (for example σ := q/2), rearranging the terms, and using the Hölder inequality yields Now, note that the right-hand side is uniformly bounded in λ and ε thanks to the inequality (25) and the already proved estimate (27). Consequently, we deduce that Moreover, from inequality (25), since the function r → |r| q−2 , r > 0, is decreasing, using again the reverse Young inequality and the estimate (27) we deduce that 3.4. Passage to the limit as λ ց 0. We pass now to the limit as λ ց 0, keeping ε > 0 fixed, and deduce existence of solutions for the regularized problem (12).
The estimates (27)- (32) imply that there exist (u ε ,û ε , v ε , ξ ε , G ε ) such that, as λ ց 0, Note that by the definition of Yosida approximation and estimate (27) we have which implies thatû ε = u ε . Moreover, by letting λ ց 0 in the forward equation in (13), we get yielding, a posteriori, also that u ε ∈ L 2 (Ω; C 0 ([0, T ]; H)). Similarly, letting λ ց 0 in the backward equation in (13) we obtain, by the weak convergences above, which yields a posteriori that v ε ∈ L q (Ω; C 0 ([0, T ]; V * 0 )). Furthermore, by comparison in the equation (13) it follows in particular that It only remains to show that ξ ε ∈ A(·, u ε ) almost everywhere. To this end, we recall that by comparison of (22) and (23) we have By the weak lower semicontinuity of the norms and the regularities of the data B ε and u 0,ε in condition (8) we infer then that lim sup λց0 E T 0 (A λ (u ελ (s)), u ελ (s)) ds where we have used the notation for a given fix complete orthonormal system (e k ) k∈N of U .
We claim now that the right-hand side of inequality (33) coincides with In order to show this, we replicate in the limit λ = 0 the Itô's formulas obtained for λ > 0 in (22) and (23). Unfortunately, since the limit process u ε is not a Itô-type process with values in V 0 , hence this cannot be done straightaway. We need to rely on a further approximation procedure.
Hence, we infer that Lemma 4.1. Let X be a Polish space and (Z n ) n be a sequence of X -valued random variables. Then (Z n ) n converges in probability if and only if for any pair of subsequences (Z n k ) k and (Z nj ) j , there exists a joint sub-subsequence (Z n k i , Z nj i ) i converging in law to a probability measure ν on X × X such that ν({(z 1 , z 2 ) ∈ X × X : z 1 = z 2 }) = 1.
This completes the proof of Theorem 2.1.iii.

Equivalence between regularized equation and minimization problem
This section is devoted to check that I ε admits a unique minimizer in V, and that this coincides with the unique solution to the ε-regularized problem. This proves Theorem 2.1.i. In all of this section ε > 0 is kept fixed.
A natural idea would be to identify the subdifferential of I ε in terms of ∂(I 1 ε + S ε ) and ∂I 2 ε . However, let us point out that the domain of I 2 ε , i.e. the space L p P (Ω; L p (0, T ; V )), may have empty interior in the topology of I 2,2 (H, H). For this reason, the analogous of [8, Thm. 2.10] is not applicable in this case, and we need to rely again on a further approximation, obtained by replacing A with its Yosida approximation A λ , for λ > 0.
We follow the following strategy instead. First of all, we show that the unique solution u ε to problem (11) is a minimizer for I ε . This ensures in particular that I ε admits at least a minimizer. Secondly, we note that actually I ε admits at most one minimizer. Finally, we conclude that minimizing I ε is equivalent to solving (11).
Proposition 5.1. The unique solution u ε to (11) is a minimiser for I ε .
Proof. From Section 3 we know that u ε can be constructed as limit in suitable topologies of a sequence (u ελ ) λ>0 , where u ελ is the unique first solution component of (13). By Proposition 3.4 we also know that such u ελ is the unique global minimizer of I ελ for all λ > 0, so that I ελ (u ελ ) ≤ I ελ (z) ∀ z ∈ I 2,2 (H, H) .