An Algorithm for Finding Feedback in a Problem with Constraints for One Class of Nonlinear Control Systems

We consider the construction of a feedback according to the Kalman algorithm for a continuous nonlinear control system on a finite time interval with control constraints where the right-hand side of the dynamics equations is linear in control and linearizable in the vicinity of the zero equilibrium position. The solution of an auxiliary optimal control problem with a quadratic functional is used for this task by analogy with the SDRE approach. Because this approach is used in the literature to find suboptimal synthesis in optimal control problems with a quadratic functional with formally linear systems where all coefficient matrices in differential equations and criteria can contain state variables, on a finite time interval it becomes necessary to solve a complicated matrix differential Riccati equations with state-dependent coefficient matrices. Due to the nonlinearity of the system this issue significantly increases the number of calculations for obtaining the coefficients of the gain matrix in the feedback and for obtaining synthesis with a given accuracy in comparison with the Kalman algorithm for linear-quadratic problems. The proposed synthesis construction algorithm is constructed using the extension principle proposed by V.F. Krotov and developed by V.I. Gurman and allows one not only to expand the scope of the SDRE approach to nonlinear control problems with control constraints in the form of closed inequalities, but also to propose a more efficient computational algorithm for finding the matrix of feedback gains in control problems on a finite interval. This article establishes the correctness of the application of the extension principle by introducing analogs of the Lagrange multipliers, which depend on the state and time, and also derives a formula for the suboptimal value of the quality criterion. The presented theoretical results are illustrated by calculating suboptimal feedbacks in the problems of managing three-sector economic systems.


INTRODUCTION
The problems of synthesizing control laws continue to be real due to the need to build feedback laws in various applications. This occurs at an increasing complexity of mathematical models due to the need to account for nonlinearities, perturbations, increased dimensionality of the state and control vectors, etc. In this connection, there is a continuous search for new approaches to the construction of synthesizing control laws in nonlinear problems and development of existing methods of construction of the laws of feedback in dynamic systems. In this case, of course, the presence of restrictions complicates the search for such control algorithms. Since 1990s the so-called SDRE approach has been actively developed in the literature (see, e.g., [1][2][3][4][5][6]) for approximate solutions of synthesis problems in the nonlinear case for optimal control problems in the classical formulation without constraints on control. In this method, for approximate solution of nonlinear optimal control problems on both an infinite interval as well as a finite one the right parts of ordinary differential equations of dynamics are first reduced to the formally linear state and control where the coefficients of all matrices can be state-dependent. In this case the feedback is constructed by solving the corresponding linear-quadratic optimal control problems where the coeffi-cients of weight matrices in the optimality criterion may also depend on state variables. Then, the matrix of the controller gain coefficients is found by the solution of Riccati type matrix equations, both algebraic for stabilization problems on the half axis, and differential, for the control problems on a finite time interval respectively. As numerous experiments have shown, this heuristic approach generates many possible suboptimal solutions due to the ambiguity of the representation of a nonlinear system as a system of linear structure and errors arising in the numerical solution of the matrix Riccati equations, which coefficients also depend on the state. However, given the complexity of construction and the importance for applications of control in the form of feedback laws in nonlinear systems, the SDRE approach has been widely used in the literature in the approximate solution of nonlinear optimal control problems without constraints on control.
Here, for one class of nonlinear controllable systems at a finite time interval we show the possibility of justifying the SDRE approach in constructing feedback in problems with control constraints by means of the expansion principle proposed by V.F. Krotov [7] and developed by V.I. Gurman [8]. At the same time, a more efficient algorithm is proposed in terms of computational volume, which does not require multiple integration of the Riccati matrix differential equations with state-dependent coefficients. We note that for the first time application of the Krotov expansion principle within the SDRE approach was illustrated in [9] for the problem of constructing a stabilizing controller without control constraints over an infinite time interval.
At the end of this paper we present the results of a computational experiment illustrating the proposed feedback algorithm on the example of solving the problem of optimal control of a three-sector nonlinear model of an economic object with constraints on the control in the form of closed inequalities.

THEORETICAL RESULTS
Let the nonlinear controlled system have the form (1) where are state and control vector-functions, the n × n matrix A is constant, coefficients of the n × m matrix B(y) and components of the vector h(y) are bounded and continuously differentiable over , and the control is a piecewise continuous vector function; t 0 , T are the given initial and final moment of time and the control u(t) at each moment of time satisfies the constraints (2) The form (1) is related to the consideration of nonlinear control problems when the initial system of equations of the dynamics can have a nonzero equilibrium position x s at some constant control v s , and the inhomogeneity h(y) is related to the transformation y = x -x s and the separation of the linear part of y, i.e., h(0) = 0.
We will search for the control based on the SDRE approach using the optimality criterion where the matrix Q is assumed to be positively semidefinite for all allowed y, and the matrices F, R are constant and positively defined. Now, the problem is to find the control u(t) in the form of feedback by fitting matrix Q(y), so that the minimum of criterion (3) along the trajectories of the system (1) is achieved under the constraints (2). This approach is inspired, on the one hand, by the hope that under such control the trajectories of the closed system at least remain bounded, and, on the other hand, would be realistic in some sense.
We note that the weight matrix Q(y) in the criterion will be represented as (4) where B s = B(0), Q 1 is some positively defined constant matrix chosen so that the matrix Q(y) for all y is positively semidefinite, the matrix K is a solution of some Riccati matrix algebraic equation with constant coefficients and the nondegenerate matrix W will be defined below. Thus, we will search for the optimal control in the form of the feedback u(y, t) in problem (1)- (3). For the solution we will use the expansion principle [7,8], which implies that the original problem of the optimal control problem with constraints is reduced to a problem without constraints and in this case the solu- tion of the new problem is at the same time the solution of the original problem [9][10][11][12][13]. For this purpose let us replace problem (1)-(3) by one without constraints using the Lagrange multipliers . The non-negative functions correspond to the control constraints, the Lagrange multiplier μ(y, t) is introduced for taking differential relations in (1) into account and is found in the form , and function λ 3 (t) corresponds to the introduced relation of the equality . Thus, we replace the functional (3) by the scheme [7,8] as follows (5) and introduce the functions Let us rewrite (5) as (9) Let us introduce the set of all admissible controls satisfying the condition and their corresponding trajectories y(t) of the system (1) defined on the interval , i.e., the set of all admissible pairs {y(t), u(t)}, which we denote by (10) Given (7), L(y,u) takes the form (11) where the original problem with constraints is reduced to a constraints-free problem. Now, let us turn to the continuously differentiable function v(y, t) in (8) and calculate its full derivative with respect to time .
Below, we will use the notation for almost all . The following is valid. Lemma. Let the conditions be satisfied: (1) The pair ; (2) There exists such that on the optimal pair , which gives the minimum value to the functional (11), the following conditions are fulfilled (12) λ λ λ μ  Then, the following relations are true (13) where are the permissible and optimal pairs, respectively. Proof. For the function v(y, t) from (8) we have (14) By integrating (14) and summing the obtained expression with functional (5) we obtain Hence, with (6) and (7) we have and Let us show that the inequality holds over the set of permissible solutions. In fact, we let , then by (1) from (5), given , we obtain (15) From the logic sequence in (15) we see that along the optimal pair , given (12), the following equality holds and, thus, the statement of the lemma holds.
To determine the pair minimizing the function (11) it is necessary to find the control u(t) and define multipliers such that for each fixed the integrand function M(y, u, t) in (11) achieves the lowest value among . Given that at t = T the function G(y 0 , y(T)) takes the minimum value and the pair satisfies (1) under constraint (2), then from the necessary minimum conditions for the control for the function M(y, u, t), we obtain the expression for the control (16) and from the minimization of the terminal part (7) of the functional (11) we find . Now, we define the unknown matrices K, W(t) and vector-function q as solutions on the interval for the following equations      it follows that the extremum control is represented as which considering the notations in (21) takes the form (23). Then, we define the multipliers such that, on the one hand, the following conditions hold true (24) and, on the other hand, we choose λ 1 , λ 2 , ϕ such that the following representations are valid (25) and (26) Let us now define the function M(y, u, t) on the control (23). By substituting the control (23) into the expression (6) and grouping similar terms, we end up with function Then, along the positively defined matrix K from (17) and the function q(t, y) from (22) for we have (27) where at the point t = T the function in (7) also takes a minimum value, which determines the minimal value of (27) as a whole.
Thus, we have Now, it is not difficult to show that the control (23) is optimal, i.e., along it, the criterion (3) takes the minimum value. We suppose that there exist functions satisfying the theorem and an arbitrary allowed pair , then, according to (24) we will have hence, and, thus, the theorem is proved. Remark 1. In the SDRE approach with the choice of Q 1 such that the matrix like Q(y) is positively defined for all values of the state vector was demonstrated in [14].
The following is then true.
,    Then, we transform the minimal value of the functional (32) using (33) and obtain Now, given the representation q(y, t) = W -1 (t)y(t) and the condition that q| t=T = (F -K)y(T) we obtain that formula (29) is valid.

u J u L y t u t L y t u t M y t u t t M y t u t t dt G y y T G y y T M y t u t t M dt G y y T G
Remark 2. We note that formula (29) includes the well-known representation of the initial condition for the Bellman function in the optimal stabilization problem for the stationary linearly quadratic problem     on the semiaxis without control constraints and it still naturally reflects the increase of the minimum value due to the control restrictions, and the influence of inhomogeneity h(y) is taken into account as well. Let us describe the algorithm for solving the optimal control problem (1)-(3).
(1) Find a positively defined matrix Q 1 such that Q(y) is also a positively defined for all y.
(2) The systems of algebraic and differential equations (17) and (18) for determining the matrices K and W(t) on the interval [t 0 , T] are solved.

Numerical Experiments
We consider the problem of optimal control for the economic model of the control object consisting of three sectors: i = 0 (material sector), i = 1 (fund-generating sector), and i = 2 (capital sector), the mathematical model includes [15,16]: (a) three differential equations describing the dynamics of fund-creating sectors: (34) (b) three Cobb-Douglas-type specific output functions: (c) three balance relations: Here, the state of the economic system (the capitalization) is described by the vector (k 0 , k 1 , k 2 ) and (s 0 , s 1 , s 2 , θ 0 , θ 1 ,θ 2 ) is a vector of controls, (s 0 , s 1 , s 2 ) are the shares of sectors in the distribution of investment resources, (θ 0 , θ 1 ,θ 2 ) are the shares of sectors in distribution of labor resources; x i is the specific production output in the corresponding sector; and β i is direct material costs in the i-th sector; i = 0, 1, 2. The initial state of the system is equal to ( ) where = k i (0). We solve the problem of translation of the nonlinear system from the initial state to the desired state on the time interval [0, T]. The equilibrium state of the system is used as the desired final state ( ), which is defined in [12] in the following form: The values of the capitalization , (i = 0, 1, 2) in the stationary state (39) depend on the controls (s 0 , s 1 , s 2 , θ 0 , θ 1 ,θ 2 ) for which the stationary values are obtained [12]. Let us now reduce the mathematical model of the control object (34) to the form (1) and write it in the form of the system (40) Using the following notations y 1 = k 1 -, y 2 = k 2 -, y 3 = k 0 -, u 1 = s 1 - , . The initial conditions are given as y(t 0 ) = (-700, -300, 300)', and the matrices R, Q 1 , K are The results of the system state calculations are shown in Fig 1a. Figure 1b shows that the optimal controls do not go beyond the region U defined by the constraints -0.3477 , -0.1024 -0.07945 . Here, all the control components u 1 (t), u 2 (t) and u 3 (t) lie on the boundary of the domain U on the interval [0, t 1 ], [0, t 2 ] and [0, t 3 ], respectively, then at they enter the region U. The control switching occurs at time t 1 = 1.439 for the component u 1 (t) and for u 2 (t) at t 2 = 0.4, for u 3 (t) at t 3 = 9.785. y 1 (T) = -4.8692 × 10 -6 ; y 2 (T) = -1.7315 × 10 -6 ; y 3 (T) = 0.229 × 10 -3 , and the optimal control values at the finite time at T = 20: u 1 (T) = 2.2873 × 10 -8 ; u 2 (T) = 6.5404 × 10 -9 ; u 3 (T) = -3.4357 × 10 -7 . Figure 2 shows the resources changes that satisfy the balance relations (36), (37). The values of investment    (s 0 (t), s 1 (t), s 2 (t)) and labor resources (θ 0 (t), θ 1 (t), θ 2 (t) ) at a finite point in time at T = 20 tend to a stationary state.
CONCLUSIONS For a particular class of nonlinear controllable systems on a finite interval, we show the possibility of applying the SDRE approach when constructing the feedback on the constrained control problem by applying the expansion principle proposed by Krotov and developed by Gurman; an effective algorithm is proposed that does not require multiple integration of the Riccati differential equation with statedependent coefficients.
Numerical experiments illustrating the proposed algorithm for optimal synthesis are also presented for optimal synthesis with control constraints in the form of closed inequalities based on the example of a three-sector nonlinear economic system.