Optimal control problem with free final time

In this tutorial, we study an optimal control problem with a free final time tf. The following packages are required:

using LinearAlgebra: norm
using NLPModelsIpopt
using NonlinearSolve
using OptimalControl
using OrdinaryDiffEq
using Plots
using Printf

Definition of the problem

We consider the following setup:

t0 = 0       # Initial time
x0 = [0, 0]  # Initial state
xf = [1, 0]  # Desired final state

@def ocp begin

    tf ∈ R, variable           # Final time is free (a decision variable)
    t ∈ [t0, tf], time         # Time interval depends on tf
    x ∈ R², state              # State vector x = [x₁, x₂]
    u ∈ R, control             # Scalar control

    # Control and time constraints
    -1 ≤ u(t) ≤ 1
    0.05 ≤ tf ≤ Inf            # Lower bound helps numerical convergence

    # Boundary conditions
    x(t0) == x0
    x(tf) == xf

    # Dynamics
    ẋ(t) == [x₂(t), u(t)]

    # Objective: minimize final time
    tf → min
end

The free final time nature of the problem is explicitly stated by:

tf ∈ R, variable

and the time definition

t ∈ [t0, tf], time

indicates that the time horizon itself depends on the optimization variable tf.

Direct resolution

To improve convergence of the direct solver, we constrain tf as follows:

0.05 ≤ tf ≤ Inf

We then solve the optimal control problem with a direct method, using an automatic handling of the free final time:

sol = solve(ocp; grid_size=100)

▫ This is OptimalControl version v1.1.6 running with: direct, adnlp, ipopt.

▫ The optimal control problem is solved with CTDirect version v0.17.4.

   ┌─ The NLP is modelled with ADNLPModels and solved with NLPModelsIpopt.
   │
   ├─ Number of time steps⋅: 100
   └─ Discretisation scheme: midpoint

▫ This is Ipopt version 3.14.19, running with linear solver MUMPS 5.8.1.

Number of nonzeros in equality constraint Jacobian...:      904
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:      201

Total number of variables............................:      303
                     variables with only lower bounds:        1
                variables with lower and upper bounds:      100
                     variables with only upper bounds:        0
Total number of equality constraints.................:      204
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.0000000e-01 9.00e-01 0.00e+00   0.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  3.5801080e-01 8.77e-01 1.22e+02   1.0 1.05e+01    -  3.05e-01 2.58e-02f  3
   2  8.8421252e-01 8.25e-01 5.41e+02   0.4 9.70e+00    -  8.29e-02 5.87e-02h  2
   3  1.4678934e+00 6.93e-01 1.76e+02   0.6 3.64e+00    -  1.00e+00 1.60e-01h  1
   4  3.2773408e+00 7.30e-03 1.40e+02   0.5 1.81e+00    -  9.77e-01 1.00e+00f  1
   5  2.8033485e+00 5.21e-04 1.71e+01  -1.0 4.74e-01    -  9.79e-01 1.00e+00h  1
   6  2.6891402e+00 4.60e-05 9.92e-02  -2.2 1.14e-01   0.0 1.00e+00 1.00e+00h  1
   7  2.2992781e+00 1.01e-03 1.33e-01  -2.9 3.90e-01    -  1.00e+00 1.00e+00f  1
   8  2.0121466e+00 1.61e-03 2.23e-02  -3.3 5.60e-01    -  9.90e-01 1.00e+00h  1
   9  2.0190855e+00 1.44e-04 2.22e-03  -3.7 6.95e-01    -  9.85e-01 9.08e-01h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  2.0056026e+00 8.34e-05 6.84e-04  -5.3 8.80e-01    -  7.56e-01 7.58e-01h  1
  11  2.0005424e+00 4.50e-05 1.99e-05  -5.3 8.89e-01    -  9.86e-01 1.00e+00h  1
  12  2.0000009e+00 2.13e-07 3.71e-07 -11.0 3.65e-02    -  9.82e-01 1.00e+00h  1
  13  2.0000000e+00 4.91e-12 5.80e-13 -11.0 5.49e-04    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 13

                                   (scaled)                 (unscaled)
Objective...............:   1.9999999910286173e+00    1.9999999910286173e+00
Dual infeasibility......:   5.8015213681809998e-13    5.8015213681809998e-13
Constraint violation....:   4.9146242631081805e-12    4.9146242631081805e-12
Variable bound violation:   9.4892587121364613e-09    9.4892587121364613e-09
Complementarity.........:   1.6476668104042548e-11    1.6476668104042548e-11
Overall NLP error.......:   1.6476668104042548e-11    1.6476668104042548e-11


Number of objective function evaluations             = 19
Number of objective gradient evaluations             = 14
Number of equality constraint evaluations            = 19
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 14
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 13
Total seconds in IPOPT                               = 3.257

EXIT: Optimal Solution Found.

The solution can be visualized as:

plt = plot(sol; label="direct", size=(800, 600))

Verification of the results

Using Pontryagin's Maximum Principle, we know that the optimal trajectory consists of two arcs: one with $u=1$ and one with $u=-1$. Without further calculations, the optimal trajectory is:

\[x_1(t) = \begin{cases} \tfrac{1}{2}t^2, & t \in [0,1),\\ -\tfrac{1}{2}t^2 + 2t - 1, & t \in [1,2], \end{cases} \quad x_2(t) = \begin{cases} t, & t \in [0,1),\\ 2-t, & t \in [1,2]. \end{cases}\]

The optimal control is:

\[u(t) = \begin{cases} 1, & t \in [0,1),\\ -1, & t \in [1,2]. \end{cases}\]

And the costate, the switching time and the final time are:

\[p_1(t)=1, \quad p_2(t)=1-t, \quad p^0=-1, \quad t_1 = 1, \quad t_f = 2.\]

We can now compare the direct numerical solution with this theoretical result:

tf = variable(sol)
u = control(sol)
p = costate(sol)
x = state(sol)
H(t) = p(t)[1]*x(t)[2] + p(t)[2]*u(t)

@printf("tf = %.5f", tf); println(lpad("(expected 2)", 33))
@printf("H(t0) = %.5f", H(t0)); println(lpad("(expected 1)", 30))
@printf("H(tf) = %.5f", H(tf)); println(lpad("(expected 1)", 30))
@printf("x(t0) = [%.5f, %.5f]", x(t0)[1], x(t0)[2]); println(lpad("(expected [0, 0])", 23))
@printf("x(tf) = [%.5f, %.5f]", x(tf)[1], x(tf)[2]); println(lpad("(expected [1, 0])", 23))
@printf("p(t0) = [%.5f, %.5f]", p(t0)[1], p(t0)[2]); println(lpad("(expected [1, 1])", 24))
@printf("p(tf) = [%.5f, %.5f]", p(tf)[1], p(tf)[2]); println(lpad("(expected [1, -1])", 24))

tf = 2.00000                     (expected 2)
H(t0) = 0.99568                  (expected 1)
H(tf) = 1.00432                  (expected 1)
x(t0) = [0.00000, -0.00000]      (expected [0, 0])
x(tf) = [1.00000, 0.00000]      (expected [1, 0])
p(t0) = [1.00000, 0.99568]       (expected [1, 1])
p(tf) = [1.00000, -1.00432]      (expected [1, -1])

The numerical results match the theoretical solution almost exactly.

Indirect method

We now solve the same problem using an indirect method (shooting approach). We begin by defining the pseudo-Hamiltonian and the switching function.

# Pseudo-Hamiltonian
H(x, p, u) = p[1]*x[2] + p[2]*u

# Hamiltonian lift used to compute the switching function
F1(x) = [0, 1]
H1 = Lift(F1)

Define the flows corresponding to the two control laws $u=+1$ and $u=-1$:

const u_pos = 1
const u_neg = -1

# Each flow depends on tf since it is part of the optimization variables
f_pos = Flow(ocp, (x, p, tf) -> u_pos)
f_neg = Flow(ocp, (x, p, tf) -> u_neg)

The shooting function enforces the final and switching conditions:

function shoot!(s, p0, t1, tf)
    x_t0 = x0
    p_t0 = p0

    # Forward integration under u = +1, then u = -1
    x_t1, p_t1 = f_pos(t0, x_t0, p_t0, t1)
    x_tf, p_tf = f_neg(t1, x_t1, p_t1, tf)

    u_tf = -1  # Final control

    # Shooting conditions:
    s[1:2] = x_tf - xf                # Terminal state must match xf
    s[3]   = H(x_tf, p_tf, u_tf) - 1  # Transversality condition
    s[4]   = H1(x_t1, p_t1)           # Switching condition φ(t₁)=0
end

To help the nonlinear solver converge, we build a good initial guess from the direct solution:

t = time_grid(sol)
x = state(sol)
p = costate(sol)
φ(t) = H1(x(t), p(t))  # Switching function

p0 = p(t0)
t1 = t[argmin(abs.(φ.(t)))]
tf = t[end]

println("p0 = ", p0)
println("t1 = ", t1)
println("tf = ", tf)

# Evaluate the norm of the shooting function at initial guess
s = zeros(4)
shoot!(s, p0, t1, tf)
println("\n‖s‖ (initial guess) = ", norm(s), "\n")

p0 = [0.9999999955091624, 0.9956848379593445]
t1 = 0.9999999955143086
tf = 1.9999999910286173

‖s‖ (initial guess) = 0.006102541639648192

We can now solve the system using an indirect shooting method:

# Aggregated nonlinear system
shoot!(s, ξ, λ) = shoot!(s, ξ[1:2], ξ[3], ξ[4])

# Define the problem and initial guess
ξ_guess = [p0..., t1, tf]
prob = NonlinearProblem(shoot!, ξ_guess)

# Solve the nonlinear system
indirect_sol = solve(prob; show_trace=Val(true), abstol=1e-8, reltol=1e-8)


Algorithm: NewtonRaphson(
    descent = NewtonDescent(),
    autodiff = AutoForwardDiff(),
    vjp_autodiff = AutoReverseDiff(
        compile = false
    ),
    jvp_autodiff = AutoForwardDiff(),
    concrete_jac = Val{false}()
)

----    	-------------       	-----------
Iter    	f(u) inf-norm       	Step 2-norm
----    	-------------       	-----------
0       	4.31515306e-03      	0.00000000e+00
1       	7.34009524e-16      	4.31516204e-03
Final   	7.34009524e-16
----------------------

retcode: Success
u: 4-element Vector{Float64}:
 1.0
 1.0
 1.0000000000000004
 2.0000000000000004

Finally, we compare the indirect solution to the direct one:

p0 = indirect_sol.u[1:2]
t1 = indirect_sol.u[3]
tf = indirect_sol.u[4]

# Reconstruct the full trajectory
f = f_pos * (t1, f_neg)
flow_sol = f((t0, tf), x0, p0; saveat=range(t0, tf, 200))

# Plot comparison
plot!(plt, flow_sol; label="indirect", color=2)