Indirect simple shooting

In this tutorial we present the indirect simple shooting method on a simple example.

Let us start by importing the necessary packages.

using OptimalControl    # to define the optimal control problem and its flow
using OrdinaryDiffEq    # to get the Flow function from OptimalControl
using NonlinearSolve    # interface to NLE solvers
using MINPACK           # NLE solver: use to solve the shooting equation
using Plots             # to plot the solution

Optimal control problem

Let us consider the following optimal control problem:

\[\left\{ \begin{array}{l} \min \displaystyle \frac{1}{2} \int_{t_0}^{t_f} u^2(t) \, \mathrm{d} t\\[1.0em] \dot{x}(t) = \displaystyle -x(t)+\alpha x^2(t)+u(t), \quad u(t) \in \R, \quad t \in [t_0, t_f] \text{ a.e.}, \\[0.5em] x(t_0) = x_0, \quad x(t_f) = x_f, \end{array} \right.%\]

with $t_0 = 0$, $t_f = 1$, $x_0 = -1$, $x_f = 0$, $\alpha=1.5$ and $\forall\, t \in [t_0, t_f]$, $x(t) \in \R$.

t0 = 0
tf = 1
x0 = -1
xf = 0
α  = 1.5
ocp = @def begin

    t ∈ [t0, tf], time
    x ∈ R, state
    u ∈ R, control

    x(t0) == x0
    x(tf) == xf

    ẋ(t) == -x(t) + α * x(t)^2 + u(t)

    ∫( 0.5u(t)^2 ) → min

end;

Boundary value problem

The pseudo-Hamiltonian of this problem is

\[ H(x,p,u) = p \, (-x+\alpha x^2+u) + p^0 u^2 /2,\]

where $p^0 = -1$ since we are in the normal case. From the Pontryagin Maximum Principle, the maximising control is given by

\[u(x, p) = p\]

since $\partial^2_{uu} H = p^0 = - 1 < 0$. Plugging this control in feedback form into the pseudo-Hamiltonian, and considering the limit conditions, we obtain the following two-points boundary value problem (BVP).

\[ \left\{ \begin{array}{l} \dot{x}(t) = \phantom{-} \nabla_p H[t] = -x(t) + \alpha x^2(t) + u(x(t), p(t)) = -x(t) + \alpha x^2(t) + p(t), \\[0.5em] \dot{p}(t) = - \nabla_x H[t] = (1 - 2 \alpha x(t))\, p(t), \\[0.5em] x(t_0) = x_0, \quad x(t_f) = x_f, \end{array} \right.\]

where $[t]~= (x(t),p(t),u(x(t), p(t)))$.

Our goal

Our goal is to solve this (BVP). Solving (BVP) consists in solving the Pontryagin Maximum Principle which provides necessary conditions of optimality.

Shooting function

To achive our goal, let us first introduce the pseudo-Hamiltonian vector field

\[ \vec{H}(z,u) = \left( \nabla_p H(z,u), -\nabla_x H(z,u) \right), \quad z = (x,p),\]

and then denote by $\varphi_{t_0, x_0, p_0}(\cdot)$ the solution of the following Cauchy problem

\[\dot{z}(t) = \vec{H}(z(t), u(z(t))), \quad z(t_0) = (x_0, p_0).\]

Our goal becomes to solve

\[\pi( \varphi_{t_0, x_0, p_0}(t_f) ) = x_f,\]

where $\pi(x, p) = x$. To compute $\varphi$ with OptimalControl.jl package, we define the flow of the associated Hamiltonian vector field by:

u(x, p) = p
φ = Flow(ocp, u)

We define also the projection function on the state space.

π((x, p)) = x

Nota bene

Actually, $\varphi_{t_0, x_0, p_0}(\cdot)$ is also solution of

\[ \dot{z}(t) = \vec{\mathbf{H}}(z(t)), \quad z(t_0) = (x_0, p_0),\]

where $\mathbf{H}(z) = H(z, u(z))$ and $\vec{\mathbf{H}} = (\nabla_p \mathbf{H}, -\nabla_x \mathbf{H})$. This is what is actually computed by Flow.

Now, to solve the (BVP) we introduce the shooting function:

\[ \begin{array}{rlll} S \colon & \R & \longrightarrow & \R \\ & p_0 & \longmapsto & S(p_0) = \pi( \varphi_{t_0, x_0, p_0}(t_f) ) - x_f. \end{array}\]

S(p0) = π( φ(t0, x0, p0, tf) ) - xf    # shooting function

Resolution of the shooting equation

At the end, solving (BVP) is equivalent to solve $S(p_0) = 0$. This is what we call the indirect simple shooting method. We define an initial guess.

ξ = [ 0.1 ]    # initial guess

NonlinearSolve.jl

We first use the NonlinearSolve.jl package to solve the shooting equation. Let us define the problem.

nle! = (s, ξ, λ) -> s[1] = S(ξ[1])    # auxiliary function
prob = NonlinearProblem(nle!, ξ)      # NLE problem with initial guess

Let us do some benchmarking.

using BenchmarkTools
@benchmark solve(prob; show_trace=Val(false))

BenchmarkTools.Trial: 574 samples with 1 evaluation per sample.
 Range (min … max):  6.790 ms … 152.417 ms  ┊ GC (min … max): 0.00% … 94.85%
 Time  (median):     8.046 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.708 ms ±   8.304 ms  ┊ GC (mean ± σ):  8.22% ±  8.37%

   ▃                     █                                     
  ▆██▆▄▄▅▄▃▄▃▃▂▁▂▃▃▃▂▂▂▁▄██▄▂▂▂▁▁▁▁▁▁▁▁▂▁▁▁▁▂▁▁▁▁▄▄▇▅▄▃▂▃▃▃▄▃ ▃
  6.79 ms         Histogram: frequency by time        9.99 ms <

 Memory estimate: 6.19 MiB, allocs estimate: 161324.

For small nonlinear systems, it could be faster to use the SimpleNewtonRaphson() descent algorithm.

@benchmark solve(prob, SimpleNewtonRaphson(); show_trace=Val(false))

BenchmarkTools.Trial: 800 samples with 1 evaluation per sample.
 Range (min … max):  4.615 ms … 152.229 ms  ┊ GC (min … max): 0.00% … 96.43%
 Time  (median):     6.102 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.247 ms ±   7.212 ms  ┊ GC (mean ± σ):  9.36% ±  8.27%

  ▃▃                 ██                                        
  ██▆▄▃▄▄▂▂▃▂▂▃▃▂▃▁▂▅███▃▃▂▁▃▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▂▁▁▁▂▁▂▁▂▂▂ ▃
  4.61 ms         Histogram: frequency by time        9.15 ms <

 Memory estimate: 4.57 MiB, allocs estimate: 99900.

Now, let us solve the problem and retrieve the initial costate solution.

indirect_sol = solve(prob; show_trace=Val(true))      # resolution of S(p0) = 0
p0_sol = indirect_sol.u[1]                            # costate solution
println("\ncostate:    p0 = ", p0_sol)
println("shoot: |S(p0)| = ", abs(S(p0_sol)), "\n")


Algorithm: NewtonRaphson(
   descent = NewtonDescent()
)

----     -------------        -----------
Iter     f(u) inf-norm        Step 2-norm
----     -------------        -----------
0        6.71224852e-02       0.00000000e+00
1        1.58897952e-03       2.86166386e-02
2        8.72173219e-07       6.46258281e-04
3        2.63049524e-13       3.55113667e-07
Final    2.63049524e-13
----------------------

costate:    p0 = 0.07202997482156899
shoot: |S(p0)| = 2.6304952369248135e-13

MINPACK.jl

Instead of the NonlinearSolve.jl package we can use the MINPACK.jl package to solve the shooting equation. To compute the Jacobian of the shooting function we use the DifferentiationInterface.jl package with ForwardDiff.jl backend.

using DifferentiationInterface
import ForwardDiff
backend = AutoForwardDiff()

Let us define the problem to solve.

nle!  = ( s, ξ) -> s[1] = S(ξ[1])                                 # auxiliary function
jnle! = (js, ξ) -> jacobian!(nle!, similar(ξ), js, backend, ξ)    # Jacobian of nle

We are now in position to solve the problem with the hybrj solver from MINPACK.jl through the fsolve function, providing the Jacobian. Let us do some benchmarking.

@benchmark fsolve(nle!, jnle!, ξ; show_trace=false)    # initial guess given to the solver

BenchmarkTools.Trial: 771 samples with 1 evaluation per sample.
 Range (min … max):  4.989 ms … 167.159 ms  ┊ GC (min … max): 0.00% … 92.99%
 Time  (median):     5.276 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.485 ms ±   7.670 ms  ┊ GC (mean ± σ):  7.81% ±  6.68%

   █▁                                                          
  ▇██▄▃▂▂▂▂▁▂▁▁▁▂▄▅▂▃▄█▄▅▅▂▂▂▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▂▁▁▁▁▂▁▂▂▁▂ ▂
  4.99 ms         Histogram: frequency by time        10.4 ms <

 Memory estimate: 4.18 MiB, allocs estimate: 120014.

We can also use the preparation step of DifferentiationInterface.jl.

extras = prepare_jacobian(nle!, similar(ξ), backend, ξ)
jnle_prepared!(js, ξ) = jacobian!(nle!, similar(ξ), js, backend, ξ, extras)
@benchmark fsolve(nle!, jnle_prepared!, ξ; show_trace=false)

BenchmarkTools.Trial: 774 samples with 1 evaluation per sample.
 Range (min … max):  4.986 ms … 155.931 ms  ┊ GC (min … max): 0.00% … 95.91%
 Time  (median):     5.321 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.456 ms ±   7.444 ms  ┊ GC (mean ± σ):  7.83% ±  6.72%

   █▃                                                          
  ▆██▇▄▃▃▂▂▂▁▂▂▁▂▁▁▁▂▅▄▃▁▃▆█▅▄▅▃▂▂▂▁▁▁▂▂▂▂▁▂▁▂▂▁▁▁▁▁▁▂▁▁▁▁▁▁▂ ▃
  4.99 ms         Histogram: frequency by time        9.36 ms <

 Memory estimate: 4.18 MiB, allocs estimate: 120009.

Now, let us solve the problem and retrieve the initial costate solution.

indirect_sol = fsolve(nle!, jnle!, ξ; show_trace=true)    # resolution of S(p0) = 0
p0_sol = indirect_sol.x[1]                                # costate solution
println("\ncostate:    p0 = ", p0_sol)
println("shoot: |S(p0)| = ", abs(S(p0_sol)), "\n")

Iter     f(x) inf-norm    Step 2-norm      Step time
------   --------------   --------------   --------------
     1     6.712249e-02     0.000000e+00         0.000789
     2     1.588980e-03     8.189120e-04         0.001745
     3     3.722710e-05     4.379408e-07         0.000727
     4     2.043880e-08     2.294999e-10         0.000719
     5     2.629559e-13     6.925509e-17         0.000699
     6     1.363970e-16     1.146337e-26         0.000710

costate:    p0 = 0.07202997482167604
shoot: |S(p0)| = 1.3639701760190786e-16

Plot of the solution

The solution can be plot calling first the flow.

sol = φ((t0, tf), x0, p0_sol)
plot(sol)

In the indirect shooting method, the research of the optimal control is replaced by the computation of its associated extremal. This computation is equivalent to finding the initial covector solution to the shooting function. Let us plot the extremal in the phase space and the shooting function with the solution.

pretty_plot — Function

using Plots.PlotMeasures

exp(p0; saveat=[]) = φ((t0, tf), x0, p0, saveat=saveat)

function pretty_plot(S, p0; Np0=20, kwargs...)

    times = range(t0, tf, length=2)
    p0_min = -0.5
    p0_max = 2
    p0_sol = p0

    # plot of the flow in phase space
    plt_flow = plot()
    p0s = range(p0_min, p0_max, length=Np0)
    for i ∈ eachindex(p0s)
        sol = exp(p0s[i])
        x = [state(sol)(t)   for t ∈ time_grid(sol)]
        p = [costate(sol)(t) for t ∈ time_grid(sol)]
        label = i==1 ? "extremals" : false
        plot!(plt_flow, x, p, color=:blue, label=label)
    end

    # plot of wavefronts in phase space
    p0s = range(p0_min, p0_max, length=200)
    xs  = zeros(length(p0s), length(times))
    ps  = zeros(length(p0s), length(times))
    for i ∈ eachindex(p0s)
        sol = exp(p0s[i], saveat=times)
        xs[i, :] .= state(sol).(times)
        ps[i, :] .= costate(sol).(times)
    end
    for j ∈ eachindex(times)
        label = j==1 ? "flow at times" : false
        plot!(plt_flow, xs[:, j], ps[:, j], color=:green, linewidth=2, label=label)
    end

    # 
    plot!(plt_flow, xlims=(-1.1, 1), ylims=(p0_min, p0_max))
    plot!(plt_flow, [0, 0], [p0_min, p0_max], color=:black, xlabel="x", ylabel="p", label="x=xf")
    
    # solution
    sol = exp(p0_sol)
    x = [state(sol)(t)   for t ∈ time_grid(sol)]
    p = [costate(sol)(t) for t ∈ time_grid(sol)]
    plot!(plt_flow, x, p, color=:red, linewidth=2, label="extremal solution")
    plot!(plt_flow, [x[end]], [p[end]], seriestype=:scatter, color=:green, label=false)

    # plot of the shooting function 
    p0s = range(p0_min, p0_max, length=200)
    plt_shoot = plot(xlims=(p0_min, p0_max), ylims=(-2, 4), xlabel="p₀", ylabel="y")
    plot!(plt_shoot, p0s, S, linewidth=2, label="S(p₀)", color=:green)
    plot!(plt_shoot, [p0_min, p0_max], [0, 0], color=:black, label="y=0")
    plot!(plt_shoot, [p0_sol, p0_sol], [-2, 0], color=:black, label="p₀ solution", linestyle=:dash)
    plot!(plt_shoot, [p0_sol], [0], seriestype=:scatter, color=:green, label=false)

    # final plot
    plot(plt_flow, plt_shoot; layout=(1,2), leftmargin=15px, bottommargin=15px, kwargs...)

end

pretty_plot(S, p0_sol; size=(800, 450))