Discretisation methods

In this tutorial, we will explore the different discretisation methods available in the OptimalControl.jl package. These methods are used to convert a continuous-time optimal control problem (OCP) into a discrete-time nonlinear programming (NLP) problem, which can then be solved using numerical optimization techniques.

Let us import the necessary packages and define the optimal control problem (Goddard problem) we will use as an example throughout this tutorial.

using BenchmarkTools  # for benchmarking
using DataFrames      # to store the results
using OptimalControl  # to define the optimal control problem and more
using NLPModels       # to retrieve information about the NLP problem
using NLPModelsIpopt  # to solve the problem via a direct method
using Plots           # to plot the solution

t0 = 0      # initial time
r0 = 1      # initial altitude
v0 = 0      # initial speed
m0 = 1      # initial mass
vmax = 0.1  # maximal authorized speed
mf = 0.6    # final mass to target

# Goddard problem function
function goddard_problem()
    ocp = @def begin # definition of the optimal control problem

    tf ∈ R, variable
    t ∈ [t0, tf], time
    x = (r, v, m) ∈ R³, state
    u ∈ R, control

    x(t0) == [r0, v0, m0]
    m(tf) == mf
    0 ≤ u(t) ≤ 1
    r(t) ≥ r0
    0 ≤ v(t) ≤ vmax

    ẋ(t) == F0(x(t)) + u(t) * F1(x(t))

    r(tf) → max

    end
    return ocp
end

ocp = goddard_problem()

# Dynamics
Cd = 310
Tmax = 3.5
β = 500
b = 2

F0(x) = begin
    # Uncontrolled dynamics: gravity and drag
    r, v, m = x
    D = Cd * v^2 * exp(-β*(r - 1)) # Aerodynamic drag
    return [ v, -D/m - 1/r^2, 0 ]   # [dr/dt, dv/dt, dm/dt]
end

F1(x) = begin
    # Control dynamics: thrust contribution
    r, v, m = x
    return [ 0, Tmax/m, -b*Tmax ]   # [dr/dt, dv/dt, dm/dt] due to thrust
end

Discretisation schemes

When calling solve, the option scheme=... can be used to specify the discretization scheme. In addition to the default implicit :midpoint method, other available options include the implicit :trapeze method (also known as Crank–Nicolson), and Gauss–Legendre collocation schemes with 2 and 3 stages: :gauss_legendre_2 and :gauss_legendre_3, of order 4 and 6 respectively. Note that higher-order methods typically result in larger NLP problems for the same number of time steps, and their accuracy also depends on the smoothness of the problem.

Let us first solve the problem with the default :midpoint method and display the solution.

sol = solve(ocp; scheme=:midpoint, display=false)
@assert successful(sol) "Solution failed"
plot(sol; size=(800, 800))
Example block output

Let us now compare different discretization schemes to evaluate their accuracy and performance. See the Strategy options for information on default values for parameters like tol.

# Solve the problem with different discretization methods
solutions = []
data = DataFrame(
    Scheme=Symbol[],
    Time=Float64[],
    Objective=Float64[],
    Iterations=Int[]
)
schemes = [
    :euler,
    :euler_implicit,
    :trapeze,
    :midpoint,
    :gauss_legendre_2,
    :gauss_legendre_3
]
for scheme in schemes
    bt = @btimed solve($ocp; scheme=$scheme, tol=1e-8, display=false)
    local sol = bt.value
    #@assert successful(sol) "Solution failed for scheme=$scheme"
    push!(solutions, (scheme, sol))
    push!(data, (
        Scheme=scheme,
        Time=bt.time,
        Objective=objective(sol),
        Iterations=iterations(sol)
    ))
end
println(data)
6×4 DataFrame
 Row │ Scheme            Time      Objective  Iterations
     │ Symbol            Float64   Float64    Int64
─────┼───────────────────────────────────────────────────
   1 │ euler             0.239747    1.01242          30
   2 │ euler_implicit    0.239095    1.01273          29
   3 │ trapeze           0.279099    1.01258          30
   4 │ midpoint          0.341014    1.01258          31
   5 │ gauss_legendre_2  0.901779    1.01258          21
   6 │ gauss_legendre_3  2.03535     1.01258          21
# Plot the results
x_style = (legend=:none,)
p_style = (legend=:none,)
u_style = (legend=:topright,)
styles = (state_style=x_style, costate_style=p_style, control_style=u_style)

plt = plot()
for (i, (scheme, sol)) in enumerate(solutions)
    plt = plot!(sol; label=string(scheme), color=i, styles...)
end
plot(plt; size=(800, 800))
Example block output

Large problems and AD backend

For some large problems, you may notice that the solver takes a long time before the iterations actually begin. This is due to the computation of sparse derivatives — specifically, the Jacobian of the constraints and the Hessian of the Lagrangian — which can be quite costly. One possible alternative is to set the option backend=:manual in the modeler, which uses simpler sparsity patterns. The resulting matrices are faster to compute but are also less sparse, so this represents a trade-off between automatic differentiation (AD) preparation time and the efficiency of the optimization itself.

disc = OptimalControl.Collocation(grid_size=1000, scheme=:gauss_legendre_3)
mod = OptimalControl.ADNLP(backend=:manual)
sol = OptimalControl.Ipopt()
solve(ocp; discretizer=disc, modeler=mod, solver=sol)
▫ This is OptimalControl 2.0.1, solving with: collocationadnlpipopt (cpu)

  📦 Configuration:
   ├─ Discretizer: collocation (grid_size = 1000, scheme = gauss_legendre_3)
   ├─ Modeler: adnlp (backend = manual)
   └─ Solver: ipopt

▫ This is Ipopt version 3.14.19, running with linear solver MUMPS 5.8.2.

Number of nonzeros in equality constraint Jacobian...:   174028
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:   104022

Total number of variables............................:    13004
                     variables with only lower bounds:     1001
                variables with lower and upper bounds:     2001
                     variables with only upper bounds:        0
Total number of equality constraints.................:    12004
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.0100000e+00 2.22e+00 2.00e+00   0.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  1.0359714e+00 2.59e+00 2.26e+04  -1.0 6.29e+00    -  1.26e-01 9.81e-01f  1
   2  1.0421591e+00 2.58e+00 2.25e+04   0.5 1.61e+02    -  1.00e+00 5.62e-03f  1
   3  1.0581309e+00 2.37e+00 1.98e+05   1.6 2.29e+01    -  1.00e+00 7.89e-02f  1
   4  1.0297827e+00 1.37e+00 5.30e+05   2.3 1.01e+01    -  9.95e-01 6.06e-01h  1
   5  1.0168610e+00 7.22e-01 1.25e+06   2.3 4.82e+00    -  5.95e-01 9.84e-01h  1
   6  1.0101725e+00 8.73e-02 2.61e+07   2.3 2.06e+00    -  1.00e+00 8.67e-01h  1
   7  1.0088127e+00 5.28e-03 7.50e+05   1.6 6.07e-01    -  1.00e+00 9.97e-01h  1
   8  1.0088898e+00 7.84e-05 1.14e+03   0.2 9.19e-02    -  1.00e+00 1.00e+00f  1
   9  1.0088804e+00 2.32e-06 1.10e+01  -1.5 2.05e-02    -  1.00e+00 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  1.0088776e+00 2.71e-07 9.58e-02  -3.6 7.53e-03    -  1.00e+00 1.00e+00h  1
  11  1.0089150e+00 1.08e-05 1.50e+01  -5.5 1.88e-02    -  9.99e-01 1.00e+00h  1
  12  1.0109999e+00 4.39e-02 1.70e-03  -6.1 8.30e-01    -  1.00e+00 1.00e+00f  1
  13  1.0121802e+00 3.92e-02 1.06e+01  -6.6 9.51e-01    -  1.00e+00 7.90e-01h  1
  14  1.0124916e+00 9.60e-03 1.58e-03  -7.0 7.90e-01    -  1.00e+00 1.00e+00h  1
  15  1.0125501e+00 3.93e-03 3.38e-01  -7.5 6.78e-01    -  1.00e+00 9.48e-01h  1
  16  1.0125685e+00 2.07e-03 4.53e-06  -8.0 6.06e-01    -  1.00e+00 1.00e+00h  1
  17  1.0125743e+00 9.40e-04 1.74e-02  -8.6 5.46e-01    -  1.00e+00 9.77e-01h  1
  18  1.0125756e+00 4.41e-04 2.86e-02  -9.2 6.91e-01    -  1.00e+00 8.46e-01h  1
  19  1.0125759e+00 1.75e-04 1.25e-07  -9.3 6.41e-01    -  1.00e+00 1.00e+00h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  20  1.0125762e+00 1.07e-04 3.45e-08 -10.0 3.60e-01    -  1.00e+00 1.00e+00h  1
  21  1.0125763e+00 9.21e-05 1.70e-08 -11.0 6.59e-01    -  1.00e+00 1.00e+00h  1
  22  1.0125763e+00 2.24e-05 2.34e-09 -11.0 3.10e-01    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 22

                                   (scaled)                 (unscaled)
Objective...............:  -1.0125763121311362e+00    1.0125763121311362e+00
Dual infeasibility......:   2.3436139230607403e-09    2.3436139230607403e-09
Constraint violation....:   5.4074100752643517e-10    2.2351828121658368e-05
Variable bound violation:   1.9198332901509185e-31    1.9198332901509185e-31
Complementarity.........:   1.7112642826483086e-11    1.7112642826483086e-11
Overall NLP error.......:   2.3436139230607403e-09    2.2351828121658368e-05


Number of objective function evaluations             = 23
Number of objective gradient evaluations             = 23
Number of equality constraint evaluations            = 23
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 23
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 22
Total seconds in IPOPT                               = 14.305

EXIT: Optimal Solution Found.

This explicit approach with strategy instances is less high-level than the descriptive method used earlier. It provides more control over the solving process and is detailed in the explicit mode documentation.

Let us now compare the performance of the two backends. We will use the @btimed macro from the BenchmarkTools package to measure the time taken for both the preparation of the NLP problem and the execution of the solver. Separating these measurements allows us to understand whether the bottleneck is in the discretization/modeling phase (which depends on the AD backend) or in the solver phase (which depends on the sparsity of the matrices). We use a lower-level approach with explicit strategy instances to separately measure the discretization/modeling time and the solver time.

# DataFrame to store the results
data = DataFrame(
    Backend=Symbol[],
    NNZO=Int[],
    NNZJ=Int[],
    NNZH=Int[],
    PrepTime=Float64[],
    ExecTime=Float64[],
    Objective=Float64[],
    Iterations=Int[]
)

# The different AD backends to test
backends = [:optimized, :manual]

# Loop over the backends
for adnlp_backend in backends

    # Create strategy instances
    discretizer = OptimalControl.Collocation(grid_size=1000, scheme=:gauss_legendre_3)
    modeler = OptimalControl.ADNLP(backend=adnlp_backend)
    solver = OptimalControl.Ipopt(print_level=0)

    # Initial guess
    init = build_initial_guess(ocp, nothing)

    # Discretize and build NLP model
    # This lower-level approach with discretize/nlp_model/solve is detailed in the [tutorial on direct transcription](@ref tutorial-nlp)
    bt = @btimed begin
        docp = discretize($ocp, $discretizer)
        nlp_model(docp, $init, $modeler)
    end
    prepa_time = bt.time
    nlp = bt.value

    # Get the number of non-zero elements
    nnzo = get_nnzo(nlp) # Gradient of the Objective
    nnzj = get_nnzj(nlp) # Jacobian of the constraints
    nnzh = get_nnzh(nlp) # Hessian of the Lagrangian

    # Solve the problem
    bt = @btimed solve($nlp, $solver; display=false)
    exec_time = bt.time
    nlp_sol = bt.value

    # Store the results in the DataFrame
    push!(data,
        (
            Backend=adnlp_backend,
            NNZO=nnzo,
            NNZJ=nnzj,
            NNZH=nnzh,
            PrepTime=prepa_time,
            ExecTime=exec_time,
            Objective=nlp_sol.objective,
            Iterations=nlp_sol.iter
        )
    )
end
println(data)
2×8 DataFrame
 Row │ Backend    NNZO   NNZJ    NNZH    PrepTime  ExecTime  Objective  Iterations
     │ Symbol     Int64  Int64   Int64   Float64   Float64   Float64    Int64
─────┼─────────────────────────────────────────────────────────────────────────────
   1 │ optimized  13004   87004   95001  5.57505    33.7565    1.01258          68
   2 │ manual     13004  174028  104022  0.347923   15.6358    1.01258          22

Explicit time grid

The option time_grid=... allows you to provide the full time grid vector t0, t1, ..., tf, which is especially useful if a non-uniform grid is desired. In the case of a free initial and/or final time, you should provide a normalized grid ranging from 0 to 1. Note that time_grid overrides grid_size if both options are specified.

sol = solve(ocp; time_grid=[0, 0.1, 0.5, 0.9, 1], display=false)
println(time_grid(sol))
[0.0, 0.022086440107996572, 0.11043220053998286, 0.19877796097196915, 0.22086440107996572]