The surface of revolution of minimum area

We consider the well-known surface of revolution of minimum area problem which dates back to Euler^[1] ^[2]. We already know that the solution is a catenoid and we propose to retrieve this numerically and compute conjugate points, in relation with second order conditions of local optimality. This is a problem from calculus of variations but we consider its optimal control version. We minimise the cost integral

\[ \int_0^{1} x(t)\,\sqrt{1+u^2(t)}\,\mathrm{d} t\]

under the dynamical constraint

\[ \dot{x}(t) = u(t), \quad u(t)\in\R,\]

and the limit conditions

\[ x(0) = 1, \quad x(1) = 2.5.\]

To define this problem with OptimalControl.jl we have write:

using OptimalControl

t0 = 0
tf = 1
x0 = 1
xf = 2.5

ocp = @def begin

    t ∈ [ t0, tf ], time
    x ∈ R, state
    u ∈ R, control

    x(t0) == x0
    x(tf) == xf

    ẋ(t) ==  u(t)

    ∫( x(t) * √(1 + u(t)^2) ) → min

end

The maximization of the pseudo-Hamiltonian provides the control with respect to the state and the costate:

\[ u(x,p) = \mathrm{sign}(x) \frac{p}{\sqrt{x^2-p^2}}.\]

From this control law, we could define the Hamiltonian $\mathbf{H}(x,p)=H(x, p, u(x,p))$ and its associated Hamiltonian flow. The OptimalControl.jl package does this for us simply passing to the function Flow, the optimal control problem with the control law.

using OrdinaryDiffEq

u(x, p) = sign(x) * p / √(x^2-p^2)

ocp_flow = Flow(ocp, u; reltol=1e-10, abstol=1e-10)

Let us plot some extremals, solutions of this flow. The initial condition $x_0$ is fixed while we compute some extremals for different values of initial covector $p_0$. We compute some specific initial covectors for a nice plot.

Computation of the initial covectors

using Plots

N = 200
tf_ = 2
tspan = range(t0, tf_, N)   # time interval
plt_x = plot()              # plot of the state x(t)
plt_p = plot()              # plot of the costate p(t)
plt_u = plot()              # plot of the control u(t)
plt_phase = plot()          # plot (x, p)

# callback: termination
# Without this, the numerical integration stop before tf for p₀ = 0.99
condition(z,t,integrator) = z[1] - (xmax+1)
affect!(integrator) = terminate!(integrator)
cbt  = ContinuousCallback(condition,affect!)

for p0 ∈ p0s # plot for each p₀ in p0s

    flow_p0 = ocp_flow((t0, tf_), x0, p0; saveat=tspan, callback=cbt)

    T = tspan
    X = state(flow_p0).(T)
    P = costate(flow_p0).(T)

    plot!(plt_x,     T, X;          color=:blue)
    plot!(plt_p,     T, P;          color=:blue)
    plot!(plt_u,     T, u.(X, P);   color=:blue)
    plot!(plt_phase, X, P;          color=:blue)

end

# Plots
plot!(plt_x; xlabel="t", ylabel="x(t,p₀)", legend=false, ylims=(0, xmax))
plot!(plt_p; xlabel="t", ylabel="p(t,p₀)", legend=false)
plot!(plt_u; xlabel="t", ylabel="u(t,p₀)", legend=false, ylims=(-2.5, 5))
plot!(plt_phase; xlabel="x(t,p₀)", ylabel="p(t,p₀)", legend=false, xlims=(0, 2), ylims=(-1, 2))

plot(plt_x, plt_p, plt_u, plt_phase; layout=(2, 2), size=(800, 600))

Here, the shooting equation given by

\[ S({p₀}) = \pi(z(t_f,x_0,{p₀})) - x_f = 0,\]

with $\pi(x, p) = x$, has two solutions: $p₀ = -0.9851$ and $p₀ = 0.5126$.

π((x, p)) = x

# Shooting function
S(p0) = (π ∘ ocp_flow)(t0, x0, p0, tf) - xf

# Solve the shooting equation: first extremal
sol1_p0 = Roots.find_zero(S, (-0.99, -0.97))

# Solve the shooting equation: second extremal
sol2_p0 = Roots.find_zero(S, (0.5, 0.6))

println("sol1_p0 = ",  sol1_p0, ", S(sol1_p0) = ", S(sol1_p0))
println("sol2_p0 =  ", sol2_p0, ", S(sol2_p0) = ", S(sol2_p0))

sol1_p0 = -0.9851008114500316, S(sol1_p0) = 2.7489122089718876e-13
sol2_p0 =  0.5126502880902792, S(sol2_p0) = 4.440892098500626e-16

Let us plot the two solutions. One can notice that they intersect as shown by the top-left subplot.

p0s = (sol1_p0, sol2_p0)     # the different p₀

N = 100
tspan = range(t0, tf, N)     # time interval
plt2_x = plot()              # plot of the state x(t)
plt2_p = plot()              # plot of the costate p(t)
plt2_u = plot()              # plot of the control u(t)
plt2_phase = plot()          # plot (x, p)

labels = ["p₀=-0.9851" , "p₀=0.51265"]

for (p0, label) ∈ zip(p0s, labels) # plot for each p₀ in p0s

    flow_p0 = ocp_flow((t0, tf), x0, p0; saveat=tspan)

    T = tspan
    X = state(flow_p0).(T)
    P = costate(flow_p0).(T)

    plot!(plt2_x,     T, X;         label=label)
    plot!(plt2_p,     T, P;         label=label)
    plot!(plt2_u,     T, u.(X, P);  label=label)
    plot!(plt2_phase, X, P;         label=label)

end

plot!(plt2_x, [tf], [xf]; xlabel="t", ylabel="x(t,p₀)", seriestype=:scatter,label="")
plot!(plt2_p; xlabel="t", ylabel="p(t,p₀)", legend=false, ylims=(-1.5, 5))
plot!(plt2_u; xlabel="t", ylabel="u(t,p₀)", legend=false, ylims=(-6, 5))
plot!(plt2_phase; xlabel="x(t,p₀)", ylabel="p(t,p₀)", legend=false, xlims=(0, 2.5), ylims=(-1, 5))

plot(plt2_x, plt2_p, plt2_u, plt2_phase; layout=(2, 2), size=(800, 600))

Now, we can compute the conjugate points along the two extremals. We have to compute the flow $\delta z(t, p₀)$ of the Jacobi equation with the initial condition $\delta z(0) = (0, 1)$. This is given solving

\[ \delta z(t, p₀) = \dfrac{\partial}{\partial p₀}z(t, p₀).\]

Note that to compute the conjugate points, we only need the first component:

\[ \delta z(t, p₀)_1.\]

using ForwardDiff

function jacobi_flow(t, p0)
    x(t, p0) = (π ∘ ocp_flow)(t0, x0, p0, t)
    return ForwardDiff.derivative(p0 -> x(t, p0), p0)
end

The first conjugate time is then the first time $\tau$ such that

\[ \delta x(\tau, p₀)= \delta z(\tau, p₀)_1 = 0,\]

with $p₀$ fixed. On the following figure, one can see that the first extremal has a conjugate time smaller than $t_f=1$ while for the second extremal, there is no conjugate time. Thus, the first extremal cannot be optimal.

using Plots.PlotMeasures

N = 100

# Jacobi field for the first extremal
tspan = range(t0, tf, N) # time interval

δx = jacobi_flow.(tspan, sol1_p0)

plt_conj1 = plot()
plot!(plt_conj1, tspan, δx)  # as n=1, det(δx) = δx
plot!(plt_conj1; xlabel="t", ylabel="δx(t,p₀)", legend=false, ylims=(-10, 10), size=(400, 300))

# Jacobi field for the second extremal
tspan = range(t0, 1.5, N) # time interval

δx = jacobi_flow.(tspan, sol2_p0)

plt_conj2= plot()
plot!(plt_conj2, tspan, δx)  # as n=1 the det(δx) = δx
plot!(plt_conj2; xlabel="t", ylabel="δx(t,p₀)", legend=false, ylims=(-10, 10), size=(400, 300))

#
plt_conj = plot(plt_conj1, plt_conj2;
    layout=(1, 2), size=(800, 300), leftmargin=25px, bottommargin=15px)

We compute the first conjugate point along the first extremal and add it to the plot.

tau0 = Roots.find_zero(tau -> jacobi_flow(tau, sol1_p0), (0.4, 0.6))

println("For p0 = ", sol1_p0, " tau_0 = ", tau0)

plot!(plt_conj[1], [tau0], [jacobi_flow(tau0, sol1_p0)]; seriestype=:scatter)

To conclude on this example, we compute the conjugate locus by using a path following algorithm. Define $F(\tau,p₀) = \delta x(\tau,p₀)$ and suppose that the partial derivative $\partial_\tau F(\tau,p₀)$ is invertible, then, by the implicit function theorem the conjugate time is a function of $p₀$. So, since here $p₀\in\R$, we can compute them by solving the initial value problem for $p₀ \in [\alpha, \beta]$:

\[ \dot{\tau}(p₀) = -\dfrac{\partial F}{\partial \tau}(\tau(p₀),p₀)^{-1}\, \dfrac{\partial F}{\partial p₀}(\tau(p₀),p₀), \quad \tau(\alpha) = \tau_0.\]

For the numerical experiment, we set $\alpha = -0.9995$, $\beta = -0.5$.

function conjugate_times_rhs_path(tau, p0)
    dF = ForwardDiff.gradient(y -> jacobi_flow(y...), [tau, p0])
    return -dF[2]/dF[1]
end

function conjugate_times(p0span, tau0)
    ode = OrdinaryDiffEq.ODEProblem((tau, par, p0) -> conjugate_times_rhs_path(tau, p0), tau0, p0span)
    sol = OrdinaryDiffEq.solve(ode; reltol=1e-8, abstol=1e-8)
    return sol.u, sol.t # taus, p0s
end

Now we have defined the algorithm, let us compute the conjugate locus and plot it.

# conjugate locus
p0 = sol1_p0
taus1, p0s1 = conjugate_times((p0, -0.5), tau0)
taus2, p0s2 = conjugate_times((p0, -0.999), tau0)
taus = append!(taus2[end:-1:1],taus1)
p0s = append!(p0s2[end:-1:1],p0s1)

# plot tau(p0)
plt_conj_times = plot(p0s, taus; xlabel="p₀", ylabel="τ", color=:blue, xlims = (-1, -0.5))

# get conjugate points
X = []
for (tau, p0) ∈ zip(taus, p0s)
    # compute x(tau, p0)
    x = (π ∘ ocp_flow)(t0, x0, p0, tau)
    push!(X, x)
end

# plot conjugate points on plt_x
plot!(plt_x, taus, X; linewidth=3, color=:red, legend=false, xlims=(0, 2), ylims=(0, xmax))

#
plot(plt_x, plt_conj_times;
    layout=(1,2), legend=false, size=(800,300), leftmargin=25px, bottommargin=15px)

1H. Schättler & U. Ledzewicz, Geometric optimal control: theory, methods and examples, vol~38 of Interdisciplinary applied mathematics, Springer Science & Business Media, New York (2012), xiv+640.
2D. Liberzon, Calculus ov Variations and Optimal Control Theory, Princeton University Press (2012).