The surface of revolution of minimum area
We consider the well-known surface of revolution of minimum area problem which dates back to Euler[1] [2]. We already know that the solution is a catenoid and we propose to retrieve this numerically and compute conjugate points, in relation with second order conditions of local optimality. This is a problem from calculus of variations but we consider its optimal control version. We minimise the cost integral
\[ \int_0^{1} x(t)\,\sqrt{1+u^2(t)}\,\mathrm{d} t\]
under the dynamical constraint
\[ \dot{x}(t) = u(t), \quad u(t)\in\R,\]
and the limit conditions
\[ x(0) = 1, \quad x(1) = 2.5.\]
To define this problem with OptimalControl.jl
we have write:
using OptimalControl
t0 = 0
tf = 1
x0 = 1
xf = 2.5
ocp = @def begin
t ∈ [ t0, tf ], time
x ∈ R, state
u ∈ R, control
x(t0) == x0
x(tf) == xf
ẋ(t) == u(t)
∫( x(t) * √(1 + u(t)^2) ) → min
end
The maximization of the pseudo-Hamiltonian provides the control with respect to the state and the costate:
\[ u(x,p) = \mathrm{sign}(x) \frac{p}{\sqrt{x^2-p^2}}.\]
From this control law, we could define the Hamiltonian $\mathbf{H}(x,p)=H(x, p, u(x,p))$ and its associated Hamiltonian flow. The OptimalControl.jl
package does this for us simply passing to the function Flow
, the optimal control problem with the control law.
using OrdinaryDiffEq
u(x, p) = sign(x) * p / √(x^2-p^2)
ocp_flow = Flow(ocp, u; reltol=1e-10, abstol=1e-10)
Let us plot some extremals, solutions of this flow. The initial condition $x_0$ is fixed while we compute some extremals for different values of initial covector $p_0$. We compute some specific initial covectors for a nice plot.
using Plots
N = 200
tf_ = 2
tspan = range(t0, tf_, N) # time interval
plt_x = plot() # plot of the state x(t)
plt_p = plot() # plot of the costate p(t)
plt_u = plot() # plot of the control u(t)
plt_phase = plot() # plot (x, p)
# callback: termination
# Without this, the numerical integration stop before tf for p₀ = 0.99
condition(z,t,integrator) = z[1] - (xmax+1)
affect!(integrator) = terminate!(integrator)
cbt = ContinuousCallback(condition,affect!)
for p0 ∈ p0s # plot for each p₀ in p0s
flow_p0 = ocp_flow((t0, tf_), x0, p0; saveat=tspan, callback=cbt)
T = tspan
X = state(flow_p0).(T)
P = costate(flow_p0).(T)
plot!(plt_x, T, X; color=:blue)
plot!(plt_p, T, P; color=:blue)
plot!(plt_u, T, u.(X, P); color=:blue)
plot!(plt_phase, X, P; color=:blue)
end
# Plots
plot!(plt_x; xlabel="t", ylabel="x(t,p₀)", legend=false, ylims=(0, xmax))
plot!(plt_p; xlabel="t", ylabel="p(t,p₀)", legend=false)
plot!(plt_u; xlabel="t", ylabel="u(t,p₀)", legend=false, ylims=(-2.5, 5))
plot!(plt_phase; xlabel="x(t,p₀)", ylabel="p(t,p₀)", legend=false, xlims=(0, 2), ylims=(-1, 2))
plot(plt_x, plt_p, plt_u, plt_phase; layout=(2, 2), size=(800, 600))
Here, the shooting equation given by
\[ S({p₀}) = \pi(z(t_f,x_0,{p₀})) - x_f = 0,\]
with $\pi(x, p) = x$, has two solutions: $p₀ = -0.9851$ and $p₀ = 0.5126$.
π((x, p)) = x
# Shooting function
S(p0) = (π ∘ ocp_flow)(t0, x0, p0, tf) - xf
# Solve the shooting equation: first extremal
sol1_p0 = Roots.find_zero(S, (-0.99, -0.97))
# Solve the shooting equation: second extremal
sol2_p0 = Roots.find_zero(S, (0.5, 0.6))
println("sol1_p0 = ", sol1_p0, ", S(sol1_p0) = ", S(sol1_p0))
println("sol2_p0 = ", sol2_p0, ", S(sol2_p0) = ", S(sol2_p0))
sol1_p0 = -0.9851008114498419, S(sol1_p0) = -2.757793993168889e-13
sol2_p0 = 0.5126502880904485, S(sol2_p0) = 0.0
Let us plot the two solutions. One can notice that they intersect as shown by the top-left subplot.
p0s = (sol1_p0, sol2_p0) # the different p₀
N = 100
tspan = range(t0, tf, N) # time interval
plt2_x = plot() # plot of the state x(t)
plt2_p = plot() # plot of the costate p(t)
plt2_u = plot() # plot of the control u(t)
plt2_phase = plot() # plot (x, p)
labels = ["p₀=-0.9851" , "p₀=0.51265"]
for (p0, label) ∈ zip(p0s, labels) # plot for each p₀ in p0s
flow_p0 = ocp_flow((t0, tf), x0, p0; saveat=tspan)
T = tspan
X = state(flow_p0).(T)
P = costate(flow_p0).(T)
plot!(plt2_x, T, X; label=label)
plot!(plt2_p, T, P; label=label)
plot!(plt2_u, T, u.(X, P); label=label)
plot!(plt2_phase, X, P; label=label)
end
plot!(plt2_x, [tf], [xf]; xlabel="t", ylabel="x(t,p₀)", seriestype=:scatter,label="")
plot!(plt2_p; xlabel="t", ylabel="p(t,p₀)", legend=false, ylims=(-1.5, 5))
plot!(plt2_u; xlabel="t", ylabel="u(t,p₀)", legend=false, ylims=(-6, 5))
plot!(plt2_phase; xlabel="x(t,p₀)", ylabel="p(t,p₀)", legend=false, xlims=(0, 2.5), ylims=(-1, 5))
plot(plt2_x, plt2_p, plt2_u, plt2_phase; layout=(2, 2), size=(800, 600))
Now, we can compute the conjugate points along the two extremals. We have to compute the flow $\delta z(t, p₀)$ of the Jacobi equation with the initial condition $\delta z(0) = (0, 1)$. This is given solving
\[ \delta z(t, p₀) = \dfrac{\partial}{\partial p₀}z(t, p₀).\]
Note that to compute the conjugate points, we only need the first component:
\[ \delta z(t, p₀)_1.\]
using ForwardDiff
function jacobi_flow(t, p0)
x(t, p0) = (π ∘ ocp_flow)(t0, x0, p0, t)
return ForwardDiff.derivative(p0 -> x(t, p0), p0)
end
The first conjugate time is then the first time $\tau$ such that
\[ \delta x(\tau, p₀)= \delta z(\tau, p₀)_1 = 0,\]
with $p₀$ fixed. On the following figure, one can see that the first extremal has a conjugate time smaller than $t_f=1$ while for the second extremal, there is no conjugate time. Thus, the first extremal cannot be optimal.
using Plots.PlotMeasures
N = 100
# Jacobi field for the first extremal
tspan = range(t0, tf, N) # time interval
δx = jacobi_flow.(tspan, sol1_p0)
plt_conj1 = plot()
plot!(plt_conj1, tspan, δx) # as n=1, det(δx) = δx
plot!(plt_conj1; xlabel="t", ylabel="δx(t,p₀)", legend=false, ylims=(-10, 10), size=(400, 300))
# Jacobi field for the second extremal
tspan = range(t0, 1.5, N) # time interval
δx = jacobi_flow.(tspan, sol2_p0)
plt_conj2= plot()
plot!(plt_conj2, tspan, δx) # as n=1 the det(δx) = δx
plot!(plt_conj2; xlabel="t", ylabel="δx(t,p₀)", legend=false, ylims=(-10, 10), size=(400, 300))
#
plt_conj = plot(plt_conj1, plt_conj2;
layout=(1, 2), size=(800, 300), leftmargin=25px, bottommargin=15px)
We compute the first conjugate point along the first extremal and add it to the plot.
tau0 = Roots.find_zero(tau -> jacobi_flow(tau, sol1_p0), (0.4, 0.6))
println("For p0 = ", sol1_p0, " tau_0 = ", tau0)
plot!(plt_conj[1], [tau0], [jacobi_flow(tau0, sol1_p0)]; seriestype=:scatter)
To conclude on this example, we compute the conjugate locus by using a path following algorithm. Define $F(\tau,p₀) = \delta x(\tau,p₀)$ and suppose that the partial derivative $\partial_\tau F(\tau,p₀)$ is invertible, then, by the implicit function theorem the conjugate time is a function of $p₀$. So, since here $p₀\in\R$, we can compute them by solving the initial value problem for $p₀ \in [\alpha, \beta]$:
\[ \dot{\tau}(p₀) = -\dfrac{\partial F}{\partial \tau}(\tau(p₀),p₀)^{-1}\, \dfrac{\partial F}{\partial p₀}(\tau(p₀),p₀), \quad \tau(\alpha) = \tau_0.\]
For the numerical experiment, we set $\alpha = -0.9995$, $\beta = -0.5$.
function conjugate_times_rhs_path(tau, p0)
dF = ForwardDiff.gradient(y -> jacobi_flow(y...), [tau, p0])
return -dF[2]/dF[1]
end
function conjugate_times(p0span, tau0)
ode = OrdinaryDiffEq.ODEProblem((tau, par, p0) -> conjugate_times_rhs_path(tau, p0), tau0, p0span)
sol = OrdinaryDiffEq.solve(ode; reltol=1e-8, abstol=1e-8)
return sol.u, sol.t # taus, p0s
end
Now we have defined the algorithm, let us compute the conjugate locus and plot it.
# conjugate locus
p0 = sol1_p0
taus1, p0s1 = conjugate_times((p0, -0.5), tau0)
taus2, p0s2 = conjugate_times((p0, -0.999), tau0)
taus = append!(taus2[end:-1:1],taus1)
p0s = append!(p0s2[end:-1:1],p0s1)
# plot tau(p0)
plt_conj_times = plot(p0s, taus; xlabel="p₀", ylabel="τ", color=:blue, xlims = (-1, -0.5))
# get conjugate points
X = []
for (tau, p0) ∈ zip(taus, p0s)
# compute x(tau, p0)
x = (π ∘ ocp_flow)(t0, x0, p0, tau)
push!(X, x)
end
# plot conjugate points on plt_x
plot!(plt_x, taus, X; linewidth=3, color=:red, legend=false, xlims=(0, 2), ylims=(0, xmax))
#
plot(plt_x, plt_conj_times;
layout=(1,2), legend=false, size=(800,300), leftmargin=25px, bottommargin=15px)