Testing the benchmark with ADNLPModel 0.8.3 #17

frapac · 2024-07-19T06:42:11Z

ADNLPModel 0.8.3 uses a different algorithm for the coloring, using the symmetry of the Hessian. It may improve the performance in OptimalControl.jl.

xref control-toolbox/CTDirect.jl#111 (comment)

0Yassine0 · 2024-07-19T08:33:31Z

@ocots @frapac .
I'm working on that, at least to see its impact on OptimalControl.
However, I had some difficulty using the updates of the packages including the new 0.9.4 version of OC, the package MadNLP doesn't allow CTDirect to update from v0.9.3 to latest version, which wasn't affected by OptimalControl only during compilation.

I'm impressed with the results so far. In testing the Goddard rocket problem, the total time decreased from ~870ms to ~80ms!!
I might have made some mistakes collecting data in this notebook (without even using MKL yet), but it shows some promising results.

frapac · 2024-07-19T09:21:16Z

@0Yassine0 Don't hesitate to remove MadNLP in your environment, as we are not using it currently.

Can you test the new release of OptimalControl with ADNLPModels 0.8.3 on moonlander? The current solution time with OC is

Total seconds in IPOPT (w/o function evaluations)    =      1.389
Total seconds in NLP function evaluations            =    207.693

It would be nice if the new release improve significantly the time spent in NLP function evaluations for this instance.

0Yassine0 · 2024-07-19T09:36:40Z

Yes, exactly that's what I did. I just use it when comparing with Ipopt and KNITRO.
I pushed the new results with MKL and the new releases:

Total seconds in IPOPT (w/o function evaluations)    =      1.399
Total seconds in NLP function evaluations            =      2.223

The result speaks for itself.

frapac · 2024-07-19T09:44:45Z

Indeed. Good job @0Yassine0 !
Closing this issue now.

ocots · 2024-07-19T10:08:25Z

@jbcaillau @PierreMartinon @joseph-gergaud ¡ Increíble !

0Yassine0 · 2024-07-19T10:28:05Z

@ibtissammim it may help you.

amontoison · 2024-07-19T12:00:45Z

What should be compared between v0.8.2 and v0.8.3 is the number of colors.
You can get it with adnlp.backend.hessian_backend.ncolors.
Less colors => less directional derivates and AD

jbcaillau · 2024-07-20T11:14:04Z

😱 @0Yassine0 @frapac impressive! guys, I would like further comments on this (closed) issue. @amontoison ?

Yes, exactly that's what I did. I just use it when comparing with Ipopt and KNITRO. I pushed the new results with MKL and the new releases:
Total seconds in IPOPT (w/o function evaluations)    =      1.399
Total seconds in NLP function evaluations            =      2.223
The result speaks for itself.

amontoison · 2024-07-20T14:49:40Z

The additional comment I can give is how variables are connected to each other in the Lagrangian of your problem. It's problem-dependent, but your problems could have quite a common pattern in the objective/constraints depending on the subset of optimal control problems and your discretization scheme.

An example of how it impacts AD is:

f(x) = x[1] * sum(x[i] for i=1:n)
c(x) = x

The Hessian of the Lagrangian will have a dense row (the first one), and thus we can't compute any combination of the columns of the Hessian with a column coloring (overlap on the first row of a potential compressed column). Therefore, we need to perform n directional derivatives to obtain each H[1, i].

The difference now is that we exploit the symmetry; thus, the first row is also the first column of the Hessian, and we can only do one directional derivative to compute them all.
The concept is simple, but the implementation is harder. In spirit, you don't want to implement a variant of METIS in Julia to compute a fill-in reducing permutation when you develop AD packages, optimization solvers, or even linear solvers.
Implementing a symmetric coloring (star coloring) is analogous.
It's a quite different topic related to graphs. I'm glad that Guillaume worked on graph theory. 🙂

The second issue is, for a computed coefficient H[i, j], is it the j-component of the i-column (computed in the compressed column of a specific color) or the i-row of the j-column (computed as another compressed column)? You want to correctly build the sparse Hessian after you have computed all nonzeros with AD (decompression step), and that's not easy to implement efficiently.

jbcaillau · 2024-07-20T15:13:03Z

thanks @amontoison Got it that symmetry reduces significantly the number of directional derivatives to compute. Question: the sparsity structure is always know in advance and fixed¹; to what extent is it helpful (= computationally relevant) / better to pass this structure to AD / optimiser, vs. using graph coloring as indicated above?

up to a few changes depending on whether the final is fixed or not, etc. ↩

amontoison · 2024-07-20T15:42:08Z

@jbcaillau

If you have the sparsity structure, you can avoid the sparsity detection, which can be quite expensive even if we use SparseConnectivityTracer.jl instead of Symbolics.jl.

The idea is to provide J directly here and H here.

Note that you still need to do the coloring afterward. The coloring is used to determine the seeds (combinations of columns of the identity matrix) that we should use for the directional derivatives to get all nonzeros of the sparse Jacobian / Hessian.

@tmigot started a tutorial to explain how to do that last year:
JuliaSmoothOptimizers/ADNLPModels.jl#209

Quick summary of what we do in ADNLPModels.jl:

Determine the sparsity pattern of the Jacobian / Hessian of any Julia function (SparseConnectivityTracer.jl)
Compute a column coloring for Jacobians or symmetric coloring for Hessians to determine the seeds that we need for the directional derivatives and the number of colors (== number of directional derivatives). We rely on SparseMatrixColorings.jl for that.
Preallocate workspace to do the directional derivatives and perform forward / reverse passes to compute compressed columns (ForwardDiff.jl / ReverseDiff.jl)
Recover the Jacobian / Hessian in COO / CSC format from the compressed columns (decompression step).

frapac closed this as completed Jul 19, 2024

jbcaillau mentioned this issue Jul 22, 2024

Pass sparse structure of constraints to NLPModels control-toolbox/CTDirect.jl#183

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing the benchmark with ADNLPModel 0.8.3 #17

Testing the benchmark with ADNLPModel 0.8.3 #17

frapac commented Jul 19, 2024

0Yassine0 commented Jul 19, 2024 •

edited

Loading

frapac commented Jul 19, 2024

0Yassine0 commented Jul 19, 2024 •

edited

Loading

frapac commented Jul 19, 2024

ocots commented Jul 19, 2024 •

edited

Loading

0Yassine0 commented Jul 19, 2024

amontoison commented Jul 19, 2024

jbcaillau commented Jul 20, 2024

amontoison commented Jul 20, 2024 •

edited

Loading

jbcaillau commented Jul 20, 2024

amontoison commented Jul 20, 2024 •

edited

Loading

Testing the benchmark with ADNLPModel 0.8.3 #17

Testing the benchmark with ADNLPModel 0.8.3 #17

Comments

frapac commented Jul 19, 2024

0Yassine0 commented Jul 19, 2024 • edited Loading

frapac commented Jul 19, 2024

0Yassine0 commented Jul 19, 2024 • edited Loading

frapac commented Jul 19, 2024

ocots commented Jul 19, 2024 • edited Loading

0Yassine0 commented Jul 19, 2024

amontoison commented Jul 19, 2024

jbcaillau commented Jul 20, 2024

amontoison commented Jul 20, 2024 • edited Loading

jbcaillau commented Jul 20, 2024

Footnotes

amontoison commented Jul 20, 2024 • edited Loading

0Yassine0 commented Jul 19, 2024 •

edited

Loading

0Yassine0 commented Jul 19, 2024 •

edited

Loading

ocots commented Jul 19, 2024 •

edited

Loading

amontoison commented Jul 20, 2024 •

edited

Loading

amontoison commented Jul 20, 2024 •

edited

Loading