diff --git a/docs/make.jl b/docs/make.jl index 556d0e48..c66491e2 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -20,7 +20,7 @@ makedocs( "Support multiple precision" => "generic.md", "Sparse Jacobian and Hessian" => "sparse.md", "Performance tips" => "performance.md", - "Provide sparsity pattern for sparse derivatives" => "sparsity_pattern.md", + "Providing sparsity pattern for sparse derivatives" => "sparsity_pattern.md", "Reference" => "reference.md", ], ) diff --git a/docs/src/sparse.md b/docs/src/sparse.md index 1a8f52cb..73a60e61 100644 --- a/docs/src/sparse.md +++ b/docs/src/sparse.md @@ -1,6 +1,6 @@ -# Sparse Hessian and Jacobian computations +# [Sparse Hessian and Jacobian computations](@id sparse) -It is to be noted that by default the Jacobian and Hessian are sparse. +By default, the Jacobian and Hessian are treated as sparse. ```@example ex1 using ADNLPModels, NLPModels @@ -8,7 +8,7 @@ using ADNLPModels, NLPModels f(x) = (x[1] - 1)^2 T = Float64 x0 = T[-1.2; 1.0] -lvar, uvar = zeros(T, 2), ones(T, 2) # must be of same type than `x0` +lvar, uvar = zeros(T, 2), ones(T, 2) lcon, ucon = -T[0.5], T[0.5] c!(cx, x) = begin cx[1] = x[2] @@ -18,7 +18,7 @@ nlp = ADNLPModel!(f, x0, lvar, uvar, c!, lcon, ucon, backend = :optimized) ``` ```@example ex1 -(get_nnzj(nlp), get_nnzh(nlp)) # number of nonzeros elements in the Jacobian and Hessian +(get_nnzj(nlp), get_nnzh(nlp)) # Number of nonzero elements in the Jacobian and Hessian ``` ```@example ex1 @@ -31,24 +31,66 @@ x = rand(T, 2) H = hess(nlp, x) ``` -The available backends for sparse derivatives (`SparseADJacobian`, `SparseADHessian` and `SparseReverseADHessian`) have keyword arguments `detector` and `coloring_algorithm` to specify the sparsity pattern detector and the coloring algorithm, respectively. +The backends available for sparse derivatives (`SparseADJacobian`, `SparseADHessian`, and `SparseReverseADHessian`) allow for customization through keyword arguments such as `detector` and `coloring_algorithm`. +These arguments specify the sparsity pattern detector and the coloring algorithm, respectively. - A **`detector`** must be of type `ADTypes.AbstractSparsityDetector`. -The default detector is `TracerSparsityDetector()` from the package `SparseConnectivityTracer.jl`. -Prior to version 0.8.0, the default detector was `SymbolicSparsityDetector()` from `Symbolics.jl`. + The default detector is `TracerSparsityDetector()` from the package `SparseConnectivityTracer.jl`. + Prior to version 0.8.0, the default was `SymbolicSparsityDetector()` from `Symbolics.jl`. - A **`coloring_algorithm`** must be of type `SparseMatrixColorings.GreedyColoringAlgorithm`. -The default algorithm is `GreedyColoringAlgorithm{:direct}()` for `SparseADJacobian` and `SparseADHessian`, while it is `GreedyColoringAlgorithm{:substitution}()` for `SparseReverseADHessian`. -These algorithms are available in the package `SparseMatrixColorings.jl`. + The default algorithm is `GreedyColoringAlgorithm{:direct}()` for `SparseADJacobian` and `SparseADHessian`, while it is `GreedyColoringAlgorithm{:substitution}()` for `SparseReverseADHessian`. + These algorithms are provided by the package `SparseMatrixColorings.jl`. The `GreedyColoringAlgorithm{:direct}()` performs column coloring for Jacobians and star coloring for Hessians. -In contrast, `GreedyColoringAlgorithm{:substitution}()` applies acyclic coloring for Hessians. -The `:substitution` coloring mode usually finds fewer colors than the `:direct` mode and thus fewer directional derivatives are needed to recover all non-zeros of the sparse Hessian. -However, it requires storing the compressed sparse Hessian, while `:direct` coloring only stores one column of the compressed Hessian. +In contrast, `GreedyColoringAlgorithm{:substitution}()` applies acyclic coloring for Hessians. The `:substitution` mode generally requires fewer colors than `:direct`, thus fewer directional derivatives are needed to reconstruct the sparse Hessian. +However, it necessitates storing the compressed sparse Hessian, while `:direct` coloring only requires storage for one column of the compressed Hessian. -The `:direct` coloring mode is numerically more stable and may be preferable for highly ill-conditioned Hessian as it doesn't require solving triangular systems to compute the non-zeros from the compressed Hessian. +The `:direct` coloring mode is numerically more stable and may be preferable for highly ill-conditioned Hessians, as it avoids solving triangular systems to compute nonzero entries from the compressed Hessian. + +## Extracting sparsity patterns + +`ADNLPModels.jl` provides the function [`get_sparsity_pattern`](@ref) to retrieve the sparsity patterns of the Jacobian or Hessian from a model. + +```@example ex3 +using SparseArrays, ADNLPModels, NLPModels + +nvar = 10 +ncon = 5 + +f(x) = sum((x[i] - i)^2 for i = 1:nvar) + x[nvar] * sum(x[j] for j = 1:nvar-1) + +function c!(cx, x) + cx[1] = x[1] + x[2] + cx[2] = x[1] + x[2] + x[3] + cx[3] = x[2] + x[3] + x[4] + cx[4] = x[3] + x[4] + x[5] + cx[5] = x[4] + x[5] + return cx +end + +T = Float64 +x0 = -ones(T, nvar) +lvar = zeros(T, nvar) +uvar = 2 * ones(T, nvar) +lcon = -0.5 * ones(T, ncon) +ucon = 0.5 * ones(T, ncon) + +nlp = ADNLPModel!(f, x0, lvar, uvar, c!, lcon, ucon) +``` +```@example ex3 +J = get_sparsity_pattern(nlp, :jacobian) +``` +```@example ex3 +H = get_sparsity_pattern(nlp, :hessian) +``` + +## Using known sparsity patterns + +If the sparsity pattern of the Jacobian or the Hessian is already known, you can provide it directly. +This may happen when the pattern is derived from the application or has been computed previously and saved for reuse. +Note that both the lower and upper triangular parts of the Hessian are required during the coloring phase. -If the sparsity pattern of the Jacobian of the constraint or the Hessian of the Lagrangian is available, you can directly provide them. ```@example ex2 using SparseArrays, ADNLPModels, NLPModels @@ -57,17 +99,17 @@ ncon = 5 f(x) = sum((x[i] - i)^2 for i = 1:nvar) + x[nvar] * sum(x[j] for j = 1:nvar-1) -H = SparseMatrixCSC{Bool, Int64}( -[ 1 0 0 0 0 0 0 0 0 1 ; - 0 1 0 0 0 0 0 0 0 1 ; - 0 0 1 0 0 0 0 0 0 1 ; - 0 0 0 1 0 0 0 0 0 1 ; - 0 0 0 0 1 0 0 0 0 1 ; - 0 0 0 0 0 1 0 0 0 1 ; - 0 0 0 0 0 0 1 0 0 1 ; - 0 0 0 0 0 0 0 1 0 1 ; - 0 0 0 0 0 0 0 0 1 1 ; - 1 1 1 1 1 1 1 1 1 1 ] +H = SparseMatrixCSC{Bool, Int}( + [ 1 0 0 0 0 0 0 0 0 1 ; + 0 1 0 0 0 0 0 0 0 1 ; + 0 0 1 0 0 0 0 0 0 1 ; + 0 0 0 1 0 0 0 0 0 1 ; + 0 0 0 0 1 0 0 0 0 1 ; + 0 0 0 0 0 1 0 0 0 1 ; + 0 0 0 0 0 0 1 0 0 1 ; + 0 0 0 0 0 0 0 1 0 1 ; + 0 0 0 0 0 0 0 0 1 1 ; + 1 1 1 1 1 1 1 1 1 1 ] ) function c!(cx, x) @@ -79,12 +121,12 @@ function c!(cx, x) return cx end -J = SparseMatrixCSC{Bool, Int64}( -[ 1 1 0 0 0 ; - 1 1 1 0 0 ; - 0 1 1 1 0 ; - 0 0 1 1 1 ; - 0 0 0 1 1 ] +J = SparseMatrixCSC{Bool, Int}( + [ 1 1 0 0 0 0 0 0 0 0 ; + 1 1 1 0 0 0 0 0 0 0 ; + 0 1 1 1 0 0 0 0 0 0 ; + 0 0 1 1 1 0 0 0 0 0 ; + 0 0 0 1 1 0 0 0 0 0 ] ) T = Float64 @@ -100,6 +142,10 @@ H_backend = ADNLPModels.SparseADHessian(nvar, f, ncon, c!, H) nlp = ADNLPModel!(f, x0, lvar, uvar, c!, lcon, ucon, jacobian_backend=J_backend, hessian_backend=H_backend) ``` +The section ["providing the sparsity pattern for sparse derivatives"](@ref sparsity-pattern) illustrates this feature with a more advanced application. + +### Acknowledgements + The package [`SparseConnectivityTracer.jl`](https://github.com/adrhill/SparseConnectivityTracer.jl) is used to compute the sparsity pattern of Jacobians and Hessians. The evaluation of the number of directional derivatives and the seeds required to compute compressed Jacobians and Hessians is performed using [`SparseMatrixColorings.jl`](https://github.com/gdalle/SparseMatrixColorings.jl). As of release v0.8.1, it has replaced [`ColPack.jl`](https://github.com/exanauts/ColPack.jl). diff --git a/docs/src/sparsity_pattern.md b/docs/src/sparsity_pattern.md index 3027dd3d..200bd348 100644 --- a/docs/src/sparsity_pattern.md +++ b/docs/src/sparsity_pattern.md @@ -1,4 +1,4 @@ -# Improve sparse derivatives +# [Improve sparse derivatives](@id sparsity-pattern) In this tutorial, we show a feature of ADNLPModels.jl to potentially improve the computation of sparse Jacobian and Hessian. diff --git a/src/sparsity_pattern.jl b/src/sparsity_pattern.jl index bc3cf3ef..f55c007e 100644 --- a/src/sparsity_pattern.jl +++ b/src/sparsity_pattern.jl @@ -1,3 +1,5 @@ +export get_sparsity_pattern + """ compute_jacobian_sparsity(c, x0; detector) compute_jacobian_sparsity(c!, cx, x0; detector) @@ -51,3 +53,78 @@ function compute_hessian_sparsity( S = ADTypes.hessian_sparsity(lagrangian, x0, detector) return S end + +""" + S = get_sparsity_pattern(model::ADModel, derivative::Symbol) + +Retrieve the sparsity pattern of a Jacobian or Hessian from an `ADModel`. +For the Hessian, only the lower triangular part of its sparsity pattern is returned. +The user can reconstruct the upper triangular part by exploiting symmetry. + +To compute the sparsity pattern, the model must use a sparse backend. +Supported backends include `SparseADJacobian`, `SparseADHessian`, and `SparseReverseADHessian`. + +#### Input arguments + +* `model`: An automatic differentiation model (either `AbstractADNLPModel` or `AbstractADNLSModel`). +* `derivative`: The type of derivative for which the sparsity pattern is needed. The supported values are `:jacobian`, `:hessian`, `:jacobian_residual` and `:hessian_residual`. + +#### Output argument + +* `S`: A sparse matrix of type `SparseMatrixCSC{Bool,Int}` indicating the sparsity pattern of the requested derivative. +""" +function get_sparsity_pattern(model::ADModel, derivative::Symbol) + get_sparsity_pattern(model, Val(derivative)) +end + +function get_sparsity_pattern(model::ADModel, ::Val{:jacobian}) + backend = model.adbackend.jacobian_backend + validate_sparse_backend(backend, SparseADJacobian, "Jacobian") + m = model.meta.ncon + n = model.meta.nvar + colptr = backend.colptr + rowval = backend.rowval + nnzJ = length(rowval) + nzval = ones(Bool, nnzJ) + SparseMatrixCSC(m, n, colptr, rowval, nzval) +end + +function get_sparsity_pattern(model::ADModel, ::Val{:hessian}) + backend = model.adbackend.hessian_backend + validate_sparse_backend(backend, Union{SparseADHessian, SparseReverseADHessian}, "Hessian") + n = model.meta.nvar + colptr = backend.colptr + rowval = backend.rowval + nnzH = length(rowval) + nzval = ones(Bool, nnzH) + SparseMatrixCSC(n, n, colptr, rowval, nzval) +end + +function get_sparsity_pattern(model::AbstractADNLSModel, ::Val{:jacobian_residual}) + backend = model.adbackend.jacobian_residual_backend + validate_sparse_backend(backend, SparseADJacobian, "Jacobian of the residual") + m = model.nls_meta.nequ + n = model.meta.nvar + colptr = backend.colptr + rowval = backend.rowval + nnzJ = length(rowval) + nzval = ones(Bool, nnzJ) + SparseMatrixCSC(m, n, colptr, rowval, nzval) +end + +function get_sparsity_pattern(model::AbstractADNLSModel, ::Val{:hessian_residual}) + backend = model.adbackend.hessian_residual_backend + validate_sparse_backend(backend, Union{SparseADHessian, SparseReverseADHessian}, "Hessian of the residual") + n = model.meta.nvar + colptr = backend.colptr + rowval = backend.rowval + nnzH = length(rowval) + nzval = ones(Bool, nnzH) + SparseMatrixCSC(n, n, colptr, rowval, nzval) +end + +function validate_sparse_backend(backend::B, expected_type, derivative_name::String) where {B <: ADBackend} + if !(backend isa expected_type) + error("The current backend $B doesn't compute a sparse $derivative_name.") + end +end diff --git a/test/sparse_hessian.jl b/test/sparse_hessian.jl index dd2ed1de..9ccd56a2 100644 --- a/test/sparse_hessian.jl +++ b/test/sparse_hessian.jl @@ -63,6 +63,14 @@ dt = (Float32, Float64) H = sparse(rows, cols, vals, nvar, nvar) @test H == [x[2] 0; x[1]+x[2] x[1]] + y[2] * [-20 0; 0 0] + if (backend == ADNLPModels.SparseADHessian) || (backend == ADNLPModels.SparseReverseADHessian) + H_sp = get_sparsity_pattern(nlp, :hessian) + @test H_sp == SparseMatrixCSC{Bool, Int}( + [ 1 0 ; + 1 1 ] + ) + end + nlp = ADNLPModel!( x -> x[1] * x[2]^2 + x[1]^2 * x[2], x0, diff --git a/test/sparse_hessian_nls.jl b/test/sparse_hessian_nls.jl index be0c8fe4..ac552a5c 100644 --- a/test/sparse_hessian_nls.jl +++ b/test/sparse_hessian_nls.jl @@ -50,6 +50,14 @@ dt = (Float32, Float64) H = Symmetric(sparse(rows, cols, vals, nvar, nvar), :L) @test H == [-20*v[2] 0; 0 0] + if (backend == ADNLPModels.SparseADHessian) || (backend == ADNLPModels.SparseReverseADHessian) + H_sp = get_sparsity_pattern(nls, :hessian_residual) + @test H_sp == SparseMatrixCSC{Bool, Int}( + [ 1 0 ; + 0 0 ] + ) + end + nls = ADNLPModels.ADNLSModel!(F!, x0, 3, matrix_free = true; kw...) @test nls.adbackend.hessian_backend isa ADNLPModels.EmptyADbackend @test nls.adbackend.hessian_residual_backend isa ADNLPModels.EmptyADbackend diff --git a/test/sparse_jacobian.jl b/test/sparse_jacobian.jl index a318d8cc..1b089406 100644 --- a/test/sparse_jacobian.jl +++ b/test/sparse_jacobian.jl @@ -51,6 +51,15 @@ dt = (Float32, Float64) 0 1 ] + if backend == ADNLPModels.SparseADJacobian + J_sp = get_sparsity_pattern(nlp, :jacobian) + @test J_sp == SparseMatrixCSC{Bool, Int}( + [ 1 0 ; + 1 1 ; + 0 1 ] + ) + end + nlp = ADNLPModel!(x -> sum(x), x0, c!, zeros(T, ncon), zeros(T, ncon), matrix_free = true; kw...) @test nlp.adbackend.jacobian_backend isa ADNLPModels.EmptyADbackend end diff --git a/test/sparse_jacobian_nls.jl b/test/sparse_jacobian_nls.jl index 14ed4b01..9035d6b4 100644 --- a/test/sparse_jacobian_nls.jl +++ b/test/sparse_jacobian_nls.jl @@ -43,6 +43,15 @@ dt = (Float32, Float64) 0 1 ] + if backend == ADNLPModels.SparseADJacobian + J_sp = get_sparsity_pattern(nls, :jacobian_residual) + @test J_sp == SparseMatrixCSC{Bool, Int}( + [ 1 0 ; + 1 1 ; + 0 1 ] + ) + end + nls = ADNLPModels.ADNLSModel!(F!, x0, 3, matrix_free = true; kw...) @test nls.adbackend.jacobian_backend isa ADNLPModels.EmptyADbackend @test nls.adbackend.jacobian_residual_backend isa ADNLPModels.EmptyADbackend