Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symmetric coloring for efficient sparse hessian computation #258

Closed
amontoison opened this issue Jun 26, 2024 · 2 comments · Fixed by #271
Closed

Symmetric coloring for efficient sparse hessian computation #258

amontoison opened this issue Jun 26, 2024 · 2 comments · Fixed by #271
Assignees
Labels
enhancement New feature or request epic

Comments

@amontoison
Copy link
Member

amontoison commented Jun 26, 2024

In ADNLPModels.jl, we currently use column coloring to determine the seeds required to compute compressed Jacobians and Hessians.

Column coloring doesn’t exploit the symmetry of the Hessian and could be replaced with symmetric coloring. With this new coloring, I expect to use fewer seeds to determine all nonzeros of the Hessians. Since Ipopt and KNITRO only require the Hessian of the Lagrangian, this feature is valuable to add to ADNLPModels.jl for solving optimization problems modeled by this package.

The drawback of symmetric coloring is that it's harder to know if, for a given coefficient H[i,j], we need to use the color associated with column j or i. Depending on this, H[i,j] will be stored in row i or j of the vector that represents the "compressed columns" associated with the color c[j] or c[i], where c is the vector that assigns a color to each column.

From what I understand, and thanks to Guillaume Dalle, we can use the property associated with star coloring, which is the method used in symmetric coloring, to perform this decompression step (see gdalle/SparseMatrixColorings.jl#22).

I have opened PR #257 to perform more efficient column decompression in sparse Jacobians and Hessians by preallocating the index of the nnz of the sparse matrix computed by a seed. This will make it easier to use symmetric coloring/decompression in the Hessian later.

I propose the following code for symmetric coloring, but I would like some comments/suggestions before we use it.

@PierreMartinon @ocots @jbcaillau @gergaud @tmigot @gdalle

using SparseMatrixColorings

# c = symmetric_coloring(H, algo)
#
# Use algorithm `algo` to construct a symmetrically structurally orthogonal
# partition of the columns (or rows) of the symmetric matrix `H`.
# The result is a coloring vector `c` of length size(H, 1) == size(H, 2)
# such that for every non-zero coefficient H[i,j], at least one of
# the following conditions holds:
#
#  • column j is the only column of its color c[j] with a non-zero coefficient in row i;
#  • column i is the only column of its color c[i] with a non-zero coefficient in row j.
#
# algo = GreedyColoringAlgorithm()
# c = symmetric_coloring(H, algo)
# ncolors = maximum(colors)

function nnz_colors(trilH, colors, ncolors)
  # We want to determine the coefficients in `trilH` that will be computed by a given color.
  # Because we exploit the symmetry, we also need to store the row index for a given coefficient
  # in the "compressed column".
  dcolors = Dict(i => Tuple{Int,Int}[] for i=1:ncolors)

  for j in 1:n
    for k in trilH.colptr[j]:trilH.colptr[j+1]-1
      i = trilH.rowval[k]

      # Should we use c[i] or c[j] for this coefficient H[i,j]?
      ci = colors[i]
      cj = colors[j]

      if i == j
        # H[i,j] is a coefficient of the diagonal
        push!(dcolors[ci], (j,k))
      else # i > j
        if is_only_color_in_row(H, i, j, n, colors, ci)
          # column i is the only column of its color c[i] with a non-zero coefficient in row j
          push!(dcolors[ci], (j,k))
        else
          # column j is the only column of its color c[j] with a non-zero coefficient in row i
          # it is ensured by the star coloring used in `symmetric_coloring`.
          push!(dcolors[cj], (i,k))
        end
      end
    end
  end

  return dcolors
end

function is_only_color_in_row(H, i, j, n, colors, ci)
  column = 1
  flag = true
  while flag && column  n
    # We want to check that all columns (except the i-th one) with color
    # ci don’t have a non-zero coefficient in row j
    if (column != i) && (colors[column] == ci)
      k = H.colptr[column]
      while (k  H.colptr[column+1]-1) && flag
        row = H.rowval[k]
        if row == j
          # We found a coefficient at row j in a column of color ci
          flag = false
        else # row != j
          k += 1 
        end
      end
    end
    column += 1
  end
  return flag
end

n = 10
H = sprand(n, n, 0.0)
H = H + I
for p in [1, n-1, n]
  H[p,:] .= 1.0
  H[:,p] .= 1.0
end
# julia> H
# 10×10 SparseMatrixCSC{Float64, Int64} with 58 stored entries:
# 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
# 1.0  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0  1.0
# 1.0   ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅   1.0  1.0
# 1.0   ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅   1.0  1.0
# 1.0   ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅   1.0  1.0
# 1.0   ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅   1.0  1.0
# 1.0   ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅   1.0  1.0
# 1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0  1.0  1.0
# 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
# 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

trilH = tril(H)
# julia> trilH = tril(H)
# 10×10 SparseMatrixCSC{Float64, Int64} with 34 stored entries:
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅ 
#  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0   ⋅ 
#  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

colors = symmetric_coloring(H, algo)
# julia> colors
# 10-element Vector{Int64}:
#  1
#  2
#  2
#  2
#  2
#  2
#  2
#  2
#  3
#  4

ncolors = maximum(colors)
# julia> ncolors
# 4

dcolors = nnz_colors(trilH, colors, ncolors)
# Dict{Int64, Vector{Tuple{Int64, Int64}}} with 4 entries:
#   4 => [(1, 10), (2, 13), (3, 16), (4, 19), (5, 22), (6, 25), (7, 28), (8, 31), (9, 33), (10, 34)]
#   2 => [(2, 11), (3, 14), (4, 17), (5, 20), (6, 23), (7, 26), (8, 29)]
#   3 => [(1, 9), (2, 12), (3, 15), (4, 18), (5, 21), (6, 24), (7, 27), (8, 30), (9, 32)]
#   1 => [(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8)]

n = 10
H2 = sprand(n, n, 0.0)
for i = 1:n
  H2[i,n-i+1] = 1.0
end
# julia> H2
# 10×10 SparseMatrixCSC{Float64, Int64} with 10 stored entries:
#  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0
#  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅ 
#  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅ 
#  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅ 
#  ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅ 
#  ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
#  ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
# 1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 

trilH2 = tril(H2)
# julia> trilH2 = tril(H2)
# 10×10 SparseMatrixCSC{Float64, Int64} with 5 stored entries:
#   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#   ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 

colors2 = symmetric_coloring(H2, algo)
# julia> colors2 = symmetric_coloring(H2, algo)
# 10-element Vector{Int64}:
#  1
#  1
#  1
#  1
#  1
#  2
#  2
#  2
#  2
#  2

ncolors2 = maximum(colors2)
# julia> ncolors2
# 2

dcolors2 = nnz_colors(trilH2, colors2, ncolors2)
# Dict{Int64, Vector{Tuple{Int64, Int64}}} with 2 entries:
#   2 => []
#   1 => [(10, 1), (9, 2), (8, 3), (7, 4), (6, 5)]

# We need to be careful in our AD tool to not use all colors,
# dcolors2[2] = [] and we don't need the color 2 here!

n = 10
H3 = sprand(n, n, 0.0)
for i = 1:n
  H3[i,n-i+1] = 1.0
end
for p in [1, n]
  H3[p,:] .= 1.0
  H3[:,p] .= 1.0
end
# julia> H3
# 10×10 SparseMatrixCSC{Float64, Int64} with 44 stored entries:
# 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
# 1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0  1.0
# 1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅   1.0
# 1.0   ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅   1.0
# 1.0   ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅   1.0
# 1.0   ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅   1.0
# 1.0   ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅   1.0
# 1.0   ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0
# 1.0  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0
# 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

trilH3 = tril(H3)
# julia> trilH3
# 10×10 SparseMatrixCSC{Float64, Int64} with 23 stored entries:
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0   ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
#  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

colors3 = symmetric_coloring(H3, algo)
# julia> colors3 = symmetric_coloring(H3, algo)
# 10-element Vector{Int64}:
#  1
#  2
#  2
#  2
#  2
#  3
#  3
#  3
#  3
#  4

ncolors3 = maximum(colors3)
# julia> ncolors3 = maximum(colors3)
# 4

dcolors3 = nnz_colors(trilH3, colors3, ncolors3)
# julia> dcolors3 = nnz_colors(trilH3, colors3, ncolors3)
# Dict{Int64, Vector{Tuple{Int64, Int64}}} with 4 entries:
#   4 => [(1, 10), (2, 12), (3, 14), (4, 16), (5, 18), (6, 19), (7, 20), (8, 21), (9, 22), (10, 23)]
#   2 => [(8, 13), (7, 15), (6, 17)]
#   3 => [(2, 11)]
#   1 => [(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)]
@amontoison amontoison added enhancement New feature or request epic labels Jun 26, 2024
@amontoison amontoison self-assigned this Jun 26, 2024
@gdalle
Copy link
Collaborator

gdalle commented Jun 26, 2024

Indeed, the decompression step after a symmetric/star coloring is a bit more tricky, because H[i, j] can be read from the group of either column i or column j. The "right" algorithms are given in the following paper: https://pubsonline.informs.org/doi/abs/10.1287/ijoc.1080.0286. In my opinion, they belong in the library SparseMatrixColorings.jl.

At the moment, only DirectRecover1 is implemented, but ideally we want DirectRecover2 because star coloring creates a byproduct that helps with decompression (the "set of 2-colored stars").
The only obstacle for that is to change the API of ADTypes.jl so that ADTypes.symmetric_coloring is allowed to return something else in addition to the vector of colors.

image
image

@jbcaillau
Copy link

I have opened PR #257 to perform more efficient column decompression in sparse Jacobians and Hessians by preallocating the index of the nnz of the sparse matrix computed by a seed. This will make it easier to use symmetric coloring/decompression in the Hessian later.

thanks @amontoison for the update. can't be of much help on this. have you considered exchanging on this with @abuttari or JY L'Excellent?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request epic
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants