Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: cleanup inconsistent state in reserveSnap() after a failure #4946

Merged
merged 1 commit into from
Nov 11, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions internal/rbd/rbd_journal.go
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,18 @@ func (ri *rbdImage) repairImageID(ctx context.Context, j *journal.Connection, fo
func reserveSnap(ctx context.Context, rbdSnap *rbdSnapshot, rbdVol *rbdVolume, cr *util.Credentials) error {
var err error

// restore original values in case of an error
origReservedID := rbdSnap.ReservedID
origRbdSnapName := rbdSnap.RbdSnapName
origVolID := rbdSnap.VolID
defer func() {
if err != nil {
rbdSnap.ReservedID = origReservedID
rbdSnap.RbdSnapName = origRbdSnapName
rbdSnap.VolID = origVolID
}
}()
Comment on lines +391 to +401
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nixpanic any specific reason to preserve and restore these values? i assume if the reservation is done properly these values will be set and not overridden to any values isnt it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are these two calls that overwrite the attributes in any case (success of error):

rbdSnap.ReservedID, rbdSnap.RbdSnapName, err = j.ReserveName(...

and

rbdSnap.VolID, err = util.GenerateVolID(...

So, if something goes wrong, it looks nicer to reset the attributes that may have been modified by a failing call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting those attributes only at the end of the function is tricky too, undoSnapReservation() expects them to be set in order to clean things up in the journal.

Copy link
Collaborator

@Madhu-1 Madhu-1 Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nixpanic the ReserveID and RbdSnapName names are not set earlier its only set by calling the j.ReserveName function, IMO we dont need to preserve anything and also calling the defer incase of error should be enough. same goes for volID as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Madhu-1 if they are not expected to be set, shall I add a check and error message in case they are not an empty string?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently we dont have any case where they are set before calling this function. there are callers doing undo reservation and that will fail or not required

  • Call Reserve to do reservation
  • Undo if there is any error

we updated the current code to do undo internally if something fails.

Doesnt hurt to have this code but this will result in extra error/logs when we try to call the Undo function multiple times.

Let me know what do you prefer

  1. Merge the current changes
  2. Add a check to ensure the values are not set before calling reserveSnap internal function
  3. Dont revert back and allow override.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current PR as it is shows what a function should do/restore. There are no callers that will call undoSnapReservation() in case reserveSnap() fails, so there should not be additional logs for that.

When a function returns an error, there should be no need for the caller to undo whatever internals the function may, or may not, have done. Everything internal to the function should stay there. For the more complex cases, a special undo function can be returned, similar to what Manager.getGroupUUID() does.

Merging the current PR as it is should be fine IMHO, @Madhu-1, but I am also fine with improving it some way you recommend.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it would not. In case there is an error returned by reserveSnap(), the defer where undoSnapReservation() is called will never be reached.


journalPoolID, imagePoolID, err := util.GetPoolIDs(ctx, rbdSnap.Monitors, rbdSnap.JournalPool, rbdSnap.Pool, cr)
if err != nil {
return err
Expand All @@ -405,6 +417,17 @@ func reserveSnap(ctx context.Context, rbdSnap *rbdSnapshot, rbdVol *rbdVolume, c
ctx, rbdSnap.JournalPool, journalPoolID, rbdSnap.Pool, imagePoolID,
rbdSnap.RequestName, rbdSnap.NamePrefix, rbdVol.RbdImageName, kmsID, rbdSnap.ReservedID, rbdVol.Owner,
"", encryptionType)
defer func() {
// only undo the reservation when an error occurred
if err == nil {
return
}

undoErr := undoSnapReservation(ctx, rbdSnap, cr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not sufficient in case reserveSnap() had an internal error. That is the reason why this change was suggested in the 1st place. A failing function should cleanup its own internal partially-completed state on a failure, reserveSnap() did not do that yet.

if undoErr != nil {
log.WarningLog(ctx, "failed undoing reservation of snapshot %q: %v", rbdSnap, undoErr)
}
}()
if err != nil {
return err
}
Expand Down