-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multipath-tools 0.10.2 #109
Open
mwilck
wants to merge
32
commits into
opensvc:stable-0.10.y
Choose a base branch
from
openSUSE:stable-0.10.y
base: stable-0.10.y
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The @v1 releases are deprecated. https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/ Signed-off-by: Martin Wilck <[email protected]>
Signed-off-by: Martin Wilck <[email protected]>
We can't use "container:" any more because upload-artifact@v4 doesn't work on Debian Jessie. Signed-off-by: Martin Wilck <[email protected]>
Running "make" in a container and "make clean" outside doesn't work (access right issues). So just use separate jobs. Signed-off-by: Martin Wilck <[email protected]>
Avoid "text file busy" error on GitHub. dmevents-test: Text file busy Makefile:74: recipe for target 'dmevents.out' failed Signed-off-by: Martin Wilck <[email protected]>
Signed-off-by: Martin Wilck <[email protected]>
The ABI should never change on a stable branch. This workflow asserts that. Signed-off-by: Martin Wilck <[email protected]>
dm_get_maps() traverses the entire list of dm maps. We shouldn't give up just because probing a single map failed. Fixes: bf3a4ad ("libmultipath: simplify dm_get_maps()") Signed-off-by: Martin Wilck <[email protected]>
We import DM_COLDPLUG_SUSPENDED in all code flows below mpath_coldplug_end. Clarify this in the code. Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
Since b22c273 ("11-dm-mpath.rules: Don't force activation while device is suspended"), we've handled the case where a device is suspended while an uevent is processed (e.g. because multipathd is reloading the map again at the same time). But we were missing the case where The device had never been initialized before. If .MPATH_DEVICE_READY_OLD was empty, we'd jump to scan_import without setting MPATH_DEVICE_READY to 0. This can cause a device not to be fully activated at boot time, because in follow-up uevents we'd assume that the device had already been set up. Treat the case in which an uevent is processed for a previously not fully set-up, suspended device like other situations where we set MPATH_DEVICE_READY to 0. Fixes: b22c273 ("11-dm-mpath.rules: Don't force activation while device is suspended") Reviewed-by: Benjamin Marzinski <[email protected]>
Our code is always setting MPATH_UNCHANGED and DM_ACTIVATION in pairs. While DM_ACTIVATION is a global DM property, MPATH_UNCHANGED is owned by us. Just set MPATH_UNCHANGED, and adapt DM_ACTIVATION when necessary just in one place. Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
We set .DM_NOSCAN above where we set DM_UDEV_DISABLE_OTHER_RULES_FLAG, too. If the latter isn't set, .DM_NOSCAN can't be set. Skip the redundant test. Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
When multipath reloads a device or fails or restores a path, the udev rules disable LVM scanning, but since .DM_NOSCAN isn't set, blkid is still run on the device. When multipath devices that are set to queue_if_no_path lose all their paths at close to the same time, udev workers can hang trying to run blkid. The blkid results shouldn't change when multipathd is adding, removing, failing or reinstating paths, aside from avoiding hanging udev processes, we're skipping unnecessary work. Hence, set .DM_NOSCAN if MPATH_UNCHANGED is set, to avoid blkid from being called in 13-dm.rules. Suggested-by: Benjamin Marzinski <[email protected]> Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
If pp->dev_loss is DEV_LOSS_TMO_UNSET and min_dev_loss is 0 (which is the case if no_path_retry is NO_PATH_RETRY_FAIL or NO_PATH_RETRY_UNDEF), we will set pp->dev_loss to 0, which is wrong. Fix it. Fixes: 058b5f5 ("libmultipath: fix dev_loss_tmo even if not set in configuration") Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
If reload_and_sync_map() removes the multipath device, deferred_failback_tick() needs to decrement the counter so that it doesn't skip the following device. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
Reported by coverity: "i--" may cause an underflow, which will again cause an overflow when the loop continues. Use a signed int for loops like this to make coverity happy. Signed-off-by: Martin Wilck <[email protected]>
Avoid the following error with clang 19: msort.c:268:27: error: cast from '__compar_fn_t' (aka 'int (*)(const void *, const void *)') to '__compar_d_fn_t' (aka 'int (*)(const void *, const void *, void *)') converts to incompatible function type [-Werror,-Wcast-function-type-mismatch] 268 | return msort_r (b, n, s, (__compar_d_fn_t)cmp, NULL); | ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Martin Wilck <[email protected]>
Signed-off-by: Martin Wilck <[email protected]>
WATCHDOG_USEC may be set to 0, which means that the watchdog is disabled in systemd. Fixes: 9366cfb ("multipathd: Implement systemd watchdog integration") Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
On systems with LVM volumes, "multipath -ll" will spit out lots of messages like libmp_mapinfo: map vg-lv0 has multiple targets which is irritating. Reduce the log level of these messages to 3, as they are harmless most of the time. This is a backport of e8949c2 ("libmultipath: reduce log level of libmp_mapinfo() messages") from the master branch. We can't apply exactly the same fix because the stable branch is missing 8c772d3 ("libmultipath: check DM UUID earlier in libmp_mapinfo__"). Signed-off-by: Martin Wilck <[email protected]>
_find_controllers() needs to free the udev device if it doesn't get added to a path. Otherwise it can leak memory whenever check_foreign() is called, causing multipathd's memory usage to continually grow. Fixes: 7b47762 ("libmultipath: nvme: fix path detection for kernel 4.16") Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]> (cherry picked from commit 5a3d334)
In function io_err_stat_handle_pathfail(), path->io_err_dis_reinstate_time is set to 0 to enqueue path to io error check as soon as possible. But multipathd can not do it within marginal_path_err_recheck_gap_time seconds after power-on, because curr_time is less than marginal_path_err_recheck_gap_time. To handle the early marginal path, we can enqueue path when io_err_dis_reinstate_time is 0. Signed-off-by: chenrenhui <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]> > (cherry picked from commit a1e3cf2)
pp->pgindex is set in disassemble_map() when a map is parsed. There are various possiblities for this index to become invalid. pp->pgindex is only used in enable_group() and followover_should_fallback(), and both callers take no action if it is 0, which is the right thing to do if we don't know the path's pathgroup. Make sure pp->pgindex is reset to 0 in various places: - when it's orphaned, - before (re)grouping paths, - when we detect a bad mpp assignment in update_pathvec_from_dm(). - when a pathgroup is deleted in update_pathvec_from_dm(). In this case, pgindex needs to be invalidated for all paths in all pathgroups after the one that was deleted. The hunk in group_paths is mostly redundant with the hunk in free_pgvec(), but because we're looping over pg->paths in the former and over pg->pgp in the latter, I think it's better too play safe. Fixes: 99db1bd ("[multipathd] re-enable disabled PG when at least one path is up") Fixes: opensvc#105 Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]> (cherry picked from commit cd912cf) (cherry picked from commit 714c20b)
The previous algorithm didn't detect the case case where cpgp contained a path that was not contained in pgp. Fix this. Cherry-picked from d4b35f6 Fixes: 90773ba ("libmultipath: resolve hash collisions in pgcmp()") Signed-off-by: Martin Wilck <[email protected]>
If map creation succeeds, previously not multipathed devices are now multipathed. udev may not have noticed this yet, thus trigger path uevents to make it aware of the situation. Likewise, if creating a map fails, the paths in question were likely considered multipath members by udev, too. They will now be marked as failed, so trigger an event in this situation as well. Fixes: opensvc#103 Suggested-by: Benjamin Marzinski <[email protected]> Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]> (cherry picked from commit 98b3a7b)
... if a map has been flushed. In this case, we know that the the paths haven't been multipathed by coalesce_paths() because of the current configuration (failure to create the map can't be the reason if the map exists in coalesce_maps()). Make sure udev sees the paths which have been released from the map as non-multipath. Note that this is the only case where maps are flushed where it is correct to trigger paths uevents. In other cases, e.g. after a "remove map" CLI command, the configuration is unchanged and if we triggered an uevent, the map would be re-created by multipathd when the uevent arrived. Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]> (cherry picked from commit ad3ea47)
If a multipath device already has need_reload set when a path is adopted, it won't call set_path_max_sectors_kb() because of short-circuit evaluation. This isn't what's intended. Fixes: e5e20c7 ("libmultipath: set max_sectors_kb in adopt_paths()") Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]> (cherry picked from commit f5c0c4b)
This change doesn't actually fix anything. The code was already safe. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]> (cherry picked from commit 85ec51e)
multipath wouldn't autodetect the GROUP_BY_PRIO path grouping policy or allow the GROUP_BY_TPG policy if there was a path that didn't have its prioritizer selected (for instance because multipathd was reconfigured while it was offline). To avoid this, make verify_alua_prio() assume an alua multipath device if all the paths with a prioritizer selected (there must be at least one) use an alua-based prioritizer. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]> (cherry picked from commit b47a577)
This comment has been minimized.
This comment has been minimized.
Hm the "stable abi" test worked in my repo. I will double check. |
Signed-off-by: Martin Wilck <[email protected]>
bmarzins
approved these changes
Jan 31, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
multipath-tools 0.10.2, 2025/02
This release contains backported bug fixes from the stable-0.11.y branch.
Bug fixes
if an invalid path device was removed from a map.
Fixes #105.
inconsistent or wrong kernel state (e.g. missing or falsely mapped path
device). Wrongly mapped paths will be unmapped and released to the system.
Fixes another issue reported in
#105.
group_by_tpg
might be disabled if one or morepaths were offline during initial configuration.
max_sectors_kb
might not be set on a path device thatwas being added to a multipath map. This problem was introduced in 0.9.9.
multipath-tools 0.10.1, 2025/01
This is the first bug fix release on the
stable-0.10.y
branch. It containsbug fixes from 0.11.0, and some CI-related fixes.
Bug fixes
of device mapper devices, in particular devices with multiple DM targets.
The problem was introduced in 0.10.0.
Fixes #102.
activated during boot if a cold plug uevent is processed for a previously
not configured multipath map while this map was suspended. This problem existed
since 0.9.8.
no_path_retry fail
and no settingfor
dev_loss_tmo
might get thedev_loss_tmo
set to 0, causing thedevice to be deleted immediately in the event of a transport disruption.
This bug was introduced in 0.9.6.
(
failback
value > 0 inmultipath.conf
), some maps might fail back laterthan configured. The problem existed since 0.9.6.
WATCHDOG_USEC
environment variable had the value "0", which means that thewatchdog is simply disabled. This (minor) problem existed since 0.4.9.
0.7.8.
the io error check for a recently failed path to be delayed. This bug
existed since 0.7.4.