Skip to content

Commit

Permalink
zed: protect against wait4()/fork() races to the launched process tree
Browse files Browse the repository at this point in the history
As soon as wait4() returns, fork() can immediately return with the same
PID, and race to lock _launched_processes_lock, then try to add the new
(duplicate) PID to _launched_processes, which asserts

By locking before wait4(), we ensure, that, given that same
unfortunate scheduling, _launched_processes_lock cannot be locked by the
spawner before we pop the process in the reaper, and only afterward will
it be added

This moves where the reaper idles when there are children from the
wait4() to the pause(), locking for the duration of that single syscall
in both the no-children and running-children cases; the impact of this
is one to two syscalls (depending on _launched_processes_lock state)
per loop

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Don Brady <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Closes #11924
Closes #11928
  • Loading branch information
nabijaczleweli authored and behlendorf committed Apr 22, 2021
1 parent 79d9f66 commit 8fd6535
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions cmd/zed/zed_exec.c
Original file line number Diff line number Diff line change
Expand Up @@ -205,10 +205,12 @@ _reap_children(void *arg)
(void) sigaction(SIGCHLD, &sa, NULL);

for (_reap_children_stop = B_FALSE; !_reap_children_stop; ) {
pid = wait4(0, &status, 0, &usage);
(void) pthread_mutex_lock(&_launched_processes_lock);
pid = wait4(0, &status, WNOHANG, &usage);

if (pid == (pid_t)-1) {
if (errno == ECHILD)
if (pid == 0 || pid == (pid_t)-1) {
(void) pthread_mutex_unlock(&_launched_processes_lock);
if (pid == 0 || errno == ECHILD)
pause();
else if (errno != EINTR)
zed_log_msg(LOG_WARNING,
Expand All @@ -217,7 +219,6 @@ _reap_children(void *arg)
} else {
memset(&node, 0, sizeof (node));
node.pid = pid;
(void) pthread_mutex_lock(&_launched_processes_lock);
pnode = avl_find(&_launched_processes, &node, NULL);
if (pnode) {
memcpy(&node, pnode, sizeof (node));
Expand Down

0 comments on commit 8fd6535

Please sign in to comment.