Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemon exit status available in the API #15

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Daemon exit status available in the API #15

wants to merge 2 commits into from

Conversation

simon-batardiere
Copy link

No description provided.

@damjan-we damjan-we requested a review from lkristan-plume June 5, 2020 08:36
Copy link

@zhichengqiu zhichengqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@damjan-we damjan-we requested a review from mitja-plume June 8, 2020 07:45
Copy link
Contributor

@mitja-plume mitja-plume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the field supposed to be accessed? Directly via the daemon_t structure?

If so, I suggest adding an accessor function. Also, the status must have a clear value representing that the process is still running (ie. the current status is not available because the process is still running). This being 0 is OK with me, but it needs to be documented in the header of the accessor function.

@simon-batardiere
Copy link
Author

simon-batardiere commented Jun 12, 2020

The previous commit add an API to get the exit status.
I don't find an easy way to know if the process is running or not. Because dn_enabled could be set to true in case of restart and doesn't really reflect if the process is running or not.

Copy link
Contributor

@lkristan-plume lkristan-plume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going in wrong direction. The idea of the daemon module is to take as much of the burden off of the caller as possible. The module itself should (and does) log all the details about the exit status.

Judging by the code in https://github.com/plume-design/opensync/pull/16/files#diff-053599008b20818e4dfcc63a39ad1221R73, the implementation of the 'atexit' callback merely tries to log something that was already logged.

As it is, the implementation of pwm_DHCP_option82_at_exit() should probably be:

    (void)daemon;
    LOGE("DHCP option82 process exited unexpectedly");
    return true;

... which means this PR (#15) is not needed at all.

AFAICS, the daemon module does not call the callback if the caller requests the daemon to stop. Therefore it makes sense to log an error whenever the callback is called. After all, even if the process exited gracefully, the caller expected it to be running, therefore this is an undesired situation.

But the main purpose of this callback is not logging, rather it is to take a suitable action (e.g. stop advertising a service that depends on this deamon).

@lkristan-plume
Copy link
Contributor

I see some room for improvement here:

... here:

        if (WEXITSTATUS(wstatus) != 0)
        {
            ls = LOG_SEVERITY_ERR;
        }
        else if (self->dn_enabled)
        {
            ls = LOG_SEVERITY_WARN;
        }

... and here:

        LOG(INFO, "Daemon %s failed to start (or exited unexpectedly).", self->dn_exec);

The level is intentionally set to INFO. That way if the atexit callback can log a better error, the log backtrace will only be dumped once.

Simon Batardiere added 2 commits June 15, 2020 19:03
Exit status of a daemon could be useful for managers to know if the daemon
executed sucessfully or not, and can be used then in some logic.
This commit makes the exit status of the daemon availble in the API.

Signed-off-by: Simon Batardiere <[email protected]>
Create a getter in the daemon API to hide the daemon structure
details outside the API.

Signed-off-by: Simon Batardiere <[email protected]>
@simon-batardiere
Copy link
Author

I think this is going in wrong direction. The idea of the daemon module is to take as much of the burden off of the caller as possible. The module itself should (and does) log all the details about the exit status.

Judging by the code in https://github.com/plume-design/opensync/pull/16/files#diff-053599008b20818e4dfcc63a39ad1221R73, the implementation of the 'atexit' callback merely tries to log something that was already logged.

As it is, the implementation of pwm_DHCP_option82_at_exit() should probably be:

    (void)daemon;
    LOGE("DHCP option82 process exited unexpectedly");
    return true;

... which means this PR (#15) is not needed at all.

AFAICS, the daemon module does not call the callback if the caller requests the daemon to stop. Therefore it makes sense to log an error whenever the callback is called. After all, even if the process exited gracefully, the caller expected it to be running, therefore this is an undesired situation.

But the main purpose of this callback is not logging, rather it is to take a suitable action (e.g. stop advertising a service that depends on this deamon).

Hi Lars, I see two usages of the daemon API:

  • Run a daemon
  • Run a process which takes a very long time to finish (for example, a curl command to download a big file).

I agree with you in the context of the daemon, and your comment on the PR for PWM is right. But in the case of a long command, the daemon API is the only valid option. If we use the other APIs, like cmd_log, we will block the event loop for a while, which is always a bad thing. So here we also need to use the daemon API. And the caller may be interested by the exit status of the long command (like the curl). This is why this PR is still useful.

@lkristan-plume
Copy link
Contributor

Hi Simon,

Regarding the two usages:

  • Your example of downloading a file will be invalidated in near future with the introduction of an API specifically for that purpose.
  • We are also considering to introduce a library for async process execution (which the daemon library could then use internally).

Considering the above, and without an immediate need for this change, I am reluctant to give it my approval.

Regardless, here are some additional comments:

1 - @ https://github.com/plume-design/opensync/pull/15/files#diff-6a0c177f1dce309d321ec6382d55b041R417

    self->dn_exit_status = WEXITSTATUS(wstatus);

Return value of WEXITSTATUS may be undefined here (e.g. if the process was signaled).

A possible solution would be to use:

    if (WIFSIGNALED(wstatus))
    {
        self->dn_exit_status = -(WTERMSIG(wstatus));

...

    else if (WIFEXITED(wstatus))
    {
        self->dn_exit_status = WEXITSTATUS(wstatus);

...

    else
    {
        self->dn_exit_status = wstatus;

2 - daemon_get_exit_status() could theoretically be called not only from within the callback, but at any time, asynchronously.
We could say this is not really supported, but it would still be nice to initialize dn_exit_status in daemon_init() (say to INT_MIN), and daemon_get_exit_status() could actually check that value and return false in that case.

Alternatively, one could use -1 for 'still running', and '-2' instead of -(WTERMSIG(wstatus)) above. Not super clean, but should suffice for typical use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants