Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shell application idle timeouts active without configuration #3928

Open
mrobbert opened this issue Nov 4, 2024 · 3 comments
Open

Shell application idle timeouts active without configuration #3928

mrobbert opened this issue Nov 4, 2024 · 3 comments

Comments

@mrobbert
Copy link

mrobbert commented Nov 4, 2024

We recently upgraded a couple of our OnDemand hosts from 3.1.7 to 3.1.9 and have found that the shell application is disconnecting after 1 minute of inactivity. I found the documentation for the Ping Ponging feature that was added recently, but it says this feature should be disabled by default so this change in functionality is unexpected.
None of our hosts had the /etc/ood/config/apps/shell/env file to configure this feature. We have added it to the hosts running 3.1.9 in order to mitigate problems with this change, but I have found that when I set OOD_SHELL_PING_PONG=false it changes the idle timeout from 1 minute to 5 minutes.
I have also confirmed that on the host we reverted to 3.1.7 in a quick attempt to work around the httpd bug that no idle timeout is in effect.

@osc-bot osc-bot added this to the Backlog milestone Nov 4, 2024
@johrstrom
Copy link
Contributor

but it says this feature should be disabled by default so this change in functionality is unexpected.

I understand it was unexpected to change it in the middle of the 3.1.x series like that. From my perspective, I just thought it'd be better to have conservative defaults especially with regards to security issues. So that's why it was turned off. Indeed, 3.1.0 was the first release to have ping pongs at all and had I really considered the security impacts at that time I still would have disabled it by default, just to have secure defaults.

when I set OOD_SHELL_PING_PONG=false it changes the idle timeout from 1 minute to 5 minutes.

Once you turn ping ponging on (by setting the environment variable to anything) we start ping ponging so we extend apache's connection timeout of 60 seconds (1 minute). So apache is timing you out in 2.x or below or 3.1.9 with this config.

The 5 minutes then ends up being OOD_SHELL_INACTIVE_TIMEOUT_MS which by default is 5 minutes (again a conservative default because it's a security issue).

I have also confirmed that on the host we reverted to 3.1.7 in a quick attempt to work around the httpd bug that no idle timeout is in effect.

Yea that's the security issue that this patched. 3.1.0 enabled ping pongs, but without any restrictions. You'll end up having ssh sessions potentially forever, and certainly much longer than any authentication timeout (perhaps even after the account has been disabled!).

@mrobbert
Copy link
Author

mrobbert commented Nov 4, 2024

Jeff,
Thanks for your quick response. I now understand that this change was supposed to be expected with this update so we will be sure to enable Ping Pong with appropriate timeout values.
I think that the disconnect comes from me not understanding that the default timeout without Ping Pong was so low, or even there at all. I don't recall having this problem before PingPong was in the OOD code. I read the documentation to mean that Ping Pong was added in order to give us a way to disconnect users where there was previously no timeout at all, so it was unexpected to get disconnected with that disabled.
Is it possible to change the documentation for this feature to indicate that it is a way to extend the timeout rather than to add one where it didn't previously exist?

@johrstrom
Copy link
Contributor

I think that the disconnect comes from me not understanding that the default timeout without Ping Pong was so low, or even there at all. I don't recall having this problem before PingPong was in the OOD code.

You must have been lucky. If you generated activity you're OK - think tailing a file that never ends. There's always network activity so apache will keep the connection open. The issue is, if that tail suddenly stopped or you ctrl+c out of it and step away from the keyboard.

Ping Pong was added in order to give us a way to disconnect users

Kind of the opposite, it was added to keep the users' shell sessions active. The security patch was what gave us the way to disconnect.

Indeed a quick discourse search seems to indicate what I suspected - for years folks have complained about shell sessions disconnecting easily.

https://discourse.openondemand.org/search?q=shell%20timeout%20order%3Alatest

Here's the history:

  • before 3.1.0: Apache will close any connection that doesn't have activity for 60 seconds. This lead to shells being disconnected quite frequently or users having to background infinite loops that output stuff to generate activity so that it'll stay alive. There was no mechanism to disconnect save for apache's inactivity timeout.
  • 3.1.0-3.1.7: ping pongs are now enabled so the connection lasts potentially forever. Again, there was no mechanism to disconnect save for apache's inactivity timeout (which the unrestrained ping pong effectively disabled).
  • 3.1.9: ping pongs and other timeout settings are configurable with actual limits on how long an ssh session can last.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants