-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid security issues of subprocess shell #6498
Conversation
hostname_cmd = ["hostname -I"] | ||
result = subprocess.check_output(hostname_cmd, shell=True) | ||
import shlex | ||
hostname_cmd = shlex.split("hostname -I") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may not need shlex.split
when the command has no placeholder for injection. The same goes for some other fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For hostname -I
, using socket.gethostname
and socket.gethostbyname_ex
can be safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tohtana, thanks for the feedback. I will remove the shlex.split()
here. However, I am not getting the same output if I use socket.gethostbyname_ex(socket.gethostname())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think the shlex.split()
is still need to pass args as a list since we are removing shell=True
. Alternatively, I could manually construct the list. However, I think I will keep the shlex.split()
to future-proof for arg changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this code? This is a general proposal to make the code robust against malicious modifications of system commands. But I don't think this is crucial because it won't be a typical attack that can harm our users. We can just keep hostname -I
if this doesn't work.
>>> import socket
>>> hostname = socket.gethostname()
>>> ip_addresses = socket.gethostbyname_ex(hostname)[2]
>>> ip_addresses
['172.17.0.2']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that as well, it looks like we would need to modify the command to be:
>>> socket.gethostbyname_ex(socket.gethostname()+".local")[2][0]
Thought it is slower to run on my system with that change.
I wonder if hosts
or DNS in your system (Windows?) has [HOSTNAME].local
but it doesn't work on my env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> socket.gethostbyname_ex(socket.gethostname()+".local")[2][0]
This works and gives correct/expected results on my Linux lambda box.
@tohtana, do you mean this does not work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't work on my wsl.
$ python -c 'import socket; socket.gethostbyname_ex(socket.gethostname()+".local")[2][0]'
Traceback (most recent call last):
File "<string>", line 1, in <module>
socket.gaierror: [Errno -5] No address associated with hostname
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I'd say lets leave it as hostname -I for now, and we can make another PR to update where we can more strenuously test Windows and other OSs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to address this in another PR. This one is urgent and focuses on security.
But I'm not sure it is a good idea to get the IP from the first entry from hostname -I
. It is not simple to control it even for the administrator. It is easier to tell users to configure /etc/hosts
properly.
After a quick look on the usage, probably it can also be a hostname, not an IP.
@loadams, do you have any idea how to repro these py UT failures? |
Yes, @tjruwase - I believe this should just be taking an environment with torch installed and cloning your branch and running Though these look to be caused by the changes in setup.py not being able to find git to get the commit hash to build the wheel with name 0.15.1+commithash. |
Unfortunately, I am unable to repro this failure following those steps.
Yes, I think that is the problem. This work with |
…epSpeed into olruwase/safe_py_subprocess
Seems to be fixed by prepending with |
This does run in a docker container and you can run using the same container if needed to test as well, but looks like it is largely resolved as well |
We are hitting few issues on ROCm environment (Ubuntu ROCm Pytorch container) due to this PR. Building extension module transformer_inference... [rank4]: Traceback (most recent call last): Please let me know if we need to create new issue to track it. |
@jagadish-amd, yes please create a new ticket. |
Avoid security issues of
shell=True
in subprocess