Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow running unsafe functions / allow only running safe functions #4

Open
5 tasks
oubiwann opened this issue Aug 13, 2015 · 9 comments
Open
5 tasks
Labels
Milestone

Comments

@oubiwann
Copy link
Member

oubiwann commented Aug 13, 2015

Proposed tasks:

  • In actual source code, create a list of all allowed Erlang and LFE mod:func combinations
  • During startup, read these into memory: as a list of strings (will this be needed), as a list of module atoms, and as a list of func atoms
  • From the list of all mod:func occurrences in a POSTed payload:
    • extract the module and attempt to convert to atom with erlang:list_to_existing_atom -- if badarg, this isn't allowed
    • do the same for the function

The consuming REST functions can then return an appropriate HTTP error and JSON error payload upon error conditions.

@yurrriq
Copy link

yurrriq commented Aug 14, 2015

Try Erlang uses this whitelist.

Edit: Pasting the misplaced misguided question here.

How tricky would it be to redefine the LFE code that parses funcalls to check a white/blacklist? i.e.

- User types code and code is sent over the wire to interpreter as string
- Interpreter parses the string
- Eval parsed string, checking each funcall against the list
- If allowed, pass to actual funcall handler
- Otherwise, return 🚫

@oubiwann
Copy link
Member Author

This is in answer to @yurrriq's question in issue #9:

I think that [doing checking of functions at run time, parsing expressions from users] might end up being a pain ... more painful to do in code that some basic systems tweaking.

Here's my current thinking:

  • get the app built, with YAWS running as an unprivileged user
  • run those in Docker
  • run the Docker behind YAWS

If someone executes malicious code, it's done in Docker and the only damage they can do is what is permitted by the UNIX user running YAWS inside Docker. We might even be able to run the LFE shell manager as a different unprivileged user than the YAWS app. This would provide even more isolation. Worst case scenario would be that the Docker instance needs to be recreated (something which might be a good idea to do several times a day, anyway).

Lastly, we can create dummy LFE modules in the Dockerized appmod's ebin, such as:

  • file
  • os
  • rpc
  • inet
  • inet_res
  • code
  • possibly others ...

Those would be empty save for a noop function would would override any of the system modules having the same name, simply returning undef. We should be able to have this only affect the LFE shell processes which the appmod starts up.

I think this would take care of pretty much most of the problems a public-facing REPL might encounter ...

@yurrriq
Copy link

yurrriq commented Aug 20, 2015

Lastly, we can create dummy LFE modules in the Dockerized appmod's ebin, such as...

Sold. I love how the "let it fail" philosophy is reaching even deeper. 😄

@yurrriq
Copy link

yurrriq commented Aug 20, 2015

Are you familiar with sharing resources, .e.g the modified ebin between Docker containers? It's always been a bit tricky for me..

Maybe I'm misunderstanding further, though. Are you talking about one dedicated shell with many concurrent sessions? Or spinning up a separate jailed LFE shell Docker container per user? ... or something else?

@oubiwann
Copy link
Member Author

So, ticket #8 is for implementing a shell manager ... that is essentially going to be a supervisor process that dynamically creates children. Each child process will be an LFE shell. Each child will be created every time someone "logs in" to try.lfe.io. Logging in will essentially be filling in a user name field. The feedback to the user for this action will be a prompt like alice|lfe>, but the real use it will be put to is to generate an md5 (along with the referring host). This md5 will be used as a name to register the child process, so that if a user logs in from the same machine 100 times in a row, they'll still just have one LFE shell process running for them. There will be a timeout built in, so that inactivity will result in the shell manager killing the process.

This will be running inside the Docker instance as part of the appmod. Ideally, I'd like to have this process run separately from YAWS, with its own heartbeat, etc., thus allowing us to separate the HTTP OS process user and the LFE shell manager OS process. But that might be overkill for this project.

The end result will be just one Docker instance (for now) running behind the publicly accessibly YAWS running on port 80. The appmod, though, could potentially have 100s of LFE shell processes running. If we get DDOSed, this would obviosuly cause problems. If people just leave us alone, this will be fine. We should probably wait on any more extensive engineering to see if there's a need first ... (this being a fair amount already)

@yurrriq
Copy link

yurrriq commented Aug 20, 2015

Thanks! I think my mental model matches yours now.

@rvirding
Copy link
Member

Ok, I am going to be very negative here.

First off, as has been pointed out, you should do a whitelist not a blacklist. Second, this doesn't really help you anyway. To be really safe you would have to disallow all access to any library function which calls funs at any level, or allows you to do a metacall at any level. Any form of metacall. The problem is that once you hit compiled system code you have lost control. You would also have to disallow lambdas and apply, etc. You also have the problem of tuple module calls. ...

Recently some one referenced an old mail which showed how easy it was to get past the restrictions.

@oubiwann
Copy link
Member Author

@rvirding Agreed.

However, if the user's LFE shell has an os module that is empty, no matter how much trickery they do with macros and anonymous functions, if the code has been replaced, they won't be able to get to it.

Now, the uncertainty I have here is this: if we have overridden the os and code modules, does the user have any mechanism whereby the ebin dir we set up as the primary one (the one with the most precedence) can be changed?

If so, then we should just abandon all white/black listing.

If not, then we should be good with overriding stdlib modules with our own empty ones. That will be minimal effort for a minimal amount of protection (in these case the protection is of the real users who would suffer if the Docker container was messed with by hackers).

Note that this would be the ebin directory for the YAWS appmod, not YAWS itself. For each YAWS web app, you can specify an ebin directory to use. I'll need to experiment to see where this dir comes in the order of precedence (I'm guessing first).

Regardless, this will all be happening in a Docker container, so even if hacked, the host system should be okay. Users would just have to wait until the next container restart.

@oubiwann oubiwann added this to the 0.3.0 milestone Feb 19, 2021
@oubiwann oubiwann changed the title Create blacklist Disallow running unsafe functions / allow only running safe functions Feb 19, 2021
@oubiwann
Copy link
Member Author

My current thinking on this is that we essentially provide a pre-evel stage:

  • any non-safe, non-approved mod:fun present anywhere in the POST'ed payload will be rejected out-right, returning an HTTP error and appropriate REST payload

Then:

  • anything that's allowed will make it past this point, and can be eval'ed by LFE's lfe_io:read_line
  • referrer IP might be the easiest way to isolate "users" (everyone behind the same firewall at an office, therefore would share the same session); any checks on safety for allowed m:f might be done here, with bad actors having their "REPL" process killed, session ended and destroyed ... and maybe an exponential cooldown added for that IP
  • anything malicious that does make it through would be acting in a Docker container managed by, e.g., AWS, so I'm actually not concerned since nothing of any value or permanence would be on the system ... not sure if there are checks we can do for that that would allow us to detect and kill ...
  • for general cleanup, we can timeout good-actor session every N seconds
  • we can also have a global session timeout that just restarts the entire supervision tree every N+M seconds, or somesuch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants