Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES5 strict mode #120

Open
liz425 opened this issue Nov 28, 2016 · 22 comments
Open

ES5 strict mode #120

liz425 opened this issue Nov 28, 2016 · 22 comments

Comments

@liz425
Copy link

liz425 commented Nov 28, 2016

I happened to meet in a situation that I could only run instrumented JS file in ES5 strict mode or ES6.
Since ES6 is currently unsupported, I use the former configuration.
However, assigning arguments is not allowed in strict mode but quite common in the instrumented JS file.
Is it possible to support ES5 strict mode? If yes, what changes are needed against Jalangi2 source code?

@ksen007
Copy link
Contributor

ksen007 commented Dec 1, 2016

We cannot support ES5 strict mode because Jalangi 2 uses arguments.callee which cannot be accessed in the strict mode.

@msridhar
Copy link
Contributor

msridhar commented Dec 1, 2016

@ksen007 would a solution be to strip any use of "use strict" from an instrumented file?

@gogo9th
Copy link

gogo9th commented Nov 22, 2020

Stripping "use strict" is still problematic in my case, because I am using Jalangi to instrument code in a webpage's <script type='module'> tag, which is forced to be a strict mode. If I change the tag's type to "text/javascript", the page suffers error because another tag cannot use exported functions from the non-module-typed <script> tag. .

@msridhar
Copy link
Contributor

@gogo9th yes that is a tricky scenario. The only thing I can think of would be to make Jalangi's use of arguments.callee optional. It's not too obvious to me how to do this. You could try to pass the function explicitly as a separate argument in instrumented code, but that could break code that uses the arguments array. So maybe we'd have to maintain a separate stack inside Jalangi? I'm open to other ideas.

@gogo9th
Copy link

gogo9th commented Nov 23, 2020

Thanks for your ideas. I think making arguments.callee usable is not sufficient, because even if we forcibly modify <script type='module'> to <script type='text/javascript'> to make arguments.callee works, there is another problem: the functions exported from the original <script type='text/javascript'> (which was previously module-typed) cannot be accessed from another <script type='text/javascript'> (which was previously module-typed). If a script's type is module, it has complex import & export rules: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules, and exported files cannot be accessed the same way as NodeJS, because a webpage's Javascript is sandboxed by the browser and thus cannot directly access its machine's file system. So I think Jalangi needs a special way of handling import/export if this.window is defined (i.e., if it is a browser).

@gogo9th
Copy link

gogo9th commented Nov 23, 2020

I think J$ should have a hierarchical scope management object based on import & export declarations for each sid.

@msridhar
Copy link
Contributor

Hi @gogo9th, to clarify, I was proposing to (optionally?) remove usage of arguments.callee so that strict mode could be enabled. The main use of arguments.callee is to provide the function value as an argument to the functionEnter callback, which in my experience is rather important. As a partial alternative, Jalangi could maintain its own call stack, and pass in the appropriate function value to functionEnter based on that. But, that still won't work for callbacks from native functions. A simpler solution would be to just say that the function value argument in functionEnter will be null if the function is in a strict mode script, and leave it to the analysis to deal with it.

I think J$ should have a hierarchical scope management object based on import & export declarations for each sid.

What problem is this solving? If it's not towards fixing handling of strict mode, I'd prefer to leave this to the analysis.

@esbena
Copy link
Contributor

esbena commented Nov 23, 2020

Before you go too deep down this rabbit hole.. I know that @algobardo and @christofferqa also fought with Jalangi and modern EcmaScript features several years ago - perhaps they can drop a quick line about their endeavours?

@gogo9th
Copy link

gogo9th commented Dec 19, 2020

As a partial alternative, Jalangi could maintain its own call stack, and pass in the appropriate function value to functionEnter based on that. But, that still won't work for callbacks from native functions.

I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?

@msridhar
Copy link
Contributor

I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?

The problem is a native function calling back into a source-code function, e.g., anytime Function.prototype.call is used to invoke a source-code function.

@msridhar
Copy link
Contributor

I came across a possible solution to this problem. In JS you can access the value of a function via its name. What if in Jalangi instrumentation, we introduced a generated name for any anonymous function, and then just always used the function name instead of arguments.callee? @gogo9th do you see any problems with this idea?

@gogo9th
Copy link

gogo9th commented Apr 5, 2021

I am very sorry for my late reply, I've found several more bugs/problems in Jalangi2 and have been fixing them. It took me about 100 hours, and I think I fixed all of them. I will organize them and tell you in the other thread we have been discussing.

For your suggestion above, are you suggesting that whenever an anonymous function appears in the source code, we instrument the code such that we store the function's value in a generated name variable (in the global scope), and we replace arguments.callee in each anonymous function with this generated name variable? If so, I think this will work.

@gogo9th
Copy link

gogo9th commented Apr 5, 2021

One similar issue I was fixing was about with() statements.

with() statement error: esnstrument.js sets 'this = base' for M(method) and F(function) calls and passes it to callFun()'s Function.prototype.apply.call(targetFn, this, arguments). But the base (this) could be some object in the with() bracket. If a function/method is called within a with() block and if the function/method internally uses 'this', then its result will be computed based on a wrong 'this' variable value. To solve this, we create a stack of with() bracket's objects, push into the stack whenever we enter a new with() block, pop them when we exit, and each callFun() should check the stacked values. There is a single stack for each scope (either global or function scope).

Also, read & write callback should check if the variable belongs to some object in the with() bracket.

Is using stacks for with() the correct was of solving the problem? Would there be any issue or corner cases?

@gogo9th
Copy link

gogo9th commented Apr 5, 2021

I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?

The problem is a native function calling back into a source-code function, e.g., anytime Function.prototype.call is used to invoke a source-code function.

I'm sorry, but I was not clear why this would be a problem.. You're right that we cannot install a callback on a native function itself. Yet, we can simply skip the native function and install the callback in the source-code function. For example, if we have code like

function SomeParentFunction () {
Function.prototype.call(this, MyFunction, args)```,
}

, then we can instrument the contents of SomeParentFunction to push into our new global calleeStack the value of the 2nd argument of Function.prototype.call (i.e., MyFunction), and instrument the contents of MyFunction such that when when it begins it first regards the lastly pushed value in the calleeStack as the value for arguments.callee. Would there be an issue in this approach?

Meanwhile, I have a different concern about this calleeStack approach in general (even current Jalangi's returnStack). When we have an asynchronous function call by Promise, then Jalangi wouldn't have a chance to push into its calleeStack or returnStack the value of the callee or caller function.

@msridhar
Copy link
Contributor

msridhar commented Apr 5, 2021

@gogo9th thanks so much for all these great comments. Just an FYI that I am busy today and possibly tomorrow but I will give feedback here as soon as I can

@msridhar
Copy link
Contributor

msridhar commented Apr 6, 2021

@gogo9th:

I am very sorry for my late reply, I've found several more bugs/problems in Jalangi2 and have been fixing them. It took me about 100 hours, and I think I fixed all of them. I will organize them and tell you in the other thread we have been discussing.

Sorry for all the bugs. Upstreaming your fixes would be great and very welcome!

For your suggestion above, are you suggesting that whenever an anonymous function appears in the source code, we instrument the code such that we store the function's value in a generated name variable (in the global scope), and we replace arguments.callee in each anonymous function with this generated name variable? If so, I think this will work.

I was actually suggesting adding a name for the function itself, rather than generating a variable. Do you think adding a name to anonymous functions is risky?

One similar issue I was fixing was about with() statements.

with() statement error: esnstrument.js sets 'this = base' for M(method) and F(function) calls and passes it to callFun()'s Function.prototype.apply.call(targetFn, this, arguments). But the base (this) could be some object in the with() bracket. If a function/method is called within a with() block and if the function/method internally uses 'this', then its result will be computed based on a wrong 'this' variable value. To solve this, we create a stack of with() bracket's objects, push into the stack whenever we enter a new with() block, pop them when we exit, and each callFun() should check the stacked values. There is a single stack for each scope (either global or function scope).

Also, read & write callback should check if the variable belongs to some object in the with() bracket.

Is using stacks for with() the correct was of solving the problem? Would there be any issue or corner cases?

At a high level, this sounds reasonable to me, but I don't really know about all the corner cases that could crop up here. I am not really sure if the decision to skip support of with was due to fundamental issues, or because it didn't seem worth the effort. One question:

There is a single stack for each scope (either global or function scope).

Why is this needed?

, then we can instrument the contents of SomeParentFunction to push into our new global calleeStack the value of the 2nd argument of Function.prototype.call (i.e., MyFunction), and instrument the contents of MyFunction such that when when it begins it first regards the lastly pushed value in the calleeStack as the value for arguments.callee. Would there be an issue in this approach?

I am not sure. But isn't the approach of introducing names for anonymous functions a better solution?

Meanwhile, I have a different concern about this calleeStack approach in general (even current Jalangi's returnStack). When we have an asynchronous function call by Promise, then Jalangi wouldn't have a chance to push into its calleeStack or returnStack the value of the callee or caller function.

Do you have an example here?

@gogo9th
Copy link

gogo9th commented Apr 7, 2021

There is a single stack for each scope (either global or function scope).

Why is this needed?

We need this, because multiple with() statements can be nested. In each with() block, we need to identify the with() objects belonging to their own with() scope.

I am not sure. But isn't the approach of introducing names for anonymous functions a better solution?
Okay, I will implement this.

@gogo9th
Copy link

gogo9th commented Apr 7, 2021

Jalangi has another issue with the strict mode. When we have a sample function:

function hello() {}

Jalangi instruments this as follows:

function hello() {
    jalangiLabel0:
        while (true) {
            try {
                JALANGI_$.Fe(9, arguments.callee, this, arguments);
                arguments = JALANGI_$.N(17, 'arguments', arguments, 4);
            } catch (JALANGI_$e) {
                JALANGI_$.Ex(49, JALANGI_$e);
            } finally { 
                if (JALANGI_$.Fr(57))
                    continue jalangiLabel0;
                else
                    return JALANGI_$.Ra();
            } 
        }
}   

Here, the line arguments = JALANGI_$.N(17, 'arguments', arguments, 4); is problematic in a strict mode, because we are not supposed to overwrite the arguments variable in the strict mode:

SyntaxError: Unexpected eval or arguments in strict mode.

I am actually not sure if we really need to above line in the instrumented code. Can we just remove this line by commenting out the esnstrument.js file's following section:

  function syncDefuns(node, scope, isScript) {
    var ret = [], ident;
//    if (!isScript) {
//      if (!Config.INSTR_TRY_CATCH_ARGUMENTS || Config.INSTR_TRY_CATCH_ARGUMENTS(node)) {
//        if (!Config.INSTR_INIT || Config.INSTR_INIT(node)) {
//          ident = createIdentifierAst("arguments");
//          ret = origin_array_concat.call(ret, createCallInitAsStatement(node,
//            createLiteralAst("arguments"),
//            ident,
//            true,
//            ident, false, true));
//        }
//      }
//    }

If a callback function has to change the contents of arguments, it can do it from withing the function-enter callback:

JALANGI_$.Fe(9, arguments.callee, this, arguments);

@msridhar
Copy link
Contributor

msridhar commented Apr 7, 2021

Tentatively I agree we can remove that callback that overwrites arguments. We should update the tests to pass if needed, and also document the change for the next release (and maybe do a bigger version bump since it's a change to the Jalangi API).

@gogo9th
Copy link

gogo9th commented Apr 10, 2021

Thanks for the feedback. I am lastly working on making <script type='module'> and eval() work. For eval(), some modules (e.g., fs, esotope) use native functions, but if the target program redefines them, Jalangi can fall into an infinite loop. Therefore, Jalangi should make sure that all modules to be used while processing eval() should use unredefined original native functions. This can be done by parsing and transforming the source code of modules to use unmodified original native functions by using Babel.

I have one question. Why doe Jalangi wrap each script or function code with a while loop? What could go wrong if they are just wrapped by a try-catch block without a while loop?

@msridhar
Copy link
Contributor

I have one question. Why doe Jalangi wrap each script or function code with a while loop? What could go wrong if they are just wrapped by a try-catch block without a while loop?

This is to enable backtracking functionality. See slides 40--43 here:

https://manu.sridharan.net/files/JalangiTutorial.pdf

I think it'd be reasonable for you to disable that while loop instrumentation on a branch, if your analyses don't use it. Maybe we should add a flag to optionally disable it.

Pjsrcool added a commit to Pjsrcool/jalangi2 that referenced this issue Aug 15, 2022
@msridhar
Copy link
Contributor

For anyone still tracking this issue, I attempted to implement partial support for use strict here:

#175

It will fail at runtime for some unusual cases involving getters / setters. And it's a real hack right now and needs cleanup. But if anyone has feedback let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants