-
Notifications
You must be signed in to change notification settings - Fork 4
CS2 Discussion: Question: New Compiler Starting Point #25
Comments
The problems with the current CoffeeScript compiler are:
The problems with Redux are:
The nice things with the current CoffeeScript compiler are:
The nice things with Redux are:
Note: I’ve probably forgotten lots of things here. Here’s what I think about time and effort: It will be much faster to update the original CoffeeScript compiler to accept more ES2015+ syntax and output more ES2015+ syntax, than to get Redux up-to-date – initially. Let’s imagine that two teams started working at the same time on the ES2015+ goals of the CoffeeScript language, one on the original compiler and one on Redux. At first, the team working on the original compiler would make much more progress. But then, after some unknown amount of time, I think that the Redux team would be able to implement features and bug fixes faster – and having more fun while doing so. It’s a bit like those school book math questions where you have to choose between two phone payment plans: One with a small monthly fee but a large fee per phone call, and one with a large monthly fee but a small fee per call. If that makes any sense :) What would I choose, then? I actually have no idea. One second I lean towards the original compiler, the next I want to go with Redux. Then I’m like “whatever, I’ll focus on Elm and PureScript instead,” followed by wanting CoffeeScript again. Either way, I can’t see myself doing major contributions to CoffeeScript in the foreseeable future, but I’ll probably stick around trying to help. |
Same. What do we think would be the healthiest for the broader community? Which option would the current coffeescript contributors and maintainers like to see? |
Or maybe the question we should be asking is, do we have the resources to finish Redux? How many developers are really interesting in actually committing hours of work to this project? How many hours can they promise? Redux seems like the better long-term solution, but if we don’t have enough developer-hours to even get Redux up to parity with CoffeeScript 1.10, then we will burn through whatever hours we have available to us advancing Redux a little bit but still leaving it in the unfinished state it’s in now. It’s risky to start with Redux unless we know with some certainty that we won’t run out of resources before Redux becomes a viable option. |
That's a very good point @GeoffreyBooth |
thx @GeoffreyBooth for giving this overview @lydell for sharing his experience. I'm not sure if the current tooling provides solid foundations for the future of CS6 ? |
(Caveat: I was once a fan of CoffeeScript but have since decided it's better to go with JavaScript + Babel. I created decaffeinate to migrate a large codebase away from CoffeeScript, so keep that in mind when considering my suggestions.) My experience with both the official CS parser/generator and Redux left me with similar impressions to what others have said on this thread. The AST generated by Redux is much nicer and the project is structured in a more sane way. It does not have the compatibility needed to support major CoffeeScript codebases, however. Of the roughy 3000 files in the codebase I work in, 20% or so did not parse correctly using CSR. I originally built decaffeinate on CSR, but there were tons of issues related to compatibility so I switched to the official parser. However, I did so by creating decaffeinate-parser which maps the AST from CS to one that much more closely resembles CSR's AST. That way I didn't have to rewrite very much code in decaffeinate. Another issue that both parsers have is bad location information. Since decaffeinate relies on it quite heavily (it edits the original source code based on node and token locations), I found tons of bugs in CS's location information for AST nodes and tokens. I worked around a lot of these by creating my own lexer, coffee-lex, which is much more faithful to what is in the original source code than CS's lexer and has accurate location information. Even with that, I eventually had to fork CS to start addressing these bugs rather than trying to work around them. That process is ongoing. Ironically, some of these tools that I created to help hasten the migration away from CoffeeScript may also be useful in a project such as you all are discussing. I have no problem with them being used in that way if you find them useful, just keep in mind that my goals probably differ from yours. If I were in a position of wanting to continue to develop CoffeeScript into the future, I would probably do this:
Best of luck! 😉 |
I think it is important to remember some of the goals of Redux. It attempts to correct a lot of unintended, undefined, and ambiguous behavior of the original compiler. The implementation demonstrates a view of what should happen but not necessarily the best or most accepted view. It also explicitly breaks CS programs that function under the standard compiler. While I have a bias towards Redux from an implementation and overall correctness standpoint I think there would have to be additions made that support the behavior of the standard compiler (maybe a strict mode that operates on Redux semantics). Given that caveat I think it would be easier to work from the Redux codebase. It's much less tangled and has some nice AST features and optimization hooks that make for easy tooling entry points, not to mention being a bit cleaner as others have mentioned. The option of simply starting over and taking both implementations as advice should also be considered. I don't think either of them truly represent the right solution for everyone IMHO. |
I'd second this, and in particular you should investigate not using a parser generator. Writing a parser by hand is not that terrible. |
From a user point of view the best option is a series of non-breaking changes to harden the lexer/rewriter/parser followed by graceful (and also tool-supported?) but breaking changes. I sincerely doubt just going with Redux is a good idea. Not because of its quality or how well it was maintained, simply because it does break too much too soon. I think refactoring the lexer and explicitely allowing ambiguties while outlining them in code for future reference is a better way to go. The ultimate goal should be to get to a state of clean code like Redux in a graceful way. However, I do not see a complete rewrite as a viable option. The risk to not get adopted by a properly sized portion of the community is too high, seeing how Redux basically (was politically) ran into a dead-end. |
@eventualbuddha thank you for your detailed comments. Any reason why you don’t submit your location bugfixes back to the main repo? Presumably the main repo would benefit from those fixes. I’ve opened a new issue to gauge how much support we have for this effort. It sounds like starting from Redux is more work but with a bigger reward, but not something we should attempt unless we know we have enough people with enough time to achieve it. Hopefully that issue answers that question. I also think most users don’t care all that much about CoffeeScript’s output. They want to use modules and classes and |
@GeoffreyBooth I wrote both of the CoffeeScript bug fixes for decaffeinate (full changelog is here: https://github.com/decaffeinate/coffeescript/commits/decaffeinate-fork-1.10.0 ). The first fix is already an open PR ( jashkenas/coffeescript#4291 ) and the second one I have been meaning to submit, and just got around to it ( jashkenas/coffeescript#4296 ). Certainly the intention is to be friendly and send fixes upstream. :-) We work off the 1.10.0 tag instead of master (since the AST format is a little different), although in both cases it was a clean cherry pick. It's also a little weird because in both cases, I think the bug didn't cause any correctness issues in CoffeeScript itself, just in the intermediate AST. But probably cleaning that up is still useful for future maintainability. |
@eventualbuddha quite a statement. Can you link to an example? |
@eventualbuddha also is this codebase you tested with Redux publicly available? I’d love to have a good serious app to use to test with. |
Here's an idea: I might be missing something, but once import/export is done in the original compiler, there won't really be any/much changes to CoffeeScript's syntax needed for a while, will it? I'm thinking we'll mostly change what the language compiles to – for example, compiling classes into something that plays better with ES2015 classes. If so, it might make sense going with the original jashkenas/coffeescript code. We could start by improving the output of the parser, so that we get a real/better CS AST. We could start replacing the current compiler (nodes.coffee) with something better – perhaps we could pull in parts of the Redux compiler as a starting point. It would be cool if a new compiler could only compile classes to start with, and fall back to the old compiler for all other nodes. Then we could leave replacing the lexer+parser as a longer term goal. As long as the new parser outputs the same AST, the same compiler can be used. This way would could replace parts of the current compiler incrementally. |
Yep! TDOP makes it much easier than you would think. Pratt is a genius, but nobody listened to him, until Crockford began popularising his algorithm. It really isn't that difficult with TDOP. I've never seen any other approach that wasn't really intimidating, but Pratt parsing only takes an evening to figure out, and then you can write parsers by hand whenever you like. It's an excellent investment. |
I linked to the Python example, because I found it the easiest to follow [by far]. For completeness, here's a copy of the original paper we remastered on GitHub, and Crockford's article. |
@lydell do you have the time to create a new branch that shows a starting point of how replacing @carlsmith or others: if you want to create a new compiler from scratch, do you mind creating a repo that demonstrates the beginnings of how such a new compiler would work? At least implement significant whitespace and variable assignment, for starters. If you can get that working in a matter of days, and it seems like implementing all the rest of CoffeeScript’s current syntax would be within reach within weeks, then the new repo could be a contender as our new-compiler starting point. And if anyone has the time to play with Redux a little bit and get past where they left off with it—struggling to upgrade its dependencies—that would be very useful too. It still feels to me like the most promising starting point, but not if it’s stuck on an unsurmountable obstacle. If Redux can get updated, and maybe we run CoffeeScript’s current tests against it to know what feature support it lacks, we can see how suitable it would be as a starting point. If none of these experiments bear fruit, the backup plan is to keep the original compiler and gradually work to improve it. We know that that’s an option, though not as satisfying as replacing its hairiest parts with cleaner and more-extensible code. |
I'm busy doing a website, but that'll be up soon, and have other projects I'm committed to, but will try and find time for doing a parser. Can probably reuse code from the one I have, strip it down to its core, and tidy it up, so we'd have a simple parser that people can understand, that we can all chip away at from there. Once you've got the recursive decent logic, with precedence and support for goofy (right to left) operators, everything else is lots of fairly local, well defined, little problems that just need working through one by one. The parsing algorithm is the tricky part. |
Just to be clear, I think what you are doing with the existing infrastructure is the best bet, but I can personally probably be more helpful working on the longer shot. |
I mentioned this in the chat, and I think it's worth bringing up here, but I wonder if we could run the original CS test suite against Redux and just see what fails? That should tell us where the Redux code needs to be updated. In addition perhaps if we are putting together a custom compiler we could use the test suites from both projects to validate progress. Just a thought. |
I think that one of the most valuable outputs of the CoffeeScriptRedux project was the wiki pages which documented information for implementors: |
@michaelficarra thanks for commenting! Do you mind sharing with us your opinion for how much effort it would take for:
|
Has anyone looked into Decaf as an option? |
So . . . I think we should investigate Decaf 😄 It’s just one file (plus a CLI that calls it): the 1506 sloc export function compile(source, opts, parse = coffeeParse) {
const doubleSemicolon = /\;+/g;
opts = opts || {tabWidth: 2, quote: 'double'};
const _compile = compose(
// hack because of double semicolon
removeDoubleEscapes,
compiledSource => Object.assign({}, compiledSource, {code: compiledSource.code.replace(doubleSemicolon, ';')}),
jsAst => recast.print(jsAst, opts),
insertSuperCalls,
insertBreakStatements,
insertVariableDeclarations,
csAst => transpile(csAst, {options: opts}),
parse);
return _compile(source).code;
} So yeah, those 1500 lines are pretty terse. The arguments in Anyway the gist of what it’s doing, as far as I can tell, is using CoffeeScript’s own parser to generate a syntax tree, which it then does some transformations on before converting it into a JavaScript syntax tree, which has lots of ESNext syntax. It’s basically one long function that winds its way around For things it can’t parse, it falls back to the CoffeeScript compiler’s output. This is Decaf’s great advantage over Decaffeinate: it can process all CoffeeScript files, always generating valid JavaScript, though some of the JavaScript might be the ES5 output by CoffeeScript rather than the ESNext output by the various libraries imported into Decaf. (Yeah, I neglected to mention: Decaf is another tool whose purpose is to convert CoffeeScript files into ESNext, just like Decaffeinate. Hence it’s written in ESNext, of course.) Decaf has a very impressive list of CoffeeScript features/node types that it can parse; the most notable omission is comments, which is ironic since that’s the one node type we wouldn’t want to output. One node type in particular it does support, however, is classes: class Animal
constructor: (@name) ->
move: (meters) ->
alert @name + " moved #{meters}m."
class Snake extends Animal
move: ->
alert "Slithering..."
super 5 becomes class Animal {
constructor(name) {
this.name = name;
}
move(meters) {
return alert(this.name + (" moved " + (meters) + "m."));
}
}
class Snake extends Animal {
move() {
alert("Slithering...");
return super.move(5);
}
} Anyway it would be sloppy if CoffeeScript 2.0 was simply a Decaf wrapper over the current CoffeeScript compiler. We should untangle the logic of what |
@lydell @rattrayalex @JimPanic @carlsmith others: Looking at the list of features that we want to update to output ESNext, I wonder if a new compiler is the best approach? The list is surprisingly short, and many of the items don’t strike me as huge tasks (like compiling default parameters to default parameters, or Don’t get me wrong: I love the idea of a new compiler, especially one that passes an AST to Babel to generate JavaScript; but I wonder if it’s the best way to get to 2.0.0. We could release 2.0.0 first, then refactor the compiler afterward if people have the motivation. A better compiler will certainly make it easier to implement whatever ES2016 and ES2017 and so on throws our way going forward. We should make a decision soon so that @greghuc can implement template literals and hopefully someone volunteers to take on classes 😄 From what I’ve seen on this thread so far, it seems like we’re drifting by default toward extending the current compiler. |
I agree that the best path forward at this point is to stick with the I think other efforts are worth thinking about for possible long-term adoption but would be a distraction to invest in now. |
I really am not sure tbh. I looked a bit deeper into the current code base last week and I think in the long run, we'd benefit a lot from a different approach lexing, parsing and transforming the AST in multiple phases even. The structure right now is quite coupled. I'd still like to see a gradual transition towards a new approach, though, whatever that might be in the future. Mostly because (I suppose) nobody knows all the quirks and undefined implementation-defined behavior, but also to give users a chance to catch up on these very probable, small behavioral changes. Otherwise the upgrade path seems too steep tbh.
|
I second what @rattrayalex said. |
Re new compiler, I agree with the pragmatic approach mentioned by @rattrayalex. However, I think there is benefit in a more decoupled, better structured compiler if it's easier to understand and so make changes. This might reduce the barrier to entry for people making commits to Coffeescript. |
Closing as we’ve decided to build atop the 1.x compiler. |
Migrated to jashkenas/coffeescript#4923 |
Here are the viable options for which compiler to use as our starting point for outputting ESNext:
The current compiler obviously has the most compatibility of any CoffeeScript compiler, and soon it will support modules and hopefully also classes, the two biggest ES2015 features. But it’s a bear to work with, with a brain-bending codebase, and strings are brittle. Redux is easier to follow, but it doesn’t have full support for all current CoffeeScript, much less ES2015 (though that PR has added support for many ES2015 features). The Decaf/AST approach with fallback is perhaps the easiest, but then we have two code-generation pipelines in our codebase.
The choice boils down to a technical one: should we stick with the current compiler’s string-generation approach, despite its flaws? Or is Redux close enough to production-ready that it’s easier to fix whatever incompatibilities it may have, and move forward from there? Or is the piecemeal, “implement what we can with fallback for everything else” pattern offered by the Decaf approach the path of least resistance?
The text was updated successfully, but these errors were encountered: