Release KDL 2.0.0 #434

zkat · 2024-12-15T03:52:37Z

This is it, folks! I think we're ready to finalize 2.0, after something like 3 years of work.

I'm very happy with the language we've defined here, and how nice it feels to use. I want to express my thanks to everyone who has contributed ideas and discussion and work to its specification, all the various implementers who have helped validate it, and all the users of 1.0 who helped give real, experience-based feedback on the previous version of the language, helping us figure out what could be improved.

I'm gonna give folks a few days to review things and make a decision, then I'm hoping to have consensus from at least a few of the major contributors and implementers before publishing this.

Thanks again, everyone. This wouldn't have been possible without you, and I think we've ended up with a real, shining gem.

NOTE: If you're "just" a community member, your opinion is still welcome, although contributors/implementers will likely be prioritized at this point.

…on closing line

zkat · 2024-12-15T10:08:11Z

heh. The draft.8 tests killed kdl-rs multiline parsing and I realized the grammar wasn't actually allowing escapes on the closing multiline quoted string line.

so there's tests for that too now.

zkat · 2024-12-15T23:41:14Z

@tabatkins @tjol I regret to inform you that the Really Complicated whitespace-only multiline string test was actually wrong (I'm pretty sure).

The prefixes didn't match, and prefixes need to match exactly before any normalization or empty line collapse, unless the line is completely empty.

That is, this should fail, I believe, wince line 2 isn't completely empty, but its prefix doesn't match line 3:

"""\n
\t\s\n
\s\s"""

tjol · 2024-12-15T23:49:18Z

@zkat I thought we'd established that the rule that empty lines can contain “any” whitespace takes priority over the prefix requirement.

#429 (comment)

zkat · 2024-12-15T23:59:00Z

Ohhh. Uggghhhhhhhhhhhh ok I'll roll this back

eilvelia · 2024-12-16T00:49:54Z

I think the interactions of multiline strings with escapes are pretty weird currently.

When processing a Multi-line String, implementations MUST dedent the string after resolving all whitespace escapes, but before resolving other backslash escapes.

\" should also be interpretered before dedenting (I assume \""" doesn't close the string) since otherwise it wouldn't be possible to find the correct enclosing """ and deduce the prefix to dedent; if you resolve any escapes, you want to interpret \\ as well. However, resolving them cannot just be done in two separate phases, or \\t will be transformed as \\t -> \t -> <tab>. I think instead it should be noted that, to find the whitespace prefix, at least \" and \\ should be analyzed, and then escapes are resolved from the raw form (i.e. during dedenting the escapes are lexically analyzed but not resolved/saved). It would be nice to add a test with \""" inside a multiline string (I can't find one).

while the following example is allowed
  """
  foo \
bar
  baz
  \   """

IMO having non-literal whitespace in the prefix-defining line also feels pretty weird. Dedented strings are mostly for readability in the source code, transforming the indentation goes against that. Swift's multiline strings are quite similar, and I think can be an inspiration for kdl's strings. Particularly, the final line (or any ws prefix) there can consist of literal whitespace only, the newline escape is not allowed as the last \n, lines must either be completely empty or include the prefix.

The "any whitespace in non-content lines" rule forms an exception to the prefix being identical (or at least missing) in all lines, which is also isn't good I think and more difficult to implement (checking the whole line for any whitespace/content instead of only comparing the first characters to the intended prefix). I initially missed this rule when I was updating the ocaml implementation.

zkat · 2024-12-16T01:13:37Z

@eilvelia \\t would transform to t. Regular escapes are processed before any dedenting happens, and that includes white space escapes. \" can never be an issue because it’s always interpreted as simply ". \ doesn’t do its “slurping” behavior unless it’s immediately followed by whitespace.

As far as the Swift rules you mention: it does seem like those would simplify the rules for multi line strings. They would be more limiting but these corner cases would be way less confusing to think about by just… banning them altogether (and simplifying parsing)

I’m curious what others think.

This backs out commit 0c5604b.

eilvelia · 2024-12-16T01:50:08Z

@eilvelia \t would transform to t. Regular escapes are processed before any dedenting happens, and that includes white space escapes. " can never be an issue because it’s always interpreted as simply ". \ doesn’t do its “slurping” behavior unless it’s immediately followed by whitespace

I think you misunderstood me. \\t would be an issue if one naively transforms \\ to \ and \t to tab in separate phases. You can't transform only the whitespace escape since then foo\\ bar would (unexpectedly) activate the whitespace escape. The issue with \" is that this should be allowed:

"""
\"""
"""

(That would be the most intuitive, also what Swift does and what the grammar and spec currently suggest, as I think.)
If you conpletely disregard all escapes other than the whitespace one before dedenting (as the quote suggests), the second """ would be parsed as the string end (and then fail with non-ws final line).

eilvelia · 2024-12-16T01:52:39Z

Regular escapes are processed before any dedenting happens

Well, per this line, the escapes (other than ws) are resolved after dedenting:

When processing a Multi-line String, implementations MUST dedent the string after resolving all whitespace escapes, but before resolving other backslash escapes.

zkat · 2024-12-16T02:06:54Z

@eilvelia what that is intended to mean is that:

"""
\s\s\s\sfoo
    """

is an invalid string (that is, \s isn't resolved by dedenting time, so it doesn't actually "count" as whitespace)

Additionally:

"""
\"""
"""

Is certainly allowed, and certainly is passing my tests right now. The closing """ parsing is done before any of the dedenting stuff even applies. That's just standard delimited string stuff.

eilvelia · 2024-12-16T02:32:05Z

Yes, well, that's what should happen, I think the spec is somewhat vague there. That line in particular seemingly suggests that all escapes except ws are unprocessed in case dedent has not been done yet. IIRC the spec only says how escapes are resolved, not how they are lexically analyzed.

edit: To add a small clarification, scanning and resolving can often be combined into a single step. In kdl, this is possible for single-line strings (and trivially raw multi-line strings) but not for quoted multi-line. Although an implementation that doesn't scan non-ws escapes beforehand should even pass most (all, I think) current tests.

zkat · 2024-12-16T02:36:56Z

@eilvelia do you have any thoughts on what kind of rewording would be helpful here?

bgotink · 2024-12-16T12:43:49Z

SPEC.md

+When processing a Multi-line String, implementations MUST dedent the string
+_after_ resolving all whitespace escapes, but _before_ resolving other backslash
+escapes. Furthermore, a whitespace escape that attempts to escape the final
+line's newline and/or whitespace prefix is invalid since the multi-line string


I'm confused by this phrasing. It seems to say that escaped newline and/or whitespace are not allowed as the multi-line string has to be valid after the escaped whitespace is removed, but the text below and two of the new tests actually allow escaped whitespace in the final line.
The examples below seem to imply that an escaped newline and/or whitespace are only not allowed if the multiline string would become invalid, but that's not how I read this sentence.

Note that would also allow escaped newlines in some cases, e.g.

node """ lorem ipsum \ """

which would be equivalent to

node """ lorem ipsum """

which would imo be worth adding as test.

@bgotink this test sort of tests that but having a more explicit one would be good.

I believe the bit that says that a trailing \ is invalid actually intended to refer to a case like this:

""" lorem ipsum\ """

which would be invalid because:

""" lorem ipsum"""

This should probably be reworded, assuming these are the semantics we want to keep.

I'm not terribly inclined to change multiline strings any further, though, tbh.

I'm not terribly inclined to change multiline strings any further,

Oh no, I'm definitely not asking to change multiline strings! I was just confused about the text vs tests.

oh no that's fine. I was also mostly responding to the ongoing convo with @eilvelia who brought up a lot of good points, though I'm leading towards "clarify, don't change"

zkat · 2024-12-19T17:32:43Z

I'm pretty happy with where this is at right now. I still need to update kdl-rs with these last couple of changes to make sure everything looks good, but barring any other issues getting brought up, I intend to merge this PR and tag the official 2.0.0 sometime on Saturday.

/cc @larsgw and @tabatkins who I don't think have responded here yet, wanna make sure y'all see this.

larsgw · 2024-12-19T17:35:47Z

I felt like I could not give meaningful feedback without attempting to implement it, but I have unfortunately been very busy with my studies the past few weeks. I don't know if I will before Saturday, but that's fine with me.

zkat · 2024-12-19T18:51:05Z

hmmmmmmmmmm

I was thinking about #"""# and how we made it an error now and... I feel weird about it. It's literally the only string that can't be represented by raw strings, at all.

The only way to represent this is to do either:

"\""
//or
#"""
"
"""#

Which feels really strange? I know it's a relatively minor thing, but it leaves a bad taste in my mouth for something to be, actually, unrepresentable in one of our string syntaxes.

tjol · 2024-12-19T19:10:29Z

That's not the only string you can't represent with raw strings. Another example is "\r\n" (you can't do this in a single-line raw string because of the newline and you can't do this in a multi-line raw string because of newline normalization.) You also can't represent "\u{feff}" as a raw string (or other strings containing disallowed literal code points). Granted, compared to "\"", the other examples I can think of are a lot ... weirder.

zkat · 2024-12-19T19:15:48Z

oh that's a good point. And it does make me feel better. 🤔🤔🤔

eilvelia · 2024-12-20T16:21:11Z

Isn't it a little weird that newlines are allowed after /- (the only line-space inside a node)?

zkat · 2024-12-20T16:29:41Z

That’s intentional!

zkat · 2024-12-20T16:41:18Z

That is: I thought it would be good for /- to mean “slurp up any and all whitespace until the item being commented out”, as opposed to giving it special rules within nodes. It’s more for simplicity (a single /- definition) than thinking this is a thing that’ll be done all the time. We used to have much more strict locations for /- and it turned out to actually complicate grammars and parsers more than it was worth

eilvelia · 2024-12-20T16:54:49Z

Well, currently slashdash is a node/children/arg/prop "modifier" (as I described in #401 (comment)) and can only be inside a node (including at the beginning of it), changing its behaviour would be as simple as

multi-line-comment := '/*' commented-block
commented-block := '*/' | (multi-line-comment | '*' | '/' | [^*/]+) commented-block
-slashdash := '/-' line-space*
+slashdash := '/-' node-space*

// Whitespace

zkat · 2024-12-20T17:19:17Z

I definitely want the following to be legal:

/-
my-node 1 2 3

so I’m not terribly inclined to change this at this stage

zkat · 2024-12-21T16:20:37Z

Final heads up: I'm gonna be wrapping up some stuff with kdl-rs this morning, and then releasing and then I'm gonna merge this and release/announce both at the same time. I'd you'd like to tag your own implementation today and have me announce it please lmk!

Looks like we're all set here :)

eilvelia · 2024-12-21T16:48:08Z

Could you take a look at two small PRs I've sent (#441 and #442)?

tjol · 2024-12-21T18:13:13Z

@zkat ckdl 0.2.1 is out now with opt-in support for KDL 2.0.0 as it stands. I'm just changing the defaults to KDL 2.0 now. Should be done in no time at all. Planning to call that version ckdl-1.0!

to match the other uses of it and the metalanguage description below

)

bgotink · 2024-12-21T19:32:35Z

I've tagged version 0.2.0 of npm package @bgotink/kdl with KDL v2 support 🎉 (release)

tjol · 2024-12-21T19:49:27Z

No huge surprises in the process of changing the defaults, ergo:

ckdl-1.0 is released. This version supports both KDL v1 and v2; hybrid mode is the default for reading, and KDL v2 is the default for writing.

Release / Python package

Feel free to mention in the main announcement.

* Add version marker to the grammer * Add version marker to the Changelog * Update SPEC.md Co-authored-by: eilvelia <[email protected]> * add a mandatory newline after the version marker * add mandatory space between version number --------- Co-authored-by: eilvelia <[email protected]>

…de (#446)

zkat · 2024-12-22T02:33:25Z

LFGGGGGGG

zkat · 2024-12-22T04:59:54Z

Thank you everyone!! Great job!! We did it!!!

zkat added the enhancement New feature or request label Dec 15, 2024

zkat requested review from borland, shieldo, hkolbeck, tabatkins, danini-the-panini, alightgoesout, IceDragon200 and larsgw December 15, 2024 03:52

Release KDL 2.0.0

5d6f755

zkat force-pushed the release-2.0.0 branch from 373115f to 5d6f755 Compare December 15, 2024 04:01

fix grammar for multiline quoted strings to allow escaped whitespace …

65a0628

…on closing line

zkat force-pushed the release-2.0.0 branch from e8276f7 to 65a0628 Compare December 15, 2024 10:07

zkat and others added 3 commits December 15, 2024 02:24

Add unicode-space to raw string

83f4c37

Remove nonexistent equals-sign from the grammar (#435)

ecb34f2

fix multiline string tests

0c5604b

eilvelia and others added 2 commits December 15, 2024 17:41

grammar: fix disallowed-keyword-identifiers and string-character (#436)

c3bb12c

Back out "fix multiline string tests"

a3a6742

This backs out commit 0c5604b.

bgotink reviewed Dec 16, 2024

View reviewed changes

zkat mentioned this pull request Dec 19, 2024

update to latest 2.0 spec kdl-org/kdl-rs#103

Merged

This was referenced Dec 20, 2024

KDL 2.0.0 compliance kachick/micro-kdl#48

Open

KDL 2.0.0 compliance TheLostLambda/knus#16

Open

eilvelia added 2 commits December 21, 2024 10:45

Always escape \ inside single quotes in the grammar text (#441)

8bd1595

to match the other uses of it and the metalanguage description below

Add tests for mandatory whitespace between arguments or properties (#442

da5cbf5

)

eilvelia mentioned this pull request Dec 21, 2024

Fix a changelog line erroneously truncated in #444 #445

Merged

eilvelia and others added 4 commits December 21, 2024 14:18

Fix a changelog line erroneously truncated in #444 (#445)

998a2ec

fix: move vertical tab to the line-breaking whitespace to match Unico…

d1022ae

…de (#446)

add vertical tab change test

da4e7d4

final tweaks before release

8ad94ec

zkat merged commit 6ceecd8 into main Dec 22, 2024
1 check passed

zkat deleted the release-2.0.0 branch December 22, 2024 02:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release KDL 2.0.0 #434

Release KDL 2.0.0 #434

zkat commented Dec 15, 2024

zkat commented Dec 15, 2024

zkat commented Dec 15, 2024

tjol commented Dec 15, 2024

zkat commented Dec 15, 2024

eilvelia commented Dec 16, 2024 •

edited

Loading

zkat commented Dec 16, 2024

eilvelia commented Dec 16, 2024

eilvelia commented Dec 16, 2024

zkat commented Dec 16, 2024

eilvelia commented Dec 16, 2024 •

edited

Loading

zkat commented Dec 16, 2024

bgotink Dec 16, 2024 •

edited

Loading

zkat Dec 16, 2024

bgotink Dec 16, 2024

zkat Dec 16, 2024

zkat commented Dec 19, 2024

larsgw commented Dec 19, 2024

zkat commented Dec 19, 2024

tjol commented Dec 19, 2024

zkat commented Dec 19, 2024

eilvelia commented Dec 20, 2024

zkat commented Dec 20, 2024

zkat commented Dec 20, 2024

eilvelia commented Dec 20, 2024

zkat commented Dec 20, 2024

zkat commented Dec 21, 2024

eilvelia commented Dec 21, 2024

tjol commented Dec 21, 2024

bgotink commented Dec 21, 2024

tjol commented Dec 21, 2024

zkat commented Dec 22, 2024

zkat commented Dec 22, 2024

Release KDL 2.0.0 #434

Release KDL 2.0.0 #434

Conversation

zkat commented Dec 15, 2024

zkat commented Dec 15, 2024

zkat commented Dec 15, 2024

tjol commented Dec 15, 2024

zkat commented Dec 15, 2024

eilvelia commented Dec 16, 2024 • edited Loading

zkat commented Dec 16, 2024

eilvelia commented Dec 16, 2024

eilvelia commented Dec 16, 2024

zkat commented Dec 16, 2024

eilvelia commented Dec 16, 2024 • edited Loading

zkat commented Dec 16, 2024

bgotink Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

zkat Dec 16, 2024

Choose a reason for hiding this comment

bgotink Dec 16, 2024

Choose a reason for hiding this comment

zkat Dec 16, 2024

Choose a reason for hiding this comment

zkat commented Dec 19, 2024

larsgw commented Dec 19, 2024

zkat commented Dec 19, 2024

tjol commented Dec 19, 2024

zkat commented Dec 19, 2024

eilvelia commented Dec 20, 2024

zkat commented Dec 20, 2024

zkat commented Dec 20, 2024

eilvelia commented Dec 20, 2024

zkat commented Dec 20, 2024

zkat commented Dec 21, 2024

eilvelia commented Dec 21, 2024

tjol commented Dec 21, 2024

bgotink commented Dec 21, 2024

tjol commented Dec 21, 2024

zkat commented Dec 22, 2024

zkat commented Dec 22, 2024

eilvelia commented Dec 16, 2024 •

edited

Loading

eilvelia commented Dec 16, 2024 •

edited

Loading

bgotink Dec 16, 2024 •

edited

Loading