diff --git a/.gitignore b/.gitignore
new file mode 100644
index 000000000..fe4a57d81
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,2 @@
+*~
+compiled/
diff --git a/shrubbery/0000-shrubbery.md b/shrubbery/0000-shrubbery.md
new file mode 100644
index 000000000..41d4520fe
--- /dev/null
+++ b/shrubbery/0000-shrubbery.md
@@ -0,0 +1,1279 @@
+- Feature Name: Shrubbery notation
+- Start Date: 2019-10-01
+- RFC PR: [racket/racket2-rfcs#122](https://github.com/racket/racket2-rfcs/pull/122)
+
+# Summary
+[summary]: #summary
+
+Shrubbery notation is similar to S-expression notation, but instead of
+generating fully formed trees, it is intended to partially group input
+for further enforestation by another parser (e.g., as in Honu). The
+notation is line- and indentation-sensitive, and the parsed form of a
+shrubbery imposes grouping to ensure that further parsing is
+consistent with the shrubbery's lines and indentation.
+
+# Motivation
+[motivation]: #motivation
+
+S-expression notation imposes a grouping at the lexeme level that is
+all but guaranteed to be respected by further parsing via macro
+expansion. One consequence of this lexeme-based grouping is that
+programs can be pretty-printed and textually traversed in standard
+ways.
+
+A traditional use of S-expression notation, however, insists that
+*all* grouping is reflected in the S-expression. Reifying all grouping
+at the lexeme level is so onerous that many practical deployments of
+S-expressions include deviations from the rule, such as keyword-based
+arguments or implicit grouping by position (as in various Clojure
+forms).
+
+Another disadvantage of S-expressions is that many of the parentheses
+are redundant after the expression is pretty-printed, because
+indentation provides the same grouping information in a more
+human-readable way. That observation suggests instead relying on line
+breaks and indentation to impart grouping information, as in Python.
+
+Shrubbery notation explores a point in the design space where the
+notation is
+
+ - line- and indentation-sensitive, and
+ - intended to constrain grouping but not reflect every detail of grouping.
+
+Deferring complete grouping to another parser relieves a burden on
+reader-level notation. At the same time, line- and
+indentation-sensitive rules constrain parsing to ensure that line
+breaks and indentation in the source are not misleading.
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+Here are some example shrubberies. Each line either uses old
+indentation to continue a nesting level that was started on a previous
+line, starts with new indentation and follows a line that ends with
+`:`, or starts with new indentation and a `|` on the same line. A `:`
+or `|` can also appear in the middle of a line, but that's roughly a
+shorthand for starting a new indented line after the `:` or before the
+`|`. The complete rules involve more terminology, but that's enough to
+get a sense of the examples.
+
+```
+def identity(x): x
+
+def fib(n):
+ cond
+ | n == 0: 0
+ | n == 1: 1
+ | else: fib(n-1) + fib(n-2)
+
+def print_sexp(v):
+ match v
+ | empty: display("()")
+ | cons(a, d):
+ if is_list(d)
+ | display("(")
+ print_sexp(a)
+ for (v = in_list(d)):
+ display(" ")
+ print_sexp(v)
+ display(")")
+ | display("(")
+ print_sexp(a)
+ display(". ")
+ print_sexp(d)
+ display(")")
+ | v: print_atom(v)
+```
+
+Forms like `def`, `cond`, and `match` are not specified by
+shrubbery notation, since specifying those forms is up to a language
+that is built on top of shrubbery notation. Still, shrubbery notation
+is meant to accommodate a particular kind of syntax for nested blocks
+(via `:` and indentation) and conditional branches (via `|`).
+
+Identifiers are C-style with alphanumerics and underscores. Operators
+are sequences of symbolic characters in the sense of `char-symbolic?`,
+roughly. No spaces are needed between operators and non-operators, so
+`1+2` and `1 + 2` mean the same thing. Comments are C-style, plus a
+`#//` group-comment form. See [lexeme parsing](#lexeme-parsing) for
+more information.
+
+The following tokens are used for grouping, in addition to line breaks
+and indentation:
+
+```
+( ) [ ] { } ; , : | « » \
+```
+
+Parentheses, square brackets, and curly braces are used to form groups
+in the obvious way. A `;` or `,` acts as a group separator, even
+within a single line. A `:` or `|` treats remaining item on the same
+line like a new indented line, which forms a subgroup. A guillemet
+pair `«` and `»` can be used (probably very rarely) to explicitly
+bracket subgroups formed by `:` and `|` without line breaks. A `\`
+continues a line, effectively shifting all columns on the next line as
+if they appeared immediately after the `\`.
+
+## Grouping by lines
+
+The main grouping rule is that sequences on different lines with the
+same indentation create separate groups, one for each line.
+
+```
+this is the first group
+this is the second group
+```
+
+Comments and lines with only whitespace are ignored. They don't count
+when this document says “the previous line” or “the next line.”
+
+## Grouping by _opener_-_closer_ pairs
+
+An _opener_-_closer_ pair `(` and `)`, `[` and `]`, or `{` and `}`
+forms a nested group that can span lines. Within the _opener_-_closer_
+pair, `,` sparates groups. Groups can be on separate lines at the same
+indentation, but groups on separate lines still must be separated by
+`,`. Parsing retains whether a subgroup is formed by `()`, `[]`, or
+`{}`.
+
+```
+group 1
+[group 2 - subgroup I,
+ group 2 - subgroup II,
+ (group 2 - subgroup III - subsubgroup A,
+ group 2 - subgroup III - subsubgroup B,
+ {group 2 - subgroup III - subsubgroup C, subsubsubgroup α,
+ group 2 - subgroup III - subsubgroup C, subsubsubgroup β})]
+(group 3 - subgroup I, group 3 - subgroup II,
+ group 3 - subgroup III)
+```
+
+The following three forms are not allowed, because they are missing a
+`,` between two groups:
+
+```
+// Not allowed
+(1
+ 2)
+[1
+ 2]
+{1
+ 2}
+```
+
+A `,` is disallowed if it would create an empty group, except that a
+trailing `,` is allowed.
+
+```
+// Not allowed
+(, 1)
+(1,, 2)
+
+// Allowed, but not standard
+(1, 2,)
+```
+
+A trailing `,` is only standard style when the _closer_ that follows is
+on its own line.
+
+```
+list(
+ red,
+ green,
+ blue,
+ orange,
+)
+```
+
+## Blocking with `:` and indentation
+
+A sequence of groups has a particular indentation that is determined
+by the first group in the sequence. Subsequent groups in a sequence
+must start with the same indentation as the first group.
+
+```
+group 1
+group 2
+// error, because the group is indented incorrectly:
+ group 3
+```
+
+When a line ends with `:` and the next line is more indented, then
+it starts a new sequence of groups that form a _block_:
+
+```
+group:
+ subgroup 1
+ subgroup 2
+```
+
+There is no constraint on how much indentation a nested group sequence
+must use, as long as the indentation is more than the enclosing group.
+Also, a new line is not required after `:`, but then it's as if the
+`:` is followed by a newline plus spaces that reach the same column as
+the `:`. All four of the following groups are the same, each with one
+block that has two nested groups:
+
+
+```
+hello:
+ world
+ universe
+
+hello:
+ world
+ universe
+
+hello: world
+ universe
+
+hello: world
+ universe
+```
+
+Within an _opener_-_closer_ pair, a nested group sequence can start at
+any indentation; it doesn't have to be indented to the right of the
+_opener_.
+
+```
+function(
+ argument,
+ more
+)
+```
+
+A block that is started with `:` normally cannot be empty (unless
+explicit-grouping `«` and `»` are used as described in a later
+section), so the following is ill-formed:
+
+```
+bad_empty: // empty block disallowed
+```
+
+However, `:` can be used at the start of a group so that the group
+contains only a block. When `:` starts a group that is in the
+top-level sequence or within an _opener_-_closer_ pair, the block
+created by `:` is allowed to be empty (because that provides a way to
+express an empty in a context where it likely to be intentional
+instead of confusing). For example, the first of the following three
+top-level groups has just a block that contains one group with the
+single element `untagged`, the second top-level group has just a
+block with zero groups, and the third has a group with one parenthesized
+sequence of groups where the middle one has an empty block:
+
+```
+: untagged
+
+:
+
+(1, :, 2)
+```
+
+## Continuing with indentation and an operator
+
+When a newly indented line starts with an operator and when the
+preceding line does _not_ end with `:`, then the indented line does
+not form a block, and it may instead continue the previous line. The
+operator-starting line continues only if the previous line was not a
+continuing line; however, additional continuing lines can start with
+an operator (not necessarily the same one) at the same indentation as
+the original continuing line. The following two groups are the same:
+
+```
+f(1) + 2
+ + 3 + 4
+ - 5 - 6
+
+f(1) + 2 + 3 + 4 - 5 - 6
+```
+
+A block is always at the end of its immediately containing group. One
+consequence is that an operator-starting line cannot continue a group
+that already has a block:
+
+```
+hello: world
+ + 3 // bad indentation
+```
+
+Along those lines, there is no ambiguity when an indented line appears
+after `:` and starts with an operator. In that case, the indented line
+is part of the block, since it cannot continue the group that contains
+the block. For example, the following two groups are the same, each
+with a block that has a `+ 3` group:
+
+```
+hello: + 3
+
+hello:
+ + 3
+```
+
+## Blocking with `|`
+
+A `|` is implicitly shifted half a column right (so, implicitly
+nested), and it is implicitly followed by a `:` that conceptually
+occupies same column as the `|`. That is, like `:`, a `|` always
+creates a nested block. Furthermore, `|` starts an enclosing block
+that includes the `|` block plus subsequent `|` blocks that are at the
+same indentation. A `|` that starts the enclosing block can appear at
+the start of a line with new indentation. The following four groups
+are the same:
+
+```
+hello
+| world
+| universe
+
+hello
+ | world
+ | universe
+
+hello | world
+ | universe
+
+hello |
+ world
+ |
+ universe
+```
+
+Each of the four groups has two elements: `hello` and a block. The
+block has two groups, each of which is a more nested block. The first
+nested block has `world` in a single group, and the second nested
+block as `universe` in a single group.
+
+A `|` cannot be a in a top-level sequence of groups or start a group
+immediately within `()`, `[]`, or `{}`, and it cannot appear just
+after `:`. Like `:`, the content of a block after `|` cannot be empty
+unless explicit-grouping `«` and `»` are used.
+
+If a `|` appears on the same line as an earlier `|` and is not more
+nested inside `()`, `[]`, or `{}`, then the `|` terminates the earlier
+`|`'s block and continues its enclosing block with a new `|` group.
+The intent and consequence of this rule is that multiple `|`s can be
+used on a single line as an alternative to starting each `|` on its
+own line, making the following groups the same as the above groups:
+
+```
+hello | world | universe
+
+hello
+| world | universe
+```
+
+The implicit shifting of `|` by half a column is consistent with its
+visual representation, and it avoids the possibility of a group
+sequence that contains a mixture of `|`-started groups and other kinds
+of groups. Standard indentation uses no additional space of
+indentation before `|` relative to its enclosing block's group.
+
+
+## Separating groups with `;` and `,`
+
+A `;` separates two groups on the same line. A `;` is allowed in any
+context—except between groups immediately within, `()`, `[]`, or `{}`,
+where a `,` separates groups. The following three blocks
+are the same:
+
+```
+hello:
+ world
+ universe
+
+hello:
+ world; universe
+
+hello: world; universe
+```
+
+The `;` and `,` separators interact differently with blocks formed by
+`:` and `|`. A `,` closes subgroups and blocks as necessary to reach
+an enclosing `()`, `[]`, or `{}`, while a `;` separate groups within a
+nested group sequence. If `;` would create an empty group, it is
+ignored.
+
+For example, the following two groups are the same, and they have one
+parenthesized term that has a single block, and the block has two
+groups:
+
+```
+(hello: world; universe)
+
+(hello: world
+ universe)
+```
+
+The following two groups are also the same, where the group has one
+parenthesized term, but that term contains two groups, where the first
+group is a block that contains a single group:
+
+
+```
+(hello: world, universe)
+
+(hello: world,
+ universe)
+```
+
+## Line- and column-insensitivity with `«` and `»`
+
+A block can be delimited explicitly with `«` and `»` to disable the
+use of line and column information for parsing between `«` and `»`. A
+`«` can be used immediately after `:` or immediately after `|`, in
+which case a `»` indicates the end of the block that starts after the
+`:` or `|`. Within the block, an explicit `;` must be used to separate
+groups.
+
+A sequence of groups, either at the top level or within a block, can
+be written without line and column sensitivity as `;` followed
+immediately by `«`, in which case a `»` indicates the end of the
+sequence, and groups within the sequence are separated by `;`. When
+parsing, the groups within the sequence are spliced into the enclosing
+context. The combination of `;` and `«` is intended for entering line-
+and column-insensitive mode for a single group or for representing a
+sequence of groups that is not within a block.
+
+Whitespace and block comments are allowed between a `:`, `|`, or `;`
+and its `«`, but in a line-sensitive context, the `«` must be on the
+same line as its `:`, `|`, or `;`.
+
+The following five groups are the same:
+
+```
+hello:
+ if x
+ | world
+ planet
+ | universe
+
+hello: if x | world; planet | universe
+
+hello:«
+ if x
+ |« world;
+ planet »
+ |« universe »»
+
+hello:« if x |« world; planet » |« universe »»
+
+;«hello
+ :
+ «
+ if
+ x
+ |
+ «
+ world
+ ;
+ planet
+ »
+ |
+ «
+ universe
+ »
+ »
+ »
+```
+
+Using `«` and `»` can “armor” a shrubbery for transport from one
+context to another where its line breaks or indentation might get
+mangled. For example, an editor might offer an operation to armor a
+range of text in perparation for moving or copying the text, and then
+it can be properly indentend in its destination before unmarmoring.
+Along similar lines, when writing code as data to be read back later,
+it's easy for a printer to insert explicit `«` and `»`.
+
+In rare cases, a programmer might write `«` and `»` directly. Although
+many shrubbery forms can be written with `:`, `|`, and `;` on a single
+line, as illustrated above, not all forms can be collapsed to a single
+line without extra delimiters. For example, these six groups are all
+different:
+
+```
+outside:
+ inside: fruit
+ rind
+
+// not the same, because `rind` is within `inside:`
+outside: inside: fruit; rind
+
+if true
+| if false
+ | x
+ | y
+| z
+
+// not the same, because there's one block with five `|` alternatives
+if | true | if false | x | y | z
+
+hello:
+ if x
+ | world
+ | universe
+ the end
+
+// not the same, because `the end` is in the second `|`:
+hello: if x | world | universe; the end
+```
+
+Using `«` and `»` can help in those cases:
+
+```
+outside:
+ inside: fruit
+ rind
+
+outside: inside:« fruit »; rind
+
+if true
+| if false
+ | x
+ | y
+| z
+
+if | true |« if false | x | y » | z
+
+hello:
+ if x
+ | world
+ | universe
+ the end
+
+hello: if x | world |« universe »; the end
+```
+
+Even so, delimiting blocks with `«` and `»` is expected to be rare in
+practice, both because programmers are likely to break things across
+lines and because a language that uses shrubbery notation is likely to
+allow `()` in places where grouping might be needed. For example,
+assuming that `if` is an expression form and `()` can wrap an
+expression, a nested conditional is probably better written like this:
+
+```
+if | true | (if false | x | y) | z
+```
+
+Using `()` in this way does not produce an equivalent shrubbery to `if
+| true |« if false | x | y »| z`, but it might represent an equivalent
+expression in the language using shrubbery notation.
+
+To stay consistent with blocks expressed through line breaks and
+indentation, a block with `«` and `»` must still appear at the end of
+its enclosing group.
+
+```
+// not allowed, because a block must end a group
+inside:« fruit » more
+```
+
+
+## Continuing a line with `\`
+
+As a last resort, `\` can be used at the end of a line (optionally
+followed by whitespace and coments on the line) to continue the next
+line as it if were one line continuing with the next line. The itself
+`\` does not appear in the parsed form. A that is not at the end of a
+line (followed by whitespace and coments) is treated the same as
+whitespace.
+
+Lines contianing only whitespace and (non-term) comments do not count
+as “the next line” even for `\` continuations, so any number of
+whitespace and comment lines can appear between `\` and the line that
+it continues.
+
+
+```
+this is \
+ the first group
+this \ is \ the \ second \ group
+
+this is a group \
+ with (a,
+ nested,
+ list)
+
+this is a group \
+ with (a,
+ \
+ nested,
+ \
+ list)
+
+this is a group \
+ with (a,
+ \
+ /* this a comment on `nested`: */
+ nested,
+ \
+ list)
+
+```
+
+## Group comments
+
+A `#//` comments out a group or `|` alternative. To comment out a
+group, `#//` must appear either on its own line before a group or at
+the start of a group. To comment out an alternative, `#//` must appear
+on its own line before the alternative or just before a `|` that does
+*not* start a new line.
+
+The interaction between `#//` and indentation depends on how it is
+used:
+
+ * When `#//` appears completely on its own line (possibly with
+ whitespace and non-group comments), then its indentation does not
+ matter. It comments out the next group or alternative—which might
+ be a single-line group, block, or `|` block.
+
+ * When `#//` appears at the start of a group with more tokens
+ afterward on the same line, it determines that group's indentation,
+ and it must obey any constraints on the group's indentation. When
+ `#//` appears immediately after an opener but with nothing else
+ afterward on the same line, it determines indentation for the
+ groups immediately within the opener, and it comments out the first
+ group.
+
+ * When `#//` appears just before a `|` on the same line, then unlike
+ the case for groups, it does not affect the the column of the `|`
+ as used to align alternatives on later lines. (That's because the
+ half-column alignment of `|` does not fit with the column alignment
+ of `#`.) Along those lines and to avoid an indentation mismatch, a
+ `#//` is not allowed to start a line for commenting out a `|`
+ alternative on the same line.
+
+A `#//` is not allowed without a group or alternative afterward to
+comment out. Multiple `#//`s do not nest (i.e., two `#//`s in a row is
+always an error).
+
+The following three groups all parse the same:
+
+```
+{
+ hello:
+ val x: f(1, 2 + 3)
+ match x
+ | 1: 'one'
+ | 2: 'two'
+}
+
+{
+ hello:
+ val x:
+ #//
+ g(-1)
+ f(
+ #//
+ 0,
+ 1,
+ 2 + 3,
+ #//
+ 4 + 5)
+ #//
+ not included in the code
+ match x
+ #//
+ | 0: no
+ | 1: 'one'
+ #//
+ | 1.5: no
+ | 2: 'two'
+ #//
+ | 3: no,
+ #//
+ goodbye:
+ the enclosing group of the block is commented out
+}
+
+{
+ hello:
+ val x:
+ #// g(-1)
+ f(#// 0, 1, 2 + 3, #// 4 + 5)
+ #// not included in the code
+ match x #// | 0: no | 1: 'one' #// | 1.5: no
+ | 2: 'two' #// | 3: no,
+ #// goodbye:
+ the enclosing group of the block is commented out
+}
+```
+
+## More examples
+
+Here are more example shrubberies. These shrubberies are not
+necessarily consistent with each other in the sense of sketching a
+single language that uses shrubbery notation; they show different
+potential ways of using the notation.
+
+
+```
+define pi: 3.14
+
+define
+| fib(0): 0
+| fib(1): 1
+| fib(n): fib(n-1) + fib(n-2)
+
+define fib(n):
+ match n
+ | 0: 0
+ | 1: 1
+ | n: fib(n-1) + fib(n-2)
+
+define fib(n):
+ match n | 0: 0
+ | 1: 1
+ | n: (fib(n-1)
+ + fib(n-2))
+
+define fib(n):
+ match n
+ | 0:
+ 0
+ | 1:
+ 1
+ | n:
+ fib(n-1) + fib(n-2)
+
+define make_adder(n):
+ lambda (m):
+ printf("adding to ~a\n", m)
+
+define fourth(n: integer):
+ define m: n*n
+ define v: m*m
+ printf("~a^4 = ~a\n", n, v)
+ v
+
+struct posn(x, y):
+ property prop_equal_and_hash:
+ let (hc = lambda (a: posn, hc):
+ hc(a.x) + hc(a.y),
+ eql = lambda (a: posn, b: posn, eql):
+ eql(a.x, b.x) && eql(a.y, b.y)):
+ values(eql, hc, hc)
+
+define go():
+ define helper(n):
+ list(n, n)
+ define more(m):
+ if m == 0 | "done"
+ | more(m - 1)
+ helper(more(9))
+
+define curried:
+ lambda (x):
+ lambda (y):
+ lambda (z):
+ list(x, y, z)
+
+let (x = 1,
+ y = 2):
+ printf("About to add")
+ x+y
+
+define show_zip(l, l2):
+ for (x = in_list(l),
+ x2 = in_list(l2)):
+ print(x)
+ print_string(" ")
+ print(x2)
+ newline()
+
+define show_combos(l, l2):
+ for (x = in_list(l)):
+ then (x2 = in_list(l2)):
+ print(x)
+ print_string(" ")
+ print(x2)
+ newline()
+```
+
+## Parsed representation
+
+The parse of a shrubbery can be represented by an S-expression:
+
+ * Each group is represented as a list that starts `'group`, and
+ the rest of the list are the elements of the group.
+
+ * Atom elements are represented as “themselves” within a group,
+ including identifers a symbols, except that an operator is
+ represented as a 2-list that is `'op` followed by the operator name
+ as a symbol.
+
+ * A group sequence is represented as a list of `'group` lists.
+
+ * An element created by `()` is represented by `'parens` consed
+ onto a group-sequence list.
+
+ * An element created by `[]` is represented by `'brackets` consed
+ onto a group-sequence list.
+
+ * An element created by `{}` is represented by `'braces` consed
+ onto a group-sequence list.
+
+ * A block is represented as either `'block` or `'alts` consed onto a
+ group-sequence list. The representation uses `'alts` if the content
+ of the block is a squence of groups started with `|`, and it's
+ `'block` otherwise.
+
+ * A block created to follow `|` appears immediately in an `'alts`
+ list.
+
+Note that a block can only appear immediately in a `'group` or `'alts`
+list. Note also that there is no possibility of confusion between
+symbol atoms in the input and `'group`, `'block`, etc., at the start
+of a list in an S-expression representation, because symbol atoms will
+always appear as non-initial items in a `'group` list.
+
+Here are some example shrubberies with their S-expression parsed
+representations:
+
+```
+define pi: 3.14
+
+(group define pi (block (group 3.14)))
+```
+
+```
+define fourth(n: integer):
+ define m: n*n
+ define v: m*m
+ printf("~a^4 = ~a\n", n, v)
+ v
+
+(group define
+ fourth
+ (parens (group n (block (group integer))))
+ (block
+ (group define m (block (group n (op *) n)))
+ (group define v (block (group m (op *) m)))
+ (group printf
+ (parens (group "\"~a^4 = ~a\\n\"") (group n) (group v)))
+ (group v)))
+```
+
+```
+if x = y
+| same
+| different
+
+(group if x (op =) y (alts (block (group same))
+ (block (group different))))
+```
+
+```
+define fib(n):
+ match n
+ | 0: 0
+ | 1: 1
+ | n: fib(n-1) + fib(n-2)
+
+(group define
+ fib
+ (parens (group n))
+ (block
+ (group match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group n
+ (block
+ (group fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))))
+```
+
+## Lexeme Parsing
+[lexeme-parsing]: #lexeme-parsing
+
+The tokens used for grouping and indentation are distinct lexemes:
+
+```
+( ) [ ] { } ; , : | « » \
+```
+
+Other lexemes are described by the grammar in the table below, where
+an asterisk in the left column indicates the productions that
+correspond to lexemes. Only simple forms of numbers are supported
+directly (decimal integers, decimal floating point, and hexadecimal
+integers, in all cases allowing `_`s between digits), but a `#{`...`}`
+escape provides access to the full Racket S-expression number grammar.
+Boolean literals are Racket-style, instead of reserving identifiers.
+Special floating-point values similarly use a `#` notation.
+
+Identifiers are formed from Unicode alphanumeric characters plus `_`,
+where the initial character must not be a numeric character. An
+identifier prefixed with `~` forms a keyword, analogous to prefixing an
+identifier with `#:` in Racket.
+
+Operators are formed from Unicode symbolic and punctuation characters
+other than the ones listed above as distinct lexemes (plus a few more,
+like `"` and `'`), but `|` or `:` is also
+allowed in an operator name as long as it is not by itself. A
+multi-character operator cannot end in `+`, `-`, or `.` to avoid
+ambiguity in cases like `1+-2` (which is `1` plus `-2`, not `1` and
+`2` combined with a `+-` operator), unless the operator contains
+only `+`, `-`, or `.` (so `++`, `--`, and `...` are allowed).
+Also, multi-character operator cannot end with `/` or contain `//` or
+`/*`, because that can create ambiguities with comments.
+
+Implicit in the grammar is the usual convention of choosing the
+largest possible match at the start of a stream. Not reflected in the
+grammar is a set of delimiter requirements: numbers, `#true`, and
+`#false` must be followed by a delimiter. For example, `1x` is a
+lexical error, because the `x` after `1` is not a delimiter.
+Non-alphanumeric characters other than `_` and `.` are delimiters.
+Finally, the treatment of `+` and `-` as a number prefix versus an
+operator is subject to a special rule: they are parsed as operators
+when immediately preceded by an alphanumeric character, `_`, `)`, `]`, or `}`
+with no whitespace in between. For
+example, `1+2` is `1` plus `2`, but `1 +2` is `1` followed by the
+number `+2`.
+
+When a `#{`...`}` escape describes an identifier S-expression, it is
+an identifier in the same sense as a shrubbery-notation identifier.
+the same holds for numbers, booleans, strings, byte strings, and
+keywords. A `#{`...`}` escape must _not_ describe a pair, because
+pairs are used to represent a parsed shrubbery, and allowing pairs
+would create ambiguous or ill-formed representations.
+
+A `@` starts an at-expression form similar to the notaton supported by
+`#lang at-exp` (which oriented toward S-expressions and
+readtable-based). The next subsection explains in more detail, but the
+tabel below sketches the shape of `@` forms.
+
+| | nonterminal | | production | adjustment |
+|---|-----------------|-----|-----------------------------------------------------------|--------------------------------|
+| * | _identifier_ | is | _alpha_ _alphanum_ * | |
+| | | | | |
+| | _alpha_ | is | **an alphabetic Unicode character or** `_` | |
+| | | | | |
+| | _alphanum_ | is | _alpha_ | |
+| | | or | **a numeric Unicode character** | |
+| | | | | |
+| * | _keyword_ | is | `~` _identifier_ | |
+| | | | | |
+| * | _operator_ | is | _opchar_ * _tailopchar_ | **not** `❘` **or** `:` ... |
+| | | or | `.` + | ... **or containing** `//` ... |
+| | | or | `+` + | ... **or containing** `/*` |
+| | | or | `-` + | |
+| | | | | |
+| | _opchar_ | is | **a symbolic Unicode character not in** _special_ | |
+| | | or | **a punctuation Unicode character not in** _special_ | |
+| | | or | **one of** `:`, `❘` | |
+| | | | | |
+| | _tailopchar_ | is | **anything in** _opchar_ **except** `+`, `-`, `.`, `/` | |
+| | | | | |
+| | _special_ | is | **one of** `(`, `)`, `[`, `]`, `{`, `}`, `«`, `»` | |
+| | | or | **one of** `"`, `;`, `,`, `~`, `#`, `\`, `_`, `@` | |
+| | | | | |
+| * | _number_ | is | _integer_ | |
+| | | or | _float_ | |
+| | | or | _hexinteger_ | |
+| | | | | |
+| | _integer_ | is | _sign_ ? _nonneg_ | |
+| | | | | |
+| | _sign_ | is | **one of** `+` **or** `-` | |
+| | | | | |
+| | _nonneg_ | is | _decimal_ _usdecimal_ + | |
+| | | | | |
+| | _decimal_ | is | `0` through `9` | |
+| | | | | |
+| | _usdecimal_ | is | _decimal_ | |
+| | | or | `_` _decimal_ | |
+| | | | | |
+| | _float_ | is | _sign_ ? _nonneg_ ? `.` _nonneg_? _exp_ ? | |
+| | | or | _sign_ ? _nonneg_ _exp_ | |
+| | | or | `#inf` | |
+| | | or | `#neginf` | |
+| | | or | `#nan` | |
+| | | | | |
+| | _exp_ | is | `e` _sign_ ? _nonneg_ | |
+| | | or | `E` _sign_ ? _nonneg_ | |
+| | | | | |
+| | _hexinteger_ | is | `0x` _hex_ _ushex_ * | |
+| | | | | |
+| | _hex_ | is | **one of** `0` **through** `9` | |
+| | | or | **one of** `a` **through** `f` | |
+| | | or | **one of** `A` **through** `F` | |
+| | | | | |
+| | _ushex_ | is | _hex_ | |
+| | | or | `_` _hex_ | |
+| | | | | |
+| * | _boolean_ | is | `#true` | |
+| | | or | `#false` | |
+| | | | | |
+| * | _string_ | is | `"` _strelem_ * `"` | |
+| | | | | |
+| | _strelem_ | is | **element in Racket string** | `\U` ≤ 6 digits |
+| | | | | |
+| * | _bytestring_ | is | `#"` _bytestrelem_ * `"` | |
+| | | | | |
+| | _bytestrelem_ | is | **element in Racket byte string** | |
+| | | | | |
+| * | _sexpression_ | is | `#{` _racket_ `}` | |
+| | | | | |
+| | _racket_ | is | **any non-pair Racket S-expression** | |
+| | | | | |
+| * | _comment_ | is | `//` _nonnlchar_ | |
+| | | or | `/*` _anychar_ `*/` | nesting allowed |
+| | | | | |
+| * | _termcomment_ | is | `#//` | |
+| | | | | |
+| | _nonnlchar_ | | **any character other than newline** | |
+| | | | | |
+| * | _atexpression_ | is | `@` _command_ ? _arguments_ ? _body_ ? | no space between these parts |
+| | | | | |
+| | _command_ | is | _identifier_ | |
+| | | or | _keyword_ | |
+| | | or | _operator_ | |
+| | | or | _number_ | |
+| | | or | _boolean_ | |
+| | | or | _string_ | |
+| | | or | _bytestring_ | |
+| | | or | _racket_ | |
+| | | or | `(` _group_ * `)` | usual comma-separated groups |
+| | | or | `«` _group_ `»` | one spliceable group, no block |
+| | | | | |
+| | _arguments_ | is | `[` _group_ * `]` | usual comma-separated groups |
+| | | | | |
+| | _body_ | is | `{` _text_ `}` | possible escapes in _text_ |
+| | | or | _atopen_ _text_ _atclose_ | _atcloser_ matching _atopen_ |
+| | | | | |
+| | _atopen_ | is | `❘` _asciisym_ * `{` | |
+| | | | | |
+| | _atclose_ | is | `}` _asciisym_ * `❘` | reverses and flips paren-like |
+
+# At-notation using `@`
+
+An `@` form of the shape
+
+```
+ @«command ...»[arg, ...]{ body }...
+```
+
+is parsed into the same representation as
+
+```
+ command ...(arg, ..., [parsed_body, ...], ...)
+```
+
+That is, the command part is left at the front and spliced into its
+enclosing group, while the argument and body parts are wrapped with
+parentheses to make them like arguments. Each body text is parsed into
+a list of string literals and escapes, and multiple body texts can
+be provided in multiple `{`...`}`s.
+
+The command part usually does not have `«»`, and it is instead
+usually written as an identifier, operator, or parenthesized term. The
+argument and body parts, when present, always use `[]` and `{}`,
+respectively. Any of the three kinds parts can be omitted, but when
+multiple parts are present, they must have no space between them or
+the leading `@`. When the argument and body parts are both
+omitted, the command part is simply spliced into its context.
+
+The conversion to a call-like form, keeping each body in a separate
+list, and allowing multiple body arguments are the three main ways
+that shrubbery `@` notation differs from `#lang at-exp` notation. The
+other differences are the use of `«`...`»` instead of `|`...`|` for
+delimiting a command, and the use of `@//` instead of `@;` for
+comments. The details are otherwise meant to be the same, and the rest
+of this section is mostly a recap.
+
+A body part is treated as literal text, except where `@` is used in a
+body to escape. An unescaped `}` closes a body, except that an
+unescaped `{` must be balanced by an unescaped `}`, with both treated
+as part of the body text. Instead of `{`, a body-starting opener can
+be `|` plus `{` with any number of ASCII punctuation and symbol
+characters (other than `{`) in between; the corresponding closer is
+then the same sequence in reverse, except that some characters are
+flpped: `{` to `}`, `(` to `)`, `)` to `(`, `[` to `]`, `]` to `[`,
+`<` to `>`, and `>` to `<`. With an `|`...`{` opener, an escape is
+formed by using the opener followed by `@`, while opener–closer pairs
+balance within the body text. When multiple body parts are provided,
+each can use a different opener and closer. The parsed form of the
+body breaks up the body text into lines and `"\n"` as separate string
+literals in the parsed list form, with each escape also being its own
+element in the list form. Parsed body text also has leading and
+trailing whitespace adjusted the same as with `#lang at-exp`.
+
+After the `@` of an escape in body text, the escape has the same
+form as an at-notaton form that starts with `@` as a shubbery. That
+is, `@` forms are essentially the same whether starting in shrubbery
+mode or body-text mode.
+
+In body text, there are two additional comment forms that are not
+supported in shrubbery mode. A `@//{` starts a block comment that ends
+with `}`, and the comment form is not part of the body text. The `@//`
+comment form must be prefixed with an opener when its enclosing body
+is started with an opener that isn't just `{`, and the `{` after `@//`
+can more generally be an `|`...`{` opener with the corresponding
+closer. Opener–closer pairs must be balanced in the commented block,
+the same as in body text. A `@//` comment form (prefixed with an
+opener as needed to form an escape) that is not followed by `{` or an
+`|`...`{` opener comments out the rest of the line, including a
+comment-terminating newline.
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+See [parse.rkt](parse.rkt). Note that if you run with no arguments,
+the that program will read from stdin. Supply one or more files to
+read from those files instead of stdin. Supply `--recover` to continue
+parsing after indentation or _closer_ errors.
+
+See [demo.shrb](demo.shrb), [interp.shrb](interp.shrb), and
+[weird.shrb](weird.shrb) for more examples.
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+Shrubbery notation may not be a good choice where precise and complete
+grouping is needed, both because its grouping is coarse-grained and
+the grouping rules generate lots of extra `group` layers.
+
+Shrubbery notation does not resolve the question of how infix
+expressions parse. There is no precedence at the shrubbery level, for
+example, other than the way that a `:` has higher precedence (in a
+sense) than `|`.
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+The lexeme-level syntax is chosen to be familiar to programmers
+generally. The sequence `1+2` is one plus two, not a strangely spelled
+identifier. Tokens like `(`, `,`, `{` and `;` are used in familiar
+ways. Shrubbery notation provides enough grouping structure that code
+navigation and transformation should be useful and straightforward in an
+editor.
+
+Parentheses in shrubbery notation do not disable indentation, unlike
+some indentation-sensitive notations. That choice supports a language
+in shrubbery notation where parentheses can be added around any
+expression — even if the expression is written with indentation
+(although the expression may need to be shifted right to preserve
+relative indentation, depending on how parentheses are added).
+
+The inclusion of `|` in shrubbery notation reflects the fact that
+conditional forms (such a `if`, `cond`, and `match`) are important and
+common. A distinct, pleasant, and uniform pattern for conditionals
+deserves direct support in the notation.
+
+Requiring a preceding `:` or preceding/following `|` for
+block-creating indentation is mostly a kind of consistency check to
+enable better and earlier errors when indentation goes wrong. It also
+allows indentation that starts with an operator to continue a group;
+it's possible for bad indentation to inadvertently cause an operator
+to be treated as continuing a group, but hopefully that will be rare.
+Always requiring a preceding `:` before an indented `|` line would be
+consistent, but it adds extras `:`s where `|` already provides one
+consistency check. Allowing an optional `:` before `|` would work, but
+programmers may then choose differently on omitting or including the
+`:`, leading to subtly divergent conventions.
+
+Explicit block grouping via `«` and `»` is expected to be rare. The
+grouping characters were intentionally chosen from the Latin-1
+extension of ASCII to avoid reserving additional ASCII characters.
+
+Making whitespace and comment lines ignored in all contexts means that
+they can be freely added without intefering with grouping. The `\`
+continuation operator is somewhat unusual in that it skips blank and
+comment lines to continue, as opposed to requiring `\` on every
+continuing line; that, too, allows extra blank and comment lines to be
+added, even amid continuing lines.
+
+The interaction of indentation and `\` differs slightly from Python,
+which does not count the space for `\` itself or any leading
+whitespace on a continuing line toward indentation. Counting the
+leading whitespace on a continuing line has the advantage that it can
+reach an arbitrary amount of identation within a constrained textual
+width. Counting the `\` itself is consistent with ignoring `\` when it
+appears within a line, so grouping stays the same whether there's a
+newline or the continue line immediately after `\`. The whitespace
+role of `\` also means that spaces can be turned into `\` to “harden”
+code for transfer via media (such as email) that might mangle
+consecutive spaces.
+
+Using `~` for keywords has a precedent in OCaml. Using `~` for
+keywords uses up a character that might otherwise be used for
+operators, but keywords seem useful enough to be worth this cost. The
+notion of keywords as distinct from identifiers has been liberating
+for Racket syntax (particularly since keywords can be kept disintinct
+from expressions more generally), and we expect similar benefits for
+having keywords in shrubbery notation.
+
+The `#{....}` escape to S-expressions bridges between shrubbery
+notation and Racket identifiers. For example, `#{exact-integer?}` is
+an identifier with `-` and `?` as part of the identifier. Shrubbery
+notation could be adapted to support Lisp-style identifiers by
+requiring more space around operators, but the rule for continuing a
+group between `(` and `)` or `[` and `]` currently depends on
+distinguishing operators from non-operators.
+
+For `@`, the choice of treating `@f[arg]{text}` as `f(arg, ["text"])`
+instead of `f(arg, "text")` reflects experience with S-expression `@`
+notation. Although it seems convenient that, say `@bold{x}` is treated
+as `(bold "x")`, the consequence is that a function like `bold` might
+be implemented at first to take a single argument; later, a use like
+`@bold{Hello @name}` breaks, because two arguments are provided.
+Making explicit the list that's inherent in body parsing should help
+reduce such mistakes (or bad design choices) for functions that are
+meant to be used with `@` notation.
+
+# Prior art
+[prior-art]: #prior-art
+
+Indentation-sensitive parsing and the use of `:` is obviously informed
+by Python.
+
+Sampling notation's rules relating indentation, lines, `;`, and `:`
+are originally based on the [`#lang
+something`](https://github.com/tonyg/racket-something) reader, which
+also targets an underlying expander that further groups tokens.
+Shrubbery notation evolved away from using `{}` for blocks, however,
+because `:` was nearly always preferred in experiements with the
+notation. For the very rare case that explicit gropuing is needed for
+a block, `«` and `»` can be used. Freeing `{}` from use for blocks,
+meanwhile, allows its use for set and map notations.
+
+Shrubbery notation is also based on
+[Lexprs](https://github.com/jeapostrophe/racket2-rfcs/blob/lexpr/lexpr/0004-lexpr.md),
+particularly its use of `|`. Lexprs uses mandatory `:` and `|` tokens
+as a prefix for indentation, and it absorbs an additional line after
+an indented section to allow further chaining of the group. Although
+`«»` can be used to form multiple subgroups within a shrubbery group,
+the notation discourages that style in favor of further nesting (or,
+in the case of `if`, in favor of `|` notation like other
+conditionals).
+
+Shrubbery notation is in some sense a follow-up to [sapling
+notation](https://github.com/mflatt/racket2-rfcs/blob/sapling/sapling/0005-sapling.md).
+The primary difference is that shrubbery notation is
+indentation-sensitive, while sapling notation is
+indentation-insensitive. Indentation sensitivity and block conventions
+in shrubbery notation avoid some delimiters and blank lines that are
+needed in sapling notation.
+
+More generally, shrubbery notation takes inspiration from
+S-expressions and alternative S-expression notations. The idea that,
+even in an S-expression-like setting, some parsing can be deferred a
+later parser has many precedents, including Clojure's choice of where
+to put parentheses and notations that use something like `$` to escape
+to infix mode.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+
+# Future possibilities
+[future-possibilities]: #future-possibilities
+
+Like other notation designs, this one leaves open exactly the way that
+the notation would be used to express a new programming language. The
+examples are meant to be suggestive and have influenced many of the
+notational choices, though.
diff --git a/shrubbery/demo.shrb b/shrubbery/demo.shrb
new file mode 100644
index 000000000..5f2911d11
--- /dev/null
+++ b/shrubbery/demo.shrb
@@ -0,0 +1,296 @@
+let (x = 1,
+ y = 2):
+ x+y
+
+let (x = 1, y = 2): x+y
+
+define pi: 3.14
+
+define fib(n):
+ log_error("fib called")
+ cond | n == 0: 0
+ | n == 1: 1
+ | else: fib(n-1) + fib(n-2)
+
+define
+ | fib(0): 0
+ | fib(1): 1
+ | fib(n): fib(n-1) + fib(n-2)
+
+define
+ | fib(0): 0
+ | fib(1): 1
+ | fib(n): fib(n-1)
+ + fib(n-2)
+
+define fib:
+ lambda (n):
+ cond
+ | n == 0: 0
+ | n == 1: 1
+ | else: fib(n-1) + fib(n-2)
+
+// Ok to add `:` before `|`. This parses the
+// same as the prevous example, but this is not the standard
+// style (which is to omit unnecessary colons).
+define fib:
+ lambda (n):
+ cond
+ | n == 0: 0
+ | n == 1: 1
+ | else: fib(n-1) + fib(n-2)
+
+// Adding parentheses is ok, at least with the obvious handling
+// of parentheses by the use of sapling notation, but the
+// parentheses are apparent in the sampling parse.
+(define fib:
+ (lambda (n):
+ (cond
+ | (n == 0): 0
+ | (n == 1): 1
+ | else: (fib(n-1) + fib(n-2)))))
+
+// For maximal noise, you could add parentheses and trailing colons.
+// But we won't.
+
+// START: all of the next `fib` definitions are exactly the same
+
+define fib(n):
+ match n
+ | 0: 0
+ | 1: 1
+ | n: fib(n-1) + fib(n-2)
+
+define fib(n):
+ match n
+ | 0: 0
+ | 1: 1
+ | n: fib(n-1) + fib(n-2)
+
+define fib(n):
+ match n | 0: 0
+ | 1: 1
+ | n: fib(n-1) + fib(n-2)
+
+define fib(n):
+ match n
+ | 0:
+ 0
+ | 1:
+ 1
+ | n:
+ fib(n-1) + fib(n-2)
+
+define fib(n):
+ match n « | 0: « 0 »
+ | 1: « 1 »
+ | n: « fib(n-1) + fib(n-2) » »
+
+define fib(n):
+ match n « | « 0: 0 » | « 1: 1 » | n: fib(n-1) + fib(n-2) »
+
+define fib(n): match n « | « 0: 0 » | « 1: 1 » | n: fib(n-1) + fib(n-2) »
+
+define fib(n): « match n « | « 0: 0 » | « 1: 1 » | n: fib(n-1) + fib(n-2) » »
+
+define fib(n): « match n « | 0: «0» | 1: «1» | n: « fib(n-1) + fib(n-2) » » »
+
+define fib(n): « match n « | « 0: «0» » | « 1: «1» » | « n: « fib(n-1) + fib(n-2) » » » »
+
+define fib(n): « match n | « 0: «0» » | « 1: «1» » | « n: « fib(n-1) + fib(n-2) » » »
+
+// END equivalent `fib` definitions
+
+define make_adder(n):
+ lambda (m):
+ printf("adding to ~a\n", m)
+ m+n
+
+define analyze(n):
+ if n == 0
+ | printf("zero\n")
+ | printf("other\n")
+ printf("done\n")
+
+define analyze(n):
+ if n == 0
+ | printf("zero\n")
+ printf("done saying zero\n")
+ | printf("other\n")
+ printf("done saying other\n")
+
+struct posn(x, y)
+
+struct color_posn(col):
+ extends posn
+ mutable
+
+struct posn(x mutable,
+ y = 7):
+ methods equality:
+ define equal(a, b):
+ is_posn(b) => (a.x == b.x && a.y == b.y)
+ define hash(a):
+ 17
+ define secondary_hash(a):
+ 19
+
+struct posn(x, y):
+ property prop_equal_and_hash:
+ let (hc = lambda (a: posn, hc):
+ hc(a.x) + hc(a.y),
+ eql = lambda (a: posn, b: posn, eql):
+ eql(a.x, b.x) && eql(a.y, b.y)):
+ values(eql, hc, hc)
+
+struct posn(x, y):
+ property prop_equal_and_hash:
+ let (hc = lambda (a: posn, hc):
+ hc(a.x) + hc(a.y)):
+ (lambda (a: posn, b: posn, eql):
+ eql(a.x, b.x) && eql(a.y, b.y),
+ hc,
+ hc)
+
+// Another possibile approach to syntax for `struct`:
+struct posn:
+ fields:
+ x mutable
+ y = 7
+ methods equality:
+ define equal(a, b):
+ is_posn(b) => (a.x == b.x && a.y == b.y)
+ define hash(a):
+ 17
+ define secondary_hash(a):
+ 19
+ property prop_quality: "high"
+
+define fourth(n :: Integer):
+ define m: n*n
+ define v: m*m
+ printf("~a^4 = ~a\n", n, v)
+ v
+
+define exp(n :: Integer, 'base': base = 2.718281828459045):
+ if (n == 1)
+ | base
+ | base * exp(n-1, 'base': base)
+
+define positive_p(n): if n > 0 | true | false
+
+define go():
+ define helper(n):
+ list(n, n)
+ define more(m):
+ if m == 0 | "done"
+ | more(m - 1)
+ helper(more(9))
+
+define approx(x):
+ match x
+ | something(v):
+ printf("got it\n")
+ v
+ | nothing: 0
+
+// With two `:`s on one line, there's no way to
+// add to the first `:`
+define approx_thunk(x):
+ match x
+ | something(v): lambda (): v
+ | nothing: lambda (): 0
+
+// Enough indentation for `v` means that it continues the
+// implicit second `:`, so the `lambda` body has `v`:
+define approx_thunk(x):
+ match x
+ | something(v): lambda ():
+ v
+ | nothing: lambda (): 0
+
+define approx_thunk(x):
+ match x
+ | something(v): lambda
+ | (): v
+ | (n): v+n
+ | nothing: lambda
+ | (): 0
+ | (n): n
+
+define curried:
+ lambda (x):
+ lambda (y):
+ lambda (z):
+ list(x, y, z)
+
+define curried: lambda (x):
+ lambda (y):
+ lambda (z):
+ list(x, y, z)
+
+define dictionary: dict:
+ foo: 17
+ bar: string
+ baz: true
+
+define colors:
+ list(
+ red,
+ green,
+ blue,
+ orange,
+ )
+
+define f(x_something,
+ y_something_else,
+ z_also_long_name):
+ 5
+
+define sum(l):
+ let loop(l = l):
+ if is_null(l)
+ | 0
+ | first(l) + loop(rest(l))
+
+define show_all(l):
+ for (x = in_list(l)):
+ print(x)
+ newline()
+
+define show_zip(l, l2):
+ for (x = in_list(l),
+ x2 = in_list(l2)):
+ print(x)
+ print_string(" ")
+ print(x2)
+ newline()
+
+define show_combos_not_same(l, l2):
+ for (x = in_list(l)):
+ then (x2 = in_list(l2)):
+ when !is_equal(x, x2):
+ print(x)
+ print_string(" ")
+ print(x2)
+ newline()
+
+define map(f, l):
+ for list (x = in_list(l)):
+ f(x)
+
+define partition(l, pred):
+ for fold (yes = empty,
+ no = empty,
+ result (reverse(yes), reverse(no))):
+ with (x = in_list(l)):
+ if pred(x)
+ | (cons(x, yes), no)
+ | (yes, cons(x, no))
+
+local:
+ with:
+ define x: 1
+ define y: 2
+ in:
+ x+y
diff --git a/shrubbery/indentation.rkt b/shrubbery/indentation.rkt
new file mode 100644
index 000000000..3fd6e047a
--- /dev/null
+++ b/shrubbery/indentation.rkt
@@ -0,0 +1,516 @@
+#lang racket/base
+(require racket/class
+ racket/list
+ "lex.rkt"
+ "private/edit-help.rkt")
+
+;; Conventions:
+;; pos = arbitary position
+;; s, e = range positions
+;; start = start of a line
+;; delta = amount to add to a line due to `\` on preceding lines
+;; col, candidate, limit = virtual column, includes any relevant delta
+;; tab = indentation relative to start, does not include delta
+
+(provide shrubbery-indentation
+ shrubbery-range-indentation
+ shrubbery-paren-matches)
+
+(define NORMAL-INDENT 2)
+(define BAR-INDENT 0)
+
+(define (shrubbery-indentation t pos
+ #:multi? [multi? #f]
+ #:always? [always? multi?])
+ (define start (line-start t pos))
+ (define current-tab (get-current-tab t start))
+ (cond
+ ;; If `always?` is #f, we got here by the Return key;
+ ;; don't indent when just inserting new lines
+ [(and (not always?) (= (line-start t (sub1 pos)) (sub1 pos)))
+ current-tab]
+ [else
+ ;; tabbing only makes sense if the target line is not a continuation
+ ;; or if it continues an empty line
+ (define delta (or (line-delta t start #:unless-empty? #t) 0))
+ (cond
+ [(eqv? delta 0)
+ (define (like-enclosing #:as-bar? [as-bar? #f]
+ #:as-operator? [as-operator? #f]
+ #:also-zero? [also-zero? #f])
+ (indent-like-enclosing-group t start current-tab
+ #:multi? multi?
+ #:as-bar? as-bar?
+ #:as-operator? as-operator?
+ #:also-zero? also-zero?))
+ (case (send t classify-position (+ start current-tab))
+ [(parenthesis)
+ (indent-like-parenthesis t start current-tab)]
+ [(bar-operator)
+ (like-enclosing #:as-bar? #t)]
+ [(comment)
+ (define group-comment? (is-group-comment? t (+ start current-tab)))
+ (like-enclosing #:as-bar? (and group-comment?
+ (bar-after-group-comment? t (+ start current-tab) start))
+ #:also-zero? group-comment?)]
+ [(operator)
+ (like-enclosing #:as-operator? #t)]
+ [(string)
+ ;; check for being inside a string, which might be in `@` content
+ (define-values (s e) (send t get-token-range (+ start current-tab)))
+ (cond
+ [(< s (+ start current-tab))
+ current-tab]
+ [else
+ (like-enclosing)])]
+ [else
+ (like-enclosing)])]
+ [else
+ ;; don't change indentation for a continuation line
+ current-tab])]))
+
+(define (shrubbery-range-indentation t s e)
+ (define s-line (send t position-paragraph s))
+ (define e-line (let ([line (send t position-paragraph e)])
+ (if (and (s . < . e)
+ (= e (send t paragraph-start-position line)))
+ (sub1 line)
+ line)))
+ (cond
+ [(= s-line e-line)
+ ;; use single-line mode
+ (define pos (send t paragraph-start-position s-line))
+ (define amt (shrubbery-indentation t pos #:always? #t))
+ (define current-amt (get-current-tab t pos))
+ (list (list (max 0 (- current-amt amt))
+ (make-string (max 0 (- amt current-amt)) #\space)))]
+ [else
+ ;; compute indentation for the first non-empty line; as long as that
+ ;; involves inserting space or deleting no more space than is
+ ;; available in all lines, shift all the lines the same
+ (define lines (get-non-empty-lines t s-line e-line))
+ (cond
+ [(null? lines) '()]
+ [else
+ (define (line-position line) (send t paragraph-start-position line))
+ (define pos (line-position (car lines)))
+ (define amt-or-multi-amt (shrubbery-indentation t pos #:multi? #t))
+ (define amts (if (list? amt-or-multi-amt)
+ amt-or-multi-amt
+ (list amt-or-multi-amt)))
+ (define current-amt (get-current-tab t pos))
+ (or
+ ;; try each possible shift:
+ (for/or ([amt (in-list amts)])
+ (cond
+ [(current-amt . < . amt)
+ ;; insert in all lines
+ (define ins-str (make-string (- amt current-amt) #\space))
+ (for/list ([line (in-range s-line (add1 e-line))])
+ (list 0 (if (memv line lines) ins-str "")))]
+ [(current-amt . > . amt)
+ (define delta (- current-amt amt))
+ (and (for/and ([line (in-list lines)])
+ (delta . <= . (get-current-tab t (send t paragraph-start-position line))))
+ (for/list ([line (in-range s-line (add1 e-line))])
+ (list (if (memv line lines) delta 0) "")))]
+ [else #f]))
+ ;; no change
+ '())])]))
+
+(define (indent-like-enclosing-group t start current-tab
+ #:as-bar? [as-bar? #f]
+ #:as-operator? [as-operator? #f]
+ #:multi? [multi? #f]
+ #:also-zero? [also-zero? #f])
+ ;; candidates are sorted right (larger tab) to left (smaller tab)
+ (define (add-zero l) (if also-zero? (add-zero-to-end l) l))
+ (define candidates (remove-dups (indentation-candidates t (sub1 start)
+ #:as-bar? as-bar?
+ #:as-operator? as-operator?)))
+ (define delta (line-delta t start))
+ (define tabs (add-zero
+ (for/list ([col (in-list candidates)]
+ #:when (col . >= . delta))
+ (- col delta))))
+ ;; if the current state matches a candidate tab, we'll
+ ;; use the next one (to the left)
+ (define next-tabs (memv current-tab tabs))
+ (cond
+ [multi?
+ (if next-tabs
+ (append next-tabs
+ (take tabs (- (length tabs) (length next-tabs))))
+ tabs)]
+ [(and next-tabs (pair? (cdr next-tabs)))
+ (cadr next-tabs)]
+ [(null? tabs) 0]
+ [else
+ ;; default to rightmost:
+ (car tabs)]))
+
+(define (indent-like-parenthesis t start current-tab)
+ (define-values (s e) (send t get-token-range (+ start current-tab)))
+ (define o-s (send t backward-match e 0))
+ (cond
+ [o-s ; => s is closer
+ (define o-start (line-start t o-s))
+ (define o-delta (line-delta t o-start))
+ (define col (col-of o-s o-start o-delta))
+ ;; line up with indentation position of opener's group
+ (define use-col
+ (or (and (positive? o-s)
+ (let ()
+ (define paren (send t get-text (sub1 e) e))
+ (define next-s (if (equal? paren "}")
+ (skip-redundant-block-operators t (sub1 o-s) o-start)
+ (sub1 o-s)))
+ (get-block-column t next-s col o-start
+ #:for-outdent? #t)))
+ col))
+ (define s-delta (line-delta t start))
+ (if (use-col . > . s-delta)
+ (- use-col s-delta)
+ 0)]
+ [else
+ ;; didn't find match, so treat like other tokens
+ (indent-like-enclosing-group t start current-tab)]))
+
+;; Gets list of candiates with further-right candidates first starting
+;; search with the token that contains `pos` (inclusive on the left
+;; edge of the token, starts before the line that we want to indent).
+;; The search works going backwards to find enclosing groups.
+(define (indentation-candidates t pos
+ #:as-bar? [as-bar? #f]
+ #:as-operator? [as-operator? #f])
+ (let loop ([pos pos]
+ ;; possible target column, depending on what's we
+ ;; find by looking further back; for example, this
+ ;; candidate will be discarded if we find another
+ ;; identifier just before it
+ [candidate #f]
+ ;; filter out possibilities to the right (exclusive)
+ ;; of the limit
+ [limit #f]
+ ;; saw a bar in this group?
+ [bar-after? as-bar?]
+ ;; can also indent by an extra step?
+ [plus-one-more? as-operator?])
+ ;; helper
+ (define (maybe-list col [plus-one-more? #f])
+ (cond
+ [(not col) null]
+ [(or (not limit) (col . <= . limit))
+ (if plus-one-more?
+ (list (+ col NORMAL-INDENT) col)
+ (list col))]
+ [else null]))
+ ;; helper: loops where the candidate also works as a refined limit
+ (define (loop* pos new-candidate plus-one-more?)
+ (loop pos new-candidate (min new-candidate (or limit new-candidate)) #f plus-one-more?))
+ (define (keep s)
+ (define start (line-start t s))
+ (define delta (line-delta t start))
+ (define new-candidate (col-of (if as-bar? (+ s BAR-INDENT) s) start delta))
+ (loop* (sub1 s)
+ ;; don't forget the old candidate if the new candidate would
+ ;; be too deeply indented
+ (if (or (not candidate) (new-candidate . <= . limit))
+ new-candidate
+ candidate)
+ (and plus-one-more?
+ (or (eqv? new-candidate 0)
+ (= start (line-start t (sub1 s)))))))
+ (cond
+ [(eqv? limit -1) null]
+ [(negative? pos) (maybe-list candidate plus-one-more?)]
+ [else
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment continue-operator)
+ ;; we don't do anything special with continue-operator here,
+ ;; because we avoid looking at line numbers, anyway, and `line-delta`
+ ;; is responsible for computing continuation columns
+ (loop (sub1 s) candidate limit bar-after? plus-one-more?)]
+ [(block-operator)
+ ;; a block creates an indentation candidate that's
+ ;; to the right of the enclosing group's indentation
+ (define start (line-start t pos))
+ (define delta (line-delta t start))
+ (define block-col (if (zero? s)
+ 0
+ (get-block-column t (sub1 s) (col-of s start delta) start)))
+ ;; redundant operators are not valid indentation points:
+ (define next-s (skip-redundant-block-operators t (sub1 s) start))
+ ;; indentation under the block operator is valid
+ (define next-candidate (col-of (add1 next-s) start delta))
+ ;; a `|` cannot appear just after a `:`, so look before that block
+ (define adj-block-col (if as-bar? (sub1 block-col) block-col))
+ ;; look further outside this block, and don't consider anything
+ ;; that would appear to be nested in the block:
+ (define outer-candidates (loop next-s next-candidate (min* adj-block-col limit) #f #f))
+ (append (cond
+ [(and bar-after? (not as-bar?))
+ null]
+ [candidate
+ ;; we already have something after `:`, so
+ ;; use its indentation
+ (maybe-list candidate plus-one-more?)]
+ [else
+ ;; we haven't found anything in the block, so
+ ;; indent as the first thing in the block
+ (maybe-list (+ block-col (if as-bar? BAR-INDENT NORMAL-INDENT)))])
+ outer-candidates)]
+ [else
+ (cond
+ [(and candidate
+ (eqv? candidate (if as-bar? BAR-INDENT 0))
+ (or (not limit) (limit . >= . (if as-bar? BAR-INDENT 0))))
+ ;; already found minimal column, so stop here
+ (maybe-list candidate plus-one-more?)]
+ [else
+ (case category
+ [(parenthesis)
+ (define paren (send t get-text (sub1 e) e))
+ (cond
+ [(opener? paren)
+ ;; we're inside parentheses, brackets, etc.
+ (cond
+ [(and bar-after? (not as-bar?))
+ ;; opener followed by a bar: no more candidates
+ null]
+ [candidate (maybe-list candidate plus-one-more?)]
+ [(zero? s) (maybe-list NORMAL-INDENT)]
+ [else
+ (define start (line-start t pos))
+ (define delta (line-delta t start))
+ ;; first position within parens/braces/brackets; if
+ ;; indentation for the bracket's group is before the bracket,
+ ;; then "outdent" that far
+ (define col (col-of s start delta))
+ (define next-s (if (equal? paren "{")
+ ;; outdent past redundant operators:
+ (skip-redundant-block-operators t (sub1 s) start)
+ (sub1 s)))
+ (define block-col (get-block-column t next-s col start
+ #:for-outdent? #t))
+ (if (and block-col (block-col . < . col))
+ (maybe-list (+ block-col NORMAL-INDENT))
+ (maybe-list (+ col NORMAL-INDENT)))])]
+ [else
+ ;; found parenthesized while walking backward
+ (define r (send t backward-match e 0))
+ (cond
+ [(not r)
+ ;; matching open not found
+ (cond
+ [(zero? s) (maybe-list candidate)]
+ [else (keep s)])]
+ [(zero? r) null]
+ [else (keep r)])])]
+ [(bar-operator)
+ (define start (line-start t pos))
+ ;; look back to see whether there's another bar on the
+ ;; same line:
+ (define-values (another-bar-start limit-pos)
+ (find-bar-same-line t s start as-bar?))
+ (cond
+ [(or (and another-bar-start as-bar?)
+ (next-is-block-bracket? t e))
+ ;; don't treat the current bar as a source
+ ;; of indentation:
+ (loop (sub1 s) #f (min* s limit) #t #f)]
+ [as-bar?
+ (define delta (line-delta t start))
+ (define col (col-of s start delta))
+ (cond
+ [(not limit-pos) (maybe-list col)]
+ [else
+ ;; within the current group but outside the found bar,
+ ;; a new bar can only line up with outer candidates
+ ;; beyond the found bar
+ (define b-start (line-start t limit-pos))
+ (define b-delta (line-delta t b-start))
+ (define b-col (col-of limit-pos b-start b-delta))
+ (append (maybe-list col)
+ ;; outer candidates:
+ (loop (sub1 limit-pos)
+ b-col
+ (min* (min col (sub1 b-col)) limit)
+ #t
+ #f))])]
+ [else
+ ;; line up within bar or outside [another] bar
+ (define bar-column (col-of s start (line-delta t start)))
+ (append (maybe-list (or candidate (+ bar-column NORMAL-INDENT)) (and candidate plus-one-more?))
+ (if (not limit-pos)
+ null
+ ;; outer candidates
+ (loop (sub1 (or another-bar-start s)) #f (min* bar-column limit) #t #f)))])]
+ [(separator)
+ (cond
+ [(equal? ";" (send t get-text (sub1 e) e))
+ (loop (sub1 s) candidate limit bar-after? #f)]
+ [candidate (maybe-list candidate plus-one-more?)]
+ [else
+ (define i-pos (get-inside-start t pos))
+ (define start (line-start t i-pos))
+ (define delta (line-delta t start))
+ (maybe-list (col-of i-pos start delta) plus-one-more?)])]
+ [else
+ (keep s)])])])])))
+
+;; Returns:
+;; * the position of a bar before `orig-pos` with the same line
+;; start as reflected by `at-start` (but taking continuing lines
+;; into account), or #f if there's not one
+;; * the position where the bar's enclosing block starts, but only
+;; reliably if `as-bar?` and the first result is #f, or #f (all modes)
+;; to mean that not outer group is possible
+(define (find-bar-same-line t orig-pos at-start as-bar?)
+ (let loop ([pos (sub1 orig-pos)] [at-start at-start] [limit-pos (sub1 orig-pos)])
+ (cond
+ [(negative? pos) (values #f limit-pos)]
+ [else
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment)
+ (loop (sub1 s) at-start limit-pos)]
+ [(continue-operator)
+ (loop (sub1 s) (line-start t s) limit-pos)]
+ [else
+ (define pos-start (line-start t pos))
+ (cond
+ [(pos-start . < . at-start) (values #f limit-pos)]
+ [else
+ (case category
+ [(parenthesis)
+ (cond
+ [(opener? (send t get-text (sub1 e) e))
+ (values #f #f)]
+ [else
+ ;; Found parenthesized while walking backward
+ (define r (send t backward-match e 0))
+ (cond
+ [(not r) (loop (sub1 s) at-start s)]
+ [else (loop (sub1 r) (line-start t r) r)])])]
+ [(bar-operator)
+ (cond
+ [(not as-bar?)
+ ;; keep looking back
+ (define-values (another-bar-pos limit-pos) (loop (sub1 s) at-start 0))
+ (values (or another-bar-pos s) limit-pos)]
+ [else
+ ;; don't need to find limit-pos
+ (values s 0)])]
+ [else (loop (sub1 s) at-start s)])])])])))
+
+;; Skips back to an unmatched opener and returns the
+;; position of the first thing after it (not counting
+;; whitespace or comments)
+(define (get-inside-start t pos)
+ (let loop ([pos pos] [last-pos #f])
+ (cond
+ [(pos . <= . 0) 0]
+ [else
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment continue-operator)
+ (loop (sub1 s) last-pos)]
+ [(parenthesis)
+ (cond
+ [(opener? (send t get-text (sub1 e) e))
+ (or last-pos e)]
+ [else
+ ;; Found parenthesized while walking backward
+ (define r (send t backward-match e 0))
+ (loop (sub1 (or r s)) (or r s))])]
+ [else (loop (sub1 s) s)])])))
+
+;; block operator just at `pos`+1
+(define (skip-redundant-block-operators t pos at-start)
+ ;; no block operators are redundant, anymore
+ pos)
+
+(define (is-group-comment? t pos)
+ (define-values (s e) (send t get-token-range pos))
+ (and (= e (+ s 3))
+ (equal? "#//" (send t get-text s e))))
+
+(define (bar-after-group-comment? t pos at-start)
+ (let loop ([pos pos])
+ (cond
+ [(= pos (send t last-position)) #f]
+ [else
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment) (loop e)]
+ [(bar-operator) (> (line-start t pos) at-start)]
+ [else #f])])))
+
+(define (next-is-block-bracket? t pos)
+ (let loop ([pos pos])
+ (cond
+ [(= pos (send t last-position)) #f]
+ [else
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment) (loop e)]
+ [(parenthesis) (equal? (send t get-text s e) "«")]
+ [else #f])])))
+
+(define (get-non-empty-lines t s-line e-line)
+ (let loop ([line s-line])
+ (cond
+ [(line . > . e-line) '()]
+ [(only-whitespace-between? t
+ (send t paragraph-start-position line)
+ (send t paragraph-end-position line))
+ (loop (add1 line))]
+ [else (cons line (loop (add1 line)))])))
+
+;; determine current indentation starting with `start`
+(define (get-current-tab t start)
+ (define e (send t last-position))
+ (let loop ([pos start])
+ (cond
+ [(= pos e) 0]
+ [else
+ (define str (send t get-text pos (add1 pos)))
+ (cond
+ [(whitespace? str)
+ (+ 1 (loop (add1 pos)))]
+ [else 0])])))
+
+(define (whitespace? str)
+ (and (= (string-length str) 1)
+ (char-whitespace? (string-ref str 0))
+ (not (equal? str "\n"))))
+
+(define (remove-dups l)
+ (cond
+ [(null? l) null]
+ [(null? (cdr l)) l]
+ [(eqv? (car l) (cadr l)) (remove-dups (cdr l))]
+ [else (cons (car l) (remove-dups (cdr l)))]))
+
+(define (min* a b)
+ (min a (or b a)))
+
+(define (add-zero-to-end l)
+ (let loop ([l l])
+ (cond
+ [(null? l) (list 0)]
+ [(eqv? (car l) 0) l]
+ [else (cons (car l) (loop (cdr l)))])))
+
+(define shrubbery-paren-matches
+ '((|(| |)|)
+ (|[| |]|)
+ (|{| |}|)
+ (« »)))
diff --git a/shrubbery/interp.shrb b/shrubbery/interp.shrb
new file mode 100644
index 000000000..b99032457
--- /dev/null
+++ b/shrubbery/interp.shrb
@@ -0,0 +1,119 @@
+/* An interpreter from Utah CS 3520, originally witten in Plait,
+ just to see how thins work out. The curly-brace S-expressions
+ may not make sense. Given the usefulness of S-expressions
+ in the class, it's not clear that departing from S-expressions
+ would be a good idea there, but it seemed like an interesting
+ realistic program to try */
+
+type Value
+ | numV(n :: Number)
+ | closV(arg :: Symbol,
+ body :: Exp,
+ env :: Env)
+
+type Exp
+ | numE(n :: Number)
+ | idE(s :: Symbol)
+ | plusE(l :: Exp,
+ r :: Exp)
+ | multE(l :: Exp,
+ r :: Exp)
+ | letE(n :: Symbol,
+ rhs :: Exp,
+ body :: Exp)
+ | lamE(n :: Symbol,
+ body :: Exp)
+ | appE(fun :: Exp,
+ arg :: Exp)
+
+type Binding
+ | bind(name :: Symbol,
+ val :: Value)
+
+type_alias Env :: Listof(Binding)
+
+define mt_env :: empty
+
+define extend_env :: cons
+
+modulet test:
+ print_only_errors(#true)
+
+// interp ----------------------------------------
+define interp(a :: Exp, env :: Env) :: Value:
+ typecase Exp a
+ | numE(n): numV(n)
+ | idE(s): lookup(s, env)
+ | plusE(l, r): num_plus(interp(l, env), interp(r, env))
+ | multE(l, r): num_times(interp(l, env), interp(r, env))
+ | letE(n, rhs, body): interp(body,
+ extend_env(bind(n, interp(rhs, env)),
+ env))
+ | lamE(n, body): closV(n, body, env)
+ | appE(fun, arg): typecase Value interp(fun, env)
+ | closV(n, body, c_env):
+ interp(body,
+ extend_env(bind(n, interp(arg, env)),
+ c_env))
+ | error('interp, "not a function")
+modulet test:
+ test:
+ interp(parse(`2), mt_env)
+ numV(2)
+
+ test { interp(parse(`2), mt_env) ; numV(2) }
+ test { interp(parse(`3), mt_env) ; numV(3) }
+
+ test_exn:
+ interp(parse(`x), mt_env)
+ "free variable"
+
+ test:
+ interp(parse(`x),
+ extend_env(bind('x, numV(9)), mt_env))
+ numV(9)
+
+ test:
+ interp(parse(`{+ 2 1}), mt_env)
+ numV(3)
+ test:
+ interp(parse(`{* 2 1}), mt_env)
+ numV(2)
+
+ test:
+ interp(parse(`{+ {* 2 3} {+ 5 8}}), mt_env)
+ numV(1)
+
+ test:
+ interp(parse(`{lambda {x} {+ x x}}),
+ mt_env)
+ closV('x, plusE(idE('x), idE('x)), mt_env)
+
+ test:
+ interp(parse(`{let {[x 5]}
+ {+ x x}}),
+ mt_env)
+ numV(1)
+
+// num+ and num* ----------------------------------------
+define num_op(op :: Number Number :: Number, l :: Value, r :: Value) :: Value:
+ cond
+ | is_numV(l) && is_numV(r):
+ numV(op(numV_n(l), numV_n(r)))
+ | else:
+ error('interp, "not a number")
+
+define num_plus(l :: Value, r :: Value) :: Value:
+ num_op(op +, l, r)
+
+define num_mult(l :: Value, r :: Value) :: Value:
+ num_op(op *, l, r)
+
+modulet test:
+ test:
+ num_plus(numV(1), numV(2))
+ numV(3)
+
+ test:
+ num_mult(numV(2), numV(3))
+ numV(6)
diff --git a/shrubbery/lex.rkt b/shrubbery/lex.rkt
new file mode 100644
index 000000000..71d5f6fa0
--- /dev/null
+++ b/shrubbery/lex.rkt
@@ -0,0 +1,760 @@
+#lang racket/base
+
+(require parser-tools/lex
+ racket/contract
+ (prefix-in : parser-tools/lex-sre)
+ "private/property.rkt")
+
+(provide lex/status
+ lex-all
+
+ token-name
+ ;; 'identifier
+ ;; 'literal
+ ;; 'comment
+ ;; 'whitespace
+ ;;
+ ;; 'operator
+ ;; 'block-operator
+ ;; 'continue-operator
+ ;; 'bar-operator
+ ;;
+ ;; 'opener
+ ;; 'closer
+ ;; 'comma-operator
+ ;; 'semicolon-operator
+ ;;
+ ;; 's-exp
+ ;;
+ ;; 'EOF
+ ;;
+ ;; 'fail
+
+ token?
+ token-value
+ token-e
+ token-line
+ token-column
+ token-srcloc
+ token-rename
+
+ syntax->token
+ stx-for-original-property
+
+ current-lexer-source)
+
+(define-lex-abbrevs
+
+ ;; For case insensitivity
+ [e (char-set "eE")]
+
+ [digit (:/ "0" "9")]
+ [digit_ (:or digit (:: digit "_"))]
+ [digit16 (:/ "af" "AF" "09")]
+ [digit16_ (:or digit16 (:: digit16 "_"))]
+ [digit8 (:/ "0" "7")]
+
+ [langchar (:or (:/ "az" "AZ" "09") "+" "-" "_")]
+
+ ;; does not constrain to avoid surrogates:
+ [unicode (:or (:: "u" (:** 1 4 digit16))
+ (:: "U" (:** 1 6 digit16)))]
+
+ [str (:: "\"" (:* string-element ) "\"")]
+
+ [string-element (:or (:~ "\"" "\\")
+ (:: "\\" unicode)
+ string-escape)]
+
+ [byte-str (:: "#\"" (:* byte-string-element) "\"")]
+ [byte-string-element (:or (:- (:/ "\x00" "\xFF") "\"" "\\")
+ string-escape)]
+ [string-escape (:or "\\\""
+ "\\\\"
+ "\\a"
+ "\\b"
+ "\\t"
+ "\\n"
+ "\\v"
+ "\\f"
+ "\\r"
+ "\\e"
+ "\\'"
+ (:: "\\" (:** 1 3 digit8))
+ (:: "\\x" (:** 1 2 digit16))
+ (:: "\\" #\newline))]
+
+ [bad-str (:: (:? "#") "\""
+ (:* (:~ "\"" "\\")
+ (:: "\\" any-char))
+ (:? "\\" "\""))]
+
+ [boolean (:or "#true" "#false")]
+
+ [special-number (:: "#"
+ (:or "#inf"
+ "#neginf"
+ "#nan"))]
+
+ [bad-hash (:- (:or (:: "#" (:* non-delims))
+ "#/")
+ boolean
+ special-number)]
+
+ [exponent-marker e]
+ [sign (char-set "+-")]
+
+ [script (:: "#!" (:or #\space #\/) (:* (:~ #\newline) (:: #\\ #\newline)))]
+
+ [identifier (:: (:or alphabetic "_")
+ (:* (:or alphabetic numeric "_")))]
+ [opchar (:or (:- symbolic (:or "~"))
+ (:- punctuation (:or "," ";" "#" "\\" "_" "@" "\""
+ "(" ")" "[" "]" "{" "}" "«" "»")))]
+ [operator (:- (:or opchar
+ (:: (:* opchar) (:- opchar "+" "-" "." "/"))
+ (:+ ".")
+ (:+ "+")
+ (:+ "-"))
+ "|" ":"
+ (:: (:* any-char) (:or "//" "/*") (:* any-char)))]
+
+ [keyword (:: "~" identifier)]
+ [bad-keyword (:: "~")]
+
+ ;; disallows a number that starts +, -, or "."
+ [number/continuing (:or decimal-number/continuing
+ hex-number)]
+ [number (:: (:? sign)
+ (:or decimal-number
+ hex-number))]
+
+ [uinteger (:: (:* digit_) digit)]
+ [uinteger16 (:: (:* digit16_) digit16)]
+
+ [decimal-number/continuing (:or (:: uinteger number-exponent)
+ (:: uinteger "." (:? uinteger) number-exponent))]
+ [decimal-number (:or decimal-number/continuing
+ (:: "." uinteger number-exponent))]
+ [number-exponent (:or "" (:: exponent-marker (:? sign) uinteger))]
+ [hex-number (:: "0x" uinteger16)]
+
+ [bad-number/continuing (:- (:: digit (:+ non-number-delims))
+ identifier
+ number/continuing)]
+ [bad-number (:- (:: (:? sign) digit (:+ non-number-delims))
+ identifier
+ number)]
+ [bad-comment "*/"]
+
+ [non-number-delims (:or non-delims ".")]
+ [non-delims (:or alphabetic numeric "_")])
+
+(define (ret name lexeme #:raw [raw #f] type paren start-pos end-pos status)
+ (values (make-token name lexeme start-pos end-pos raw)
+ type paren (position-offset start-pos) (position-offset end-pos) status))
+
+(define stx-for-original-property (read-syntax #f (open-input-string "original")))
+(define current-lexer-source (make-parameter "input"))
+
+(define (make-token name e start-pos end-pos [raw #f])
+ (define offset (position-offset start-pos))
+ (define loc (vector (current-lexer-source)
+ (position-line start-pos)
+ (position-col start-pos)
+ offset
+ (- (position-offset end-pos)
+ offset)))
+ (token name (let loop ([e e] [raw raw])
+ (let ([e (if (pair? e)
+ (let p-loop ([e e])
+ (cond
+ [(null? (cdr e)) (list (loop (car e) raw))]
+ [else (cons (loop (car e) #f)
+ (p-loop (cdr e)))]))
+ e)]
+ [raw (if (pair? e) #f raw)])
+ (define stx (datum->syntax #f
+ e
+ loc
+ stx-for-original-property))
+ (if (eq? name 'comment)
+ stx
+ (syntax-raw-property stx (or raw (if (string? e) e '()))))))))
+
+(define (read-line-comment name lexeme input-port start-pos
+ #:status [status 'initial]
+ #:consume-newline? [consume-newline? #f])
+ (let ([comment (apply string (append (string->list lexeme) (read-line/skip-over-specials input-port
+ consume-newline?)))])
+ (define-values (end-line end-col end-offset) (port-next-location input-port))
+ (values (make-token name comment start-pos (position end-offset end-line end-col))
+ 'comment #f
+ (position-offset start-pos)
+ end-offset
+ status)))
+
+(define get-next-comment
+ (lexer
+ ["/*" (values 1 end-pos lexeme)]
+ ["*/" (values -1 end-pos lexeme)]
+ [(:or "/" "*" (:* (:~ "*" "/")))
+ (let-values ([(delta end-pos rest-lexeme) (get-next-comment input-port)])
+ (values delta end-pos (string-append lexeme rest-lexeme)))]
+ [(eof) (values 'eof end-pos "")]
+ [(special)
+ (get-next-comment input-port)]
+ [(special-comment)
+ (get-next-comment input-port)]))
+
+(define (read-nested-comment num-opens start-pos lexeme input)
+ (define-values (diff end next-lexeme) (get-next-comment input))
+ (cond
+ [(eq? 'eof diff) (ret 'fail eof 'error #f start-pos end 'initial)]
+ [else
+ (define all-lexeme (string-append lexeme next-lexeme))
+ (define next-num-opens (+ diff num-opens))
+ (cond
+ [(= 0 next-num-opens) (ret 'comment all-lexeme 'comment #f start-pos end 'initial)]
+ [else (read-nested-comment next-num-opens start-pos all-lexeme input)])]))
+
+(define (get-offset i)
+ (let-values (((x y offset) (port-next-location i)))
+ offset))
+
+(define (read-line/skip-over-specials i consume-newline?)
+ (let loop ()
+ (define next (peek-char-or-special i))
+ (cond
+ [(eq? next #\newline)
+ (cond
+ [consume-newline?
+ (read-char-or-special i)
+ (list #\newline)]
+ [else null])]
+ [(eof-object? next)
+ null]
+ [else
+ (read-char-or-special i)
+ (if (char? next)
+ (cons next (loop))
+ (loop))])))
+
+(struct s-exp-mode (depth) #:prefab)
+(struct in-at (mode opener shrubbery-status openers) #:prefab)
+(struct in-escaped (shrubbery-status at-status) #:prefab)
+
+(define (lex/status in pos status racket-lexer/status)
+ (let-values ([(tok type paren start end status)
+ (let loop ([status status])
+ (cond
+ [(s-exp-mode? status)
+ ;; within `#{}`
+ (unless racket-lexer/status
+ (error "shouldn't be in S-expression mode without a Racket lexer"))
+ (define depth (s-exp-mode-depth status))
+ (cond
+ [(and (zero? depth)
+ (eqv? #\} (peek-char in)))
+ ;; go out of S-expression mode by using shrubbery lexer again
+ (shrubbery-lexer/status in)]
+ [else
+ (define-values (tok type paren start end s-exp-status)
+ (racket-lexer/status in))
+ (values tok type paren start end (case s-exp-status
+ [(open)
+ (s-exp-mode (add1 depth))]
+ [(close)
+ (s-exp-mode (sub1 depth))]
+ [else status]))])]
+ [(in-at? status)
+ ;; within an `@` sequence
+ (at-lexer in status (lambda (status) (loop status)))]
+ [(in-escaped? status)
+ (define-values (t type paren start end sub-status)
+ (loop (in-escaped-shrubbery-status status)))
+ (values t type paren start end (struct-copy in-escaped status
+ [shrubbery-status sub-status]))]
+ [(eq? status 'continuing)
+ ;; normal mode, after a form
+ (shrubbery-lexer-continuing/status in)]
+ [else
+ ;; normal mode, at start or after an operator or whitespace
+ (shrubbery-lexer/status in)]))])
+ (cond
+ [(and (eq? (token-name tok) 'at-content)
+ (eqv? 0 (string-length (token-e tok))))
+ ;; a syntax coloring lexer must not return a token that
+ ;; consumes no characters, so just drop it by recurring
+ (lex/status in pos status racket-lexer/status)]
+ [else
+ (define backup (cond
+ ;; If we have "@/{" and we add a "/" after the existing one,
+ ;; we'll need to back up more:
+ [(eq? (token-name tok) 'at-opener) 1]
+ [(and (in-at? status) (eq? (token-name tok) 'operator)) 2]
+ [else 0]))
+ (values tok type paren start end backup status)])))
+
+(define-syntax-rule (make-lexer/status number bad-number)
+ (lexer
+ [(:+ whitespace)
+ (ret 'whitespace lexeme 'white-space #f start-pos end-pos 'initial)]
+ [str (ret 'literal (parse-string lexeme) #:raw lexeme 'string #f start-pos end-pos 'datum)]
+ [byte-str (ret 'literal (parse-byte-string lexeme) #:raw lexeme 'string #f start-pos end-pos 'datum)]
+ [bad-number
+ (ret 'fail lexeme 'error #f start-pos end-pos 'continuing)]
+ [number
+ (ret 'literal (parse-number lexeme) #:raw lexeme 'constant #f start-pos end-pos 'continuing)]
+ [special-number
+ (let ([num (case lexeme
+ [("#inf") +inf.0]
+ [("#neginf") -inf.0]
+ [("#nan") +nan.0])])
+ (ret 'literal num #:raw lexeme 'constant #f start-pos end-pos 'continuing))]
+ [boolean
+ (ret 'literal (equal? lexeme "#true") #:raw lexeme 'constant #f start-pos end-pos 'continuing)]
+ ["//" (read-line-comment 'comment lexeme input-port start-pos)]
+ ["/*" (read-nested-comment 1 start-pos lexeme input-port)]
+ ["#//"
+ (ret 'group-comment lexeme 'comment #f start-pos end-pos 'initial)]
+ [(:: (:or "#lang " "#!")
+ (:or langchar
+ (:: langchar (:* (:or langchar "/")) langchar)))
+ (ret 'comment lexeme 'other #f start-pos end-pos 'initial)]
+ [(:: (:or "#lang " "#!") (:* (:& any-char (complement whitespace))))
+ (ret 'fail lexeme 'error #f start-pos end-pos 'initial)]
+ [script
+ (ret 'comment lexeme 'comment #f start-pos end-pos 'initial)]
+ [(:or "(" "[" "{" "«")
+ (ret 'opener lexeme 'parenthesis (string->symbol lexeme) start-pos end-pos 'initial)]
+ [(:or ")" "]" "}" "»")
+ (ret 'closer lexeme 'parenthesis (string->symbol lexeme) start-pos end-pos 'continuing)]
+ ["#{"
+ (ret 's-exp lexeme 'parenthesis '|{| start-pos end-pos (s-exp-mode 0))]
+ [":"
+ (ret 'block-operator lexeme 'block-operator #f start-pos end-pos 'initial)]
+ ["|"
+ (ret 'bar-operator lexeme 'bar-operator #f start-pos end-pos 'initial)]
+ ["\\"
+ (ret 'continue-operator lexeme 'continue-operator #f start-pos end-pos 'initial)]
+ [","
+ (ret 'comma-operator lexeme 'separator #f start-pos end-pos 'initial)]
+ [";"
+ (ret 'semicolon-operator lexeme 'separator #f start-pos end-pos 'initial)]
+ [identifier
+ (ret 'identifier (string->symbol lexeme) #:raw lexeme 'symbol #f start-pos end-pos 'continuing)]
+ [operator
+ (ret 'operator (list 'op (string->symbol lexeme)) #:raw lexeme 'operator #f start-pos end-pos 'initial)]
+ [keyword
+ (let ([kw (string->keyword (substring lexeme 1))])
+ (ret 'identifier kw #:raw lexeme 'keyword #f start-pos end-pos 'continuing))]
+ ["@//"
+ (let ([opener (peek-at-opener input-port)])
+ (if opener
+ (ret 'at-comment lexeme 'at-comment (string->symbol lexeme) start-pos end-pos (in-at 'open opener 'initial '()))
+ (read-line-comment 'at-comment lexeme input-port start-pos)))]
+ ["@"
+ (let ([opener (peek-at-opener input-port)])
+ (define mode (if opener 'open 'initial))
+ (ret 'at lexeme 'at (string->symbol lexeme) start-pos end-pos (in-at mode opener 'initial '())))]
+ [(special)
+ (cond
+ [(or (number? lexeme) (boolean? lexeme))
+ (ret 'literal lexeme 'constant #f start-pos end-pos 'continuing)]
+ [(string? lexeme)
+ (ret 'literal lexeme 'string #f start-pos end-pos 'continuing)]
+ [(keyword? lexeme)
+ (ret 'literal lexeme 'hash-colon-keyword #f start-pos end-pos 'continuing)]
+ [else
+ (ret 'literal lexeme 'no-color #f start-pos end-pos 'continuing)])]
+ [(special-comment)
+ (ret 'comment "" 'comment #f start-pos end-pos 'initial)]
+ [(eof) (ret-eof start-pos end-pos)]
+ [(:or bad-str bad-keyword bad-hash bad-comment)
+ (ret 'fail lexeme 'error #f start-pos end-pos 'bad)]
+ [any-char (extend-error lexeme start-pos end-pos input-port)]))
+
+(define (ret-eof start-pos end-pos)
+ (values (make-token 'EOF eof start-pos end-pos) 'eof #f #f #f #f))
+
+(define shrubbery-lexer/status (make-lexer/status number bad-number))
+(define shrubbery-lexer-continuing/status (make-lexer/status number/continuing bad-number/continuing))
+
+;; after reading `@`, we enter an at-exp state machine for whether
+;; we're in the initial part, within `[]`, or within `{}`; we have to
+;; perform some parsing here to balance openers and closers; we leave
+;; wehite trimming to the parser layer
+(define (at-lexer in status recur)
+ (define in-mode (in-at-mode status))
+ (define (get-expected opener/ch ch/closer)
+ (define (get-all-expected s)
+ (for ([ch (in-string s)])
+ (unless (eqv? ch (read-char in))
+ (error "inconsistent input" ch))))
+ (define start-pos (next-location-as-pos in))
+ (define eof?
+ (cond
+ [(string? opener/ch)
+ (get-all-expected opener/ch)
+ (unless (eqv? ch/closer (read-char in))
+ (error "inconsistent opener input" ch/closer))
+ #f]
+ [else
+ (define ch (read-char in))
+ (cond
+ [(eof-object? ch) #t]
+ [else
+ (unless (eqv? opener/ch ch)
+ (error "inconsistent closer input" opener/ch))
+ (get-all-expected ch/closer)
+ #f])]))
+ (define end-pos (next-location-as-pos in))
+ (values start-pos end-pos eof?))
+ (case in-mode
+ ;; 'initial mode is right after `@` without immediate `{`, and we
+ ;; may transition from 'initial mode to 'brackets mode at `[`
+ [(initial brackets)
+ ;; recur to parse in shrubbery mode:
+ (define-values (t type paren start end sub-status)
+ (recur (in-at-shrubbery-status status)))
+ ;; to keep the term and possibly exit 'initial or 'brackets mode:
+ (define (ok status)
+ (values t type paren start end (cond
+ [(and (not (s-exp-mode? sub-status))
+ (null? (in-at-openers status)))
+ ;; either `{`, `[`, or back to shrubbery mode
+ (define opener (peek-at-opener in))
+ (cond
+ [opener
+ (in-at 'open opener sub-status '())]
+ [(and (not (eq? in-mode 'brackets))
+ (eqv? #\[ (peek-char in)))
+ (in-at 'brackets #f sub-status '())]
+ [(in-escaped? sub-status)
+ (in-escaped-at-status sub-status)]
+ [else sub-status])]
+ [else
+ ;; continue in-at mode
+ status])))
+ ;; converts a token to an error token:
+ (define (error status)
+ (values (struct-copy token t [name 'fail]) 'error #f start end status))
+ ;; update the shrubbery-level status, then keep the term or error,
+ ;; trackig nesting depth through the status as we continue:
+ (let ([status (struct-copy in-at status
+ [shrubbery-status sub-status])])
+ (case (token-name t)
+ [(opener) (ok (struct-copy in-at status
+ [openers (cons (token-e t)
+ (in-at-openers status))]))]
+ [(closer)
+ (cond
+ [(and (pair? (in-at-openers status))
+ (closer-for? (token-e t) (car (in-at-openers status))))
+ (ok (struct-copy in-at status
+ [openers (cdr (in-at-openers status))]))]
+ [else
+ (error status)])]
+ [else (ok status)]))]
+ ;; 'open mode is right `@` when the next character is `{`
+ [(open)
+ (define opener (in-at-opener status))
+ (define-values (start-pos end-pos eof?) (get-expected opener #\{))
+ (ret 'at-opener (string-append opener "{") 'parenthesis '|{| start-pos end-pos
+ (struct-copy in-at status [mode 'inside] [openers 0]))]
+ ;; 'inside mode means in `{}` and not currently escaped, and we
+ ;; transition to 'escape mode on a `@`, and we transition to 'close mode
+ ;; on a `}` that is not balancing a `{` within `{}`
+ [(inside)
+ (define opener (in-at-opener status))
+ (define start-pos (next-location-as-pos in))
+ (define o (open-output-string))
+ (let loop ([depth (in-at-openers status)])
+ (define ch (peek-char in))
+ (cond
+ [(eqv? ch #\newline)
+ ;; convert a newline into a separate string input
+ (define s (get-output-string o))
+ (cond
+ [(= 0 (string-length s))
+ (read-char in)
+ (define end-pos (next-location-as-pos in))
+ (ret 'at-content "\n" 'string #f start-pos end-pos
+ (struct-copy in-at status [mode 'inside] [openers depth]))]
+ [else
+ (define end-pos (next-location-as-pos in))
+ (ret 'at-content s 'string #f start-pos end-pos
+ (struct-copy in-at status [mode 'inside] [openers depth]))])]
+ [(or (eof-object? ch)
+ (peek-at-closer in #:opener opener))
+ (cond
+ [(zero? depth)
+ ;; `lex/status` will handle the case that the content is empty
+ (define end-pos (next-location-as-pos in))
+ (ret 'at-content (get-output-string o) 'string #f start-pos end-pos
+ (struct-copy in-at status [mode 'close]))]
+ [else
+ (if (equal? opener "")
+ (write-char (read-char in) o)
+ (write-string (read-string (add1 (bytes-length opener)) in) o))
+ (loop (sub1 depth))])]
+ [(peek-at-prefixed #\@ in #:opener opener)
+ ;; `lex/status` will handle the case that the content is empty
+ (define end-pos (next-location-as-pos in))
+ (ret 'at-content (get-output-string o) 'string #f start-pos end-pos
+ (struct-copy in-at status [mode 'escape] [openers depth]))]
+ [(peek-at-opener in #:opener opener)
+ (if (equal? opener "")
+ (write-char (read-char in) o)
+ (write-string (read-string (add1 (bytes-length opener)) in) o))
+ (loop (add1 depth))]
+ [else
+ (write-char (read-char in) o)
+ (loop depth)]))]
+ ;; 'escape mode means in `{}`, not currently escaped, and expect `@` next
+ [(escape)
+ (define opener (in-at-opener status))
+ (define-values (start-pos end-pos eof?) (get-expected opener #\@))
+ (cond
+ [(read-at-comment in)
+ => (lambda (slashes)
+ (cond
+ [(peek-at-opener in)
+ => (lambda (opener)
+ ;; block comment
+ (define end-pos (next-location-as-pos in))
+ (ret 'at-comment (string-append opener "@" slashes) 'at-comment #f start-pos end-pos
+ (in-at 'open opener (in-escaped 'initial (struct-copy in-at status [mode 'inside])) '())))]
+ [else
+ ;; line comment
+ (read-line-comment 'comment (string-append opener "@" slashes) in start-pos
+ #:status (struct-copy in-at status [mode 'inside])
+ #:consume-newline? #t)]))]
+ [else
+ (define next-opener (peek-at-opener in))
+ (define mode (if next-opener 'open 'initial))
+ (ret 'at (string-append opener "@") 'at #f start-pos end-pos
+ (in-at mode next-opener (in-escaped 'initial (struct-copy in-at status [mode 'inside])) '()))])]
+ ;; 'close mode handles the final `}` of a `{}`
+ [(close)
+ (define closer (at-opener->closer (in-at-opener status)))
+ (define-values (start-pos end-pos eof?) (get-expected #\} closer))
+ (cond
+ [eof? (ret-eof start-pos end-pos)]
+ [else
+ (define sub-status (in-at-shrubbery-status status))
+ (ret 'at-closer (string-append "}" closer) 'parenthesis '|}| start-pos end-pos
+ (if (in-escaped? sub-status)
+ (in-escaped-at-status sub-status)
+ sub-status))])]
+ [else (error "unknown at-exp state")]))
+
+(define (peek-at-opener in #:opener [opener #f])
+ (cond
+ [opener
+ ;; look for another instance of the current opener
+ (peek-at-prefixed #\{ in #:opener opener)]
+ [else
+ ;; look for a fresh opener
+ (define ch (peek-char in))
+ (cond
+ [(eqv? ch #\{) ""]
+ [(eqv? ch #\|)
+ (let loop ([chars '(#\|)] [offset 1])
+ (define ch (peek-char in offset))
+ (cond
+ [(eqv? ch #\{) (list->string (reverse chars))]
+ [(and ((char->integer ch) . < . 128)
+ (or (char-symbolic? ch)
+ (char-punctuation? ch)))
+ (loop (cons ch chars) (add1 offset))]
+ [else #f]))]
+ [else #f])]))
+
+(define (peek-at-prefixed ch in #:opener opener)
+ (let loop ([offset 0])
+ (cond
+ [(= offset (string-length opener))
+ (if (eqv? ch (peek-char in offset))
+ opener
+ #f)]
+ [(eqv? (peek-char in offset) (string-ref opener offset))
+ (loop (add1 offset))]
+ [else #f])))
+
+(define (peek-at-closer in #:opener [opener #f])
+ (define ch (peek-char in))
+ (cond
+ [(eqv? ch #\})
+ (let loop ([offset 0])
+ (cond
+ [(= offset (string-length opener)) opener]
+ [(eqv? (peek-char in (add1 offset))
+ (flip-at-bracket (string-ref opener (- (string-length opener) offset 1))))
+ (loop (add1 offset))]
+ [else #f]))]
+ [else #f]))
+
+(define (read-at-comment in)
+ (and (eqv? (peek-char in) #\/)
+ (eqv? (peek-char in 1) #\/)
+ (begin
+ (read-char in)
+ (read-char in)
+ "//")))
+
+(define (flip-at-bracket ch)
+ (case ch
+ [(#\<) #\>]
+ [(#\>) #\<]
+ [(#\[) #\]]
+ [(#\]) #\[]
+ [(#\() #\)]
+ [(#\)) #\(]
+ [else ch]))
+
+(define (at-opener->closer opener)
+ (cond
+ [(eqv? 0 (string-length opener)) ""]
+ [else
+ (list->string (reverse (for/list ([ch (in-string opener)])
+ (flip-at-bracket ch))))]))
+
+(define (next-location-as-pos in)
+ (define-values (line col pos) (port-next-location in))
+ (position pos line col))
+
+(define (extend-error lexeme start end in)
+ (define next (peek-char-or-special in))
+ (if (or (memq next
+ `(special
+ #\" #\, #\' #\` #\( #\) #\[ #\] #\{ #\} #\;
+ ,eof))
+ (char-whitespace? next))
+ (ret 'fail lexeme 'error #f start end 'bad)
+ (let-values (((rest end-pos) (get-chunk in)))
+ (ret 'fail (string-append lexeme rest) 'error #f start end-pos 'bad))))
+
+(define get-chunk
+ (lexer
+ [(:+ whitespace) (values lexeme end-pos)]))
+
+(define (parse-number s)
+ (if (and ((string-length s) . > . 2)
+ (eqv? #\x (string-ref s 1)))
+ (string->number (regexp-replace* #rx"_" (substring s 2) "") 16)
+ (string->number (regexp-replace* #rx"_" s ""))))
+
+(define (parse-string s)
+ (read (open-input-string s)))
+
+(define (parse-byte-string s)
+ (read (open-input-string s)))
+
+(define (parse-char s)
+ (define str
+ (read (open-input-string (string-append "\""
+ (substring s 1 (sub1 (string-length s)))
+ "\""))))
+ (string-ref str 0))
+
+(struct token (name value))
+(struct located-token token (srcloc))
+
+(define (token-e t)
+ (syntax-e (token-value t)))
+
+(define (token-line t)
+ (if (located-token? t)
+ (srcloc-line (located-token-srcloc t))
+ (syntax-line (token-value t))))
+
+(define (token-column t)
+ (let ([c (if (located-token? t)
+ (srcloc-column (located-token-srcloc t))
+ (syntax-column (token-value t)))])
+ (if (and c (eq? (token-name t) 'bar-operator))
+ (+ c 0.5)
+ c)))
+
+(define (token-srcloc t)
+ (cond
+ [(located-token? t)
+ (located-token-srcloc t)]
+ [else
+ (define s (token-value t))
+ (srcloc (syntax-source s)
+ (syntax-line s)
+ (syntax-column s)
+ (syntax-position s)
+ (syntax-span s))]))
+
+(define (token-rename t name)
+ (struct-copy token t [name name]))
+
+(define (syntax->token name s [srcloc #f])
+ (if srcloc
+ (located-token name s srcloc)
+ (token name s)))
+
+;; Runs `lex/status` in a loop, but switches to `finish-s-exp`
+;; for an S-expression escape:
+(define (lex-all in fail
+ #:keep-type? [keep-type? #f]
+ #:source [source (object-name in)])
+ (parameterize ([current-lexer-source source])
+ (let loop ([status 'initial])
+ (define-values (tok type paren start-pos end-pos backup new-status)
+ (lex/status in (file-position in) status #f))
+ (define (wrap r)
+ (if keep-type?
+ (vector r type paren)
+ r))
+ (case (token-name tok)
+ [(EOF) '()]
+ [(fail) (fail tok "read error")]
+ [(s-exp)
+ (cons (wrap (finish-s-exp tok in fail)) (loop 'continuing))]
+ [else
+ (cons (wrap tok) (loop new-status))]))))
+
+(define (finish-s-exp open-tok in fail)
+ (define v (read-syntax (current-lexer-source) in))
+ (when (eof-object? v)
+ (fail open-tok "expected S-expression after `#{`"))
+ (define end-pos
+ (let loop ()
+ (define-values (line col pos) (port-next-location in))
+ (define c (read-char in))
+ (cond
+ [(eof-object? c)
+ (fail v "expected `}` after S-expression")]
+ [(eqv? c #\})
+ (add1 pos)]
+ [(char-whitespace? c)
+ (loop)]
+ [else
+ (define bad (datum->syntax #f c (list (current-lexer-source)
+ line
+ col
+ pos
+ 1)))
+ (fail bad "expected only whitespace or `}` after S-expression")])))
+ (define result
+ (syntax->token (if (identifier? v) 'identifier 'literal)
+ v
+ (let ([loc (token-srcloc open-tok)])
+ (struct-copy srcloc loc
+ [span (- end-pos (srcloc-position loc))]))))
+ (when (pair? (syntax-e v))
+ (fail result "S-expression in `#{` and `}` must not be a pair"))
+ result)
+
+(define (closer-for? cl op)
+ (equal? cl (case op
+ [("(") ")"]
+ [("[") "]"]
+ [("{") "}"]
+ [("«") "»"]
+ [else #f])))
diff --git a/shrubbery/main.rkt b/shrubbery/main.rkt
new file mode 100644
index 000000000..61ee569ad
--- /dev/null
+++ b/shrubbery/main.rkt
@@ -0,0 +1,42 @@
+#lang racket/base
+
+(module reader racket/base
+ (require syntax/strip-context
+ "parse.rkt")
+
+ (provide (rename-out [shrubbery-read read]
+ [shrubbery-read-syntax read-syntax])
+ get-info
+ get-info-proc)
+
+ (define (shrubbery-read in)
+ (syntax->datum
+ (shrubbery-read-syntax #f in)))
+
+ (define (shrubbery-read-syntax src in)
+ (strip-context
+ #`(module anything racket/base
+ '#,(parse-all in))))
+
+ (define (get-info in mod line col pos)
+ (lambda (key default)
+ (get-info-proc key default (lambda (key default) default))))
+
+ (define (get-info-proc key default make-default)
+ (case key
+ [(color-lexer)
+ (dynamic-require 'shrubbery/syntax-color
+ 'shrubbery-lexer)]
+ [(drracket:indentation)
+ (dynamic-require 'shrubbery/indentation
+ 'shrubbery-indentation)]
+ [(drracket:range-indentation)
+ (dynamic-require 'shrubbery/indentation
+ 'shrubbery-range-indentation)]
+ [(drracket:paren-matches)
+ (dynamic-require 'shrubbery/indentation
+ 'shrubbery-paren-matches)]
+ [(drracket:grouping-position)
+ (dynamic-require 'shrubbery/navigation
+ 'shrubbery-grouping-position)]
+ [else (make-default key default)])))
diff --git a/shrubbery/navigation.rkt b/shrubbery/navigation.rkt
new file mode 100644
index 000000000..0233bcbca
--- /dev/null
+++ b/shrubbery/navigation.rkt
@@ -0,0 +1,299 @@
+#lang racket/base
+(require racket/class
+ racket/list
+ "lex.rkt"
+ "private/edit-help.rkt")
+
+;; Conventions are the same as in "indentation.rkt"
+
+(provide shrubbery-grouping-position)
+
+(define (shrubbery-grouping-position t pos limit-pos direction)
+ (case direction
+ [(backward)
+ (define-values (s e) (skip-whitespace #:and-separators? #t t (sub1 pos) -1))
+ (define category (send t classify-position s))
+ (case category
+ [(parenthesis)
+ (or (send t backward-match e 0)
+ e)]
+ [(bar-operator block-operator) e]
+ [else s])]
+ [(forward)
+ (define-values (s e) (skip-whitespace #:and-separators? #t t pos 1))
+ (define category (send t classify-position s))
+ (case category
+ [(parenthesis)
+ (or (send t forward-match s (send t last-position))
+ s)]
+ [(block-operator bar-operator)
+ (define-values (next-s next-e) (skip-whitespace t e 1))
+ (define start (line-start t next-s))
+ (define delta (line-delta t start))
+ (skip-to-shallower t e (col-of next-s start delta)
+ #:bar-stop-line (and (eq? category 'bar-operator)
+ start))]
+ [else e])]
+ [(down)
+ (define-values (s e) (skip-whitespace #:and-separators? #t t pos 1))
+ (define category (send t classify-position s))
+ (case category
+ [(parenthesis)
+ (and (opener? (send t get-text s e))
+ e)]
+ [(block-operator bar-operator)
+ e]
+ [else #f])]
+ [(up)
+ (define start (line-start t pos))
+ (case (send t classify-position pos)
+ [(bar-operator)
+ ;; immediately before a `|` is a special case
+ (start-of-alts t pos)]
+ [else
+ (define-values (s e) (skip-whitespace #:and-separators? #t t (sub1 pos) -1
+ #:stay-on-line start))
+ (define category (send t classify-position s))
+ (case category
+ [(parenthesis)
+ (if (opener? (send t get-text s e))
+ s
+ (start-of-group #:or-out? #t t e start))]
+ [(block-operator bar-operator)
+ (start-of-group #:or-out? #t t e start)]
+ [else
+ (start-of-group #:or-out? (eq? s pos) t s start)])])]
+ [else #f]))
+
+(define (skip-whitespace t pos dir
+ #:and-separators? [and-separators? #f]
+ #:stay-on-line [stay-on-line #f])
+ (define end-pos (send t last-position))
+ (cond
+ [(= pos -1) (send t get-token-range 0)]
+ [(pos . > . end-pos) (send t get-token-range (sub1 end-pos))]
+ [else
+ (let loop ([pos pos] [stay-on-line stay-on-line])
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (define (continue #:ok-to-change-line? [ok-to-change-line? #f])
+ (define start (and stay-on-line (line-start t s)))
+ (cond
+ [(and stay-on-line
+ (not ok-to-change-line?)
+ (not (eqv? start stay-on-line))
+ (not (eq? category 'white-space)))
+ (if (dir . < . 0)
+ (send t get-token-range e)
+ (send t get-token-range (sub1 s)))]
+ [else
+ (if (dir . < . 0)
+ (if (zero? s)
+ (values s e)
+ (loop (- s 1) start))
+ (if (eqv? e end-pos)
+ (values s e)
+ (loop e start)))]))
+ (case category
+ [(white-space comment)
+ (continue)]
+ [(continue-operator)
+ (continue #:ok-to-change-line? #t)]
+ [(separator)
+ (if and-separators?
+ (continue)
+ (values s e))]
+ [else (values s e)]))]))
+
+;; find the end of a block as a column less than `col`, but
+;; if `bar-stop-line` is given, also stop at a bar on that line
+(define (skip-to-shallower t pos col
+ #:bar-stop-line [bar-stop-line #f])
+ (define end-pos (send t last-position))
+ (let loop ([pos pos] [last-e #f])
+ (cond
+ [(= pos end-pos) end-pos]
+ [else
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment continue-operator)
+ (loop e last-e)]
+ [(separator) s]
+ [else
+ (define start (line-start t s))
+ (define delta (line-delta t start))
+ (define new-col (col-of s start delta))
+ (cond
+ [(new-col . < . col) (or last-e s)]
+ [(and (eq? category 'bar-operator)
+ (eqv? start bar-stop-line))
+ (or last-e s)]
+ [else
+ (case category
+ [(parenthesis)
+ (define o-e (send t forward-match s end-pos))
+ (if o-e
+ (loop o-e o-e)
+ s)]
+ [else (loop e e)])])])])))
+
+;; return the start of the group containing `pos`, but if `pos`
+;; is already the start of the group and `or-out?1, return the start of the
+;; enclosing group
+(define (start-of-group t orig-pos at-start
+ #:or-out? or-out?)
+ (define (finish last-pos)
+ (or last-pos
+ (if or-out?
+ (start-of-enclosing-block t orig-pos)
+ orig-pos)))
+ (let loop ([pos orig-pos] [last-pos #f] [at-start at-start])
+ (cond
+ [(= pos 0) (or last-pos 0)]
+ [else
+ (define-values (s e) (send t get-token-range (sub1 pos)))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment)
+ (define start (line-start t s))
+ (if (eqv? start at-start)
+ (loop s last-pos at-start)
+ (finish last-pos))]
+ [(continue-operator)
+ (loop s last-pos (line-start t s))]
+ [(block-operator)
+ (or last-pos (if or-out? s orig-pos))]
+ [(bar-operator)
+ (or last-pos (if or-out? s orig-pos))]
+ [(parenthesis)
+ (define o-s (send t backward-match e 0))
+ (cond
+ [o-s ; => `s` is a closer
+ (define start (line-start t e))
+ (if (eqv? start at-start)
+ (loop o-s o-s (line-start t o-s))
+ (finish last-pos))]
+ [else ; `s` is an opener
+ (or last-pos (if or-out? s orig-pos))])]
+ [(separator)
+ (finish last-pos)]
+ [else
+ (define start (line-start t s))
+ (if (eqv? start at-start)
+ (loop s s at-start)
+ (finish last-pos))])])))
+
+;; return the start of the block or parens containing the group containg `pos`
+(define (start-of-enclosing-block t pos)
+ (define start (line-start t pos))
+ (define col (col-of pos start (line-delta t start)))
+ (cond
+ [(zero? col)
+ ;; use column 0 as a proxy for being in a top-level block; we
+ ;; don't really want to go up (to the start of the buffer)
+ ;; from there
+ pos]
+ [else
+ (let loop ([pos pos] [last-pos #f])
+ (cond
+ [(= pos 0) 0]
+ [else
+ (define-values (s e) (send t get-token-range (sub1 pos)))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment continue-operator) (loop s last-pos)]
+ [(parenthesis)
+ (define o-s (send t backward-match e 0))
+ (cond
+ [o-s ; => `s` is a closer
+ (loop o-s o-s)]
+ [else ; `s` is an opener
+ s])]
+ [(bar-operator)
+ ;; needs to be outdented relative to initial `pos`
+ (define s-start (line-start t pos))
+ (define s-col (col-of pos s-start (line-delta t s-start)))
+ (cond
+ [(s-col . < . col) last-pos]
+ [else (loop s last-pos)])]
+ [(block-operator)
+ (define s-col (get-block-column t (sub1 s) #f (line-start t s)))
+ (cond
+ [(s-col . < . col) (or last-pos s)]
+ [else (loop s last-pos)])]
+ [else
+ (loop s s)])]))]))
+
+;; give a `pos` that's right before a bar, return the start of the
+;; first `|` in the block, or return the start of the enclsing block
+;; if `pos` is at the first bar
+(define (start-of-alts t pos)
+ (define at-start (line-start t pos))
+ (define col (col-of pos at-start (line-delta t at-start)))
+ (define (outdented? s)
+ (define start (line-start t s))
+ (define s-col (col-of s start (line-delta t s)))
+ (s-col . <= . col))
+ (define (select last-bar last-block last-pos s)
+ (or last-bar (and last-pos last-block) last-pos s))
+ (let loop ([pos pos] ; checking before
+ [last-bar #f] ; candidate bar (to move just before) most recently found (right line or column)
+ [col col] ; column of most recently found `last-bar`
+ [last-block #f] ; candidate `:`, only counts if `last-pos` is set for outdented
+ [last-pos #f] ; candiate start of an outdented block
+ [bar-start at-start] ; line for current block (to detect same-line candidates)
+ [at-start at-start]) ; line for candidate block start (to detect when moved too early)
+ (cond
+ [(= pos 0) 0]
+ [else
+ (define-values (s e) (send t get-token-range (sub1 pos)))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment)
+ (loop s last-bar col last-block last-pos
+ bar-start at-start)]
+ [(continue-operator)
+ (define s-start (line-start t s))
+ (loop s last-bar col last-block last-pos
+ (if (= at-start bar-start) s-start bar-start) s-start)]
+ [else
+ (define s-start (line-start t pos))
+ (cond
+ [(and (s-start . < . at-start)
+ (select last-bar last-block last-pos #f))
+ => (lambda (pos) pos)]
+ [else
+ (case category
+ [(parenthesis)
+ (define o-s (send t backward-match e 0))
+ (cond
+ [o-s ; => `s` is a closer
+ (loop o-s last-bar col last-block (if (outdented? o-s) o-s last-pos)
+ bar-start (line-start t o-s))]
+ [else ; `s` is an opener
+ (select last-bar last-block last-pos s)])]
+ [(bar-operator)
+ (define s-col (col-of s s-start (line-delta t s-start)))
+ (cond
+ [(eqv? bar-start s-start)
+ ;; same line, so earlier bar the in same block
+ (loop s s s-col #f #f bar-start 0)]
+ [else
+ (cond
+ [(s-col . > . col)
+ ;; more indented: ignore
+ (loop s last-bar col last-block last-pos bar-start at-start)]
+ [(s-col . < . col)
+ ;; less indented
+ (select last-bar last-block last-pos s)]
+ [else
+ ;; same indented:
+ (loop s s col #f #f s-start 0)])])]
+ [(block-operator)
+ (cond
+ [(outdented? s) (select last-bar last-block last-pos s)]
+ [else (loop s last-bar col s #f bar-start s-start)])]
+ [else
+ (loop s last-bar col last-block (if (outdented? s) s last-pos) bar-start s-start)])])])])))
+
diff --git a/shrubbery/parse.rkt b/shrubbery/parse.rkt
new file mode 100644
index 000000000..cfc4857b9
--- /dev/null
+++ b/shrubbery/parse.rkt
@@ -0,0 +1,1270 @@
+#lang racket/base
+(require racket/pretty
+ "lex.rkt"
+ "srcloc.rkt"
+ (submod "print.rkt" for-parse)
+ "private/property.rkt"
+ "private/at-space.rkt")
+
+(provide parse-all)
+
+;; Parsing state at the group level:
+(struct state (line ; current group's line and last consumed token's line
+ column ; group's column; below ends group, above starts indented
+ operator-column ; column for operator that continues group on a new line
+ paren-immed? ; immediately in `()` or `[]`?
+ bar-closes? ; does `|` always end a group?
+ bar-closes-line ; `|` (also) ends a group on this line
+ block-mode ; 'inside, #f, `:` or `|` token, or 'end
+ delta ; column delta created by `\`, applied to `line` continuation
+ raw ; reversed whitespace (and comments) to be remembered
+ at-mode)) ; look for `@` continuation after term: #f, 'initial, or 'no-initial
+
+(define (make-state #:line line
+ #:column column
+ #:operator-column [operator-column #f]
+ #:paren-immed? [paren-immed? #f]
+ #:bar-closes? [bar-closes? #f]
+ #:bar-closes-line [bar-closes-line #f]
+ #:block-mode [block-mode #f]
+ #:delta delta
+ #:raw [raw null])
+ (state line
+ column
+ operator-column
+ paren-immed?
+ bar-closes?
+ bar-closes-line
+ block-mode
+ delta
+ raw
+ #f))
+
+;; Parsing state for group sequences: top level, in opener-closer, or after `:`
+(struct group-state (closer ; expected closer: a string, EOF, or column (maybe 'any as column)
+ paren-immed? ; immediately in `()` or `[]`?
+ column ; not #f => required indentation check checking
+ check-column? ; #f => allow any sufficiently large (based on closer) indentation
+ bar-closes? ; does `|` always end the sequence of groups?
+ bar-closes-line ; `|` (also) ends a sequence of groups on this line
+ block-mode ; 'inside, #f, `:` or `|` token, or 'end
+ comma-time? ; allow and expect a comma next
+ sequence-mode ; 'any, 'one, or 'none
+ last-line ; most recently consumed line
+ delta ; column delta created by `\`, applies to `last-line` continuation
+ commenting ; pending group-level `#//` token; exclusive with `tail-commenting`
+ tail-commenting ; pending group-level `#//` at end (so far)
+ raw)) ; reversed whitespace (and comments) to be remembered
+
+(define (make-group-state #:closer closer
+ #:paren-immed? [paren-immed? #f]
+ #:column column
+ #:check-column? [check-column? #t]
+ #:bar-closes? [bar-closes? #f]
+ #:bar-closes-line [bar-closes-line #f]
+ #:block-mode [block-mode #f]
+ #:sequence-mode [sequence-mode 'any]
+ #:last-line last-line
+ #:delta delta
+ #:commenting [commenting #f]
+ #:raw [raw null])
+ (group-state closer
+ paren-immed?
+ column
+ check-column?
+ bar-closes?
+ bar-closes-line
+ block-mode
+ #f
+ sequence-mode
+ last-line
+ delta
+ #f
+ commenting
+ raw))
+
+(define (closer-column? c) (or (eq? c 'any) (number? c)))
+
+(define closer-expected? pair?)
+(define (closer-expected closer) (if (pair? closer) (car closer) closer))
+(define (closer-expected-opener closer) (and (pair? closer) (cdr closer)))
+(define (make-closer-expected str tok) (cons str tok))
+
+(define group-tag (syntax-raw-property (datum->syntax #f 'group) '()))
+(define top-tag (syntax-raw-property (datum->syntax #f 'top) '()))
+(define parens-tag (syntax-raw-property (datum->syntax #f 'parens) '()))
+
+;; ----------------------------------------
+
+;; In the parsed representation of a shrubbery, source locations are
+;; not associated with sequences tagged `group` or `top`. For terms
+;; tagged with `parens`, `braces`, `block`, `alts`, and `op`, the same
+;; source location is associated with the tag and the parentheses that
+;; group the tag with its members, and that location spans the content
+;; and any delimiters used to form it (such as parentheses or a `:`
+;; that starts a block).
+
+;; Raw text is preserved as syntax properties so that the original
+;; program text can be reconstructed. Raw text is represented as a
+;; `cons`-based tree of strings to be rendered through an inorder
+;; traversal. The relevant syntax properties are 'raw, 'raw-prefix,
+;; and 'raw-tail. The 'raw property on an atom is the original text
+;; for the atom; whitespace and comments that precede the atom are
+;; normally attached as 'raw-prefix. For a compound term, the
+;; contained terms must be traversed to gather all the raw text; in
+;; addition to 'raw-prefix and 'raw attached to the compound-term tag,
+;; a 'raw-tail property holds raw text to be written after all of the
+;; compound term's subterms. For example, the closing `)` for a
+;; `parens` term will be attached to 'raw-tail on the 'parens tag. The
+;; `group` and `top` tags may have non-empty 'raw-prefix and 'raw-tail
+;; syntax properties, even though those tags don't have source
+;; locations.
+
+;; Parse all groups in a stream
+(define (parse-top-groups l)
+ (define-values (gs rest-l end-line end-delta end-t tail-commenting tail-raw)
+ (parse-groups l (make-group-state #:closer eof
+ #:column #f
+ #:check-column? #f
+ #:last-line -1
+ #:delta 0)))
+ (when tail-commenting (fail-no-comment-group tail-commenting))
+ (unless (null? rest-l)
+ (error "had leftover items" rest-l))
+ (define top (lists->syntax (cons top-tag (bars-insert-alts gs))))
+ (add-raw-tail top tail-raw))
+
+;; Parse a sequence of groups (top level, in opener-closer, or after `:`)
+;; consuming the closer in the case of opener-closer context.
+;; Returns: the list of groups
+;; remaining tokens after a closer
+;; line of last consumed (possibly closer)
+;; delta of last consumed
+;; last token consumed (if a closer)
+;; pending group-comment token from end
+;; pending raw whitespace/comments
+(define (parse-groups l sg)
+ (define (check-no-commenting #:tail-ok? [tail-ok? #f])
+ (define commenting (or (group-state-commenting sg)
+ (and (not tail-ok?) (group-state-tail-commenting sg))))
+ (when commenting
+ (fail-no-comment-group commenting)))
+ (define (done)
+ (check-no-commenting #:tail-ok? #t)
+ (values null l (group-state-last-line sg) (group-state-delta sg) #f
+ (group-state-tail-commenting sg)
+ (group-state-raw sg)))
+ (define (check-column t column)
+ (when (group-state-check-column? sg)
+ (unless (eqv? column (group-state-column sg))
+ (fail t "wrong indentation"))))
+ (define closer (group-state-closer sg))
+ (cond
+ [(null? l)
+ ;; Out of tokens
+ (when (string? (closer-expected closer))
+ (fail (closer-expected-opener closer) (format "expected ~s" (closer-expected closer))))
+ (done)]
+ [else
+ (define t (car l))
+ (define column (column+ (token-column t) (group-state-delta sg)))
+ (cond
+ [(eq? (token-name t) 'group-comment)
+ ;; column doesn't matter
+ (define line (token-line t))
+ (define-values (rest-l last-line delta raw)
+ (next-of (cdr l) line (group-state-delta sg) (cons t (group-state-raw sg))))
+ (cond
+ [(and line
+ (line . > . (group-state-last-line sg))
+ (next-line? rest-l last-line))
+ ;; structure comment is on its own line, so it comments out next group
+ (check-no-commenting)
+ (parse-groups rest-l (struct-copy group-state sg
+ [last-line last-line]
+ [delta delta]
+ [tail-commenting t]
+ [raw raw]))]
+ [else
+ (fail t "misplaced group comment")
+ (parse-groups rest-l (struct-copy group-state sg
+ [last-line last-line]
+ [delta delta]
+ [raw raw]))])]
+ [(and (closer-column? closer)
+ column
+ (column . < . closer))
+ ;; Next token is less indented than this group sequence
+ (done)]
+ [else
+ ;; Dispatch on token
+ (case (token-name t)
+ [(closer)
+ (cond
+ [(closer-column? closer)
+ (done)]
+ [else
+ (unless (equal? (closer-expected closer) (token-e t))
+ (fail t (format "expected ~s; closing at ~a" (closer-expected closer) (token-value t))))
+ (define raw (cons t (group-state-raw sg)))
+ (if (eof-object? (closer-expected closer))
+ ;; continue after extra closer:
+ (parse-groups (cdr l) (struct-copy group-state sg
+ [last-line (token-line t)]
+ [raw raw]))
+ ;; stop at closer
+ (begin
+ (check-no-commenting)
+ (values null (cdr l) (token-line t) (group-state-delta sg) t #f raw)))])]
+ [(whitespace comment continue-operator)
+ (define-values (next-l last-line delta raw)
+ (next-of l (group-state-last-line sg) (group-state-delta sg) (group-state-raw sg)))
+ (parse-groups next-l (struct-copy group-state sg
+ [last-line last-line]
+ [delta delta]
+ [raw raw]))]
+ [(comma-operator)
+ (cond
+ [(closer-column? (group-state-closer sg))
+ (done)]
+ [else
+ (unless (and (group-state-paren-immed? sg)
+ (group-state-comma-time? sg))
+ (fail t (format "misplaced comma~a"
+ (if (group-state-paren-immed? sg)
+ ""
+ " (not immdiately within parentheses or brackets)"))))
+ (define-values (rest-l last-line delta raw)
+ (next-of (cdr l) (token-line t) (group-state-delta sg) (cons t (group-state-raw sg))))
+ ;; In top level or immediately in opener-closer:
+ (parse-groups rest-l (struct-copy group-state sg
+ [check-column? (next-line? rest-l last-line)]
+ [last-line last-line]
+ [comma-time? #f]
+ [delta delta]
+ [raw raw]))])]
+ [(semicolon-operator)
+ (when (group-state-paren-immed? sg)
+ (fail t (format "misplaced semicolon (~a)"
+ "immdiately within parentheses or brackets")))
+ (cond
+ [(eq? (group-state-block-mode sg) 'inside) (done)]
+ [else
+ (check-column t column)
+ (define-values (rest-l last-line delta raw)
+ (next-of (cdr l) (token-line t) (group-state-delta sg) (cons t (group-state-raw sg))))
+ (parse-groups rest-l (struct-copy group-state sg
+ [check-column? (next-line? rest-l last-line)]
+ [column (or (group-state-column sg) column)]
+ [last-line last-line]
+ [delta delta]
+ [commenting (group-state-tail-commenting sg)]
+ [tail-commenting #f]
+ [block-mode (next-block-mode (group-state-block-mode sg))]
+ [raw raw]))])]
+ [else
+ (when (group-state-comma-time? sg)
+ (fail t (format "missing comma before new group (~a)"
+ "within parentheses or braces")))
+ (case (token-name t)
+ [(bar-operator)
+ (define line (token-line t))
+ (cond
+ [(or (group-state-bar-closes? sg)
+ (and line
+ (eqv? line (group-state-bar-closes-line sg))))
+ (done)]
+ [(eq? (group-state-block-mode sg) 'inside)
+ ;; Bar at the start of a group
+ (define same-line? (or (not (group-state-last-line sg))
+ (= line (group-state-last-line sg))))
+ (when (group-state-check-column? sg)
+ (unless (or same-line?
+ (= column (group-state-column sg)))
+ (fail t "wrong indentation")))
+ (define pre-raw (group-state-raw sg))
+ (define commenting (or (group-state-commenting sg)
+ (group-state-tail-commenting sg)))
+ (define-values (g rest-l group-end-line group-end-delta block-tail-commenting block-tail-raw)
+ (parse-block t (cdr l)
+ #:line line
+ #:closer (or (column-next column) 'any)
+ #:bar-closes? #t
+ #:bar-closes-line line
+ #:delta (group-state-delta sg)
+ #:raw (if commenting
+ null
+ pre-raw)))
+ (define-values (gs rest-rest-l end-line end-delta end-t tail-commenting tail-raw)
+ (parse-groups rest-l (struct-copy group-state sg
+ [column (if same-line?
+ (group-state-column sg)
+ column)]
+ [check-column? (next-line? rest-l group-end-line)]
+ [last-line group-end-line]
+ [delta group-end-delta]
+ [comma-time? (group-state-paren-immed? sg)]
+ [commenting #f]
+ [tail-commenting block-tail-commenting]
+ [raw (if commenting
+ (append block-tail-raw
+ (cons (syntax-to-raw (datum->syntax #f g))
+ pre-raw))
+ block-tail-raw)])))
+ (values (if commenting
+ gs
+ (cons (list group-tag
+ (add-span-srcloc
+ t end-t
+ (cons 'bar g)))
+ gs))
+ rest-rest-l
+ end-line
+ end-delta
+ end-t
+ tail-commenting
+ tail-raw)]
+ [else
+ (if (and (token? (group-state-block-mode sg))
+ (eq? 'block-operator (token-name (group-state-block-mode sg))))
+ (fail (group-state-block-mode sg) "unnecessary `:` before `|`")
+ (fail t "misplaced `|`"))])]
+ [else
+ ;; Parse one group, then recur to continue the sequence:
+ (check-column t column)
+ (define line (token-line t))
+ (define pre-raw (group-state-raw sg))
+ (define commenting (or (group-state-commenting sg)
+ (group-state-tail-commenting sg)))
+ (when (and (pair? l)
+ (eq? 'none (group-state-sequence-mode sg))
+ (not commenting))
+ (fail t "second group not allowed within `@«` and `»`"))
+ (define-values (g rest-l group-end-line group-delta group-tail-commenting group-tail-raw)
+ (parse-group l (make-state #:paren-immed? (group-state-paren-immed? sg)
+ #:line line
+ #:column column
+ #:bar-closes? (group-state-bar-closes? sg)
+ #:bar-closes-line (group-state-bar-closes-line sg)
+ #:block-mode (group-state-block-mode sg)
+ #:delta (group-state-delta sg)
+ #:raw null)))
+ (define-values (gs rest-rest-l end-line end-delta end-t tail-commenting tail-raw)
+ (parse-groups rest-l (struct-copy group-state sg
+ [column (or (group-state-column sg) column)]
+ [check-column? (next-line? rest-l group-end-line)]
+ [last-line group-end-line]
+ [comma-time? (group-state-paren-immed? sg)]
+ [delta group-delta]
+ [commenting #f]
+ [tail-commenting group-tail-commenting]
+ [raw (if commenting
+ (append group-tail-raw
+ (cons (syntax-to-raw (datum->syntax #f g))
+ pre-raw))
+ group-tail-raw)]
+ [sequence-mode (if (and (not commenting)
+ (eq? 'one (group-state-sequence-mode sg)))
+ 'none
+ (group-state-sequence-mode sg))])))
+ (values (if commenting
+ gs
+ (cons (cons (add-pre-raw group-tag
+ pre-raw)
+ g)
+ gs))
+ rest-rest-l
+ end-line
+ end-delta
+ end-t
+ tail-commenting
+ tail-raw)])])])]))
+
+;; Parse one group.
+;; Returns: the list of items in the group
+;; remaining tokens after group
+;; line of last consumed
+;; delta at last consumed
+;; pending group-comment token from end
+(define (parse-group l s)
+ (define (done)
+ (values null l (state-line s) (state-delta s) #f (state-raw s)))
+ (cond
+ [(null? l) (done)]
+ [else
+ (define t (car l))
+ (define line (token-line t))
+ (define (check-block-mode)
+ (when (eq? (state-block-mode s) 'end)
+ (fail t "no terms allowed after `»` within a group")))
+ (define (check-nested-block-mode t)
+ (when (eq? (state-block-mode s) 'no)
+ (fail t "blocks not allowed immediately within `@«` and `»`")))
+ ;; Consume a token
+ (define (keep delta
+ #:operator-column [operator-column (state-operator-column s)]
+ #:at-mode [at-mode (state-at-mode s)])
+ (check-block-mode)
+ (define-values (at-adjust new-at-mode at-l at-line at-delta)
+ (continue-at at-mode #f (cdr l) line delta))
+ (define-values (g rest-l end-line end-delta tail-commenting tail-raw)
+ (parse-group at-l (struct-copy state s
+ [line at-line]
+ [delta at-delta]
+ [raw null]
+ [block-mode (next-block-mode (state-block-mode s))]
+ [operator-column operator-column]
+ [at-mode new-at-mode])))
+ (define elem (record-raw (token-value t) #f (state-raw s)))
+ (values (at-adjust (cons elem g)) rest-l end-line end-delta tail-commenting tail-raw))
+ ;; Dispatch
+ (cond
+ [(and line (line . > . (state-line s)))
+ ;; new line
+ (case (token-name t)
+ [(whitespace comment)
+ (parse-group (cdr l) (struct-copy state s
+ [line line]
+ [delta 0]
+ [raw (cons t (state-raw s))]))]
+ [(closer at-content at-closer)
+ (done)]
+ [else
+ ;; consume any group comments that are on their own line:
+ (define-values (group-commenting use-t use-l last-line delta raw)
+ (get-own-line-group-comment t l (state-line s) (state-delta s) (state-raw s)))
+ (define column (token-column use-t))
+ (cond
+ [(column . > . (state-column s))
+ ;; More indented forms a nested block when there's
+ ;; a preceding `:` (doesn't get here) or starting with `|`;
+ ;; more indented continues a group when starting with an
+ ;; operator
+ (cond
+ [(eq? 'bar-operator (token-name use-t))
+ (when (and (state-operator-column s)
+ (<= column (state-operator-column s)))
+ (fail use-t "wrong indentation"))
+ (check-nested-block-mode use-t)
+ (parse-block #f use-l
+ #:block-mode 'inside
+ #:line (token-line use-t)
+ #:closer (or (token-column use-t) 'any)
+ #:bar-closes? #f
+ #:bar-closes-line #f
+ #:delta delta
+ #:raw raw
+ #:group-commenting group-commenting)]
+ [(and (eq? 'operator (token-name use-t))
+ (or (not (state-operator-column s))
+ (= column (state-operator-column s))))
+ (when group-commenting (fail group-commenting "misplaced group comment"))
+ (keep 0 #:operator-column column)]
+ [(and (eq? 'opener (token-name use-t))
+ (equal? "«" (token-e t)))
+ (when group-commenting (fail group-commenting "misplaced group comment"))
+ ;; will error as out-of-place
+ (parse-group l (struct-copy state s
+ [line line]))]
+ [else
+ (fail use-t "wrong indentation (or missing `:` on previous line)")])]
+ [else
+ ;; using `done` here means that we leave any group comments in place;
+ ;; not consuming inspected tokens is normally worrisome, but we'll parse
+ ;; them at most one more time
+ (done)])])]
+ [else
+ ;; Not a new line
+ (case (token-name t)
+ [(closer comma-operator semicolon-operator at-content at-closer)
+ (done)]
+ [(identifier number literal operator)
+ (keep (state-delta s))]
+ [(block-operator)
+ (check-block-mode)
+ (check-nested-block-mode t)
+ (parse-block t (cdr l)
+ #:line (token-line t)
+ #:closer (or (column-half-next (or (state-operator-column s)
+ (state-column s)))
+ 'any)
+ #:delta (state-delta s)
+ #:raw (state-raw s)
+ #:bar-closes? (and (state-bar-closes? s)
+ (not (state-bar-closes-line s)))
+ #:bar-closes-line (state-bar-closes-line s))]
+ [(bar-operator)
+ (define line (token-line t))
+ (cond
+ [(or (state-bar-closes? s)
+ (and line
+ (eqv? line (state-bar-closes-line s))))
+ (done)]
+ [else
+ (check-block-mode)
+ (check-nested-block-mode t)
+ (parse-block #f l
+ #:block-mode 'inside
+ #:line line
+ #:closer (or (token-column t) 'any)
+ #:bar-closes? #f
+ #:bar-closes-line #f
+ #:delta (state-delta s)
+ #:raw (state-raw s))])]
+ [(opener)
+ (check-block-mode)
+ (define-values (closer tag)
+ (case (token-e t)
+ [("(") (values ")" 'parens)]
+ [("{") (values "}" 'braces)]
+ [("[") (values "]" 'brackets)]
+ [("«") (if (state-at-mode s)
+ (values "»" 'at)
+ (fail t "misplaced `«`"))]
+ [else (error "unknown opener" t)]))
+ (define pre-raw (state-raw s))
+ (define-values (group-commenting next-l last-line delta raw)
+ (next-of/commenting (cdr l) line (state-delta s) null))
+ (define sub-column
+ (if (pair? next-l)
+ (column+ (token-column (car next-l)) (state-delta s))
+ (column-next (column+ (token-column t) (state-delta s)))))
+ (define-values (gs rest-l close-line close-delta end-t never-tail-commenting group-tail-raw)
+ (parse-groups next-l (make-group-state #:closer (make-closer-expected closer t)
+ #:paren-immed? #t
+ #:block-mode (if (eq? tag 'at) 'no #f)
+ #:column sub-column
+ #:last-line last-line
+ #:delta delta
+ #:commenting group-commenting
+ #:raw raw
+ #:sequence-mode (if (eq? tag 'at) 'one 'any))))
+ (define-values (at-adjust new-at-mode at-l at-line at-delta)
+ (continue-at (state-at-mode s) (equal? closer "]") rest-l close-line close-delta))
+ (define-values (g rest-rest-l end-line end-delta tail-commenting tail-raw)
+ (parse-group at-l (struct-copy state s
+ [line at-line]
+ [delta at-delta]
+ [block-mode (next-block-mode (state-block-mode s))]
+ [raw null]
+ [at-mode new-at-mode])))
+ (define new-g (at-adjust
+ (cons (add-raw-to-prefix
+ t pre-raw #:tail group-tail-raw
+ (add-span-srcloc
+ t end-t
+ (cons tag gs)))
+ g)))
+ (define-values (result-g result-tail-raw)
+ (if (eq? tag 'at)
+ (splice-at t new-g tail-raw)
+ (values new-g tail-raw)))
+ (values result-g
+ rest-rest-l
+ end-line
+ end-delta
+ tail-commenting
+ result-tail-raw)]
+ [(whitespace comment continue-operator)
+ (define-values (next-l line delta raw)
+ (next-of l (state-line s) (state-delta s) (state-raw s)))
+ (parse-group next-l (struct-copy state s
+ [line line]
+ [delta delta]
+ [raw raw]))]
+ [(group-comment)
+ (fail t "misplaced group comment")
+ (parse-group (cdr l) s)]
+ [(at)
+ (check-block-mode)
+ (cond
+ [(null? (cdr l))
+ (fail t "missing term after `@`")
+ (parse-group null s)]
+ [else
+ (define next-t (cadr l))
+ (case (token-name next-t)
+ [(opener)
+ (case (token-e next-t)
+ [("(" "«")
+ (parse-group (cdr l) (struct-copy state s
+ [raw (cons t (state-raw s))]
+ [at-mode 'initial]))]
+ [("[")
+ (keep (state-delta s) #:at-mode 'no-initial)]
+ [else (error "unexpected" (token-name next-t))])]
+ [(identifier number literal operator opener)
+ (parse-group (cdr l) (struct-copy state s
+ [raw (cons t (state-raw s))]
+ [at-mode 'initial]))]
+ [(at-opener)
+ (keep (state-delta s) #:at-mode 'no-initial)]
+ [else
+ (fail next-t "invalid after `@`")])])]
+ [(at-comment)
+ (fail t "comments using `@//` are allowed only within an `@` body")]
+ [else
+ (error "unexpected" (token-value t))])])]))
+
+(define (parse-block t l
+ #:line line
+ #:closer closer
+ #:bar-closes? [bar-closes? #f]
+ #:bar-closes-line [bar-closes-line #f]
+ #:delta in-delta
+ #:raw in-raw
+ #:group-commenting [in-group-commenting #f]
+ #:block-mode [block-mode t])
+ (define-values (opener-t opener-l opener-line opener-delta opener-raw)
+ (next-of/opener l line in-delta null))
+ (define-values (group-commenting next-l last-line delta raw)
+ (if in-group-commenting
+ (values in-group-commenting opener-l opener-line opener-delta opener-raw)
+ (next-of/commenting opener-l opener-line opener-delta opener-raw)))
+ (cond
+ [(pair? next-l)
+ (define next-t (car next-l))
+ (define-values (indent-gs rest-l end-line end-delta end-t tail-commenting tail-raw)
+ (parse-groups next-l
+ (make-group-state #:closer (if opener-t
+ (make-closer-expected "»" opener-t)
+ closer)
+ #:column (column+ (token-column next-t) delta)
+ #:last-line last-line
+ #:bar-closes? (and (not opener-t) bar-closes?)
+ #:bar-closes-line (and (not opener-t) bar-closes-line)
+ #:block-mode block-mode
+ #:delta delta
+ #:commenting group-commenting
+ #:raw raw)))
+ (define used-closer? (or opener-t
+ (closer-expected? closer)))
+ (define-values (null-g post-l post-line post-delta post-tail-commenting post-tail-raw)
+ (if used-closer?
+ ;; in 'end mode, so errors or returns a null group:
+ (parse-group rest-l (make-state #:line end-line
+ #:column +inf.0
+ #:bar-closes? bar-closes?
+ #:bar-closes-line bar-closes-line
+ #:block-mode 'end
+ #:delta end-delta
+ #:raw null))
+ (values '() rest-l end-line end-delta tail-commenting (if used-closer?
+ null
+ tail-raw))))
+ (unless (null? null-g) (error "internal error, parsed more"))
+ (values (list (add-raw-to-prefix
+ t in-raw #:tail (and used-closer? tail-raw)
+ (add-span-srcloc
+ t end-t #:alt next-t
+ (tag-as-block indent-gs))))
+ post-l
+ post-line
+ post-delta
+ post-tail-commenting
+ post-tail-raw)]
+ [else
+ (when opener-t (fail opener-t (format "expected `»`")))
+ (values (list (add-raw-to-prefix
+ t in-raw
+ (add-span-srcloc
+ t #f
+ (tag-as-block null))))
+ next-l
+ line
+ delta
+ group-commenting
+ raw)]))
+
+(define (tag-as-block gs)
+ (cond
+ [(and (pair? gs)
+ ;; really only need to check the first one:
+ (for/and ([g (in-list gs)])
+ (and (pair? g)
+ (tag? 'group (car g))
+ (pair? (cdr g))
+ (null? (cddr g))
+ (let ([b (cadr g)])
+ (and (pair? b)
+ (tag? 'bar (car b))
+ ;; the rest should always be true:
+ (pair? (cdr b))
+ (null? (cddr b))
+ (pair? (cadr b))
+ (tag? 'block (caadr b)))))))
+ (cons 'alts (for/list ([g (in-list gs)])
+ (move-pre-raw
+ (car g)
+ (let ([b (cadr g)])
+ (cadr b)))))]
+ [else (cons 'block gs)]))
+
+(define (bars-insert-alts gs)
+ (cond
+ [(and (pair? gs)
+ ;; same check is in `tag-as-block
+ (let ([g (car gs)])
+ (and (pair? g)
+ (tag? 'group (car g))
+ (pair? (cdr g))
+ (null? (cddr g))
+ (let ([b (cadr g)])
+ (and (pair? b)
+ (tag? 'bar (car b))
+ (car b))))))
+ => (lambda (bar-tag)
+ (list (list group-tag
+ (add-span-srcloc
+ bar-tag #f
+ (tag-as-block gs)))))]
+ [else gs]))
+
+(define (tag? sym e)
+ (or (eq? sym e)
+ (and (syntax? e)
+ (eq? sym (syntax-e e)))))
+
+;; Look for `{` (as 'at-opener) next or a `[` that might be followed
+;; by a `{`, and prepare to convert by rearranging info a splice
+;; followed by parentheses
+(define (continue-at at-mode after-bracket? l line delta)
+ (define (at-call rator parens g)
+ (if (eq? at-mode 'no-initial)
+ (cons (move-pre-raw rator
+ (add-raw-to-prefix #f (syntax-to-raw rator) parens))
+ g)
+ (list* rator parens g)))
+ (cond
+ [(not at-mode)
+ (values (lambda (g) g) #f l line delta)]
+ [(and (not after-bracket?)
+ (pair? l)
+ (eq? 'opener (token-name (car l)))
+ (equal? "[" (token-e (car l))))
+ (values (lambda (g)
+ (define a (cadr g))
+ (define tag (car a))
+ (cond
+ [(tag? 'brackets tag)
+ (at-call (car g)
+ (cons (datum->syntax tag 'parens tag tag)
+ (cdr a))
+ (cddr g))]
+ [(eq? at-mode 'no-initial)
+ (add-raw-to-prefix* #f (syntax-to-raw (car g))
+ (cdr g))]
+ [else g]))
+ at-mode l line delta)]
+ [(and (pair? l)
+ (eq? 'at-opener (token-name (car l))))
+ ;; process a `{`...`}` body, handling escapes and then trimming whitespace
+ (define init-t (car l))
+ (let loop ([l (cdr l)] [content '()])
+ (case (if (null? l) 'at-closer (token-name (car l)))
+ [(at-closer)
+ (when (null? l)
+ (fail init-t "missing closer for `@` content"))
+ (define-values (prefix-syntaxes new-content post-syntaxes)
+ (adjust-content-space content group-tag))
+ (define c
+ (list group-tag (add-tail-raw-to-prefix
+ (list (car l))
+ post-syntaxes
+ (cons (datum->syntax (token-value init-t)
+ 'brackets
+ (token-value init-t)
+ (token-value init-t))
+ (add-raw-to-prefix* #f (map syntax-to-raw prefix-syntaxes)
+ new-content)))))
+ (values (lambda (g) (cond
+ [(not after-bracket?)
+ (at-call (car g)
+ (list parens-tag c)
+ (cdr g))]
+ [else
+ (define bracket (caar g))
+ (define new-g (cons (cons parens-tag
+ (append
+ (cdar g)
+ (list
+ (move-post-raw-to-prefix bracket c))))
+ (cdr g)))
+ (move-pre-raw bracket
+ (add-raw-to-prefix* #f (syntax-to-raw bracket)
+ new-g))]))
+ 'initial (if (null? l) null (cdr l)) line delta)]
+ [(at-content)
+ (loop (cdr l)
+ ;; mark as 'content instead of 'group for now, so we
+ ;; can split and trim whitespace after finding all of it
+ (cons (list 'content (token-value (car l)))
+ content))]
+ [(at at-comment)
+ (define t (car l))
+ (define comment? (eq? (token-name t) 'at-comment))
+ ;; `parse-group` work will be delimited by 'at-content or 'at-closer
+ (define-values (g rest-l group-end-line group-delta group-tail-commenting group-tail-raw)
+ (parse-group (if comment?
+ (cons (token-rename t 'at) (cdr l))
+ l)
+ (make-state #:line (token-line t)
+ #:column (token-column t)
+ #:delta 0
+ #:raw null)))
+ (loop rest-l (cons (if comment?
+ (list 'comment (cons (token-raw t) (syntax-to-raw g)))
+ (cons group-tag g))
+ content))]
+ [(comment)
+ (loop (cdr l) (cons (list 'comment (token-e (car l))) content))]
+ [else (error "unexpected in at" (token-name (car l)))]))]
+ [else
+ (values (lambda (g) g) #f l line delta)]))
+
+(define (splice-at t g tail-raw)
+ (define gs (car g))
+ (define at (car gs))
+ (unless (tag? 'at at) (error "expected at"))
+ (when (null? (cdr gs)) (fail t "empty group within within `@«` and `»`"))
+ (unless (null? (cddr gs)) (error "extra groups in at"))
+ (define rest (cdr g))
+ (values
+ (move-pre-raw* at
+ (add-raw-to-prefix* #f (syntax-to-raw at)
+ (append (cdadr gs)
+ (if (null? rest)
+ rest
+ (move-post-raw-to-prefix at rest)))))
+ (if (null? rest)
+ (append tail-raw (list (or (syntax-raw-tail-property at) '())))
+ tail-raw)))
+
+;; Like `datum->syntax`, but propagates the source location of
+;; a start of a list (if any) to the list itself. That starting
+;; item is expected to be a tag that has the span of the whole
+;; term already as its location
+(define (lists->syntax l)
+ (cond
+ [(pair? l)
+ (define a (car l))
+ (define new-l (for/list ([e (in-list l)])
+ (lists->syntax e)))
+ (if (syntax? a)
+ (datum->syntax* #f new-l a stx-for-original-property)
+ (datum->syntax* #f new-l))]
+ [else l]))
+
+;; Consume whitespace and comments, including continuing backslashes,
+;; where lookahead is needed
+;; Arguments/returns:
+;; list of input tokens (after whitespace, comments, and continues)
+;; most recently consumed non-whitesace line (to determine whether
+;; next is on same line); on input; the line can be #f, which means
+;; "treat as same line"; the result is never #f
+;; current line delta (created by continues)
+;; accumulated reversed raw whitespace
+(define (next-of l last-line delta raw)
+ (cond
+ [(null? l) (values null (or last-line 0) delta raw)]
+ [else
+ (define t (car l))
+ (case (token-name t)
+ [(whitespace comment)
+ (next-of (cdr l) last-line delta (cons (car l) raw))]
+ [(continue-operator)
+ (define line (token-line t))
+ ;; a continue operator not followed only by whitespace and
+ ;; comments is just treated as whitespace
+ (define-values (next-l next-raw)
+ (let loop ([l (cdr l)] [raw (cons t raw)])
+ (cond
+ [(null? l) (values null raw)]
+ [else (case (token-name (car l))
+ [(whitespace comment) (loop (cdr l) (cons (car l) raw))]
+ [else (values l raw)])])))
+ (cond
+ [(and (pair? next-l)
+ (eqv? line (token-line (car next-l))))
+ ;; like whitespace:
+ (next-of next-l last-line delta next-raw)]
+ [else
+ (next-of next-l
+ ;; whitespace-only lines don't count, so next continues
+ ;; on the same line by definition:
+ #f
+ (column+ (token-column t)
+ (+ (if (or (not last-line) (eqv? line last-line))
+ delta
+ 0)
+ 1))
+ next-raw)])]
+ [else
+ (define line (token-line t))
+ (values l
+ (or last-line line)
+ (if (or (not last-line) (eqv? line last-line))
+ delta
+ 0)
+ raw)])]))
+
+(define (next-of/commenting l last-line delta raw)
+ (define-values (rest-l rest-last-line rest-delta rest-raw)
+ (next-of l last-line delta raw))
+ (cond
+ [(pair? rest-l)
+ (define-values (group-commenting use-t use-l last-line delta raw)
+ (get-own-line-group-comment (car rest-l) rest-l rest-last-line rest-delta rest-raw))
+ (values group-commenting use-l last-line delta raw)]
+ [else
+ (values #f rest-l rest-last-line rest-delta rest-raw)]))
+
+(define (next-of/opener l last-line delta raw)
+ (define-values (rest-l rest-last-line rest-delta rest-raw)
+ (next-of l last-line delta raw))
+ (cond
+ [(pair? rest-l)
+ (define t (car rest-l))
+ (cond
+ [(and (eq? (token-name t) 'opener)
+ (equal? (token-e t) "«"))
+ (values t (cdr rest-l) (token-line t) rest-delta (cons t rest-raw))]
+ [else
+ (values #f rest-l rest-last-line rest-delta rest-raw)])]
+ [else
+ (values #f rest-l rest-last-line rest-delta rest-raw)]))
+
+;; t is at the start of l on input and output
+(define (get-own-line-group-comment t l line delta raw)
+ (let loop ([commenting #f] [t t] [l (cdr l)] [line line] [delta delta] [raw raw])
+ (case (token-name t)
+ [(group-comment)
+ (cond
+ [((token-line t) . > . line)
+ (define-values (next-l last-line next-delta next-raw)
+ (next-of l line delta (cons t raw)))
+ (cond
+ [(null? next-l) (fail-no-comment-group t)]
+ [(next-line? next-l last-line)
+ (when commenting (fail-no-comment-group commenting))
+ (loop t (car next-l) (cdr next-l) last-line next-delta next-raw)]
+ [else
+ ;; `t` is still in `next-raw`, but this is going to lead to an error
+ (values commenting t (cons t next-l) last-line next-delta next-raw)])]
+ [else
+ (fail t "misplaced group comment")])]
+ [else
+ (values commenting t (cons t l) line delta raw)])))
+
+(define (next-line? l last-line)
+ (and (pair? l)
+ (let ([line (token-line (car l))])
+ (and line
+ (line . > . last-line)))))
+
+(define (fail-no-comment-group t)
+ (fail t "no group for term comment"))
+
+;; Report an error on failure, but then keep parsing anyway
+;; if in recover mode
+(define current-recover-mode (make-parameter #f))
+(define (fail tok msg)
+ (define loc (if (syntax? tok)
+ (srcloc (syntax-source tok)
+ (syntax-line tok)
+ (syntax-column tok)
+ (syntax-position tok)
+ (syntax-span tok))
+ (token-srcloc tok)))
+ (cond
+ [(current-recover-mode)
+ (log-error "~a: ~a" (srcloc->string loc) msg)]
+ [else
+ (raise
+ (exn:fail:read
+ (if (error-print-source-location)
+ (format "~a: ~a" (srcloc->string loc) msg)
+ msg)
+ (current-continuation-marks)
+ (list loc)))]))
+
+(define (column-next c)
+ (and c
+ (if (integer? c)
+ (add1 c)
+ (add1 (inexact->exact (floor c))))))
+
+(define (column-half-next c)
+ (if (integer? c)
+ (+ c 0.5)
+ (column-next c)))
+
+(define (column+ c n)
+ (and c (+ c n)))
+
+(define (next-block-mode mode)
+ (if (eq? mode 'no) 'no #f))
+
+;; ----------------------------------------
+
+(define (add-span-srcloc start-t end-t l #:alt [alt-start-t #f])
+ (define (add-srcloc l loc)
+ (cons (let ([stx (datum->syntax* #f (car l) loc stx-for-original-property)])
+ (if (syntax? start-t)
+ (let* ([stx (syntax-property-copy stx start-t syntax-raw-property)]
+ [stx (syntax-property-copy stx start-t syntax-raw-prefix-property)])
+ stx)
+ stx))
+ (cdr l)))
+ (define last-t/e (or end-t
+ ;; when `end-t` is false, we go looking for the
+ ;; end in `l`; this search walks down the end
+ ;; of `l`, and it may recur into the last element,
+ ;; but it should go only a couple of levels that way,
+ ;; since non-`group` tags have spanning locations
+ ;; gather from their content
+ (let loop ([e l])
+ (cond
+ [(syntax? e) e]
+ [(not (pair? e)) #f]
+ [(null? (cdr e))
+ (define a (car e))
+ (if (and (pair? a)
+ (syntax? (car a)))
+ ;; found a tag like `block`
+ (car a)
+ (loop a))]
+ [else (loop (cdr e))]))))
+ (define s-loc (cond
+ [(not start-t)
+ (token-srcloc alt-start-t)]
+ [(syntax? start-t)
+ (syntax-srcloc start-t)]
+ [else
+ (token-srcloc start-t)]))
+ (define e-loc (and last-t/e
+ (if (syntax? last-t/e)
+ (syntax-srcloc last-t/e)
+ (token-srcloc last-t/e))))
+ (define s-position (srcloc-position s-loc))
+ (add-srcloc l (srcloc (srcloc-source s-loc)
+ (srcloc-line s-loc)
+ (srcloc-column s-loc)
+ s-position
+ (if e-loc
+ (let ([s s-position]
+ [e (srcloc-position e-loc)]
+ [sp (srcloc-span e-loc)])
+ (and s e sp
+ (+ (max (- e s) 0) sp)))
+ (srcloc-span s-loc)))))
+
+(define (token-raw t)
+ (define value (token-value t))
+ (or (syntax-raw-property value)
+ (syntax-e value)))
+
+(define (record-raw stx t pre-raw)
+ (define stx+raw (if t
+ (syntax-raw-property stx (raw-cons (token-raw t)
+ (or (syntax-raw-property stx) '())))
+ (if (syntax-raw-property stx)
+ stx
+ (syntax-raw-property stx '()))))
+ (if (null? pre-raw)
+ stx+raw
+ (syntax-raw-prefix-property stx+raw (raw-cons (raw-tokens->raw pre-raw)
+ (or (syntax-raw-prefix-property stx) '())))))
+
+(define (add-raw-tail top raw)
+ (if (null? raw)
+ top
+ (datum->syntax* top
+ (cons (syntax-raw-tail-property (car (syntax-e top)) (raw-tokens->raw raw))
+ (cdr (syntax-e top)))
+ top
+ top)))
+
+(define (add-pre-raw stx pre-raw)
+ (if (null? pre-raw)
+ stx
+ (syntax-raw-prefix-property stx (raw-tokens->raw pre-raw))))
+
+(define (move-pre-raw from-stx to)
+ (define pre-raw (and (syntax? from-stx)
+ (syntax-raw-prefix-property from-stx)))
+ (cond
+ [pre-raw
+ (define a (car to))
+ (cons (syntax-raw-prefix-property a (raw-cons pre-raw
+ (or (syntax-raw-prefix-property a) '())))
+ (cdr to))]
+ [else to]))
+
+(define (move-pre-raw* from-stx to)
+ (cond
+ [(syntax? (car to)) (move-pre-raw from-stx to)]
+ [else (cons (move-pre-raw* from-stx (car to))
+ (cdr to))]))
+
+(define (move-post-raw from-stx to)
+ (define post-raw (and (syntax? from-stx)
+ (syntax-raw-tail-property from-stx)))
+ (cond
+ [post-raw
+ (define a (datum->syntax* #f (car to)))
+ (cons (syntax-raw-tail-property a (raw-cons (or (syntax-raw-tail-property a) '())
+ post-raw))
+ (cdr to))]
+ [else to]))
+
+(define (move-post-raw-to-prefix from-stx to)
+ (define post-raw (and (syntax? from-stx)
+ (syntax-raw-tail-property from-stx)))
+ (cond
+ [post-raw
+ (define a (datum->syntax* #f (car to)))
+ (cons (syntax-raw-prefix-property a (raw-cons post-raw
+ (or (syntax-raw-prefix-property a) '())))
+ (cdr to))]
+ [else to]))
+
+(define (raw-tokens->raw pre-raw)
+ (for/list ([raw-t (in-list (reverse pre-raw))])
+ (if (token? raw-t)
+ (token-raw raw-t)
+ raw-t)))
+
+(define (add-raw-to-prefix t pre-raw l #:tail [post-raw #f])
+ (cons (let ([stx (record-raw (car l) t pre-raw)])
+ (if post-raw
+ (syntax-raw-tail-property stx (raw-cons (or (syntax-raw-tail-property stx) '())
+ (raw-tokens->raw post-raw)))
+ stx))
+ (cdr l)))
+
+(define (add-raw-to-prefix* t pre-raw l)
+ (cond
+ [(syntax? (car l)) (add-raw-to-prefix t pre-raw l)]
+ [else (cons (add-raw-to-prefix* t pre-raw (car l))
+ (cdr l))]))
+
+(define (add-tail-raw-to-prefix post-raw post-stxes l)
+ (cons (syntax-raw-tail-property (datum->syntax* #f (car l))
+ (raw-cons (map syntax-to-raw post-stxes)
+ (raw-tokens->raw post-raw)))
+ (cdr l)))
+
+(define (raw-cons a b)
+ (cond
+ [(null? a) b]
+ [(null? b) a]
+ [else (cons a b)]))
+
+(define (syntax-property-copy dest src prop)
+ (define v (prop src))
+ (if v
+ (prop dest v)
+ dest))
+
+;; like `datum->syntax`, but ensures that a sequence of pairs is not
+;; too long before there's a syntax pair
+(define (datum->syntax* ctx v [src #f] [props #f])
+ (datum->syntax
+ ctx
+ (let loop ([v v] [depth 0])
+ (cond
+ [(pair? v)
+ (cond
+ [(depth . >= . 32)
+ (datum->syntax ctx
+ (cons (loop (car v) 0) (loop (cdr v) 0))
+ src)]
+ [else
+ (cons (loop (car v) 0) (loop (cdr v) (add1 depth)))])]
+ [else v]))
+ src
+ props))
+
+;; ----------------------------------------
+
+;; check that line-counting is consistent (always on or always off),
+;; and when it's off, make sure there are no newlines except maybe at
+;; the beginning and/or end
+(define (check-line-counting l)
+ (unless (null? l)
+ (define (fail-inconsistent t)
+ (fail t "port did not consistently report lines and columns"))
+ (cond
+ [(and (token-line (car l))
+ (token-column (car l)))
+ (for ([t (in-list l)])
+ (unless (and (token-line t)
+ (token-column t))
+ (fail-inconsistent t)))]
+ [else
+ (let loop ([l l] [saw-non-ws? #f] [newline-t #f])
+ (unless (null? l)
+ (define t (car l))
+ (when (or (token-line t)
+ (token-column t))
+ (fail-inconsistent t))
+ (case (token-name t)
+ [(whitespace comment continue-operator)
+ (loop (cdr l)
+ saw-non-ws?
+ (and saw-non-ws?
+ (or newline-t
+ (and (regexp-match? #rx"[\r\n]" (syntax-e (token-value t)))
+ t))))]
+ [else
+ (when newline-t
+ (fail newline-t "port does not count lines, but input includes a newline"))
+ (loop (cdr l) #t #f)])))])))
+
+;; ----------------------------------------
+
+(define (parse-all in #:source [source (object-name in)])
+ (define l (lex-all in fail #:source source))
+ (check-line-counting l)
+ (if (null? l)
+ eof
+ (parse-top-groups l)))
+
+(module+ main
+ (require racket/cmdline
+ "print.rkt")
+
+ (define show-raw? #f)
+ (define one-line? #f)
+
+ (define (parse-all* in)
+ (unless one-line?
+ (port-count-lines! in))
+ (define e (parse-all in))
+ (unless (eof-object? e)
+ (cond
+ [show-raw?
+ (for ([s (syntax->list e)])
+ (printf "#|\n~a\n|#\n" (shrubbery-syntax->string s))
+ (pretty-write (syntax->datum s)))]
+ [else
+ (pretty-write
+ (syntax->datum e))])))
+
+ (command-line
+ #:once-each
+ [("--recover") "Continue parsing after an error"
+ (current-recover-mode #t)]
+ [("--raw") "Show raw strings when printing"
+ (set! show-raw? #t)]
+ [("--one-line") "Disable line counting to assume a single line"
+ (set! one-line? #t)]
+ #:args file
+ (if (null? file)
+ (parse-all* (current-input-port))
+ (for-each (lambda (file)
+ (call-with-input-file*
+ file
+ parse-all*))
+ file))))
diff --git a/shrubbery/print.rkt b/shrubbery/print.rkt
new file mode 100644
index 000000000..f7d1a500f
--- /dev/null
+++ b/shrubbery/print.rkt
@@ -0,0 +1,97 @@
+#lang racket/base
+(require "private/property.rkt")
+
+;; Printing syntax object using raw-text properties
+
+(provide shrubbery-syntax->string)
+
+(module+ for-parse
+ (provide syntax-to-raw))
+
+(define (shrubbery-syntax->string s #:max-length [max-length #f])
+ (cond
+ [(all-raw-available? s)
+ (define o (open-output-string))
+ (define (full?)
+ (and max-length
+ ((file-position o) . > . max-length)))
+ (let loop ([l (syntax-to-raw s)])
+ (cond
+ [(pair? l)
+ (unless (full?)
+ (loop (car l))
+ (unless (full?)
+ (loop (cdr l))))]
+ [(null? l) (void)]
+ [(string? l) (display l o)]
+ [else (void)]))
+ (define orig-str (get-output-string o))
+ (define starting-col (extract-starting-column s))
+ ;; strip `string-col` spaces from the start of lines after the first one:
+ (define str (regexp-replace* (string-append "\n" (make-string starting-col #\space))
+ orig-str
+ "\n"))
+ (if (and max-length
+ ((string-length str) . > . max-length))
+ (string-append (substring str 0 (max 0 (- max-length 3)))
+ "...")
+ str)]
+ [else
+ (define v (if (syntax? s) (syntax->datum s) s))
+ (if max-length
+ (parameterize ([error-print-width max-length])
+ (format "~.s" v))
+ (format "~s" v))]))
+
+(define (syntax-to-raw g)
+ (let loop ([g g] [tail null] [use-prefix? #f])
+ (cond
+ [(null? g) tail]
+ [(pair? g)
+ (define a-stx (car g))
+ (define post (and (syntax? a-stx)
+ (syntax-raw-tail-property a-stx)))
+ (define a (loop a-stx null use-prefix?))
+ (define d (loop (cdr g)
+ (if post
+ (if (null? tail)
+ post
+ (cons tail post))
+ tail)
+ #t))
+ (if (null? a) d (cons a d))]
+ [(syntax? g)
+ (define pre (and use-prefix?
+ (syntax-raw-prefix-property g)))
+ (define r (syntax-raw-property g))
+ (define raw (if (and pre r)
+ (cons pre r)
+ (or pre r null)))
+ (define d (loop (syntax-e g) tail use-prefix?))
+ (if (null? raw) d (cons raw d))]
+ [else tail])))
+
+(define (all-raw-available? s)
+ (cond
+ [(syntax? s)
+ (or (syntax-raw-property s)
+ (let ([e (syntax-e s)])
+ (or (and (pair? e)
+ (all-raw-available? e))
+ (null? e)))
+ #;
+ (and (log-error "?? ~s" s)
+ #f))]
+ [(pair? s) (and (all-raw-available? (car s))
+ (all-raw-available? (cdr s)))]
+ [else #t]))
+
+(define (extract-starting-column s)
+ (cond
+ [(syntax? s)
+ (or (syntax-column s)
+ (let ([e (syntax-e s)])
+ (and (pair? e)
+ (extract-starting-column (car e))))
+ 0)]
+ [else 0]))
diff --git a/shrubbery/private/at-space.rkt b/shrubbery/private/at-space.rkt
new file mode 100644
index 000000000..70588a457
--- /dev/null
+++ b/shrubbery/private/at-space.rkt
@@ -0,0 +1,176 @@
+#lang racket/base
+(require "property.rkt")
+
+(provide adjust-content-space)
+
+(define (adjust-content-space rev group-tag)
+ (let*-values ([(rev) (remove-trailing-spaces rev)]
+ [(rev end2) (discard-immediate-space rev)]
+ [(rev end1) (discard-immediate-newline rev #f)])
+ (let*-values ([(content) (reverse rev)]
+ [(content start1) (discard-immediate-space content)]
+ [(content start2) (discard-immediate-newline content #f)])
+ (let* ([min-col (get-min-column content)]
+ [content (if (eqv? min-col 0)
+ content
+ (trim-shared-column content min-col (pair? start2)))])
+ (let-values ([(content comments) (convert-content content group-tag)])
+ (values (append start1 start2)
+ content
+ (append (reverse comments) end1 end2)))))))
+
+(define rx:whitespace #px"^\\s*$")
+
+(define (is-newline? c)
+ (and (eq? 'content (car c))
+ (equal? "\n" (syntax-e (cadr c)))))
+
+;; discard space immediately before closer or after opener
+(define (discard-immediate-space lst)
+ (cond
+ [(and (pair? lst)
+ (eq? 'content (caar lst))
+ (pair? (cdr lst))
+ (let ([prev (cadr lst)])
+ (and (eq? 'content (car prev))
+ (equal? "\n" (syntax-e (cadr prev)))))
+ (regexp-match? rx:whitespace (syntax-e (cadar lst))))
+ (values (cdr lst) (cdar lst))]
+ [else
+ (values lst null)]))
+
+;; discard newline just before closer,
+;; unless a newline is all there will be
+(define (discard-immediate-newline lst check-next-ws?)
+ (cond
+ [(and (pair? lst)
+ (is-newline? (car lst))
+ (not (null? (cdr lst)))
+ (not (and check-next-ws?
+ (null? (cddr lst))
+ (let ([prev (cadr lst)])
+ (and (eq? 'content (car prev))
+ (regexp-match? rx:whitespace (cadr prev)))))))
+ (values (cdr lst) (cdar lst))]
+ [else
+ (values lst null)]))
+
+(define (remove-trailing-spaces rev)
+ (define (trailing-space-count s)
+ (let loop ([i (string-length s)])
+ (cond
+ [(eqv? i 0) 0]
+ [(char-whitespace? (string-ref s (sub1 i))) (add1 (loop (sub1 i)))]
+ [else 0])))
+ (let loop ([rev rev])
+ (cond
+ [(null? rev) rev]
+ [(is-newline? (car rev))
+ (define next (cdr rev))
+ (cons (car rev)
+ (cond
+ [(null? next) null]
+ [(is-newline? (car next)) (loop next)]
+ [(eq? 'content (caar next))
+ (define n (trailing-space-count (syntax-e (cadar next))))
+ (cond
+ [(eqv? n 0) (loop next)]
+ [else
+ (define a (cadar next))
+ (define s (syntax-e a))
+ (cons (list 'content
+ (datum->syntax a
+ (substring s 0 (- (string-length s) n))
+ a
+ a))
+ (loop (cdr next)))])]
+ [else (loop next)]))]
+ [else (cons (car rev) (loop (cdr rev)))])))
+
+(define (leading-space-count s)
+ (let loop ([i 0])
+ (cond
+ [(eqv? i (string-length s)) 0]
+ [(char-whitespace? (string-ref s i)) (add1 (loop (add1 i)))]
+ [else 0])))
+
+(define (get-min-column lst)
+ (let loop ([lst lst] [min-col #f] [saw-nl? #t])
+ (cond
+ [(null? lst) min-col]
+ [(is-newline? (car lst))
+ (loop (cdr lst) min-col #t)]
+ [(and saw-nl?
+ (eq? 'content (caar lst)))
+ (define stx (cadar lst))
+ (define s (syntax-e stx))
+ (define n (+ (leading-space-count s) (or (syntax-column stx) 0)))
+ (loop (cdr lst) (min n (or min-col n)) #f)]
+ [else
+ (loop (cdr lst) min-col #f)])))
+
+(define (trim-shared-column lst min-col start-nl?)
+ (let loop ([lst lst] [saw-nl? start-nl?])
+ (cond
+ [(null? lst) null]
+ [(is-newline? (car lst))
+ (cons (car lst)
+ (loop (cdr lst) #t))]
+ [(and saw-nl?
+ (eq? 'content (caar lst)))
+ (define a (cadar lst))
+ (define s (syntax-e a))
+ (define col (or (syntax-column a) 0))
+ (define n (leading-space-count s))
+ (define add-back-n (max 0 (- n min-col)))
+ (define trimmed-s (if (eqv? n 0)
+ s
+ (substring s n)))
+ (define trimmed
+ (cons (list 'content
+ (datum->syntax a
+ trimmed-s
+ a
+ a))
+ (loop (cdr lst) #f)))
+ (if (eqv? add-back-n 0)
+ trimmed
+ (cons (list 'content
+ (syntax-raw-property (datum->syntax a
+ (make-string add-back-n #\space)
+ a)
+ '()))
+ trimmed))]
+ [else (cons (car lst) (loop (cdr lst) #f))])))
+
+(define (convert-content lst group-tag)
+ (let loop ([lst lst] [accum '()] [comments '()])
+ (cond
+ [(null? lst) (values (reverse accum) (reverse comments))]
+ [else
+ (define c (car lst))
+ (cond
+ [(eq? 'comment (car c))
+ (loop (cdr lst) accum (cons (cadr c) comments))]
+ [(eq? 'content (car c))
+ (loop (cdr lst)
+ (cons (add-comments (cons group-tag (cdr c))
+ comments)
+ accum)
+ null)]
+ [else
+ (loop (cdr lst)
+ (cons (add-comments c comments)
+ accum)
+ null)])])))
+
+(define (add-comments c comments)
+ (cond
+ [(null? comments) c]
+ [else
+ (define stx (car c))
+ (cons (syntax-raw-prefix-property stx
+ (cons (reverse comments)
+ (or (syntax-raw-prefix-property stx) '())))
+ (cdr c))]))
+
diff --git a/shrubbery/private/edit-help.rkt b/shrubbery/private/edit-help.rkt
new file mode 100644
index 000000000..c2fba84dc
--- /dev/null
+++ b/shrubbery/private/edit-help.rkt
@@ -0,0 +1,103 @@
+#lang racket/base
+(require racket/class)
+
+(provide opener?
+ line-start
+ line-delta
+ col-of
+ only-whitespace-between?
+ get-block-column)
+
+(define (opener? s)
+ (member s '("(" "{" "[" "«")))
+
+(define (line-start t pos)
+ (send t paragraph-start-position (send t position-paragraph pos #t)))
+
+(define (line-delta t start #:unless-empty? [unless-empty? #f])
+ (let loop ([pos start])
+ (cond
+ [(eqv? pos 0) 0]
+ [else
+ (case (send t classify-position (sub1 pos))
+ [(white-space comment)
+ (define-values (s e) (send t get-token-range (sub1 pos)))
+ (loop s)]
+ [(continue-operator)
+ ;; since we've only skipped comments and whitespace, this
+ ;; continue operator applies
+ (define-values (s e) (send t get-token-range (sub1 pos)))
+ (define c-start (line-start t s))
+ (define more-delta (line-delta t c-start #:unless-empty? unless-empty?))
+ (and more-delta
+ (if unless-empty?
+ (not (only-whitespace-between? t c-start s #:or-ws-like? #t))
+ #t)
+ (+ (- e c-start) more-delta))]
+ [else 0])])))
+
+;; skip back over whitespace and comments to find `\`
+(define (col-of pos start delta)
+ (+ (- pos start) delta))
+
+(define (only-whitespace-between? t s-pos e-pos
+ #:or-ws-like? [or-ws-like? #f])
+ (let loop ([pos s-pos])
+ (and (case (send t classify-position pos)
+ [(white-space) #t]
+ [(comment continue-operator) or-ws-like?]
+ [else #f])
+ (let ()
+ (define-values (s e) (send t get-token-range pos))
+ (or (e . >= . e-pos)
+ (loop e))))))
+
+;; find the current indentation of the block that starts at or before `pos`,
+;; a long as the block continues (not nested) on the line at `at-start`;
+;; if `for-outdent?` is true, don't treat leading operators as the start
+(define (get-block-column t pos candidate at-start
+ #:for-outdent? [for-outdent? #t])
+ (let loop ([pos pos] [candidate candidate] [at-start at-start])
+ (define pos-start (line-start t pos))
+ (cond
+ [(pos-start . < . at-start)
+ candidate]
+ [else
+ (define-values (s e) (send t get-token-range pos))
+ (define category (send t classify-position s))
+ (case category
+ [(white-space comment continue-operator)
+ (if (zero? s)
+ (or candidate 0)
+ (loop (sub1 s) candidate at-start))]
+ [(parenthesis)
+ (cond
+ [(opener? (send t get-text (sub1 e) e))
+ candidate]
+ [else
+ ;; Found parenthesized while walking backward
+ (define r (send t backward-match e 0))
+ (cond
+ [(not r)
+ (define start (line-start t pos))
+ (define delta (line-delta t start))
+ (loop (sub1 s) (col-of s start delta) at-start)]
+ [(zero? r) (list 0)]
+ [else
+ (define start (line-start t r))
+ (define delta (line-delta t start))
+ (loop (sub1 r) (col-of r start delta) start)])])]
+ [(block-operator bar-operator separator) candidate]
+ [else
+ (cond
+ [(and #f ;; not sure why special-casing operators seemed like a good idea
+ for-outdent?
+ (eq? category 'operator))
+ (if (zero? s)
+ (or candidate 0)
+ (loop (sub1 s) candidate at-start))]
+ [(zero? s) 0]
+ [else
+ (define start (line-start t pos))
+ (define delta (line-delta t start))
+ (loop (sub1 s) (col-of s start delta) start)])])])))
diff --git a/shrubbery/private/property.rkt b/shrubbery/private/property.rkt
new file mode 100644
index 000000000..fb190570d
--- /dev/null
+++ b/shrubbery/private/property.rkt
@@ -0,0 +1,20 @@
+#lang racket/base
+
+(provide syntax-raw-property
+ syntax-raw-prefix-property
+ syntax-raw-tail-property)
+
+(define syntax-raw-property
+ (case-lambda
+ [(stx) (syntax-property stx 'raw)]
+ [(stx val) (syntax-property stx 'raw val #t)]))
+
+(define syntax-raw-prefix-property
+ (case-lambda
+ [(stx) (syntax-property stx 'raw-prefix)]
+ [(stx val) (syntax-property stx 'raw-prefix val #t)]))
+
+(define syntax-raw-tail-property
+ (case-lambda
+ [(stx) (syntax-property stx 'raw-tail)]
+ [(stx val) (syntax-property stx 'raw-tail val #t)]))
diff --git a/shrubbery/srcloc.rkt b/shrubbery/srcloc.rkt
new file mode 100644
index 000000000..1e66f06f0
--- /dev/null
+++ b/shrubbery/srcloc.rkt
@@ -0,0 +1,22 @@
+#lang racket/base
+(require (for-syntax racket/base))
+
+(provide syntax-srcloc)
+
+(define-syntax (get stx)
+ (syntax-case stx ()
+ [(_ syntax-srcloc)
+ (cond
+ [(identifier-binding #'syntax-srcloc)
+ #'(begin)]
+ [(file-exists? (collection-file-path "syntax-srcloc.rkt" "racket"))
+ #`(require #,(datum->syntax #'syntax-srcloc 'racket/syntax-srcloc))]
+ [else
+ #'(define (syntax-srcloc stx)
+ (srcloc (syntax-source stx)
+ (syntax-line stx)
+ (syntax-column stx)
+ (syntax-position stx)
+ (syntax-span stx)))])]))
+
+(get syntax-srcloc)
diff --git a/shrubbery/syntax-color.rkt b/shrubbery/syntax-color.rkt
new file mode 100644
index 000000000..db2516314
--- /dev/null
+++ b/shrubbery/syntax-color.rkt
@@ -0,0 +1,8 @@
+#lang racket/base
+(require "lex.rkt"
+ syntax-color/racket-lexer)
+
+(provide shrubbery-lexer)
+
+(define (shrubbery-lexer in pos status)
+ (lex/status in pos status racket-lexer/status))
diff --git a/shrubbery/tests/indent.rkt b/shrubbery/tests/indent.rkt
new file mode 100644
index 000000000..58f705970
--- /dev/null
+++ b/shrubbery/tests/indent.rkt
@@ -0,0 +1,379 @@
+#lang at-exp racket/base
+(require racket/class
+ "../lex.rkt"
+ "../indentation.rkt"
+ (for-syntax racket/base))
+
+(define failed? #f)
+
+(define TOKEN-SLOT 0)
+(define TYPE-SLOT 1)
+
+;; fix 1-based indexing...
+(define (srcloc-0position loc)
+ (sub1 (srcloc-position loc)))
+
+(define like-text%
+ (class object%
+ (init-field content)
+
+ (super-new)
+
+ (define-values (position-paragraphs paragraph-starts)
+ (let loop ([pos 0] [para 0] [pos-para #hasheqv()] [para-pos #hasheqv((0 . 0))])
+ (cond
+ [(= pos (string-length content))
+ (values (hash-set pos-para pos para) para-pos)]
+ [(char=? #\newline (string-ref content pos))
+ (loop (add1 pos) (add1 para)
+ (hash-set pos-para pos para)
+ (hash-set para-pos (add1 para) (add1 pos)))]
+ [else
+ (loop (add1 pos) para (hash-set pos-para pos para) para-pos)])))
+
+ (define tokens
+ (lex-all (let ([p (open-input-string content)])
+ (port-count-lines! p)
+ p)
+ (lambda args
+ (error "unexpected lex failure: ~s" args))
+ #:keep-type? #t))
+
+ ;; position -> token
+ (define mapping
+ (let loop ([tokens tokens] [pos 0] [mapping #hasheqv()])
+ (cond
+ [(null? tokens) mapping]
+ [else
+ (define t+type (car tokens))
+ (define t (vector-ref t+type TOKEN-SLOT))
+ (define loc (token-srcloc t))
+ (unless (= pos (srcloc-0position loc))
+ (error 'test "token discontinuity ~s vs. ~v @ ~s" pos (token-e t) (srcloc-0position loc)))
+ (loop (cdr tokens)
+ (+ pos (srcloc-span loc))
+ (for/fold ([mapping mapping]) ([i (in-range (srcloc-span loc))])
+ (hash-set mapping (+ pos i) t+type)))])))
+
+ (define/public (get-text s e)
+ (substring content s e))
+
+ (define/public (classify-position pos)
+ (define t+type (or (hash-ref mapping pos #f)
+ (error 'classify-position "lookup failed: ~e" pos)))
+ (vector-ref t+type TYPE-SLOT))
+
+ (define/public (get-token-range pos)
+ (define t+type (or (hash-ref mapping pos #f)
+ (error 'get-token-range "lookup failed: ~e" pos)))
+ (define t (vector-ref t+type TOKEN-SLOT))
+ (define loc (token-srcloc t))
+ (values (srcloc-0position loc)
+ (+ (srcloc-0position loc) (srcloc-span loc))))
+
+ (define/public (last-position)
+ (string-length content))
+
+ (define/public (position-paragraph pos [eol? #f])
+ (or (hash-ref position-paragraphs pos #f)
+ (error 'position-paragraph "lookup failed: ~e" pos)))
+
+ (define/public (paragraph-start-position para)
+ (or (hash-ref paragraph-starts para #f)
+ (error 'paragraph-start-position "lookup failed: ~e" para)))
+
+ (define/public (backward-match pos cutoff)
+ (let loop ([pos (sub1 pos)] [depth -1] [need-close? #t])
+ (cond
+ [(pos . < . 0) #f]
+ [else
+ (define-values (s e) (get-token-range pos))
+ (define category (classify-position pos))
+ (case category
+ [(parenthesis)
+ (case (get-text s e)
+ [("{" "(" "[" "«") (and (not need-close?)
+ (if (= depth 0)
+ s
+ (loop (sub1 s) (sub1 depth) #f)))]
+ [("}" ")" "]" "»") (loop (sub1 s) (add1 depth) #f)]
+ [else (error "unexpected parenthesis-class text")])]
+ [(whitespace comment)
+ (loop (sub1 s) depth need-close?)]
+ [else (if need-close?
+ s
+ (loop (sub1 s) depth #f))])])))))
+
+
+(define (check e line)
+ (printf "Checking ~s\n" line)
+ ;; replacing each "^" by two spaces pushes the token too
+ ;; far to the right, which makes the `#:multi?` result
+ ;; keep its original sorting right-to-left:
+ (define clean-e (regexp-replace* #rx"\\!" (regexp-replace* #rx"\\^" e " ") "x"))
+ (define m (regexp-match-positions #rx"[ ]*(?:\\^|[!])" e))
+ (define start-pos (caar m))
+ (define expected (reverse
+ (let loop ([pos start-pos])
+ (cond
+ [(= pos (string-length e)) '()]
+ [(memv (string-ref e pos) '(#\^ #\!))
+ (cons (- pos start-pos) (loop (add1 pos)))]
+ [else (loop (add1 pos))]))))
+ (define t (new like-text% [content clean-e]))
+ (define raw-candidates (shrubbery-indentation t start-pos #:multi? #t))
+ (define candidates (if (list? raw-candidates) raw-candidates (list raw-candidates)))
+ (unless (equal? candidates expected)
+ (set! failed? #t)
+ (eprintf (string-append
+ "failed at line ~a\n"
+ " expected: ~a\n"
+ " received: ~a\n")
+ line
+ expected
+ candidates)))
+
+(define-syntax-rule (check-all e ...)
+ (begin
+ (check e (at e))
+ ...))
+
+(define-syntax (at stx)
+ (syntax-case stx ()
+ [(_ e) #`#,(syntax-line #'e)]))
+
+(define e string-append)
+
+(check-all
+
+ ;; In these tests, the last line of each is the one to be indented,
+ ;; and the "^"s (which will be replaced by two spaces before applying
+ ;; the indenter) or "!"s (replaced by an "x") are the expected
+ ;; indentation candidates. The content after the last "^" or "!", if
+ ;; any, can affect the indenters results (e.g., the answer tends to
+ ;; be different if it starts "|"). Whitespace after a final "^" or
+ ;; "!" will not matter.
+
+ @e{a
+ ^x}
+
+ @e{a:
+ ^ ^x}
+
+ @e{apple:
+ ^ ^x}
+
+ @e{apple: banana:
+ ^ ^ ^x}
+
+ @e{apple: :
+ ^ ^ ^x}
+
+ @e{(a:
+ ^ ^x}
+
+ @e{(:
+ ^ ^x}
+
+ @e{[a:
+ ^ ^x}
+
+ @e{[:
+ ^ ^x}
+
+ @e|{{a:
+ ^ ^x}|
+
+ @e|{{:
+ ^ ^x}|
+
+ @e{define pi:
+ ^ ^x}
+
+ @e{define pi: 3.14
+ ^ ^x}
+
+ @e{define fib(n):
+ cond | n == 0: 0
+ ^ ^|}
+
+ @e{define fib(n):
+ log_error("fib called")
+ cond | n == 0: 0
+ ^ ^|}
+
+ @e{define
+ | fib(0): 0
+ | fib(1): 1
+ | fib(n): fib(n-1) // in parens => `+` continues this line
+ ^ ^ ^ ^+ fib(n-2))}
+
+ @e{define
+ | fib(0): 0
+ | fib(1): 1
+ | fib(n): (fib(n-1) // in parens => `+` continues this line
+ ^ ^+ fib(n-2))}
+
+ @e{define fib:
+ lambda (n):
+ cond
+ | n == 0: 0
+ ^ ^| n == 1: 1}
+
+ @e|{define fib(n):
+ match n { | 0 { 0 }
+ ^| 1 { 1 }}|
+
+ @e{1 + x: « 3
+ ^4 »}
+
+ @e{1 + x: « 3
+ ^ ^+ 4 »}
+
+ @e{1 + x: 3
+ ^ ^ ^+ 4}
+
+ @e{x: 1 + x: 3
+ ^ ^ ^ ^+ 4}
+
+ @e{x: 1 + x | 3
+ ^ ^ ^ ^+ 4}
+
+ @e{define analyze(n):
+ if n == 0
+ | printf("zero\n")
+ ^ ^ ^printf}
+
+ @e{define go():
+ define more(m):
+ if m == 0 | "done"
+ ^|}
+
+ @e{define go():
+ define more(m):
+ if m == 0 | "done"
+ ^ ^ ^ ^x}
+
+
+ @e{define approx_thunk(x):
+ match x
+ | something(v): lambda
+ | (): v
+ ^ ^ ^| (n): v+n}
+
+ @e{define colors:
+ list(
+ ^red,}
+
+ @e{define colors:
+ list(
+ red,
+ ^green,}
+
+ @e{define colors:
+ list(
+ red
+ ^)}
+
+ @e{define colors:
+ list(red,
+ ^green,}
+
+ @e{this is a \
+ !ong linear group}
+
+ @e{this is a \
+ very long linear group \
+ !hat spans multiple lines}
+
+ @e{this is | one: a \
+ long result
+ ^ ^| two}
+
+ @e{this is | one: a \
+ long result
+ | two \
+ !lso long}
+
+ @e{this is a group \ with (a,
+ ^nested}
+
+ @e{this is a group \
+ with (a,
+ ^nested}
+
+ @e{this is a group \
+ with (a,
+ \
+ ^nested}
+
+ @e{hello | a | c\
+ :
+ ^ ^ ^d}
+
+ @e{nonsense:
+ hello | there 4.5
+ | more f(8)
+ next (ok |
+ ^|}
+
+ @e{nonsense:
+ hello | there 4.5
+ | more f(8)
+ next (ok |
+ | stuff: (begin:
+ more
+ ^ ^things}
+
+ @e{x + w | z : :
+ ^ ^ ^ ^y}
+
+ @e{x something | a
+ y:
+ w:
+ ^ ^ ^ ^q}
+
+ @e{z: | x
+ ^ ^y}
+
+ @e{z:
+ | x
+ ^ ^y}
+
+ @e|{|z {
+ | x
+ ^y|}|
+
+ @e{define fib(n):
+ match n
+ | 0
+ ^ ^ ^: 0}
+
+ @e{+ `(
+ ^x}
+
+ @e{+ `(
+ x
+ ^)}
+
+ @e{+ x`(
+ ^x}
+
+ @e{+ x`(
+ x
+ ^)}
+
+ @e{(match v
+ | cons(bv, av):
+ define values(is_a_match, ar):
+ ¿a_pred(av)
+ if is_a_match
+ ^ ^| define}
+
+ @e{a |« b»
+ ^ c}
+
+ )
+
+(when failed?
+ (error "failed"))
diff --git a/shrubbery/tests/parse.rkt b/shrubbery/tests/parse.rkt
new file mode 100644
index 000000000..2b2fc517a
--- /dev/null
+++ b/shrubbery/tests/parse.rkt
@@ -0,0 +1,1683 @@
+#lang racket/base
+(require "../parse.rkt"
+ "../print.rkt"
+ "../write.rkt"
+ racket/pretty)
+
+(define input1
+#< (a.x == b.x && a.y == b.y)
+ define hash(a):
+ 17
+ define secondary_hash(a):
+ 19
+
+struct posn(x, y):
+ property prop_equal_and_hash:
+ let (hc = lambda (a: posn, hc):
+ hc(a.x) + hc(a.y),
+ eql = lambda (a: posn, b: posn, eql):
+ eql(a.x, b.x) && eql(a.y, b.y)):
+ values(eql, hc, hc)
+
+struct posn(x, y):
+ property prop_equal_and_hash:
+ let (hc = lambda (a: posn, hc):
+ hc(a.x) + hc(a.y)):
+ (lambda (a: posn, b: posn, eql):
+ eql(a.x, b.x) && eql(a.y, b.y),
+ hc,
+ hc)
+
+// Another possibile approach to syntax for `struct`:
+struct posn:
+ fields:
+ x mutable
+ y = 7
+ methods equality:
+ define equal(a, b):
+ is_posn(b) => (a.x == b.x && a.y == b.y)
+ define hash(a):
+ 17
+ define secondary_hash(a):
+ 19
+ property prop_quality: "high"
+
+define fourth(n :: Integer):
+ define m: n*n
+ define v: m*m
+ printf("~a^4 = ~a\n", n, v)
+ v
+
+define exp(n :: Integer, ~base: base = 2.718281828459045):
+ if (n == 1)
+ | base
+ | base * exp(n-1, ~base: base)
+
+define positive_p(n): if n > 0 | true | false
+
+define go():
+ define helper(n):
+ list(n, n)
+ define more(m):
+ if m == 0 | "done"
+ | more(m - 1)
+ helper(more(9))
+
+define approx(x):
+ match x
+ | something(v):
+ printf("got it\n")
+ v
+ | nothing: 0
+
+// With two `:`s on one line, there's no way to
+// add to the first `:`
+define approx_thunk(x):
+ match x
+ | something(v): lambda (): v
+ | nothing: lambda (): 0
+
+// Enough indentation for `v` means that it continues the
+// implicit second `:`, so the `lambda` body has `v`:
+define approx_thunk(x):
+ match x
+ | something(v): lambda ():
+ v
+ | nothing: lambda (): 0
+
+define approx_thunk(x):
+ match x
+ | something(v): lambda
+ | (): v
+ | (n): v+n
+ | nothing: lambda
+ | (): 0
+ | (n): n
+
+define curried:
+ lambda (x):
+ lambda (y):
+ lambda (z):
+ list(x, y, z)
+
+define curried: lambda (x):
+ lambda (y):
+ lambda (z):
+ list(x, y, z)
+
+define dictionary: dict:
+ foo: 17
+ bar: string
+ baz: true
+
+define colors:
+ list(
+ red,
+ green,
+ blue,
+ orange,
+ )
+
+define f(x_something,
+ y_something_else,
+ z_also_long_name):
+ 5
+
+define sum(l):
+ let loop(l = l):
+ if is_null(l)
+ | 0
+ | first(l) + loop(rest(l))
+
+define show_all(l):
+ for (x = in_list(l)):
+ print(x)
+ newline()
+
+define show_zip(l, l2):
+ for (x = in_list(l),
+ x2 = in_list(l2)):
+ print(x)
+ print_string(" ")
+ print(x2)
+ newline()
+
+define show_combos_not_same(l, l2):
+ for (x = in_list(l)):
+ then (x2 = in_list(l2)):
+ when !is_equal(x, x2):
+ print(x)
+ print_string(" ")
+ print(x2)
+ newline()
+
+define map(f, l):
+ for list (x = in_list(l)):
+ f(x)
+
+define partition(l, pred):
+ for fold (yes = empty,
+ no = empty,
+ result (reverse(yes), reverse(no))):
+ with (x = in_list(l)):
+ if pred(x)
+ | (cons(x, yes), no)
+ | (yes, cons(x, no))
+
+local:
+ with:
+ define x: 1
+ define y: 2
+ in:
+ x+y
+
+if t | if f | a | b | y
+if t |« if f | a | b » | y
+if t |« tag: if f | a | b » | y
+if t |« tag: «if f | a | b » » | y
+
+x: y: a; b ; c
+x: y:« a; b »; c
+
+if t | x | y; z
+if t | x |« y »; z
+
+x:«
+ #//
+ 2
+ 3 »
+
+branch |« x»;
+INPUT
+)
+
+(define expected1
+ '(top
+ (group
+ let
+ (parens (group x (op =) 1) (group y (op =) 2))
+ (block (group x (op +) y)))
+ (group
+ let
+ (parens (group x (op =) 1) (group y (op =) 2))
+ (block (group x (op +) y)))
+ (group define pi (block (group 3.14)))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group log_error (parens (group "fib called")))
+ (group
+ cond
+ (alts
+ (block (group n (op ==) 0 (block (group 0))))
+ (block (group n (op ==) 1 (block (group 1))))
+ (block
+ (group
+ else
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ (alts
+ (block (group fib (parens (group 0)) (block (group 0))))
+ (block (group fib (parens (group 1)) (block (group 1))))
+ (block
+ (group
+ fib
+ (parens (group n))
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))
+ (group
+ define
+ (alts
+ (block (group fib (parens (group 0)) (block (group 0))))
+ (block (group fib (parens (group 1)) (block (group 1))))
+ (block
+ (group
+ fib
+ (parens (group n))
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))
+ (group
+ define
+ fib
+ (block
+ (group
+ lambda
+ (parens (group n))
+ (block
+ (group
+ cond
+ (alts
+ (block (group n (op ==) 0 (block (group 0))))
+ (block (group n (op ==) 1 (block (group 1))))
+ (block
+ (group
+ else
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))))
+ (group
+ define
+ fib
+ (block
+ (group
+ lambda
+ (parens (group n))
+ (block
+ (group
+ cond
+ (alts
+ (block (group n (op ==) 0 (block (group 0))))
+ (block (group n (op ==) 1 (block (group 1))))
+ (block
+ (group
+ else
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))))
+ (group
+ (parens
+ (group
+ define
+ fib
+ (block
+ (group
+ (parens
+ (group
+ lambda
+ (parens (group n))
+ (block
+ (group
+ (parens
+ (group
+ cond
+ (alts
+ (block (group (parens (group n (op ==) 0)) (block (group 0))))
+ (block (group (parens (group n (op ==) 1)) (block (group 1))))
+ (block
+ (group
+ else
+ (block
+ (group
+ (parens
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0 (block (group 0))))
+ (block (group 1 (block (group 1))))
+ (block
+ (group
+ n
+ (block
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))))))
+ (group
+ define
+ make_adder
+ (parens (group n))
+ (block
+ (group
+ lambda
+ (parens (group m))
+ (block
+ (group printf (parens (group "adding to ~a\n") (group m)))
+ (group m (op +) n)))))
+ (group
+ define
+ analyze
+ (parens (group n))
+ (block
+ (group
+ if
+ n
+ (op ==)
+ 0
+ (alts
+ (block (group printf (parens (group "zero\n"))))
+ (block (group printf (parens (group "other\n"))))))
+ (group printf (parens (group "done\n")))))
+ (group
+ define
+ analyze
+ (parens (group n))
+ (block
+ (group
+ if
+ n
+ (op ==)
+ 0
+ (alts
+ (block
+ (group printf (parens (group "zero\n")))
+ (group printf (parens (group "done saying zero\n"))))
+ (block
+ (group printf (parens (group "other\n")))
+ (group printf (parens (group "done saying other\n"))))))))
+ (group struct posn (parens (group x) (group y)))
+ (group
+ struct
+ color_posn
+ (parens (group col))
+ (block (group extends posn) (group mutable)))
+ (group
+ struct
+ posn
+ (parens (group x mutable) (group y (op =) 7))
+ (block
+ (group
+ methods
+ equality
+ (block
+ (group
+ define
+ equal
+ (parens (group a) (group b))
+ (block
+ (group
+ is_posn
+ (parens (group b))
+ (op =>)
+ (parens
+ (group
+ a
+ (op |.|)
+ x
+ (op ==)
+ b
+ (op |.|)
+ x
+ (op &&)
+ a
+ (op |.|)
+ y
+ (op ==)
+ b
+ (op |.|)
+ y)))))
+ (group define hash (parens (group a)) (block (group 17)))
+ (group define secondary_hash (parens (group a)) (block (group 19)))))))
+ (group
+ struct
+ posn
+ (parens (group x) (group y))
+ (block
+ (group
+ property
+ prop_equal_and_hash
+ (block
+ (group
+ let
+ (parens
+ (group
+ hc
+ (op =)
+ lambda
+ (parens (group a (block (group posn))) (group hc))
+ (block
+ (group
+ hc
+ (parens (group a (op |.|) x))
+ (op +)
+ hc
+ (parens (group a (op |.|) y)))))
+ (group
+ eql
+ (op =)
+ lambda
+ (parens
+ (group a (block (group posn)))
+ (group b (block (group posn)))
+ (group eql))
+ (block
+ (group
+ eql
+ (parens (group a (op |.|) x) (group b (op |.|) x))
+ (op &&)
+ eql
+ (parens (group a (op |.|) y) (group b (op |.|) y))))))
+ (block (group values (parens (group eql) (group hc) (group hc)))))))))
+ (group
+ struct
+ posn
+ (parens (group x) (group y))
+ (block
+ (group
+ property
+ prop_equal_and_hash
+ (block
+ (group
+ let
+ (parens
+ (group
+ hc
+ (op =)
+ lambda
+ (parens (group a (block (group posn))) (group hc))
+ (block
+ (group
+ hc
+ (parens (group a (op |.|) x))
+ (op +)
+ hc
+ (parens (group a (op |.|) y))))))
+ (block
+ (group
+ (parens
+ (group
+ lambda
+ (parens
+ (group a (block (group posn)))
+ (group b (block (group posn)))
+ (group eql))
+ (block
+ (group
+ eql
+ (parens (group a (op |.|) x) (group b (op |.|) x))
+ (op &&)
+ eql
+ (parens (group a (op |.|) y) (group b (op |.|) y)))))
+ (group hc)
+ (group hc)))))))))
+ (group
+ struct
+ posn
+ (block
+ (group fields (block (group x mutable) (group y (op =) 7)))
+ (group
+ methods
+ equality
+ (block
+ (group
+ define
+ equal
+ (parens (group a) (group b))
+ (block
+ (group
+ is_posn
+ (parens (group b))
+ (op =>)
+ (parens
+ (group
+ a
+ (op |.|)
+ x
+ (op ==)
+ b
+ (op |.|)
+ x
+ (op &&)
+ a
+ (op |.|)
+ y
+ (op ==)
+ b
+ (op |.|)
+ y)))))
+ (group define hash (parens (group a)) (block (group 17)))
+ (group define secondary_hash (parens (group a)) (block (group 19)))))
+ (group property prop_quality (block (group "high")))))
+ (group
+ define
+ fourth
+ (parens (group n (op ::) Integer))
+ (block
+ (group define m (block (group n (op *) n)))
+ (group define v (block (group m (op *) m)))
+ (group printf (parens (group "~a^4 = ~a\n") (group n) (group v)))
+ (group v)))
+ (group
+ define
+ exp
+ (parens
+ (group n (op ::) Integer)
+ (group #:base (block (group base (op =) 2.718281828459045))))
+ (block
+ (group
+ if
+ (parens (group n (op ==) 1))
+ (alts
+ (block (group base))
+ (block
+ (group
+ base
+ (op *)
+ exp
+ (parens (group n (op -) 1) (group #:base (block (group base))))))))))
+ (group
+ define
+ positive_p
+ (parens (group n))
+ (block
+ (group if n (op >) 0 (alts (block (group true)) (block (group false))))))
+ (group
+ define
+ go
+ (parens)
+ (block
+ (group
+ define
+ helper
+ (parens (group n))
+ (block (group list (parens (group n) (group n)))))
+ (group
+ define
+ more
+ (parens (group m))
+ (block
+ (group
+ if
+ m
+ (op ==)
+ 0
+ (alts
+ (block (group "done"))
+ (block (group more (parens (group m (op -) 1))))))))
+ (group helper (parens (group more (parens (group 9)))))))
+ (group
+ define
+ approx
+ (parens (group x))
+ (block
+ (group
+ match
+ x
+ (alts
+ (block
+ (group
+ something
+ (parens (group v))
+ (block (group printf (parens (group "got it\n"))) (group v))))
+ (block (group nothing (block (group 0))))))))
+ (group
+ define
+ approx_thunk
+ (parens (group x))
+ (block
+ (group
+ match
+ x
+ (alts
+ (block
+ (group
+ something
+ (parens (group v))
+ (block (group lambda (parens) (block (group v))))))
+ (block
+ (group nothing (block (group lambda (parens) (block (group 0))))))))))
+ (group
+ define
+ approx_thunk
+ (parens (group x))
+ (block
+ (group
+ match
+ x
+ (alts
+ (block
+ (group
+ something
+ (parens (group v))
+ (block (group lambda (parens) (block (group v))))))
+ (block
+ (group nothing (block (group lambda (parens) (block (group 0))))))))))
+ (group
+ define
+ approx_thunk
+ (parens (group x))
+ (block
+ (group
+ match
+ x
+ (alts
+ (block
+ (group
+ something
+ (parens (group v))
+ (block
+ (group
+ lambda
+ (alts
+ (block (group (parens) (block (group v))))
+ (block (group (parens (group n)) (block (group v (op +) n)))))))))
+ (block
+ (group
+ nothing
+ (block
+ (group
+ lambda
+ (alts
+ (block (group (parens) (block (group 0))))
+ (block (group (parens (group n)) (block (group n)))))))))))))
+ (group
+ define
+ curried
+ (block
+ (group
+ lambda
+ (parens (group x))
+ (block
+ (group
+ lambda
+ (parens (group y))
+ (block
+ (group
+ lambda
+ (parens (group z))
+ (block (group list (parens (group x) (group y) (group z)))))))))))
+ (group
+ define
+ curried
+ (block
+ (group
+ lambda
+ (parens (group x))
+ (block
+ (group
+ lambda
+ (parens (group y))
+ (block
+ (group
+ lambda
+ (parens (group z))
+ (block (group list (parens (group x) (group y) (group z)))))))))))
+ (group
+ define
+ dictionary
+ (block
+ (group
+ dict
+ (block
+ (group foo (block (group 17)))
+ (group bar (block (group string)))
+ (group baz (block (group true)))))))
+ (group
+ define
+ colors
+ (block
+ (group
+ list
+ (parens (group red) (group green) (group blue) (group orange)))))
+ (group
+ define
+ f
+ (parens
+ (group x_something)
+ (group y_something_else)
+ (group z_also_long_name))
+ (block (group 5)))
+ (group
+ define
+ sum
+ (parens (group l))
+ (block
+ (group
+ let
+ loop
+ (parens (group l (op =) l))
+ (block
+ (group
+ if
+ is_null
+ (parens (group l))
+ (alts
+ (block (group 0))
+ (block
+ (group
+ first
+ (parens (group l))
+ (op +)
+ loop
+ (parens (group rest (parens (group l))))))))))))
+ (group
+ define
+ show_all
+ (parens (group l))
+ (block
+ (group
+ for
+ (parens (group x (op =) in_list (parens (group l))))
+ (block (group print (parens (group x))) (group newline (parens))))))
+ (group
+ define
+ show_zip
+ (parens (group l) (group l2))
+ (block
+ (group
+ for
+ (parens
+ (group x (op =) in_list (parens (group l)))
+ (group x2 (op =) in_list (parens (group l2))))
+ (block
+ (group print (parens (group x)))
+ (group print_string (parens (group " ")))
+ (group print (parens (group x2)))
+ (group newline (parens))))))
+ (group
+ define
+ show_combos_not_same
+ (parens (group l) (group l2))
+ (block
+ (group
+ for
+ (parens (group x (op =) in_list (parens (group l))))
+ (block
+ (group
+ then
+ (parens (group x2 (op =) in_list (parens (group l2))))
+ (block
+ (group
+ when
+ (op !)
+ is_equal
+ (parens (group x) (group x2))
+ (block
+ (group print (parens (group x)))
+ (group print_string (parens (group " ")))
+ (group print (parens (group x2)))
+ (group newline (parens))))))))))
+ (group
+ define
+ map
+ (parens (group f) (group l))
+ (block
+ (group
+ for
+ list
+ (parens (group x (op =) in_list (parens (group l))))
+ (block (group f (parens (group x)))))))
+ (group
+ define
+ partition
+ (parens (group l) (group pred))
+ (block
+ (group
+ for
+ fold
+ (parens
+ (group yes (op =) empty)
+ (group no (op =) empty)
+ (group
+ result
+ (parens
+ (group reverse (parens (group yes)))
+ (group reverse (parens (group no))))))
+ (block
+ (group
+ with
+ (parens (group x (op =) in_list (parens (group l))))
+ (block
+ (group
+ if
+ pred
+ (parens (group x))
+ (alts
+ (block
+ (group
+ (parens (group cons (parens (group x) (group yes))) (group no))))
+ (block
+ (group
+ (parens
+ (group yes)
+ (group cons (parens (group x) (group no))))))))))))))
+ (group
+ local
+ (block
+ (group
+ with
+ (block
+ (group define x (block (group 1)))
+ (group define y (block (group 2)))))
+ (group in (block (group x (op +) y)))))
+ (group if t (alts (block (group if f)) (block (group a)) (block (group b)) (block (group y))))
+ (group if t (alts (block (group if f (alts (block (group a)) (block (group b))))) (block (group y))))
+ (group if t (alts (block (group tag (block (group if f (alts (block (group a)) (block (group b))))))) (block (group y))))
+ (group if t (alts (block (group tag (block (group if f (alts (block (group a)) (block (group b))))))) (block (group y))))
+ (group x (block (group y (block (group a) (group b) (group c)))))
+ (group x (block (group y (block (group a) (group b))) (group c)))
+ (group if t (alts (block (group x)) (block (group y) (group z))))
+ (group if t (alts (block (group x)) (block (group y))))
+ (group z)
+ (group x (block (group 3)))
+ (group branch (alts (block (group x))))))
+
+
+(define input2
+#< body
+ more
+
+x something | a
+ y:
+ w:
+ q
+ | c
+ z
+ & b
+
+x | indentize
+ y
+ z
+ | indentize:
+ y
+
+define fib(n):
+ match n
+ | 0
+ : 0
+ | 1
+ : 1
+ | n
+ : fib(n-1)
+ + fib(n-2)
+ more
+
+begin:
+ dictionary = [
+ "foo" : 17,
+ "bar" : "string",
+ "baz" : #true
+ ]
+
+cond | null: 0 | list(x): 1
+ | cons(a, d): f(a) + f(d)
+INPUT
+ )
+
+(define expected2
+ '(top
+ (group somthing else (block (group 8)))
+ (group this is a very long linear group that spans multiple lines)
+ (group
+ this
+ is
+ (alts
+ (block (group one (block (group a long result))))
+ (block (group two also long))
+ (block (group three (block (group result nested here))))))
+ (group this is the first group)
+ (group this is the second group)
+ (group this is a group with (parens (group a) (group nested) (group list)))
+ (group this is a group with (parens (group a) (group nested) (group list)))
+ (group this is a group with (parens (group a) (group nested) (group list)))
+ (group this is a group with (parens (group a) (group nested) (group list)))
+ (group hello (alts (block (group a)) (block (group c (block (group d))))))
+ (group this (block (group is more) (group foo)))
+ (group
+ foo
+ (alts (block (group more)) (block (group again (block (group sub))))))
+ (group something (op +))
+ (group more stuff)
+ (group something (block (group more stuff)))
+ (group
+ define
+ (alts
+ (block (group fib (parens (group 0)) (op =) 0))
+ (block (group fib (parens (group 1)) (op =) 1))
+ (block
+ (group fib (parens (group n)) (op =))
+ (group
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))))
+ (group
+ define
+ (alts
+ (block (group fib (parens (group 0)) (op =) 0))
+ (block (group fib (parens (group 1)) (op =) 1))
+ (block
+ (group
+ fib
+ (parens (group n))
+ (op =)
+ fib
+ (parens (group n (op -) 1))
+ (op +)
+ fib
+ (parens (group n (op -) 2))))
+ (block (group more))))
+ (group
+ nonsense
+ (block
+ (group
+ hello
+ (alts
+ (block (group there 4.5))
+ (block
+ (group more f (parens (group 8)))
+ (group
+ next
+ (parens
+ (group
+ ok
+ (alts
+ (block)
+ (block
+ (group
+ stuff
+ (block
+ (group
+ (parens
+ (group begin (block (group more) (group things)))
+ (group separately)))
+ (group again))))))))
+ (group something))))))
+ (group q (block (group x) (group y)))
+ (group z)
+ (group q (block (group x) (group y)))
+ (group z)
+ (group q (block (group w (block (group x) (group y)))))
+ (group z)
+ (group x (op +) w (alts (block (group z (block (group (block (group y))))))))
+ (group
+ let
+ (alts
+ (block (group x (op =) 8))
+ (block
+ (group
+ y
+ (op =)
+ 9
+ (block (group 10) (group let (alts (block (group x (op =) 8)))))))
+ (block
+ (group
+ y
+ (op =)
+ 9
+ (block
+ (group 10)
+ (group show (parens (group x (op +) y)))
+ (group x (op -) y))))))
+ (group
+ letrec_syntax_and_values
+ (block
+ (group
+ (parens
+ (group m (op =) (parens (group syntax_rules (op ....))))
+ (group n (op =) (parens (group syntax_rules (op ....)))))
+ (block
+ (group (parens (group x (op =) 10) (group y (op =) 12)))
+ (group (op =>) body)
+ (group more)))))
+ (group
+ x
+ something
+ (alts
+ (block (group a) (group y (block (group w (block (group q))))))
+ (block (group c) (group z) (group (op &) b))))
+ (group
+ x
+ (alts
+ (block (group indentize) (group y) (group z))
+ (block (group indentize (block (group y))))))
+ (group
+ define
+ fib
+ (parens (group n))
+ (block
+ (group
+ match
+ n
+ (alts
+ (block (group 0) (group (block (group 0))))
+ (block (group 1) (group (block (group 1))))
+ (block
+ (group n)
+ (group
+ (block
+ (group fib (parens (group n (op -) 1)))
+ (group (op +) fib (parens (group n (op -) 2)))
+ (group more))))))))
+ (group
+ begin
+ (block
+ (group
+ dictionary
+ (op =)
+ (brackets
+ (group "foo" (block (group 17)))
+ (group "bar" (block (group "string")))
+ (group "baz" (block (group #t)))))))
+ (group
+ cond
+ (alts
+ (block (group null (block (group 0))))
+ (block (group list (parens (group x)) (block (group 1))))
+ (block
+ (group
+ cons
+ (parens (group a) (group d))
+ (block (group f (parens (group a)) (op +) f (parens (group d))))))))))
+
+(define input3
+#<)|
+@x|(<[{hello}|(<[@6 there}]>)|
+
+@x{ 2
+ 1
+ 4
+ 5 }
+
+@{4 @//{ this is a comment! }
+ 5 @// line comment
+ 6}
+
+The end
+INPUT
+ )
+
+(define expected3
+ '(top
+ (group (parens (group a) (group b)))
+ (group (parens (group a) (group b)) (parens (group 0)))
+ (group (parens (group 7)))
+ (group (parens (group (brackets (group "9")))))
+ (group (parens (group 7) (group (brackets (group "8, 10 ") (group "more")))))
+ (group
+ 5
+ (parens
+ (group 3)
+ (group
+ (brackets
+ (group "yohoo ")
+ (group
+ 9
+ (parens
+ (group a)
+ (group b)
+ (group c)
+ (group (brackets (group "this is plain text") (group "\n") (group "inside braces")))))
+ (group "\n")
+ (group "0")))))
+ (group bracketed (parens (group "apple")))
+ (group (braces (group bracketed)) (parens (group "apple")))
+ (group 1 2 3 (block (group 5)))
+ (group 1 2 3 (parens (group (brackets (group "data")))) (block (group 5)))
+ (group x (parens (group (brackets (group "hello}there")))))
+ (group x (parens (group (brackets (group "hello}there")))))
+ (group x (parens (group (brackets (group "hello}") (group 6) (group " there")))))
+ (group
+ x
+ (parens
+ (group
+ (brackets (group " 2") (group "\n") (group "1") (group "\n") (group " ") (group "4") (group "\n") (group " ") (group "5 ")))))
+ (group
+ (parens
+ (group (brackets (group "4 ") (group "\n") (group "5 ") (group " 6")))))
+ (group The end)))
+
+(define (check input expected)
+ (let ([in (open-input-string input)])
+ (define (out name parsed write)
+ (define path (build-path (find-system-path 'temp-dir) name))
+ (printf "~a\n" path)
+ (call-with-output-file*
+ path
+ #:exists 'truncate
+ (lambda (o) (write parsed o))))
+ (port-count-lines! in)
+ (define parsed-stx (parse-all in))
+ (define parsed (syntax->datum parsed-stx))
+ (unless (equal? expected parsed)
+ (out "expected" expected pretty-write)
+ (out "parsed" parsed pretty-write)
+ (error "parse failed"))
+ (define printed (shrubbery-syntax->string parsed-stx))
+ (unless (equal? input printed)
+ (out "expected" input display)
+ (out "printed" printed display)
+ (error "print failed"))
+ (define (check-reparse count-lines?)
+ (define reparsed (let ([o (open-output-bytes)])
+ (write-shrubbery parsed o)
+ (or (with-handlers ([exn:fail? (lambda (exn) (eprintf "~a\n" (exn-message exn)) #f)])
+ (define in (open-input-bytes (get-output-bytes o)))
+ (when count-lines?
+ (port-count-lines! in))
+ (syntax->datum (parse-all in)))
+ (begin
+ (out "wrote" (get-output-bytes o) displayln)
+ (error "parse of wrote failed")))))
+ (unless (equal? parsed reparsed)
+ (out "expected" parsed pretty-print)
+ (out "reparsed" reparsed pretty-print)
+ (error "print failed")))
+ (check-reparse #t)
+ (check-reparse #f)))
+
+(define (check-fail input rx)
+ (let ([in (open-input-string input)])
+ (port-count-lines! in)
+ (unless (with-handlers ([exn:fail? (lambda (exn) (regexp-match? rx (exn-message exn)))])
+ (parse-all in)
+ #f)
+ (error "failed to fail: ~s" input))))
+
+(define (lines s . ss)
+ (apply string-append s (for/list ([s (in-list ss)]) (string-append "\n" s))))
+
+(check input1 expected1)
+(check input2 expected2)
+(check input3 expected3)
+
+(check-fail "if t | «tag: if f | a | b» more | y" #rx"no terms allowed after `»`")
+(check-fail "x: y:« a; b » more; c" #rx"no terms allowed after `»`")
+
+(check-fail (lines "x"
+ " y")
+ #rx"wrong indentation")
+(check-fail (lines "1"
+ " + 2"
+ " + 3")
+ #rx"wrong indentation")
+(check-fail (lines "1: 2"
+ " + 3")
+ #rx"wrong indentation")
+(check-fail (lines "x | y | c"
+ " + 3")
+ #rx"wrong indentation")
+
+(check-fail "(«| a | c»)" #rx"misplaced `|`")
+(check-fail "z «|« « | y » » »" #rx"misplaced `|`")
+(check-fail "«|« w « | y » » »" #rx"misplaced `|`")
diff --git a/shrubbery/weird.shrb b/shrubbery/weird.shrb
new file mode 100644
index 000000000..129c9ecb8
--- /dev/null
+++ b/shrubbery/weird.shrb
@@ -0,0 +1,151 @@
+// A set of examples to see what happens with verious forms,
+// where many of them seem nonsensical
+
+somthing else: 8
+
+this is a \
+ very long linear group \
+ that spans multiple lines
+
+this is | one: a \
+ long result
+ | two \
+ also long
+ | three:
+ result \
+ nested \
+ here
+
+this is \
+ the first group
+this \ is \ the \ second \ group
+
+this is a group \ with (a,
+ nested,
+ list)
+
+this is a group \
+ with (a,
+ nested,
+ list)
+
+this is a group \
+ with (a,
+ \
+ nested,
+ \
+ list)
+
+this is a group \
+ with (a,
+ \
+ /* this a comment on `nested`: */
+ nested,
+ \
+ list)
+
+hello | a | c\
+ :
+ d
+
+this: \
+ is more
+ foo
+
+foo
+ | more \
+ | again:
+ sub
+
+something +
+more stuff
+
+something:
+ more stuff
+
+define
+ | fib(0) = 0
+ | fib(1) = 1
+ | fib(n) =
+ fib(n-1) + fib(n-2)
+
+define
+ | fib(0) = 0
+ | fib(1) = 1
+ | fib(n) = fib(n-1) + fib(n-2)
+ | more
+
+nonsense:
+ hello | there 4.5
+ | more f(8)
+ next (ok |
+ | stuff: (begin:
+ more
+ things
+ ,
+ separately)
+ again)
+ something
+
+q:
+ x; y
+z
+
+q:
+ x; y;
+z
+
+q:
+ w:
+ x; y;
+z
+
+x + w | z : :
+ y
+
+let | x = 8
+ | y = 9: 10
+ let | x = 8
+ | y = 9: 10
+ show(x + y)
+ x - y
+
+letrec_syntax_and_values:
+ (m = (syntax_rules ....),
+ n = (syntax_rules ....)):
+ (x = 10,
+ y = 12)
+ => body
+ more
+
+x something | a
+ y:
+ w:
+ q
+ | c
+ z
+ & b
+
+x | indentize
+ y
+ z
+ | indentize:
+ y
+
+define fib(n):
+ match n
+ | 0
+ : 0
+ | 1
+ : 1
+ | n
+ : fib(n-1)
+ + fib(n-2)
+ more
+
+begin:
+ dictionary = [
+ "foo" : 17,
+ "bar" : "string",
+ "baz" : #true
+ ]
diff --git a/shrubbery/write.rkt b/shrubbery/write.rkt
new file mode 100644
index 000000000..e8e36478b
--- /dev/null
+++ b/shrubbery/write.rkt
@@ -0,0 +1,92 @@
+#lang racket/base
+(require racket/keyword
+ racket/symbol)
+
+;; Writing a shubbery represented as an S-expression
+
+(provide write-shrubbery)
+
+(define rx:identifier #px"^(?:\\p{L}|_)(?:\\p{L}|\\p{N}|_)*$")
+
+(define (write-shrubbery v [op (current-output-port)])
+ (let loop ([v v] [sole? #t])
+ (cond
+ [(list? v)
+ (cond
+ [(null? v)
+ (error 'write-shubbery "unexpected ~s" v)]
+ [(eq? 'op (car v))
+ (cond
+ [(and (not sole?)
+ (memq (cadr v) '(... ¿)) )
+ (display "¿(? " op)
+ (display (cadr v) op)
+ (display ")" op)]
+ [else
+ (display (cadr v) op)])]
+ [(eq? 'alts (car v))
+ (display "" op)
+ (for/fold ([first? #t]) ([v (in-list (cdr v))])
+ (unless first? (display " " op))
+ (display "|« " op)
+ (unless (and (pair? v) (eq? (car v) 'block))
+ (error 'write-shubbery "unexpected ~s" v))
+ (for/fold ([first? #t]) ([v (in-list (cdr v))])
+ (unless first? (display "; " op))
+ (loop v #f)
+ #f)
+ (display " »" op)
+ #f)]
+ [(eq? 'top (car v))
+ (for/fold ([first? #t]) ([v (in-list (cdr v))])
+ (unless first? (display "; " op))
+ (loop v #t)
+ #f)
+ (void)]
+ [else
+ (define-values (open sep close)
+ (case (car v)
+ [(group) (values "" " " "")]
+ [(block) (values ":« " "; " " »")]
+ [(parens) (values "(" ", " ")")]
+ [(brackets) (values "[" ", " "]")]
+ [(braces) (values "{" ", " "}")]
+ [else (error 'write-shubbery "unexpected ~s" (car v))]))
+ (display open op)
+ (for/fold ([first? #t]) ([v (in-list (cdr v))])
+ (unless (or first?
+ (and (pair? v) (eq? (car v) 'block)))
+ (display sep op))
+ (loop v #f)
+ #f)
+ (display close op)])]
+ [(symbol? v)
+ (define s (symbol->immutable-string v))
+ (cond
+ [(regexp-match? rx:identifier s)
+ (display s op)]
+ [else
+ (display "#{")
+ (write v op)
+ (display "}")])]
+ [(keyword? v)
+ (define s (keyword->immutable-string v))
+ (cond
+ [(regexp-match? rx:identifier s)
+ (display "~" op)
+ (display s op)]
+ [else
+ (display "#{")
+ (write v op)
+ (display "}")])]
+ [(or (string? v)
+ (bytes? v)
+ (exact-integer? v)
+ (flonum? v))
+ (write v op)]
+ [(boolean? v)
+ (display (if v "#true" "#false") op)]
+ [else
+ (display "#{")
+ (write v op)
+ (display "}")])))