Skip to content

DSL ~ LARK Documentation

wzeng0 edited this page May 2, 2022 · 23 revisions

LARK is the parsing tool used by the domain specific language team to enable algorithmic, grammar based conversion of dsl files, which are largely written in plaintext, to a WDL++.

There are 5 Parts to creating a lark file:

  1. Writing the grammar
  2. Creating the parser
  3. Shaping the tree
  4. Evaluating the tree
  5. Optimizing

Writing the Grammar

Lark accepts its grammars in a format called EBNF:
Example of EBNF code can be found here:

item: _NL "ITEM" id "IN" location property* action*
    | _NL "ITEM" id property* action*

Creating the Parser

To create a parser, all we need to do is to tell Lark to take a value

_NL: ( /(\r?\n[\t ]*)/ | COMMENT) +

%import .util.COMMENT
%ignore COMMENT

Shaping the Tree

  • The little arrows represent aliases, which is a name for a specific part of the rule. In this case, we will name the true/false/null matches, and this way we won’t lose the information. We also alias SIGNED_NUMBER to mark it for later processing.
  • ?[value] tells the code to arrange the tree so that it only has one branch since it only has one member (value).
  • We turned the ESCAPED_STRING terminal into a rule. This way it will appear in the tree as a branch. This is equivalent to aliasing (like we did for the number), but now string can also be used elsewhere in the grammar (namely, in the pair rule).

Evaluating the Tree


Example calculator built using LARK:

from lark import Lark, Transformer, v_args
    input = raw_input
except NameError:
my_grammar_calc = """
    %import common.CNAME -> NAME
    %import common.NUMBER
    %import common.WS_INLINE
    %ignore WS_INLINE
    ?start: value -> number
    ?value: add
        | SIGNED_NUMBER -> assign_var
    ?add: sub
        | add "+" sub -> add
    ?sub: mult
        | sub "-" mult -> sub
    ?mult: div
        | mult "*" div -> mul
    ?div: neg
        | div "/" neg -> div
    ?neg: parenth
        | "-" parenth -> neg
    ?parenth: value
        |"(" value ")"
@v_args(inline = True)
class my_calc_tree(Transformer):
    from operator import add, sub, mul, truediv as div, neg
    number = float
    def __init__(self):
        self.vars = {}
    def assign_var(self, nm, val):
        self.vars[name] = val
parse_my_calc = Lark(my_grammar_calc, parser = 'lalr', transformer=my_calc_tree())
calculator = parse_my_calc.parse
def test():
    print(calculator("a = 3+4"));
    print(calculator("a = 3/4"));
    print(calculator("a = 3*4"));
    print(calculator("a = 3-4"));

Clone this wiki locally