DSL ~ LARK Documentation

LARK is the parsing tool used by the domain specific language team to enable algorithmic, grammar based conversion of dsl files, which are largely written in plaintext, to a WDL++.

There are 5 Parts to creating a lark file:

  1. Writing the grammar
  2. Creating the parser
  3. Shaping the tree
  4. Evaluating the tree
  5. Optimizing

Writing the Grammar

Lark accepts its grammars in a format called EBNF:
Example of EBNF code can be found here:

item: _NL "ITEM" id "IN" location property* action*
    | _NL "ITEM" id property* action*

Creating the Parser

To create a parser, all we need to do is to tell Lark to take a value

_NL: ( /(\r?\n[\t ]*)/ | COMMENT) +

%import .util.COMMENT
%ignore COMMENT

Shaping the Tree

Shaping the tree allows you to structure grammar inputs that would normally be filtered out such as DEC_NUMBER, "true", "false", etc... Examples and a more complex description of functionalities are included below.

    ?value: dict
          | array
          | noun verb "like" noun -> comparative
          | DEC_NUMBER      -> number
          | "true"             -> true
          | "false"            -> false
          | "null"             -> null

  • The little arrows represent aliases, which is a name for a specific part of the rule. In this case, we will name the true/false/null matches, and this way we won’t lose the information. We also alias SIGNED_NUMBER to mark it for later processing.
  • ?[value] tells the code to arrange the tree so that it only has one branch since it only has one member (value).
  • By turning the string terminal into a rule, we can have it appear as a branch of the tree so that it can be used in other places that are part of grammar (mainly in pair rule).

Evaluating the Tree


Example calculator built using LARK:

from lark import Lark, Transformer, v_args
    input = raw_input
except NameError:
my_grammar_calc = """
    %import common.CNAME -> NAME
    %import common.NUMBER
    %import common.WS_INLINE
    %ignore WS_INLINE
    ?start: value -> number
    ?value: add
        | SIGNED_NUMBER -> assign_var
    ?add: sub
        | add "+" sub -> add
    ?sub: mult
        | sub "-" mult -> sub
    ?mult: div
        | mult "*" div -> mul
    ?div: neg
        | div "/" neg -> div
    ?neg: parenth
        | "-" parenth -> neg
    ?parenth: value
        |"(" value ")"
@v_args(inline = True)
class my_calc_tree(Transformer):
    from operator import add, sub, mul, truediv as div, neg
    number = float
    def __init__(self):
        self.vars = {}
    def assign_var(self, nm, val):
        self.vars[name] = val
parse_my_calc = Lark(my_grammar_calc, parser = 'lalr', transformer=my_calc_tree())
calculator = parse_my_calc.parse
def test():
    print(calculator("a = 3+4"));
    print(calculator("a = 3/4"));
    print(calculator("a = 3*4"));
    print(calculator("a = 3-4"));

