Skip to content

DSL ~ LARK Documentation

wzeng0 edited this page May 2, 2022 · 23 revisions

LARK is the parsing tool used by the domain specific language team to enable algorithmic, grammar based conversion of dsl files, which are largely written in plaintext, to a WDL++.

There are 5 Parts to creating a lark file:

  1. Writing the grammar
  2. Creating the parser
  3. Shaping the tree
  4. Evaluating the tree
  5. Optimizing

Writing the Grammar

Lark accepts its grammars in a format called EBNF:
Example of EBNF code can be found here: https://tomassetti.me/ebnf/

item: _NL "ITEM" id "IN" location property* action*
    | _NL "ITEM" id property* action*

Creating the Parser

To create a parser, all we need to do is to tell Lark to take a value

_NL: ( /(\r?\n[\t ]*)/ | COMMENT) +

%import .util.COMMENT
%ignore COMMENT

Shaping the Tree

Shaping the tree allows you to structure grammar inputs that would normally be filtered out such as DEC_NUMBER, "true", "false", etc... Examples and a more complex description of functionalities are included below.

    ?value: dict
          | array
          | noun verb "like" noun -> comparative
          | DEC_NUMBER      -> number
          | "true"             -> true
          | "false"            -> false
          | "null"             -> null

  • The little arrows represent aliases, which is a name for a specific part of the rule. In this case, we will name the true/false/null matches, and this way we won’t lose the information. We also alias SIGNED_NUMBER to mark it for later processing.
  • ?[value] tells the code to arrange the tree so that it only has one branch since it only has one member (value).
  • By turning the string terminal into a rule, we can have it appear as a branch of the tree so that it can be used in other places that are part of grammar (mainly in pair rule).

Evaluating the Tree

Optimizing

Example calculator built using LARK:

from lark import Lark, Transformer, v_args
 
try:
    input = raw_input
except NameError:
    pass
 
my_grammar_calc = """
    %import common.CNAME -> NAME
    %import common.NUMBER
    %import common.WS_INLINE
    %ignore WS_INLINE
 
    ?start: value -> number
    ?value: add
        | SIGNED_NUMBER -> assign_var
    ?add: sub
        | add "+" sub -> add
    ?sub: mult
        | sub "-" mult -> sub
    ?mult: div
        | mult "*" div -> mul
    ?div: neg
        | div "/" neg -> div
    ?neg: parenth
        | "-" parenth -> neg
    ?parenth: value
        |"(" value ")"
"""
 
@v_args(inline = True)
class my_calc_tree(Transformer):
    from operator import add, sub, mul, truediv as div, neg
    number = float
    def __init__(self):
        self.vars = {}
    def assign_var(self, nm, val):
        self.vars[name] = val
 
parse_my_calc = Lark(my_grammar_calc, parser = 'lalr', transformer=my_calc_tree())
calculator = parse_my_calc.parse
 
def test():
    print(calculator("a = 3+4"));
    print(calculator("a = 3/4"));
    print(calculator("a = 3*4"));
    print(calculator("a = 3-4"));
 
test()

Clone this wiki locally