Introduce SanLang language - Lexer, Parser and Interpreter #4002

IvanIvanoff · 2023-11-28T15:42:08Z

Changes

Note: For syntax and examples check the san_lang_test.exs file.

The san_lang_parser.erl and san_lang_lexer.erl files are autogenerated and they should not be reviewed or included in the repository. The .xrl and .yrl files are used to generate them.

Overview

SanLang is a a small interpreted language that can execute one-liners like flat_map(map_keys(@projects), fn slug -> @projects[slug]["github_organizations"].

To improve the templating engine capabilities for Queries 2.0 we introduce SanLang -- an interpreted language inspired by the Elixir syntax.

We want to provide the ability for small code snippets (one-liners in most cases) that extract and manipulate some data.

For example, if a user wants to add a text widget with About the author information, the user doesn't have to hardcode the email/twitter/telegram/etc. links, but can use code like Email: {{@owner["email"}}, Twitter: {{@owner.twitter_handle}}

The identifiers starting with @ are provided as environment bindings by the backend and the user has access to it without doing anything else.

Why a language?

We want to allow the users to write code that is executed on the backend. To allow this, we need to be very careful in what and how we allow it to be executed.

Doing String.split/Regex.scan/etc. parsing won't suffice, or it will be much more complicated and hard to maintain and debug.
Allowing users to write Elixir code will force either to analyze all the code for un-safe operations System.cmd/http calls/etc. and it can be hard to verify that it is indeed safe.

Using a separate language like python/lua/etc. will require us to add this language compiler/interpreter as a dependency and support inter-language compatibility.

Executing Elixir in a safe environment (container/jail/etc.) will also induce complexity.

Considering all these precautions, developing a new small language does not sound so terrible.

Technologies used

The SanLang language has three main components: lexer, parser, and interpreter.

The lexer and parser how the input is tokenized and parsed -- validating the syntax and building an abstract syntax tree.
The lexer and parser are written declaratively in leex and yecc. These are the Erlang equivalent of lex and yacc tools for LALR(1) parsing.
The lexer and parser together are ~120 lines of code, which includes support for: named functions, env vars, local vars, lambda functions, chained access operator, arithmetic operations.
The interpreter is written in Elixir and translates the AST to elixir code and executes it.
The interpreter produces Elixir values as result, which makes it trivial to use the result in the backend without any transformations.

Language overview

The following are valid SanLang expressions:

Literals evaluate to themselves: 1, "string", 3.14;
Special boolean literals true and false;
Basic arithmetic with proper precedence: 1 + 2*3 + 10 evaluates to 17;
Named functions with literal arguments: pow(10,18), div(6,4) (for integer division);
Access to environment variables that are provided by the execution environment: @projects
Access operator, map function and lambda functions for working with this environment variables. See below for more examples.
Access operator that can be chained: @projects["santiment"]["main_contract_address"]["decimals"]
Comparisons operators: 1 == 1, 1 != 2, 1 > 2, 1 < 2, 1 >= 2, 1 <= 2;
Boolean operators and and or: true and false, true or false;
Proper precedence of boolean/comparison/arithmetic operators: 5 + 6 < 10, pow(2, 10) - 1 < 1024 and pow(2,10) + 1 > 1024.

Examples:

Get the list of all slugs from the @projects map:
map_keys(@projects)
Get the token decimals for sentiment:
`@projects["sentiment"]["main_contract_address"]["decimals"]
Get all github organizations of all projects in a list:
flat_map(map_keys(@projects), fn slug -> @projects[slug]["github_organizations"] end)
Get the email address of the owner of the dashboard:
@owner["email"]
filter(@data, fn x -> x > 1 and x < 10 end)
See san_lang_test.exs for more examples.

Ticket

Checklist:

I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have tried to find clearer solution before commenting hard-to-understand parts of code
I have added tests that prove my fix is effective or that my feature works

tzanko-matev · 2023-12-01T15:10:59Z

Don't forget to add an Academy article about Sanlang

IvanIvanoff force-pushed the san-lang branch from ed188d9 to a2a2692 Compare November 30, 2023 09:48

IvanIvanoff changed the title ~~San lang~~ Implement lexer, parser and interpreter for SanLang Nov 30, 2023

IvanIvanoff changed the title ~~Implement lexer, parser and interpreter for SanLang~~ Introduce SanLang language - Lexer, Parser and Interpreter Dec 1, 2023

IvanIvanoff force-pushed the san-lang branch from 9154f95 to 8662863 Compare December 1, 2023 14:19

IvanIvanoff added 12 commits December 5, 2023 16:07

Update absinthe in order to relax the nimble_parsec version requirement

faac517

Start implementing SanLang

1a1c656

Support lambda functions in function calls arguments for SanLang

3c546e0

Add SanLang Parse/Interpreter/Kernel

cce6b78

Implement pow named function call evaluation

3591641

Backup

1849fa7

Implement map in SanLang

e17f418

Add flat_map, access operator by identifier, tests

69b5aae

Add the ./src dir to Dockerfile

346a1b5

Add the autogenerated SanLang .erl files to gitignore

ad0736a

Rename some san lang lexer rules

6ca1a64

Rework SanLang to use the precedence yacc feature

8d5f121

IvanIvanoff force-pushed the san-lang branch from 94f9a60 to 8d5f121 Compare December 5, 2023 15:30

backup

e3e0185

IvanIvanoff requested a review from tspenov December 6, 2023 14:26

Fix SanLang precedence handling. Add booleans and comparisons

b5de55c

IvanIvanoff force-pushed the san-lang branch from 8525c43 to b5de55c Compare December 6, 2023 14:27

IvanIvanoff marked this pull request as ready for review December 6, 2023 14:33

IvanIvanoff merged commit 91a1c97 into master Dec 7, 2023

delete-merged-branch bot deleted the san-lang branch December 7, 2023 09:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce SanLang language - Lexer, Parser and Interpreter #4002

Introduce SanLang language - Lexer, Parser and Interpreter #4002

IvanIvanoff commented Nov 28, 2023 •

edited

Loading

tzanko-matev commented Dec 1, 2023

Introduce SanLang language - Lexer, Parser and Interpreter #4002

Introduce SanLang language - Lexer, Parser and Interpreter #4002

Conversation

IvanIvanoff commented Nov 28, 2023 • edited Loading

Changes

Overview

Why a language?

Technologies used

Language overview

Ticket

Checklist:

tzanko-matev commented Dec 1, 2023

IvanIvanoff commented Nov 28, 2023 •

edited

Loading