diff --git a/docs/user_guide.rst b/docs/user_guide.rst index 041c508a..d68784c9 100644 --- a/docs/user_guide.rst +++ b/docs/user_guide.rst @@ -284,3 +284,101 @@ BNFC adds the grammar name as a file extension. So if the grammar file is named ``Calc.cf``, the lexer will be associated to the file extension ``.calc``. To associate other file extensions to a generated lexer, you need to modify (or subclass) the lexer. + +Python Backend +=============== + +The BNF Converter's Python Backend generates a Python frontend, that uses +Antlr4, to parse input into an AST (abstract syntax tree). + +The python package Antlr4, the jar for Antlr4 and Python 3.10 or higher is needed. + +Example usage: :: + + bnfc --python -m Calc.cf + + +.. list-table:: The result is a set of files: + :widths: 25 25 + :header-rows: 1 + + * - Filename + - Description + * - bnfcPyGenCalc/CalcLexer.g4 + - Provides the grammar for the lexer. + * - bnfcPyGenCalc/CalcParser.g4 + - Provides the grammar for the parser. + * - bnfcPyGenCalc/Absyn.py + - Provides the classes for the abstract syntax. + * - bnfcPyGenCalc/PrettyPrinter.py + - Provides printing for both the AST and the linearized tree. + * - genTest.py + - A ready test-file, that uses the generated frontend to convert input into an AST. + * - skele.py + - Provides skeleton code to deconstruct an AST, using structural pattern matching. + * - Makefile + - The makefile, which uses an Antlr jar file to produce the lexer and parser for Python. + +Make sure the jar for Antlr is accessible from the generated makefile and +run the makefile. For example, on linux, one can export the following +variable from ``.profile``: + +``export ANTLR="$HOME/Downloads/antlr/antlr-4.13.2-complete.jar"`` + +Subsequently run ``make``. The generated lexer and parser is placed inside the +folder used above. + +Testing the frontend +.................... + +It's possible to pipe input, like:: + + echo "(1 + 2) * 3" | python3 genTest.py + +or:: + + python3 genTest.py < file.txt + +and it's possible to just use an argument:: + + python3 genTest.py file.txt + + +Caveats +....... + +Maximum elements for hand-made lists: + If one defines custom rules for lists, such as:: + + (:) [C] ::= 'a' C 'b' [C] 'c' + + the Python backend can not simplify the rule for an iterative approach + for the parser, meaning at most 1000 elements can be parsed - or a maximum + recursion depth will be thrown. Using the terminal or separator pragmas + should work fine. + +Skeleton code for using lists as entrypoints: + Matchers for using lists, such as [Exp], are not generated in the + skeleton code as it may confuse users if the grammar uses several different + list categories, as a user may then try to pattern match lists without + checking what type the elements have. Users are instead encouraged to use + non-list entrypoints. + +Several entrypoints: + The testfile genTest.py only uses the first entrypoint used by default. + +Using multiple separators: + Using multiple separators for the same category, such as below, generates + Python functions with overlapping names, causing runtime errors.:: + + separator Exp1 "," ; + separator Exp1 ";" ; + +Results from the parameterized tests: + One error among the regression tests are reported: the Java BNFC example + grammar contains mutually left recursive rules. + +Escaped characters in haskell-hcr: + Attempting to parse ParCore.hcr from the BNFC example grammar + haskell-hcr yield errors for escaped characters. + diff --git a/document/BNF_Converter_Python_Mode.html b/document/BNF_Converter_Python_Mode.html new file mode 100644 index 00000000..ce939979 --- /dev/null +++ b/document/BNF_Converter_Python_Mode.html @@ -0,0 +1,230 @@ + + + + BNF Converter Python Mode + + + +
+

BNF Converter

+

Python Mode

+
+

By Björn Werner

+ +

2024

+

+ The BNF Converter's Python Backend generates a Python frontend, that uses + Antlr4, to parse input into an AST (abstract syntax tree). +

+

+ BNFC on Github:
+ https://github.com/BNFC/bnfc +

+

+ Antlr on Github:
+ https://github.com/antlr/antlr4 +

+

+ Requirements are: the jar file for ANTLRv4, the Python package antlr4, and + Python 3.10 or higher. +

+

Usage

+
+ bnfc --python -m NAME.cf
+
+

+There should now exist the following files: +

+ + + + + + + + + + + + + + + + + + + + + + +
Filename:Description:
bnfcPyGenNAME/NAMELexer.g4Provides the grammar for the lexer.
bnfcPyGenNAME/NAMEParser.g4Provides the grammar for the parser.
bnfcPyGenNAME/Absyn.pyProvides the classes for the abstract syntax.
bnfcPyGenNAME/PrettyPrinter.pyProvides printing for both the AST and the linearized tree.
genTest.pyA ready test-file, that uses the generated frontend to convert input into an AST.
skele.pyProvides skeleton code to deconstruct an AST, using structural pattern matching.
+

+Make sure the jar for Antlr is accessible from the generated makefile and run the makefile. The generated lexer and parser is placed inside the folder used above. +

+

+ For example, on linux, export the following variable from .profile: +

+
+export ANTLR="$HOME/Downloads/antlr/antlr-4.13.2-complete.jar"
+
+

+ After that it should be possible to run the makefile: +

+
+ make
+
+

Testing the frontend

+

+ The following example uses a frontend that is generated from a C-like grammar. +

+

+ $ python3 genTest.py < hello.c +

+

+ Parse Successful!
+
+ [Abstract Syntax]
+ (PDefs [(DFun Type_int "main" [] [(SExp (EApp "printString" [(EString "Hello world")])), (SReturn (EInt 0))])])
+
+ [Linearized Tree]
+ int main ()
+ {
+  printString ("Hello world");
+  return 0;
+ }
+

+

The Abstract Syntax Tree

+

+ The AST is built up using instances of Python classes, using the dataclass decorator, such as: +

+

+@dataclass
+class EAdd:
+ exp_1: Exp
+ exp_2: Exp
+ _ann_type: _AnnType = field(default_factory=_AnnType) +

+

+ The "_ann_type" variable is a placeholder that can be used to store useful information, + for example type-information in order to create a type-annotated AST. +

+

Using the skeleton file

+

+ The skeleton file serves as a template, to create an interpreter for example. + Two different types of matchers are generated: the first with all the value + categories together, and a second type where each matcher only has one + individual value category, as in the example below: +

+

+def matcherExp(exp_: Exp):
+ match exp_:
+  case EAdd(exp_1, exp_2, _ann_type):
+   # Exp "+" Exp1
+   raise Exception('EAdd not implemented')
+  case ESub(exp_1, exp_2, _ann_type):
+   ... +

+

+ This can be modified, in order to return the addition of each evaluated argument + category, into: +

+

+ def matcherExp(exp_: Exp):
+  match exp_:
+   case EAdd(exp_1, exp_2, _ann_type):
+    # Exp "+" Exp1
+    return matcherExp(exp_1) + matcherExp(exp_2)
+   case ESub(exp_1, exp_2, _ann_type):
+    ... +

+

+ The function can now be imported and used in the generated test file + (similarly to how the pretty printer is imported and used): +

+

+ from skele import matcherExp
+ ...
+ print(matcherExp(ast)) +

+ +

Known issues

+

+ Maximum elements for hand-made list rules: +

+

+ If one defines custom rules for lists, such as: +

+

+ (:) [C] ::= 'a' C 'b' [C] 'c' +

+

+ the Python backend can not simplify the rule for an iterative approach + for the parser, meaning at most 1000 elements can be parsed - or a maximum + recursion depth will be thrown. Using the terminal or separator pragmas should work fine. +

+

+ Skeleton code for using lists as entrypoints: +

+

+ Matchers for using lists, such as [Exp], are not generated in the + skeleton code as it may confuse users if the grammar uses several different + list categories - as a user may then try to pattern match lists without + checking what type the elements have. Users are instead encouraged to use + non-list entrypoints. +

+

+ The improper way to iterate over lists, as the value category is unknown: +

+

+  case list():
+   for ele in ast:
+    ... +

+

+ The proper way to deconstruct lists, where we know the value category: +

+

+  case RuleName(listexp_):
+   for exp in listexp_:
+    ... +

+

Several entrypoints:

+

+ The testfile genTest.py only uses the first entrypoint by default. +

+

+ Using multiple separators: +

+

+ Using multiple separators for the same category, such as below, generates + Python functions with overlapping names, causing runtime errors. +

+

+ separator Exp1 "," ;
+ separator Exp1 ";" ; +

+

+Results from the parameterized tests: +

+

+ One error among the regression tests are reported: the Java BNFC example grammar contains mutually left recursive rules. +

+

+ Example for grammar haskell-hcr: +

+

+ Attempting to parse ParCore.hcr from the haskell-hcr example BNFC grammar yields an error for escaped characters. +

diff --git a/source/BNFC.cabal b/source/BNFC.cabal index 7300a8d2..872f28fd 100644 --- a/source/BNFC.cabal +++ b/source/BNFC.cabal @@ -32,9 +32,8 @@ Description: -- Support range when build with cabal tested-with: - GHC == 9.10.1 - GHC == 9.8.2 - GHC == 9.6.5 + GHC == 9.8.1 + GHC == 9.6.3 GHC == 9.4.8 GHC == 9.2.8 GHC == 9.0.2 @@ -44,6 +43,7 @@ tested-with: GHC == 8.4.4 GHC == 8.2.2 GHC == 8.0.2 + GHC == 7.10.3 extra-doc-files: README.md @@ -81,9 +81,6 @@ executable bnfc other-modules: -- Generated by cabal Paths_BNFC - autogen-modules: - -- Generated by cabal - Paths_BNFC default-extensions: -- Keep in alphabetical order. LambdaCase @@ -157,14 +154,6 @@ library -- BNFC.Lex -- -- Generated by happy -- BNFC.Par - -- 2023-11-03 We cannot add BNFC.{Lex,Par} as then the Lex.x and Par.y files - -- are not bundled by cabal dist. - -- Just make sure that there is no src/BNFC/{Lex,Par}.hs before running cabal sdist, - -- otherwise we will end up with both Lex.hs and Lex.x (resp. Par.{hs,y}) - -- which will cause alex/happy to not be run, leading to build failures. - autogen-modules: - -- Generated by cabal - Paths_BNFC other-modules: -- Generated by cabal Paths_BNFC @@ -266,6 +255,17 @@ library BNFC.Backend.Java.RegToAntlrLexer BNFC.Backend.Java.Utils + -- Python backend + BNFC.Backend.Python + BNFC.Backend.Python.CFtoPyAbs + BNFC.Backend.Python.CFtoPyPrettyPrinter + BNFC.Backend.Python.RegToFlex + BNFC.Backend.Python.PyHelpers + BNFC.Backend.Python.CFtoPySkele + BNFC.Backend.Python.CFtoAntlr4Lexer + BNFC.Backend.Python.CFtoAntlr4Parser + BNFC.Backend.Python.Antlr4Utils + -- XML backend BNFC.Backend.XML diff --git a/source/main/Main.hs b/source/main/Main.hs index 754bf268..6377611f 100644 --- a/source/main/Main.hs +++ b/source/main/Main.hs @@ -26,6 +26,7 @@ import BNFC.Backend.Latex import BNFC.Backend.OCaml import BNFC.Backend.Pygments import BNFC.Backend.TreeSitter +import BNFC.Backend.Python import BNFC.CF (CF) import BNFC.GetCF import BNFC.Options hiding (make, Backend) @@ -83,3 +84,5 @@ maketarget = \case TargetPygments -> makePygments TargetCheck -> error "impossible" TargetTreeSitter -> makeTreeSitter + TargetPython -> makePython + \ No newline at end of file diff --git a/source/src/BNFC/Backend/Python.hs b/source/src/BNFC/Backend/Python.hs new file mode 100644 index 00000000..3e0240dd --- /dev/null +++ b/source/src/BNFC/Backend/Python.hs @@ -0,0 +1,160 @@ +{- + BNF Converter: Python main file + Copyright (C) 2004 Author: Bjorn Werner +-} + +{-# LANGUAGE NoImplicitPrelude #-} +{-# LANGUAGE OverloadedStrings #-} + +module BNFC.Backend.Python (makePython) where + +import Prelude hiding ((<>)) +import System.FilePath (()) +import BNFC.CF (CF, firstEntry) +import BNFC.Options (SharedOptions, optMake, lang) +import BNFC.Backend.Base (MkFiles, mkfile) +import BNFC.Backend.Python.CFtoPyAbs (cf2PyAbs) +import BNFC.Backend.Python.CFtoPyPrettyPrinter (cf2PyPretty) +import BNFC.Backend.Python.CFtoPySkele (cf2PySkele) +import BNFC.Backend.Python.PyHelpers +import BNFC.PrettyPrint +import Data.Char (toLower, isLetter) +import qualified BNFC.Backend.Common.Makefile as Makefile + +import BNFC.Backend.Python.CFtoAntlr4Lexer (cf2AntlrLex) +import BNFC.Backend.Python.CFtoAntlr4Parser (cf2AntlrParse) + + +-- | Entrypoint for BNFC to use the Python backend. +makePython :: SharedOptions -> CF -> MkFiles () +makePython opts cf = do + let pkgName = "bnfcPyGen" ++ filteredName + let abstractClasses = cf2PyAbs cf + let prettyPrinter = cf2PyPretty pkgName cf + let skeletonCode = cf2PySkele pkgName cf + mkPyFile (pkgName ++ "/Absyn.py") abstractClasses + mkPyFile (pkgName ++ "/PrettyPrinter.py") prettyPrinter + mkPyFile "skele.py" skeletonCode + mkPyFile "genTest.py" (pyTest pkgName filteredName cf) + Makefile.mkMakefile (optMake opts) $ makefile pkgName filteredName (optMake opts) + + let (d, kwenv) = cf2AntlrLex filteredName cf + mkAntlrFile (pkgName ++ "/" ++ filteredName ++ "Lexer.g4") d + --cf2AntlrParse :: String -> String -> CF -> KeywordEnv -> String + let p = cf2AntlrParse filteredName (pkgName ++ ".Absyn") cf kwenv + mkAntlrFile (pkgName ++ "/" ++ filteredName ++ "Parser.g4") p + where + name :: String + name = lang opts + filteredName = filter isLetter name + mkPyFile x = mkfile x comment + mkAntlrFile x = mkfile x ("//" ++) -- "//" for comments + +-- | A makefile with distclean and clean specifically for the testsuite. No +-- "all" is needed as bnfc has already generated the necessary Python files. +makefile :: String -> String -> Maybe String -> String -> Doc +makefile pkgName filteredName optMakefileName basename = vcat + [ + Makefile.mkRule "all" [] + [ "java -jar $(ANTLR) -Dlanguage=Python3 " ++ pkgName ++ "/" ++ filteredName ++ "Lexer.g4" + , "java -jar $(ANTLR) -Dlanguage=Python3 " ++ pkgName ++ "/" ++ filteredName ++ "Parser.g4" ] + , Makefile.mkRule "clean" [] + [ "rm -f parser.out parsetab.py" ] + , Makefile.mkRule "distclean" [ "vclean" ] [] + , Makefile.mkRule "vclean" [] + [ "rm -f " ++ unwords + [ + pkgName ++ "/Absyn.py", + pkgName ++ "/PrettyPrinter.py", + pkgName ++ "/Absyn.py.bak", + pkgName ++ "/PrettyPrinter.py.bak", + pkgName ++ "/" ++ filteredName ++ "*", + "skele.py", + "genTest.py", + "skele.py.bak", + "genTest.py.bak" + ], + "rm -f " ++ pkgName ++ "/__pycache__/*.pyc", + "rm -fd " ++ pkgName ++ "/__pycache__", + "rmdir " ++ pkgName, + "rm -f __pycache__/*.pyc", + "rm -fd __pycache__", + "rm -f " ++ makefileName, + "rm -f " ++ makefileName ++ ".bak" + ] + ] + where + makefileName = case optMakefileName of + Just s -> s + Nothing -> "None" -- No makefile will be created. + + +-- | Put string into a comment. +comment :: String -> String +comment x = "# " ++ x + + +-- Produces the content for the testing file, genTest.py. +pyTest :: String -> String -> CF -> String +pyTest pkgName filteredName cf = unlines + [ "import sys" + , "from " ++ pkgName ++ ".PrettyPrinter import printAST, lin, renderC" + , "from antlr4 import *" + , "from " ++ pkgName ++ "." ++ lexerName ++ " import " ++ lexerName + , "from " ++ pkgName ++ "." ++ parserName ++ " import " ++ parserName + , "from antlr4.error.ErrorListener import ErrorListener" + , "" + , "# Suggested input options:" + , "# python3 genTest.py < sourcefile" + , "# python3 genTest.py sourcefile inputfile (i.e. for interpreters)." + , "inputFile = None" + , "if len(sys.argv) > 1:" + , " f = open(sys.argv[1], 'r')" + , " inp = f.read()" + , " f.close()" + , " if len(sys.argv) > 2:" + , " inputFile = sys.argv[2]" + , "else:" + , " inp = ''" + , " for line in sys.stdin:" + , " inp += line" + , "" + , "class ThrowingErrorListener(ErrorListener):" + , " def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):" + , " raise Exception(f'Syntax error at line {line}, column {column}: {msg}')" + , "" + , "try:" + , " lexer = " ++ lexerName ++ "(InputStream(inp))" + , " lexer.removeErrorListeners()" + , " lexer.addErrorListener(ThrowingErrorListener())" + , "" + , " stream = CommonTokenStream(lexer)" + , " parser = " ++ parserName ++ "(stream)" + , " parser.removeErrorListeners()" + , " parser.addErrorListener(ThrowingErrorListener())" + , "" + , " tree = parser.start_" ++ defaultEntrypoint ++ "()" + , " ast = tree.result" + , " error = False" + , "except Exception as e:" + , " print(e)" + , " error = True" + , "" + , "if not error and ast:" + , " print('Parse Successful!\\n')" + , " print('[Abstract Syntax]')" + , " print(printAST(ast))" + , " print('\\n[Linearized Tree]')" + , " linTree = lin(ast)" + , " print(renderC(linTree))" + , " print()" + , "else:" + , " print('Parse failed')" + , " quit(1)" + ] + where + lexerName = filteredName ++ "Lexer" + parserName = filteredName ++ "Parser" + defaultEntrypoint = (translateToList . show . firstEntry) cf + + diff --git a/source/src/BNFC/Backend/Python/Antlr4Utils.hs b/source/src/BNFC/Backend/Python/Antlr4Utils.hs new file mode 100644 index 00000000..33fea7ef --- /dev/null +++ b/source/src/BNFC/Backend/Python/Antlr4Utils.hs @@ -0,0 +1,46 @@ +{- + Description : Copied from the Java backend and modified for use with Python. + Modified by : Björn Werner +-} + +module BNFC.Backend.Python.Antlr4Utils (getRuleName, getLabelName, startSymbol, + comment) + where + +import BNFC.CF +import BNFC.Utils (mkName, NameStyle(..)) +import BNFC.Backend.Python.PyHelpers (pythonReserved) + + +-- | Make an Antlr grammar file line comment +comment :: String -> String +comment = ("// " ++) + + +-- Python keywords plus Antlr4 reserved keywords +pythonAntlrReserved :: [String] +pythonAntlrReserved = pythonReserved ++ + [ "catch" + , "grammar" + , "throws" + ] + + +-- | Appends an underscore if there is a clash with a Python or ANTLR keyword. +-- E.g. "Grammar" clashes with ANTLR keyword "grammar" since +-- we sometimes need the upper and sometimes the lower case version +-- of "Grammar" in the generated parser. +getRuleName :: String -> String +getRuleName z + -- | firstLowerCase z `elem` ("grammar" : pythonReserved) = z ++ "_" + | z `elem` pythonAntlrReserved = z ++ "_" + | otherwise = z + + +getLabelName :: Fun -> String +getLabelName = mkName ["Rule"] CamelCase + + +-- | Make a new entrypoint NT for an existing NT. +startSymbol :: String -> String +startSymbol = ("Start_" ++) diff --git a/source/src/BNFC/Backend/Python/CFtoAntlr4Lexer.hs b/source/src/BNFC/Backend/Python/CFtoAntlr4Lexer.hs new file mode 100644 index 00000000..e1d48098 --- /dev/null +++ b/source/src/BNFC/Backend/Python/CFtoAntlr4Lexer.hs @@ -0,0 +1,187 @@ +{-# LANGUAGE NoImplicitPrelude #-} +{-# LANGUAGE OverloadedStrings #-} + +{- + BNF Converter: Python Antlr4 Lexer generator + Copyright (C) 2015 Author: Gabriele Paganelli + + Description : This module generates the Antlr4 input file. + Based on CFtoJLex15.hs + + Author : Gabriele Paganelli (gapag@distruzione.org) + Created : 15 Oct, 2015 + + Edited for Python by + : Björn Werner + Modified : 30 Dec, 2024 + +-} + +module BNFC.Backend.Python.CFtoAntlr4Lexer ( cf2AntlrLex ) where + +import Prelude hiding ((<>)) + +import Text.PrettyPrint +import BNFC.CF +import BNFC.Backend.Java.RegToAntlrLexer +import BNFC.Backend.Common.NamedVariables +import BNFC.Backend.Python.Antlr4Utils (getRuleName) + + +-- | Creates a lexer grammar. +-- Since antlr token identifiers must start with an uppercase symbol, +-- I prepend "Surrogate_id_SYMB_" to the identifier. +-- This introduces risks of clashes if somebody uses the same identifier for +-- user defined tokens. This is not handled. +-- returns the environment because the parser uses it. +cf2AntlrLex :: String -> CF -> (Doc, KeywordEnv) +cf2AntlrLex lang cf = (,env) $ vcat + [ prelude lang + , cMacros + -- unnamed symbols (those in quotes, not in token definitions) + , lexSymbols env + , restOfLexerGrammar cf + ] + where + env = zip (cfgSymbols cf ++ reservedWords cf) $ + map (("Surrogate_id_SYMB_" ++) . show) [0 :: Int ..] + + +-- | File prelude +prelude :: String -> Doc +prelude lang = vcat + [ "// Lexer definition for use with Antlr4" + , "lexer grammar" <+> text lang <> "Lexer;" + ] + + +--For now all categories are included. +--Optimally only the ones that are used should be generated. +cMacros :: Doc +cMacros = vcat + [ "// Predefined regular expressions in BNFC" + , frg "LETTER : CAPITAL | SMALL" + , frg "CAPITAL : [A-Z\\u00C0-\\u00D6\\u00D8-\\u00DE]" + , frg "SMALL : [a-z\\u00DF-\\u00F6\\u00F8-\\u00FF]" + , frg "DIGIT : [0-9]" + ] + where frg a = "fragment" <+> a <+> ";" + + +escapeChars :: String -> String +escapeChars = concatMap escapeCharInSingleQuotes + + +-- | +-- >>> lexSymbols [("foo","bar")] +-- bar : 'foo' ; +-- >>> lexSymbols [("\\","bar")] +-- bar : '\\' ; +-- >>> lexSymbols [("/","bar")] +-- bar : '/' ; +-- >>> lexSymbols [("~","bar")] +-- bar : '~' ; +lexSymbols :: KeywordEnv -> Doc +lexSymbols ss = vcat $ map transSym ss + where + transSym (s,r) = text r <> " : '" <> text (escapeChars s) <> "' ;" + + +-- | Writes rules for user defined tokens, and, if used, the predefined +-- BNFC tokens. +restOfLexerGrammar :: CF -> Doc +restOfLexerGrammar cf = vcat + [ lexComments (comments cf) + , "" + , userDefTokens + , ifString strdec + , ifChar chardec + , ifC catDouble + [ "// Double predefined token type" + , "DOUBLE : DIGIT+ '.' DIGIT+ ('e' '-'? DIGIT+)?;" + ] + , ifC catInteger + [ "//Integer predefined token type" + , "INTEGER : DIGIT+;" + ] + , ifC catIdent + [ "// Identifier token type" + , "fragment" + , "IDENTIFIER_FIRST : LETTER | '_';" + , "IDENT : IDENTIFIER_FIRST (IDENTIFIER_FIRST | DIGIT)*;" + ] + , "// Whitespace" + , "WS : (' ' | '\\r' | '\\t' | '\\n' | '\\f')+ -> skip;" + , "// Escapable sequences" + , "fragment" + , "Escapable : ('\"' | '\\\\' | 'n' | 't' | 'r' | 'f');" + , "ErrorToken : . ;" + , ifString stringmodes + , ifChar charmodes + ] + where + ifC cat s = if isUsedCat cf (TokenCat cat) then vcat s else "" + ifString = ifC catString + ifChar = ifC catChar + strdec = [ "// String token type" + , "STRING : '\"' -> more, mode(STRINGMODE);" + ] + chardec = ["CHAR : '\\'' -> more, mode(CHARMODE);"] + userDefTokens = vcat + [ text (getRuleName name) <> " : " <> text (printRegJLex exp) <> ";" + | (name, exp) <- tokenPragmas cf ] + stringmodes = [ "mode STRESCAPE;" + , "STRESCAPED : Escapable -> more, popMode ;" + , "mode STRINGMODE;" + , "STRINGESC : '\\\\' -> more , pushMode(STRESCAPE);" + , "STRINGEND : '\"' -> type(STRING), mode(DEFAULT_MODE);" + , "STRINGTEXT : ~[\"\\\\] -> more;" + ] + charmodes = [ "mode CHARMODE;" + , "CHARANY : ~['\\\\] -> more, mode(CHAREND);" + , "CHARESC : '\\\\' -> more, pushMode(CHAREND),pushMode(ESCAPE);" + , "mode ESCAPE;" + , "ESCAPED : (Escapable | '\\'') -> more, popMode ;" + , "mode CHAREND;" + , "CHARENDC : '\\'' -> type(CHAR), mode(DEFAULT_MODE);" + ] + + +-- | Stores multi and single line comment rules. +lexComments :: ([(String, String)], [String]) -> Doc +lexComments ([],[]) = "" +lexComments (m,s) = vcat + (prod "COMMENT_antlr_builtin" lexSingleComment s ++ + prod "MULTICOMMENT_antlr_builtin" lexMultiComment m ) + where + prod bg lc ty = [bg, ": ("] ++ punctuate "|" (map lc ty) ++ skiplex + skiplex = [") -> skip;"] + + +-- | Create lexer rule for single-line comments. +-- +-- >>> lexSingleComment "--" +-- '--' ~[\r\n]* (('\r'? '\n')|EOF) +-- +-- >>> lexSingleComment "\"" +-- '"' ~[\r\n]* (('\r'? '\n')|EOF) +lexSingleComment :: String -> Doc +lexSingleComment c = + "'" <>text (escapeChars c) <> "' ~[\\r\\n]* (('\\r'? '\\n')|EOF)" + + +-- | Create lexer rule for multi-lines comments. +-- +-- There might be a possible bug here if a language includes 2 multi-line +-- comments. They could possibly start a comment with one character and end it +-- with another. However this seems rare. +-- +-- >>> lexMultiComment ("{-", "-}") +-- '{-' (.)*? '-}' +-- +-- >>> lexMultiComment ("\"'", "'\"") +-- '"\'' (.)*? '\'"' +lexMultiComment :: (String, String) -> Doc +lexMultiComment (b,e) = "'" <> text (escapeChars b) + <> "' (.)*? '"<> text (escapeChars e) + <> "'" diff --git a/source/src/BNFC/Backend/Python/CFtoAntlr4Parser.hs b/source/src/BNFC/Backend/Python/CFtoAntlr4Parser.hs new file mode 100644 index 00000000..14357e1c --- /dev/null +++ b/source/src/BNFC/Backend/Python/CFtoAntlr4Parser.hs @@ -0,0 +1,342 @@ +{- + BNF Converter: Antlr4 Python Generator + Copyright (C) 2004 Author: Markus Forsberg, Michael Pellauer, + Bjorn Bringert + + Description : This module generates the ANTLR .g4 input file for the + Python backend. It follows the same basic structure + of CFtoHappy. + + Author : Gabriele Paganelli (gapag@distruzione.org) + Created : 15 Oct, 2015 + + Edited for Python 2024 by + : Björn Werner +-} + +{-# LANGUAGE LambdaCase #-} + +module BNFC.Backend.Python.CFtoAntlr4Parser ( cf2AntlrParse ) where + +import Data.Foldable ( toList ) +import Data.List ( intercalate ) +import Data.Maybe +import BNFC.CF +import BNFC.Utils ( (+++), (+.+), applyWhen ) +import BNFC.Backend.Python.Antlr4Utils +import BNFC.Backend.Common.NamedVariables +import BNFC.Backend.Python.PyHelpers +import Data.Either (lefts, rights, isLeft) + + +-- | A definition of a non-terminal by all its rhss, +-- together with parse actions. +data PDef = PDef + { _pdNT :: Maybe String + -- ^ If given, the name of the lhss. Usually computed from 'pdCat'. + , _pdCat :: Cat + -- ^ The category to parse. + , _pdAlts :: [(Pattern, Action, Maybe Fun)] + -- ^ The possible rhss with actions. If 'null', skip this 'PDef'. + -- Where 'Nothing', skip ANTLR rule label. + } +type Rules = [PDef] +type Pattern = String +type Action = String +type MetaVar = (String, Cat) + + +-- | Creates the ANTLR parser grammar for this CF. +--The environment comes from CFtoAntlr4Lexer +cf2AntlrParse :: String -> String -> CF -> KeywordEnv -> String +cf2AntlrParse lang packageAbsyn cf env = unlines $ concat + [ [ header + , tokens + , importAbs + , "" + -- Generate start rules [#272] + -- _X returns [ dX result ] : x=X EOF { $result = $x.result; } + , prRules packageAbsyn $ map entrypoint $ toList $ allEntryPoints cf + -- Generate regular rules + , prRules packageAbsyn $ rulesForAntlr4 packageAbsyn cf env + ] + ] + where + header :: String + header = unlines + [ "// Parser definition for use with ANTLRv4" + , "parser grammar" +++ lang ++ "Parser;" + ] + tokens :: String + tokens = unlines + [ "options {" + , " tokenVocab = " ++ lang ++ "Lexer;" + , "}" + ] + importAbs :: String + importAbs = unlines + [ "@parser::header {import " ++ packageAbsyn + , "}" + ] + + +-- | Generate start rule to help ANTLR. +-- +-- @start_X returns [ X result ] : x=X EOF { $result = $x.result; } # Start_X@ +-- +entrypoint :: Cat -> PDef +entrypoint cat = + PDef (Just nt) cat [(pat, act, fun)] + where + nt = firstLowerCase $ startSymbol $ identCat cat + pat = "x=" ++ catToNT cat +++ "EOF" + act = "$result = $x.result;" + -- No ANTLR Rule label, ("Start_" ++ identCat cat) conflicts with lhs. + fun = Nothing + + +-- | The following functions are a (relatively) straightforward translation +-- of the ones in CFtoHappy.hs +rulesForAntlr4 :: String -> CF -> KeywordEnv -> Rules +rulesForAntlr4 packageAbsyn cf env = map mkOne getrules + where + getrules = ruleGroups cf + mkOne (cat,rules) = constructRule packageAbsyn cf env rules cat + + +-- | Aids the pattern constructor for lists +data ListType = Term | Sep | None + deriving Eq + + +-- | For every non-terminal, we construct a set of rules. A rule is a +-- sequence of terminals and non-terminals, and an action to be performed. +-- Complete sets of separator or terminator rules are treated separately, as +-- the default recursive parsing may reach the maximum recursion depth +-- in Python. Cases of multiple sets of rules are not considered. +constructRule :: String -> CF -> KeywordEnv -> [Rule] -> NonTerminal -> PDef +constructRule packageAbsyn cf env rules nt + | termOrSep /= None = PDef Nothing nt $ + (if oneNilFun then + [ (" /* empty */ " + , "$result=[]" + , Nothing + ) + ] + else + [] + ) ++ + [ ( generateListPatterns packageAbsyn env + (rhsRule (head consFuns)) termOrSep oneNilFun + , "# actions embedded in pattern" + , Nothing + ) + ] + | otherwise = PDef Nothing nt $ + [ ( p + , generateAction packageAbsyn nt (funRule r) m b + , Nothing -- labels not needed for BNFC-generated AST parser + ) + | (index, r0) <- zip [1..] rules + , let b = isConsFun (funRule r0) && elem (valCat r0) (cfgReversibleCats cf) + , let r = applyWhen b revSepListRule r0 + , let (p,m0) = generatePatterns index env r + , let m = applyWhen b reverse m0 + ] + where + -- Figures out if the rules are well formed list rules (using the + -- separator or terminator pragmas). + nilFuns = filter isNilFun rules + oneFuns = filter isOneFun rules + consFuns = filter isConsFun rules + + noNilFuns = length nilFuns == 0 + noOneFuns = length oneFuns == 0 + + oneNilFun = length nilFuns == 1 + oneOneFun = length oneFuns == 1 + oneConsFun = length consFuns == 1 + + onlyMiddle :: [Either Cat String] -> Bool + onlyMiddle ecs = all isLeft [head ecs, last ecs] + + noStrings :: [Either Cat String] -> Bool + noStrings ecs = length (rights ecs) == 0 + + -- Terminator: + -- [] + -- (:) C ... [C] + isTerminator = oneNilFun && noOneFuns && oneConsFun && + (noStrings . rhsRule . head) nilFuns && + (onlyMiddle . rhsRule . head) consFuns + + -- Terminator nonempty: + -- (:[]) C ... + -- (:) C ... [C] + isTerminatorNonempty = noNilFuns && oneOneFun && oneConsFun && + (isLeft . head . rhsRule . head) oneFuns && + (onlyMiddle . rhsRule . head) consFuns && + (rights . rhsRule . head) oneFuns == ((rights . rhsRule . head) consFuns) + + -- Separator: + -- [] + -- (:[]) C + -- (:) C ... [C] + isSeparator = oneNilFun && oneOneFun && oneConsFun && + (noStrings . rhsRule . head) nilFuns && + (noStrings . rhsRule . head) oneFuns && + (onlyMiddle . rhsRule . head) consFuns + + -- Separator nonempty: + -- (:[]) C + -- (:) C ... [C] + isSeparatorNonempty = noNilFuns && oneOneFun && oneConsFun && + (noStrings . rhsRule . head) oneFuns && + (onlyMiddle . rhsRule . head) consFuns + + termOrSep + | isTerminator || isTerminatorNonempty = Term + | isSeparator || isSeparatorNonempty = Sep + | otherwise = None + + +-- | Generates a string containing the semantic action. +generateAction :: IsFun f => String -> NonTerminal -> f -> [MetaVar] + -> Bool -- ^ Whether the list should be reversed or not. + -- Only used if this is a list rule. + -> Action +generateAction packageAbsyn _ f ms rev + | isNilFun f = "$result =" ++ c ++ ";" + | isOneFun f = "$result =" ++ c ++ ";; $result.append(" ++ p_1 ++ ")" + | isConsFun f = "$result =" ++ p_2 ++ ";; $result." ++ add p_1 + | isCoercion f = "$result = " ++ p_1 ++ ";" + | isDefinedRule f = + "$result = " ++ packageAbsyn ++ "." ++ sanitize (funName f) + ++ "(" ++ intercalate "," (map (resultvalue packageAbsyn) ms) ++ ");" + | otherwise = "$result = " ++ c + ++ "(" ++ intercalate "," (map (resultvalue packageAbsyn) ms) ++ ");" + where + sanitize = getRuleName + c = if isNilFun f || isOneFun f || isConsFun f + then "[]" + else packageAbsyn ++ "." ++ sanitize (funName f) + p_1 = resultvalue packageAbsyn $ ms!!0 + p_2 = resultvalue packageAbsyn $ ms!!1 + add p = (if rev then "append(" else "insert(0, ") ++ p ++ ")" + + +-- | Gives the abstract syntax constructor for a cat. +resultvalue :: String -> MetaVar -> String +resultvalue packageAbsyn (n,c) = case c of + TokenCat "Double" -> concat [ packageAbsyn ++ ".Double(", txt, ")" ] + TokenCat "Integer" -> concat [ packageAbsyn ++ ".Integer(" , txt, ")" ] + TokenCat "Char" -> packageAbsyn ++ ".Char(" ++ txt ++ ")" + TokenCat "String" -> packageAbsyn ++ ".String(" ++ txt ++ ")" + TokenCat "Ident" -> concat [ packageAbsyn, ".Ident(", txt, ")" ] + TokenCat s -> packageAbsyn ++ "." ++ unkw s ++ "(" ++ txt ++ ")" + _ -> concat [ "$", n, ".result" ] + where + txt = '$':n +.+ "text" + + + -- | Generate patterns and a set of metavariables indicating +-- where in the pattern the non-terminal +-- >>> generatePatterns 2 [] $ npRule "myfun" (Cat "A") [] Parsable +-- (" /* empty */ ",[]) +-- >>> generatePatterns 3 [("def", "_SYMB_1")] $ npRule "myfun" (Cat "A") [Right "def", Left (Cat "B")] Parsable +-- ("_SYMB_1 p_3_2=b",[("p_3_2",B)]) +generatePatterns :: Int -> KeywordEnv -> Rule -> (Pattern,[MetaVar]) +generatePatterns ind env r = + case rhsRule r of + [] -> (" /* empty */ ", []) + its -> ( unwords $ mapMaybe (uncurry mkIt) nits + , [ (var i, cat) | (i, Left cat) <- nits ] + ) + where + nits = zip [1 :: Int ..] its + var i = "p_" ++ show ind ++"_"++ show i + mkIt i = \case + Left c -> Just $ var i ++ "=" ++ catToNT c + Right s -> lookup s env + + +-- | Nonempty patterns with embedded semantic actions, to match: +-- Separator: +-- C (... C)* +-- Terminator: +-- (C ...)+ +-- A terminators for example is turned into the pattern: +-- {init list action} ( p_1_2=C {append action} )+ +-- Not that for separators with empty, consFun empty is a possible derivation, +-- meaning a separator can end with delims: +-- C (... C)* (...)? +generateListPatterns :: String -> KeywordEnv -> [Either Cat String] -> + ListType -> Bool -> Pattern +generateListPatterns packageAbsyn env ecss termOrSep oneNilFun = + case termOrSep of + Sep -> p1 ++ " {" ++ a1 ++ "} (" ++ delims ++ " " ++ p2 ++ + " {" ++ a2 ++ "})*" ++ if oneNilFun then "(" ++ delims ++ ")?" else "" + Term -> "{$result=[];} (" ++ p2 ++ " " ++ delims ++ " {" ++ a2 ++ "})+" + None -> error "Neither Term or Sep" + where + c = head (lefts ecss) + p1 = "p_1_1=" ++ catToNT c + p2 = "p_1_2=" ++ catToNT c + + a1 = "$result = [" ++ resultvalue packageAbsyn ("p_1_1", c) ++ "]" + a2 = "$result.append(" ++ resultvalue packageAbsyn ("p_1_2", c) ++ ")" + + delims = unwords (mapMaybe (\s -> lookup s env) (rights ecss)) + + +-- | Converts a cat to string, an underscore is appended for keywords words. +catToNT :: Cat -> String +catToNT = \case + TokenCat "Ident" -> "IDENT" + TokenCat "Integer" -> "INTEGER" + TokenCat "Char" -> "CHAR" + TokenCat "Double" -> "DOUBLE" + TokenCat "String" -> "STRING" + c | isTokenCat c -> getRuleName $ identCat c + | otherwise -> getRuleName $ firstLowerCase $ identCat c + + +-- | Puts together the pattern and actions and returns a string containing all +-- the rules. +prRules :: String -> Rules -> String +prRules packabs = concatMap $ \case + + -- No rules: skip. + PDef _mlhs _nt [] -> "" + + -- At least one rule: print! + PDef mlhs nt (rhs : rhss) -> unlines $ concat + + -- The definition header: lhs and type. + [ [ unwords [ fromMaybe nt' mlhs + , "returns" , "[" , packabs+.+normcat , "result" , "]" + ] + ] + -- The first rhs. + , alternative " :" rhs + -- The other rhss. + , concatMap (alternative " |") rhss + -- The definition footer. + , [ " ;" ] + ] + where + alternative sep (p, a, label) = concat + [ [ concat [ sep , p ] ] + , [ concat [ " {" , a , "}" ] ] + , [ concat [ " #" , antlrRuleLabel l ] | Just l <- [label] ] + ] + catid = identCat nt + normcat = identCat (normCat nt) + nt' = getRuleName $ firstLowerCase catid + antlrRuleLabel :: Fun -> String + antlrRuleLabel fnc + | isNilFun fnc = catid ++ "_Empty" + | isOneFun fnc = catid ++ "_AppendLast" + | isConsFun fnc = catid ++ "_PrependFirst" + | isCoercion fnc = "Coercion_" ++ catid + | otherwise = getLabelName fnc diff --git a/source/src/BNFC/Backend/Python/CFtoPyAbs.hs b/source/src/BNFC/Backend/Python/CFtoPyAbs.hs new file mode 100644 index 00000000..6840453a --- /dev/null +++ b/source/src/BNFC/Backend/Python/CFtoPyAbs.hs @@ -0,0 +1,151 @@ + +{- + BNF Converter: Python abstract syntax and parsing definitions generator + Copyright (C) 2024 Author: Bjorn Werner + Based on CFtoCAbs.hs, Copyright (C) 2004 Michael Pellauer +-} + +module BNFC.Backend.Python.CFtoPyAbs (cf2PyAbs) where +import Data.List (nub) +import Data.Char (isLower) +import Data.Either (lefts) +import BNFC.CF +import BNFC.Backend.Python.PyHelpers +import BNFC.Backend.Common.NamedVariables +import Text.PrettyPrint (Doc, render) + + +-- | Produces the content for Absyn.py +cf2PyAbs :: CF -> String +cf2PyAbs cf = unlines + ["from typing import List as _List" + ,"# Value categories (no coercsions):" + , unlines valueCatsClasses + , "" + , placeholderVariableClass + , "" + ,"# Rules:" + ,"from dataclasses import dataclass, field" + ,"\n" ++ (unlines dataClasses) + , "" + , createDefineFunctions cf + ] + where + rules = cfgRules cf + + -- To create Absyn.py + dataClasses :: [String] + dataClasses = map makePythonClass + [ r | r <- rules, not (isDefinedRule r) + , not (isNilCons r) + , not (isCoercion r) + ] + + rulesNoListConstructors = + [r | r <- (cfgRules cf), not (isNilCons r), not (isCoercion r) ] + + -- Note: Custom tokens are set to inherit "str". + valueCatNames = nub $ + (map (unkw . show . normCat . valCat) rulesNoListConstructors) ++ + (map ((++ "(str)") . unkw) (tokenNames cf)) ++ + [ "String(str)" + , "Char(str)" + , "Ident(str)" + , "Integer(int)" + , "Double(float)" + ] + valueCatsClasses = map createValueCatClass valueCatNames + + +-- | Converts the production of a define, called an expression, to a +-- production for the parsing definition. +expToDef :: CF -> Exp -> String +expToDef cf (App "(:)" _ (e:[App "[]" _ _])) = expToDef cf e ++ "]" +expToDef cf (App "(:)" _ (e:[recList])) = "[" ++ expToDef cf e ++ ", " ++ + expToDef cf recList +expToDef _ (App "[]" _ _) = "[]" +expToDef cf (App fName _ exps) + | isLower (head fName) = + unkw fName ++ "(" ++ addCommas (map (expToDef cf) exps) ++ ")" + | otherwise = + unkw fName ++ "(" ++ addCommas (map (expToDef cf) exps) ++ ")" +expToDef _ (Var s) = unkw s +expToDef _ (LitInt i) = "Integer(" ++ show i ++ ")" +expToDef _ (LitDouble d) = "Double(" ++ show d ++ ")" +expToDef _ (LitChar s) = "Char(\"" ++ show s ++ "\")" +expToDef _ (LitString s) = "String('" ++ s ++ "')" + + +-- A placeholder variable to store additional information, for say type +-- annotation. +placeholderVariableClass :: String +placeholderVariableClass = unlines + [ "# Placeholder to add additional information to a node in the AST," ++ + " like type information." + , "class _AnnType:" + , " def __init__(self):" + , " self.__v = None" + , "" + , " def s(self, val):" + , " if not self.__v == None:" + , " if self.__v != val:" + , " raise Exception('already has type: ' + str(self.__v)" ++ + " + ' and tried to set to ' + str(val))" + , " self.__v = val" + , "" + , " def g(self):" + , " return self.__v" + , "" + , " def __str__(self):" + , " return str(self.__v.__class__)" + , "" + , " def __repr__(self):" + , " return str(self.__v.__class__)" + ] + + +-- | The value categories become abstract classes, for type hinting. +createValueCatClass :: String -> String +createValueCatClass s = "class " ++ s ++ ":\n\tpass\n" + + +-- | Make a Python class from a rule's name and production. +makePythonClass :: Rul RFun -> String +makePythonClass rule = + "@dataclass\n" ++ + "class " ++ className ++ ":\n" ++ + if length cats == 0 then "\tpass\n" else classBody + where + className = unkw (funName rule) + sentForm = rhsRule rule + cats = lefts sentForm + nvCats = numVars sentForm :: [Either (Cat, Doc) String] + + enumeratedVarsWithType = [render d ++ ": " ++ + strCatToPyTyping (show (normCat c)) | (c, d) <- lefts nvCats] + + classBody = unlines $ map ("\t" ++) (enumeratedVarsWithType ++ + ["_ann_type: _AnnType = field(default_factory=_AnnType)"]) + + +-- | Creates the corresponding type hinting for some member variable. +strCatToPyTyping :: String -> String +strCatToPyTyping s = if strIsList s + then "_List['" ++ (unkw . tail . init) s ++ "']" + else unkw s + + +-- | Creates functions for the defines. +createDefineFunctions :: CF -> String +createDefineFunctions cf = unlines + (map (createDefineFunction cf) (definitions cf)) + + +createDefineFunction :: CF -> Define -> String +createDefineFunction cf d = unlines + [ "def " ++ (unkw . wpThing . defName) d ++ "(" ++ addCommas args ++ "):" + , " return " ++ expToDef cf (defBody d) + ] + where + args = map (unkw . fst) (defArgs d) + diff --git a/source/src/BNFC/Backend/Python/CFtoPyPrettyPrinter.hs b/source/src/BNFC/Backend/Python/CFtoPyPrettyPrinter.hs new file mode 100644 index 00000000..8464bff6 --- /dev/null +++ b/source/src/BNFC/Backend/Python/CFtoPyPrettyPrinter.hs @@ -0,0 +1,436 @@ + +{- + BNF Converter: Python pretty-printer generator + Copyright (C) 2024 Author: Bjorn Werner + Based on CFtoCPrinter.hs, Copyright (C) 2004 Michael Pellauer +-} + +module BNFC.Backend.Python.CFtoPyPrettyPrinter ( cf2PyPretty ) where +import Data.List ( intercalate, nub, findIndices ) +import BNFC.CF +import BNFC.Backend.Python.PyHelpers +import BNFC.Backend.Common.NamedVariables +import Text.PrettyPrint (Doc, render) +import Data.Either (rights, lefts, isLeft) +import BNFC.Backend.Common.StrUtils +import qualified Data.List.NonEmpty as List1 + + +-- | Used to create PrettyPrinter.py, that contains the functionality +-- to print the AST and the linearized tree. +cf2PyPretty :: String -> CF -> String +cf2PyPretty pkgName cf = unlines + [ "from " ++ pkgName ++ ".Absyn import *" + , "import itertools" + , "" + , makePrintAST cf + , "" + , makeListDecons cf + , "" + , makeRenderC + , "" + , makeCoercCompare cf + , "" + , makeCompareFunc + , "" + , makeLinFunc cf + ] + + +-- | Creates the print AST function. +makePrintAST :: CF -> String +makePrintAST cf = concat + [ "def printAST(ast: object) -> list:\n" + , " match ast:\n" + , concat + [ ifUsedThen catInteger + [ " case Integer():" + , " return str(ast)" + ] + , ifUsedThen catDouble + [ " case Double():" + , " if ast.is_integer():" + , " return str(int(ast))" + , " else:" + , " return str(ast)" + ] + , ifUsedThen catString + [ " case String():" + , " return str(ast)" + ] + , ifUsedThen catChar + [ " case Char():" + , " return str(ast)" + ] + , ifUsedThen catIdent + [ " case Ident():" + , " return '\"' + str(ast) + '\"'" + ] + ] + , if length (tokenNames cf) > 0 + then unlines + [ " case (" ++ intercalate " | " + (map ((++ "()") . unkw) (tokenNames cf)) ++ "):" + , " return '\"' + str(ast) + '\"'" + ] + else "" + , " case list():\n" + , " return '[' + ', '.join([printAST(a) for a in ast]) + ']'\n" + , "\n" + , " if len(vars(ast)) > 0:\n" + , " return '(' + ast.__class__.__name__ + ' ' + ' '.join(" ++ + "[printAST(vars(ast)[k]) for k in vars(ast) if k != '_ann_type']) + ')'\n" + , " else:\n" + , " return ast.__class__.__name__\n" + ] + where + ifUsedThen :: TokenCat -> [String] -> String + ifUsedThen cat ss + | isUsedCat cf (TokenCat cat) = unlines ss + | otherwise = "" + + +-- | Creates deconstructors for all list categories. +makeListDecons :: CF -> String +makeListDecons cf = unlines $ map (makeListDecon cf) listCats + where + rules = cfgRules cf + valCats = nub $ map valCat rules + listCats = [c | c <- valCats, isList c] + + +-- | Creates a deconstructor for some list category. +makeListDecon :: CF -> Cat -> String +makeListDecon cf c = concat + [ "def list" ++ name ++ "Decon(xs):\n" + , oneRuleStr + , nilRuleStr + , consRuleStr + , "\n" + ] + where + name = show $ catOfList c + listRulesForCat = [ r | r <- cfgRules cf, isParsable r, valCat r == c] + + nilRule = case [r | r <- listRulesForCat, isNilFun r] of + [] -> Nothing + rs -> Just (head rs) + oneRule = case [r | r <- listRulesForCat, isOneFun r] of + [] -> Nothing + rs -> Just (head rs) + consRule = case [r | r <- listRulesForCat, isConsFun r] of + [] -> Nothing + rs -> Just (head rs) + + noOneFun = case oneRule of + Nothing -> True + _ -> False + + -- Builds the production recursively + sentFormToArgs :: Int -> [Either Cat String] -> String + sentFormToArgs _ [] = "[]" + sentFormToArgs v (Right strOp:ecss) = + "['" ++ escapeChars strOp ++ "'] + " ++ + sentFormToArgs v ecss + sentFormToArgs v (Left _:ecss) + | v == 0 = "c(xs[0], '" ++ name ++ "') + " ++ sentFormToArgs (v+1) ecss + | v == 1 = error "Python backend error - should use iterative approach for cons" --"list" ++ name ++ "Decon(xs[1:]) + " ++ + sentFormToArgs (v+1) ecss + | otherwise = error "A list production can max have C and [C]." + + nilRuleStr = case nilRule of + Nothing -> "" + Just r -> unlines + [ " if len(xs) == 0:" + , " return " ++ sentFormToArgs 0 (rhsRule r) + ] + + oneRuleStr = case oneRule of + Nothing -> "" + Just r -> unlines + [ " if len(xs) == 1:" + , " return " ++ sentFormToArgs 0 (rhsRule r) + ] + + -- Adds each element with delims iteratively + consRuleStr = case consRule of + Nothing -> "" + Just r -> unlines + [ " " ++ start + , " for x in xs[:" ++ endIndice ++ "][::-1]:" + , " tot += " ++ add endlims ++ "[]" + , " tot = " ++ add delims ++ "tot" + , " tot = c(x, '" ++ name ++ "') + tot" + , " tot = " ++ add prelims ++ "tot" + , " return tot" + ] + where + ecss = rhsRule r + indices = findIndices isLeft ecss + i1 = head indices + i2 = last indices + prelims = rights $ take i1 ecss + endlims = rights $ drop i2 ecss + delims = rights $ drop i1 $ take i2 ecss + + start + | not noOneFun = "tot = list" ++ name ++ "Decon(xs[-1:])" + | otherwise = "tot = list" ++ name ++ "Decon([])" + + add :: [String] -> String + add ss = concat $ map (\s-> "['" ++ escapeChars s ++ "'] + ") ss + + endIndice + | not noOneFun = "-1" + | otherwise = "" + + +-- | Creates the renderC function, which creates a string of a list of +-- strings, and inserts white-spaces to render the language in a C-like +-- manner. +makeRenderC :: String +makeRenderC = unlines + [ "def renderC(ss: list):" + , " def br(i):" + , " return '\\n' + ' ' * iLevel" + , "" + , " def ident(i):" + , " return ' ' * iLevel" + , "" + , " return tot[:i]" + , "" + , " def oneEmptyLine(tot):" + , " tot = tot.rstrip(' ')" + , " if len(tot) > 0 and tot[-1] != '\\n':" + , " tot += '\\n'" + , " tot += ident(iLevel)" + , " return tot" + , "" + , " tot = ''" + , " iLevel = 0" + , " for i in range(len(ss)):" + , " s = ss[i]" + , " match s:" + , " case '{':" + , " tot = oneEmptyLine(tot)" + , " iLevel += 1" + , " tot += '{' + br(iLevel)" + , " case ('(' | '['):" + , " tot += s" + , " case (')' | ']'):" + , " tot = tot.rstrip()" + , " tot += s + ' '" + , " case '}':" + , " iLevel -= 1" + , " tot = oneEmptyLine(tot)" + , " tot += s + br(iLevel)" + , " case ',':" + , " tot = tot.rstrip()" + , " tot += s + ' '" + , " case ';':" + , " tot = tot.rstrip()" + , " tot += s + br(iLevel)" + , " case '':" + , " tot += ''" + , " case ' ':" + , " tot += s" + , " case _:" + , " if s[-1] == ' ':" -- To not extend separators of spaces. + , " tot = tot.rstrip()" + , " tot += s" + , " else:" + , " tot += s + ' '" + , "" + , " return tot" + ] + + +-- Provides a mapping from a rule to its value category. +makeCoercCompare :: CF -> String +makeCoercCompare cf = concat + [ "cdict = {\n" + , unlines (map (\(fs, cs) -> " " ++ unkw fs ++ " : '" ++ cs ++ "',") scs) + , "}" + ] + where + scs :: [(String, String)] + scs = [(funName r, (show . wpThing . valRCat) r) | r <- cfgRules cf, + not (isCoercion r), not (isNilCons r), not (isDefinedRule r)] + + +-- | Creates a function that attempts to figure out if +-- parentheses are required, for example: +-- 1 + (2 * 3) +-- The precedence for the addition is low, say Exp, but the multiplication +-- has a higher precedence, say Exp1, so parantheses are needed. +makeCompareFunc :: String +makeCompareFunc = unlines + [ "def c(ast, cat: str) -> list:" + , " cl = ast.__class__" + , " if cl in cdict:" + , " clCat = cdict[cl]" + , " clCatAlphas = ''.join(filter(str.isalpha, clCat))" + , " catAlphas = ''.join(filter(str.isalpha, cat))" + , " clCatNums = ''.join(filter(str.isnumeric, clCat))" + , " catNums = ''.join(filter(str.isnumeric, cat))" + , " clCatNum = 0" + , " catNum = 0" + , " if clCatAlphas == catAlphas:" + , " if len(clCatNums) > 0:" + , " clCatNum = int(clCatNums)" + , " if len(catNums) > 0:" + , " catNum = int(catNums)" + , " if clCatNum < catNum:" + , " return ['('] + lin(ast) + [')']" + , " return lin(ast)" + ] + + +-- | Returns the AST as a list of characters, which can be sent into the +-- renderC.function. +makeLinFunc :: CF -> String +makeLinFunc cf = unlines + [ "def lin(ast: object) -> list:" + , " match ast:" + , concat + [ ifUsedThen catInteger + [ " case Integer():" + , " return [str(ast)]" + ] + , ifUsedThen catDouble + [ " case Double():" + , " if ast.is_integer():" + , " return [str(int(ast))]" + , " else:" + , " return [str(ast)]" + ] + , ifUsedThen catString + [ " case String():" + , " return [ast]" + ] + , ifUsedThen catIdent + [ " case Ident():" + , " return [ast]" + ] + , ifUsedThen catChar + [ " case Char():" + , " return [ast]" + ] + ] + , " # Token cases:" + , unlines tokenCases + , " # Rule cases:" + , unlines ruleCases + , -- Deals with cases where the entrypoint is say [Stm] or + -- [Exp] with pattern matching on the first object in the list. + " case " ++ "list():" + , " if len(ast) == 0:" + , " return []" + , " else:" + , " match ast[0]:" + , unlines listEntrypointCases + , " case _:" + , " raise Exception(ast[0].__class__.__name__, " ++ + "'unmatched ast[0]')" + , " case _:" + , " raise Exception(str(ast.__class__) + ' unmatched')" + ] + where + -- To include standard literals, if needed. + ifUsedThen :: TokenCat -> [String] -> String + ifUsedThen cat ss + | isUsedCat cf (TokenCat cat) = unlines ss + | otherwise = "" + + -- Figures out the deliminators for the separators and terminators, + -- to further process a deconstructed object that contains list(s). + rules = [r | r <- cfgRules cf + , not (isCoercion r) + , not (isDefinedRule r) + , not (isNilCons r) + ] + + tokenCases = map makeTokenCase (tokenNames cf) + ruleCases = map makeRuleCase rules + + catEntrypointsForLists = + [catOfList c | c <- (List1.toList . allEntryPoints) cf, isList c] + + -- The Haskell backend defaults to the production for the lowest + -- precedence for lists that are defined. Like ``separator Exp1 ","``. + lowestPrecListCats = [c | c <- catEntrypointsForLists, + precCat c == (minimum (map precCat + [c2 | c2 <- catEntrypointsForLists, normCat c == normCat c2] + ) + ) + ] + + listEntrypointCases = + map (makeListEntrypointCase cf) lowestPrecListCats + + +-- | Creates cases that checks what class individual nodes might be, meaning +-- the rule names, or the token categories +makeListEntrypointCase :: CF -> Cat -> String +makeListEntrypointCase cf c = concat + [ " case " ++ intercalate "|" constructors ++ ":\n" + , " return list" ++ show c ++ "Decon(ast)" + ] + where + constructors = if isTokenCat c + then [unkw (show c) ++ "()"] + else map ((++ "()") . unkw . funName) + [ + r | r <- rulesForNormalizedCat cf (normCat c), + not (isCoercion r), + not (isDefinedRule r) + ] + + +-- | Creates a case for a user defined literal, which inherits str. +makeTokenCase :: String -> String +makeTokenCase tokenName = concat + [ " case " ++ unkw tokenName ++ "():\n" + , " return [ast]" + ] + + +-- | Creates a case for some rule, with the additional information of what +-- separator- and terminator-delimiters there are. +makeRuleCase :: Rul RFun -> String +makeRuleCase rule = concat + [ " case " ++ unkw fName ++ "(" ++ varNamesCommad ++ "):\n" + , " # " ++ (showEcss sentForm) ++ "\n" + , " return " ++ if (length args > 0) then (intercalate " + " args) + else "[]" + ] + where + fName = wpThing (funRule rule) + sentForm = rhsRule rule + + nvCats = numVars sentForm :: [Either (Cat, Doc) String] + enumeratedVarNames = [render d | (_, d) <- lefts nvCats] + + varNamesCommad = if length enumeratedVarNames > 0 + then addCommas (enumeratedVarNames ++ ["_ann_type"]) + else "" + + args = ecssAndVarsToList + sentForm + enumeratedVarNames + + +-- | Creates a list of a production with both terminals and non-terminals. +ecssAndVarsToList :: [Either Cat String] -> [String] -> [String] +ecssAndVarsToList [] _ = [] +ecssAndVarsToList (Left c:ecss) (s:ss) + | isList c = ["list" ++ name ++ "Decon(" ++ s ++ ")"] ++ + ecssAndVarsToList ecss ss + | otherwise = ["c(" ++ s ++ ", '" ++ (show c) ++ "')"] ++ + ecssAndVarsToList ecss ss + where + name = show $ catOfList c +ecssAndVarsToList (Right strOp:ecss) ss = + ["['" ++ escapeChars strOp ++ "']"] ++ ecssAndVarsToList ecss ss +ecssAndVarsToList ((Left _):_) [] = error "Missing variable name" + diff --git a/source/src/BNFC/Backend/Python/CFtoPySkele.hs b/source/src/BNFC/Backend/Python/CFtoPySkele.hs new file mode 100644 index 00000000..26904764 --- /dev/null +++ b/source/src/BNFC/Backend/Python/CFtoPySkele.hs @@ -0,0 +1,109 @@ + +{- + BNF Converter: Python skeleton-code generator + Copyright (C) 2024 Author: Bjorn Werner +-} + +module BNFC.Backend.Python.CFtoPySkele where +import BNFC.CF +import BNFC.Backend.Python.PyHelpers +import Data.Char (toLower) +import BNFC.Backend.Common.NamedVariables +import Text.PrettyPrint (Doc, render) +import Data.Either (lefts) +import Data.List (intercalate) + +-- | Entrypoint. +cf2PySkele :: String -> CF -> String +cf2PySkele pkgName cf = unlines + ["from " ++ pkgName ++ ".Absyn import *" + , "" + , "" + , makeSkele cf + ] + + +-- Creates first a matcher with all value categories, and underneath one +-- matcher for each value category. +makeSkele :: CF -> String +makeSkele cf = unlines + [ "# Categories combined into one matcher" + , "def skeleMatcher(ast: object):" + , ind 1 "match ast:" + , intercalate "\n" skeleLiteralCases + , intercalate "\n" skeleTokenCases + , intercalate "\n" skeleRuleCases + , ind 2 "case _:" + , ind 3 "raise Exception(str(ast.__class__) + ' unmatched')" + , "" + , "" + , "# Categories split into their own matchers" + , unlines matchersOnCats + ] + where + rules = + [ r | r <- cfgRules cf + , not (isCoercion r) + , not (isDefinedRule r) + , not (isNilCons r) + ] + + presentLiterals = ifC catInteger ++ + ifC catDouble ++ + ifC catString ++ + ifC catIdent ++ + ifC catChar + + skeleLiteralCases = map makeSkeleTokenCase presentLiterals + skeleTokenCases = map makeSkeleTokenCase (tokenNames cf) + skeleRuleCases = map makeSkeleRuleCase rules + + parserCats = filter (not . isList) (allParserCatsNorm cf) :: [Cat] + rulesfornormalizedcat = map (rulesForNormalizedCat cf) parserCats + parserCatsWithRules = zip parserCats rulesfornormalizedcat + + matchersOnCats = map makeMatcherOnCat parserCatsWithRules + + ifC :: TokenCat -> [String] + ifC cat = if isUsedCat cf (TokenCat cat) then [cat] else [] + + +-- Creates a matcher for some value category. +makeMatcherOnCat :: (Cat, [Rul RFun]) -> String +makeMatcherOnCat (c, rules) = unlines + [ "def matcher" ++ show c ++ "(" ++ varName ++ ": " ++ show c ++ "):" + , ind 1 "match " ++ varName ++ ":" + , intercalate "\n" cases + , ind 2 "case _:" + , ind 3 "raise Exception(str(" ++ varName ++ ".__class__) + ' unmatched')" + , "" + ] + where + varName = map toLower (show c) ++ "_" + cases = map makeSkeleRuleCase (filter + (\r -> not (isCoercion r) && not (isDefinedRule r)) + rules) + + +-- | Creates a case for some rule. +makeSkeleRuleCase :: Rul RFun -> String +makeSkeleRuleCase rule = intercalate "\n" + [ ind 2 "case " ++ name ++ "(" ++ varNamesCommad ++ "):" + , ind 3 "# " ++ (showEcss sentForm) + , ind 3 "raise Exception('" ++ name ++ " not implemented')" + ] + where + name = unkw (funName rule) + sentForm = rhsRule rule + nvCats = numVars sentForm :: [Either (Cat, Doc) String] + enumeratedVarNames = [render d | (_, d) <- lefts nvCats] + varNamesCommad = addCommas (enumeratedVarNames ++ ["_ann_type"]) + + +-- | Creates a case for a user-defined token. +makeSkeleTokenCase :: String -> String +makeSkeleTokenCase tokenName = intercalate "\n" + [ ind 2 "case " ++ unkw tokenName ++ "():" + , ind 3 "raise Exception('" ++ unkw tokenName ++ " not implemented')" + ] + diff --git a/source/src/BNFC/Backend/Python/PyHelpers.hs b/source/src/BNFC/Backend/Python/PyHelpers.hs new file mode 100644 index 00000000..a1258e85 --- /dev/null +++ b/source/src/BNFC/Backend/Python/PyHelpers.hs @@ -0,0 +1,135 @@ + +{- + BNF Converter: Python backend helper functions + Copyright (C) 2024 Author: Bjorn Werner +-} + +module BNFC.Backend.Python.PyHelpers where +import Data.List ( intercalate ) +import Data.Char +import BNFC.CF + + +-- Indents by four spaces +ind :: Int -> String -> String +ind 0 s = s +ind n s = ind (n-1) (" " ++ s) + + +addCommas :: [String] -> String +addCommas ss = intercalate ", " ss + + +addCitationSigns :: String -> String +addCitationSigns ss = "'" ++ ss ++ "'" + + +filterOut :: Eq a => [a] -> [a] -> [a] +filterOut xs ys = filter (\x -> not (elem x ys)) xs + + +-- Converts every character to unicode with an underscore in front. +toOrd :: String -> String +toOrd s = concat (map (("_" ++) . show . ord) s) + + +-- | Converts a string of underscores and unicode numbers such as "_53_53" +-- into "++". +toChr :: String -> String +toChr "" = "" +toChr xs = map chr nrs + where + nrsStr = tail $ split '_' xs :: [String] + nrs = map read nrsStr :: [Int] + + +split :: Char -> String -> [String] +split c s = split' c s "" + + +split' :: Char -> String -> String -> [String] +split' _ [] ps = [ps] +split' c (s:ss) ps + | c == s = [ps] ++ split' c ss "" + | otherwise = split' c ss (ps ++ [s]) + + +-- Converts [Cat] into ListCat, which is mainly used in the parser. +translateToList :: String -> String +translateToList s + | strIsList s = "List" ++ (tail $ init s) + | otherwise = s + + +strIsList :: String -> Bool +strIsList s = head s == '[' && last s == ']' + + +firstRight :: [Either a b] -> Maybe b +firstRight [] = Nothing +firstRight (Left _:es) = firstRight es +firstRight (Right r:_) = Just r + + +-- Retrieves the first character from strings such as "[Stm]" or "Stm". +firstAlpha :: String -> Char +firstAlpha s + | strIsList s = head $ tail s + | otherwise = head s + + +-- | Converts a production into a string, for comments. +showEcss :: [Either Cat String] -> String +showEcss [] = "" +showEcss (Left c:ecss) = show c ++ " " ++ (showEcss ecss) +showEcss (Right strOp:ecss) = "\"" ++ strOp ++ "\" " ++ (showEcss ecss) + + +-- | Adds an underscore if the string overlaps with a keyword. +unkw :: String -> String +unkw s = if s `elem` pythonReserved then s ++ "_" else s + + +-- | Python keyword list plus soft keywords +pythonReserved :: [String] +pythonReserved = + [ "and" + , "as" + , "assert" + , "async" + , "await" + , "break" + , "case" + , "class" + , "continue" + , "def" + , "del" + , "elif" + , "else" + , "except" + , "False" + , "finally" + , "for" + , "from" + , "global" + , "if" + , "import" + , "in" + , "is" + , "lambda" + , "match" + , "None" + , "nonlocal" + , "not" + , "or" + , "pass" + , "raise" + , "return" + , "True" + , "try" + , "type" + , "while" + , "with" + , "yield" + , "_" + ] diff --git a/source/src/BNFC/Backend/Python/RegToFlex.hs b/source/src/BNFC/Backend/Python/RegToFlex.hs new file mode 100644 index 00000000..37e357b4 --- /dev/null +++ b/source/src/BNFC/Backend/Python/RegToFlex.hs @@ -0,0 +1,97 @@ +{-# LANGUAGE LambdaCase #-} + +{- + Due to the almost full similarity, the name RegToFlex remains from the + C backend (2024). +-} + +module BNFC.Backend.Python.RegToFlex (printRegFlex, escapeChar) where + +-- modified from pretty-printer generated by the BNF converter + +import Data.Char (ord, showLitChar) +import qualified Data.List as List +import BNFC.Abs (Reg(..), Identifier(Identifier)) +import BNFC.Backend.Common (flexEps) + + +-- the top-level printing method +printRegFlex :: Reg -> String +printRegFlex = render . prt 0 + + +-- you may want to change render and parenth +render :: [String] -> String +render = rend (0::Int) where + rend i ss = case ss of + "[" :ts -> cons "[" $ rend i ts + "(" :ts -> cons "(" $ rend i ts + t : "," :ts -> cons t $ space "," $ rend i ts + t : ")" :ts -> cons t $ cons ")" $ rend i ts + t : "]" :ts -> cons t $ cons "]" $ rend i ts + t :ts -> space t $ rend i ts + _ -> "" + cons s t = s ++ t + space t s = if null s then t else t ++ s + + +parenth :: [String] -> [String] +parenth ss = ["("] ++ ss ++ [")"] + + +-- the printer class does the job +class Print a where + prt :: Int -> a -> [String] + + +prPrec :: Int -> Int -> [String] -> [String] +prPrec i j = if j prPrec i 2 (concat [prt 2 reg0 , prt 3 reg]) + RAlt reg0 reg -> prPrec i 1 (concat [prt 1 reg0 , ["|"] , prt 2 reg]) + + -- Flex does not support set difference. See link for valid patterns. + -- https://westes.github.io/flex/manual/Patterns.html#Patterns + -- RMinus reg0 reg -> prPrec i 1 (concat [prt 2 reg0 , ["#"] , prt 2 reg]) + RMinus reg0 REps -> prt i reg0 -- REps is identity for set difference + RMinus RAny (RChar c) -> [ concat [ "[^", escapeChar c, "]" ] ] + RMinus RAny (RAlts str) -> [ concat [ "[^", concatMap escapeChar str, "]" ] ] + -- FIXME: unicode inside brackets [...] is not accepted by flex + -- FIXME: maybe we could add cases for char - RDigit, RLetter etc. + RMinus _ _ -> error "Flex does not support general set difference" + + RStar reg -> concat [ prt 3 reg , ["*"] ] + RPlus reg -> concat [ prt 3 reg , ["+"] ] + ROpt reg -> concat [ prt 3 reg , ["?"] ] + REps -> [ flexEps ] + RChar c -> [ escapeChar c ] + -- Unicode characters cannot be inside [...] so we use | instead. + RAlts str -> prPrec i 1 $ List.intersperse "|" $ map escapeChar str + -- RAlts str -> concat [["["], prt 0 $ concatMap escapeChar str, ["]"]] + RSeqs str -> prPrec i 2 $ map escapeChar str + RDigit -> [ "\\d" ] + RLetter -> [ "[A-Za-z]" ] -- add underscore ? + RUpper -> [ "[A-Z]" ] + RLower -> [ "[a-z]" ] + RAny -> [ "." ] + + +-- | Handle special characters in regular expressions. +escapeChar :: Char -> String +escapeChar c + | c `elem` reserved = '\\':[c] + | let x = ord c, x >= 256 = [c] + -- keep unicode characters -- "\x" ++ showHex x "" + | otherwise = showLitChar c "" + where + reserved :: String + reserved = " '$+-*=<>[](){}!?.,;:^~|&%#/\\$_@\"" + + diff --git a/source/src/BNFC/Options.hs b/source/src/BNFC/Options.hs index ac5fdbf6..74a1c757 100644 --- a/source/src/BNFC/Options.hs +++ b/source/src/BNFC/Options.hs @@ -64,6 +64,7 @@ data Target = TargetC | TargetCpp | TargetCppNoStl | TargetHaskell | TargetHaskellGadt | TargetLatex | TargetJava | TargetOCaml | TargetPygments | TargetTreeSitter + | TargetPython | TargetCheck deriving (Eq, Bounded, Enum, Ord) @@ -83,6 +84,7 @@ instance Show Target where show TargetPygments = "Pygments" show TargetTreeSitter = "Tree-sitter" show TargetCheck = "Check LBNF file" + show TargetPython = "Python" -- | Which version of Alex is targeted? data AlexVersion = Alex3 @@ -261,6 +263,7 @@ printTargetOption = ("--" ++) . \case TargetOCaml -> "ocaml" TargetPygments -> "pygments" TargetTreeSitter -> "tree-sitter" + TargetPython -> "python" TargetCheck -> "check" printAlexOption :: AlexVersion -> String @@ -314,6 +317,8 @@ targetOptions = "Output a Python lexer for Pygments" , Option "" ["tree-sitter"] (NoArg (\o -> o {target = TargetTreeSitter})) "Output grammar.js file for use with tree-sitter" + , Option "" ["python"] (NoArg (\ o -> o{target = TargetPython })) + "Output Python code for use with PLY" , Option "" ["check"] (NoArg (\ o -> o{target = TargetCheck })) "No output. Just check input LBNF file" ] @@ -530,6 +535,7 @@ instance Maintained Target where TargetOCaml -> True TargetPygments -> True TargetTreeSitter -> True + TargetPython -> True TargetCheck -> True instance Maintained AlexVersion where @@ -661,4 +667,5 @@ translateOldOptions = mapM $ \ o -> do , ("--ghc" , "--generic") , ("--deriveGeneric" , "--generic") , ("--deriveDataTypeable" , "--generic") + , ("-python" , "--python") ] diff --git a/testing/src/ParameterizedTests.hs b/testing/src/ParameterizedTests.hs index ce0c945c..c5876296 100644 --- a/testing/src/ParameterizedTests.hs +++ b/testing/src/ParameterizedTests.hs @@ -421,6 +421,10 @@ parameters = concat , javaParams { tpName = "Java (with jflex and line numbers)" , tpBnfcOptions = ["--java", "--jflex", "-l"] } ] + -- Python + , [ pythonParams { tpName = "Python" + , tpBnfcOptions = ["--python"] } + ] ] where base = baseParameters @@ -444,6 +448,14 @@ parameters = concat , tpBnfcOptions = ["--ocaml"] , tpRunTestProg = haskellRunTestProg } + pythonParams = base + { tpBuild = do + tpMake + , + tpRunTestProg = \ _lang args -> do + pyFile_ <- findFile "genTest.py" + cmd "python3" $ pyFile_ : args + } -- | Helper function that runs bnfc with the context's options and an -- option to generate 'tpMakefile'.