diff --git a/.gitignore b/.gitignore index 4acafde18..621bae8f2 100644 --- a/.gitignore +++ b/.gitignore @@ -408,3 +408,4 @@ dmypy.json # Custom rules (everything added below won't be overriden by 'Generate .gitignore File' if you use 'Update' option) +.vscode/ \ No newline at end of file diff --git a/doc/report.md b/doc/report.md new file mode 100644 index 000000000..526e60a6e --- /dev/null +++ b/doc/report.md @@ -0,0 +1,239 @@ +# Cool Compiler + +## Autores ✒️ + +- **Carmen Irene Cabrera Rodríguez** - [cicr99](https://github.com/cicr99) +- **David Guaty Domínguez** - [Gu4ty](https://github.com/Gu4ty) +- **Enrique Martínez González** - [kikeXD](https://github.com/kikeXD) + +## Requerimientos 📋 + +Para la ejecución de este proyecto necesita tener istalado: + +- Python3.7 o superior +- Las dependencias que se encuentran listadas en el archivo [requirements.txt](../requirements.txt) +- Spim para ejecutar programas MIPS32 + +Si lo desea, usted puede instalar todas las dependencias necesarias ejecutando en su terminal el siguiente comando, desde el directorio `/src`: + +```bash +make install +``` + +## Modo de uso + +Para compilar y ejecutar un archivo en COOL deberá ejecutar el siguiente comando en la terminal desde el directorio `/src`: + +```bash +make main .cl +``` + +Si usted no proporciona ningún archivo, se tomará por defecto el archivo `code.cl` presente en dicho directorio. El comando anterior es equivalente a: + +```bash +./coolc.sh .cl +spim -file .mips +``` + +## Arquitectura del compilador + +Para la implementación de este proyecto se utilizaron como base los contenidos y proyectos desarrollados en 3er año; añadiendo las funcionalidades faltantes y realizando modificaciones y mejoras sobre el código ya existente. + +### Fases + +Las fases en que se divide el proceso de compilación se muestran a continuación y serán explicadas con más detalle en las secciones siguientes: + +1. Lexer +2. Parsing +3. Recolección de tipos +4. Construcción de tipos +5. Chequeo / Inferencia de tipos +6. Traducción de COOL a CIL +7. Traducción de CIL a MIPS + +### Lexer + +Para el análisis léxico se utilizó el módulo `lex.py` del paquete PLY de Python, que permite separar el texto de entrada (código COOL) en una colección de _tokens_ dado un conjunto de reglas de expresiones regulares. + +Para la obtención de los tokens de _string_ y los comentarios multilíneas se definieron en el lexer, además del _INITIAL_, que es el estado que usa el lexer por defecto, dos estados exclusivos: + +```python + states = ( + ("string", "exclusive"), + ("comment", "exclusive"), + ) +``` + +Esto permitió tener en cuenta: el uso de caracteres inválidos en el primer caso, y los comentarios anidados en el segundo. + +Además se llevaron a cabo cálculos auxiliares para obtener el valor de la columna de cada token, puesto que el lexer solo cuenta con el número de fila y el index. + +### Parsing + +Se utilizó una modificación de la implementación previa del parser LR1 para llevar a cabo la fase de _parsing_; esta se realizó para poder almacenar el token, en lugar de solo su lexema; puesto que el token también guarda la posición _(fila, columna)_. + +La gramática utilizada es S-atributada. Podrá encontrar la implementación de la misma en [grammar.py](https://github.com/codersUP/cool-compiler-2021/blob/master/src/compiler/cmp/grammar.py) + +### Recolección de tipos + +Esta fase se realiza mediante la clase _Type Collector_ que sigue los siguientes pasos: + +- Definición de los _built-in types_, o sea, los tipos que son inherentes al lenguaje Cool : _Int_, _String_, _Bool_, _IO_, _Object_; incluyendo la definición de sus métodos. Además se añaden como tipos _SELF_TYPE_, _AUTO_TYPE_. +- Recorrido por las declaraciones hechas en el programa recolectando los tipos creados. +- Chequeo de los padres que están asignados a cada tipo. Como las clases pueden definirse de modo desordenado, el chequeo de la asignación correcta de padres para cada clase debe hacerse después de recolectar los tipos. De esta forma es posible capturar errores como que un tipo intente heredar de otro que no existe. Aquellas clases que no tengan un padre explícito se les asigna _Object_ como padre. +- Chequeo de herencia cíclica. En caso de detectar algún ciclo en la jerarquía de tipos, se reporta el error, y a la clase por la cual hubo problema se le asigna Object como padre, para continuar el análisis. +- Una vez chequeados los puntos anteriores, se reorganiza la lista de nodos de declaración de clases que está guardada en el nodo Program. La reorganización se realiza tal que para cada tipo A, si este hereda del tipo B (siendo B otra de las clases definidas en el programa) la posición de B en la lista es menor que la de A. De esta manera, cuando se visite un nodo de declaración de clase, todas las clases de las cuales él es descendiente, ya fueron visitadas previamente. + +### Construcción de tipos + +La construcción de tipos se desarrolla empleando la clase _Type Builder_. Esta se encarga de visitar los _features_ de las declaraciones de clase, dígase: funciones y atributos; tal que cada tipo contenga los atributos y métodos que lo caracterizan. + +Además se encarga de chequear la existencia del tipo **Main** con su método **main** correspondiente, como es requerido en COOL. + +En esta clase también se hace uso de la clase _Inferencer Manager_ que permitirá luego realizar la inferencia de tipo. Por tanto, a todo atributo, parámetro de método o tipo de retorno de método, que esté definido como AUTO*TYPE se le asigna un \_id* que será manejado por el manager mencionado anteriormente. Este id será guardado en el nodo en cuestión para poder acceder a su información en el manager cuando sea necesario. + +### Chequeo e Inferencia de tipos + +En primer lugar se utiliza la clase _Type Checker_ para validar el correcto uso de los tipos definidos. Toma la instancia de clase _Inferencer Manager_ utilizada en el _Type Builder_ para continuar la asignación de _id_ a otros elementos en el código que también pueden estar definidos como _AUTO_TYPE_, como es el caso de las variables definidas en la expresión _Let_. Las variables definidas en el _Scope_ se encargarán de guardar el _id_ asignado; en caso de que no se les haya asignado ninguno, el id será _None_. + +La instancia de _Scope_ creada en el _Type Checker_, así como la de _Inferencer Manager_ se pasarán al _Type Inferencer_ para realizar la inferencia de tipos. + +Ahora bien, la clase Inferencer Manager guarda las listas _conforms_to_, _conformed_by_, _infered_type_. El _id_ asignado a una variable representa la posición donde se encuentra la información relacionada a la misma en las listas. + +Sea una variable con _id = i_, que está definida como _AUTO_TYPE_ y sea _A_ el tipo estático que se ha de inferir: + +- `conforms_to[i]` guarda una lista con los tipos a los que debe conformarse _A_; note que esta lista contiene al menos al tipo _Object_. El hecho de que _A_ se conforme a estos tipos, implica que todos ellos deben encontrarse en el camino de él a Object en el árbol de jerarquía de tipos. En caso contrario se puede decir que hubo algún error en la utilización del _AUTO_TYPE_ para esta variable. Sea _B_ el tipo más lejano a _Object_ de los que aparecen en la lista. +- `conformed_by[i]` almacena una lista con los tipos que deben conformarse a _A_. Luego el menor ancestro común (_LCA - Lowest Common Ancestor_) de dichos tipos deberá conformarse a A. Note que este siempre existirá, pues en caso peor será _Object_, que es la raíz del árbol de tipos. Sea _C_ el _LCA_ de los tipos guardados. Note que si la lista está vacía, (que puede suceder) _C_ será _None_. +- Como _C_ se conforma a _A_ y _A_ se conforma a _B_, tiene que ocurrir que _C_ se conforma a _B_. En caso contrario, se reporta un uso incorrecto de _AUTO_TYPE_ para esa variable. Todos los tipos en el camino entre _B_ y _C_ son válidos para inferir _A_; pues cumplen con todas las restricciones que impone el programa. En nuestro caso se elige _C_, que es el tipo más restringido, para la inferencia. En caso de que _C_ sea _None_ se toma _B_ como tipo de inferencia. +- `infered_type[i]` guardará el tipo inferido una vez realizado el procedimiento anterior; mientras tanto su valor es _None_. + +La clase _Inferencer Manager_ además, está equipada con métodos para actualizar las listas dado un _id_, y para realizar la inferencia dados los tipos almacenados. + +El _Type Inferencer_ por su parte, realizará un algoritmo de punto fijo para llevar a cabo la inferencia: + +1. Realiza un recorrido del AST (Árbol de Sintaxis Abstracta) actualizando los conjuntos ya mencionados. Cuando se visita un nodo, específicamente un _ExpressionNode_, este recibe como parámetro un conjunto de tipos a los que debe conformarse la expresión; a su vez retorna el tipo estático computado y el conjunto de tipos que se conforman a él. Esto es lo que permite actualizar las listas que están almacenadas en el _manager_. +2. Infiere todos los tipos que pueda con la información recogida. +3. - Si pudo inferir al menos uno nuevo, regresa al punto 1; puesto que este tipo puede influir en la inferencia de otros. + - Si no pudo inferir ninguno, significa que ya no hay más información que se pueda inferir, por tanto se realiza un último recorrido asignando tipo _Object_ a todos los AUTO_TYPES que no pudieron ser inferidos. + +> Se considera que un tipo puede ser inferido, si no ha sido inferido anteriormente, y si su lista _conforms_to_ contiene a otro tipo distinto de Object o su lista _conformed_by_ contiene al menos un tipo. + +Por último se realiza un nuevo recorrido del _AST_ con el _Type Checker_ para detectar nuevamente los errores semánticos que puedan existir en el código, ahora con los _AUTO_TYPES_ sustituidos por el tipo inferido. + +### Traducción de COOL a CIL + +Se definió un _visitor_ en el que se recorre todo el _ast_ generado en etapas anteriores y que recibe el contexto, que también fue creado previamente, para tener la información relacionada a los tipos que se encuentren en el código. El objetivo fundamental de este recorrido es generar otro _ast_ que posee estructuras pertenecientes a CIL y que hará más fácil la generación de código MIPS posteriormente. Además, se generan chequeos que permitirán lanzar errores en tiempo de ejecución. + +Primero que todo, se generan todos los tipos pertenecientes a COOL por defecto. Para ello, por cada tipo se crea un nodo que contenga sus atributos y funciones, lo que permite luego generarlos en MIPS. Después de este paso, comienza en sí el recorrido al _ast_ de COOL. + +En este recorrido se generan las 3 principales estructuras que posee el código de CIL: + +- los **tipos**, donde se guarda un resumen de los _features_ de cada uno de los tipos declarados en el código, +- los **datos**, sección en la que se encuentran todas las "macros" que serán utilizadas durante la ejecución, +- el **código**, donde son colocadas todas las funciones generadas a partir del recorrido. + +En este recorrido por el ast, se define la estructura necesaria para la detección de ciertos errores en tiempo de ejecución. Entre los errores que se chequean se encuentran: la comprobación de que no se realicen divisiones por 0, el llamado a una función dinámica de una variable de tipo _void_, los índices en _strings_ fuera de rango, entre otros. Agregar esta comprobación en el ast de CIL hace mucho más sencillo el proceso de recorrido de este *ast* posteriormente. + +En el caso del _case_ se chequea que la expresión principal no sea de tipo _void_ y además, que se conforme a alguna rama en la ejecución de este. El algoritmo empleado para reconocer por cuál de las ramas continuará la ejecución del código comienza por: tomar el tipo de todas las ramas del _case_, llámese a este conjunto $A$; por cada elemento del conjunto $A$ se toma la cantidad de tipos dentro del propio conjunto que se conforman a $a_i, i \in [1, |A|]$ ,de modo que se obtienen los pares $$. Se define $|\{a_j \leq a_i, \forall j, j\in[1, |A|]\}|$ como $a_{i_c}$. Tomando los elementos $a_i$ por el que menor $a_{i_c}$ tenga, se estará tomando los nodos más abajos en el árbol de tipos dentro de cada posible rama de este. Si se ordenan las ramas del _case_ por el que menor $a_{i_c}$ se obtendrá una lista. Luego se recorre esta generando por cada elemento el subconjunto $B_i$ donde $b_{i_i} \in B_i$ si $b_{i_i} <= a_i$. Se chequea si el tipo de la expresión principal del _case_ aparece en este subconjunto. En el caso de que aparezca, el case se resuelve yendo por la rama que posee el tipo $a_i$. + +### Traducción de CIL a MIPS + +Para la generación de código MIPS se definió un _visitor_ sobre el _ast_ de CIL generado en la etapa anterior. Este _visitor_ produce un nuevo _ast_ que representan las secciones: _.DATA_, _.TEXT_ y las instrucciones en el código MIPS. Otro _visitor_ definido esta vez sobre los nodos del _ast_ del código MIPS se encarga de producir el código de MIPS que será ejecutado por el emulador SPIM. + +**Representación de objetos en memoria** + +El principal desafío en esta etapa es decidir como representar las instancias de tipos en memoria. Los objetos en memoria se representan de la siguiente manera: + +| Dirección x | Dirección x + 4 | Dirección x + 8 | ... | Dirección x + a * 4 | +| ----------- | --------------- | --------------- | --- | ------------------- | +| Tipo | Atributo $0$ | Atributo $1$ | ... | Atributo $a - 1$ | + +Por lo que un objeto es una zona continua de memoria de tamaño $1 + 4 * a$, donde $a$ es la cantidad de atributos que posee el objeto. El tipo y cada atributo son de tamaño $1$ _palabra_. + +El campo _Tipo_ es un número entre $0$ y $n-1$, siendo $n$ la cantidad total de tipos definidos en el programa de COOL a compilar. Un atributo puede guardar un valor específico o dicho valor puede ser interpretado como la dirección en memoria de otro objeto. + +Para saber la cantidad de tipos y asignarles a cada uno un valor entre $0$ y $n$, en el _visitor_ sobre el _ast_ de CIL primero se recorren todos los tipos definidos por el código CIL, asignándoles valores distintos de manera ordenada según se van descubriendo. Además, por cada tipo se guardan también los nombres de sus parámetros y métodos en el orden en que se definieron en el tipo. + +Para obtener o modificar un atributo específico de una instancia conociendo el nombre del atributo, se busca su índice en los atributos almacenados para el tipo en cuestión. Si el índice es $i$, entonces su valor estará en la dirección de memoria $(x+4) + (i * 4)$. + +**Inicialización** + +Cuando se crea una nueva instancia mediante la instrucción de CIL _ALLOCATE_ se conoce el tipo del objeto a crear. Esta información se aprovecha para inicializar con valores por defecto la instancia de acuerdo a su tipo. Los tipos primitivos de COOL se inicializan de forma específica. Para los demás tipos, el código CIL de la etapa anterior genera para cada tipo una función _init_ que se encarga de esta tarea, la cual es llamada en el código CIL y traducida a MIPS después. + +**LLamado de función dinámico** + +Para cada tipo, se guardan sus métodos en una lista llamada _dispatch_. Una lista _dispatch_ de $m$ métodos tiene la siguiente estructura + +| Dirección x | Dirección x + 4 | Dirección x + 8 | ... | Dirección x + (m-1) * 4 | +| ----------- | --------------- | ---------------- | --- | ----------------------- | +| Método 0 | Método 1 | Método 2 | ... | Método m-1 | + +Se tendrán $n$ listas, una por cada tipo. Cada celda es de una palabra y contiene la dirección a la primera instrucción del método correspondiente, o lo que es lo mismo, la dirección de la etiqueta generada para el método. + +Los métodos en la lista se encuentran en el mismo orden en que fueron definidos en el tipo. + +Estando una lista _dispatch_ específica, se decide la ubicación del método buscado por un proceso análogo a los atributos en las instancias de los objetos explicado anteriormente. Si el índice del método dentro del tipo es $i$, entonces la dirección del método buscado estará en la dirección $x + 4 * i$. + +Ahora solo faltaría saber por cuál de las listas _dispatch_ decidirse para buscar el método dado un tipo. + +Para eso se tiene otra lista llamada _virtual_. Su función es almacenar por cada tipo, la dirección a su lista _dispatch_ . La lista _virtual_ tiene la siguiente forma: + +| Dirección $x$ | Dirección $x + 4$ | Dirección $x + 8$ | ... | Dirección $x + (n-1) * 4$ | +| -------------- | ----------------- | ----------------- | --- | ------------------------- | +| _dispatch_ $0$ | _dispatch_ $1$ | _dispatch_ $2$ | ... | _dispatch_ $n - 1$ | + +Recordar que $n$ es la cantidad de tipos. + +Dado una instancia en memoria, se puede ver su tipo en la primera de sus direcciones continuas. Luego se hace otro proceso análogo a como se buscaron los atributos y métodos. Se obtiene el índice del tipo de la instancia y se decide por cual _dispatch_ buscar el método que se quiere invocar. Si el índice del tipo es $i$, se buscará en la lista _dispatch_ en la posición $x + 4*i$. + +### Estructura del proyecto + +El *pipeline* que sigue el proceso de compilación se observa en el archivo [main.py](https://github.com/codersUP/cool-compiler-2021/blob/master/src/main.py). Se hace uso de las funcionalidades implementadas en el paquete `compiler`, que presenta la siguiente estructura: + +```bash +├── cmp +│ ├── ast.py +│ ├── automata.py +│ ├── cil_ast.py +│ ├── grammar.py +│ ├── __init__.py +│ ├── mips_ast.py +│ ├── pycompiler.py +│ ├── semantic.py +│ └── utils.py +├── __init__.py +├── lexer +│ ├── __init__.py +│ └── lex.py +├── parser +│ ├── __init__.py +│ ├── parser.py +│ └── utils.py +└── visitors + ├── cil2mips + │ ├── cil2mips.py + │ ├── __init__.py + │ ├── mips_lib.asm + │ ├── mips_printer.py + │ └── utils.py + ├── cool2cil + │ ├── cil_formatter.py + │ ├── cool2cil.py + │ └── __init__.py + ├── __init__.py + ├── semantics_check + │ ├── formatter.py + │ ├── __init__.py + │ ├── type_builder.py + │ ├── type_checker.py + │ ├── type_collector.py + │ └── type_inferencer.py + ├── utils.py + └── visitor.py +``` + +En su mayoría, los módulos que posee el paquete `cmp` fueron tomados de los proyectos y contenidos vistos en 3er año. Los paquetes `lexer` y `parser` definen la lógica para la tokenización y posterior parsing del texto de entrada respectivamente. El paquete `visitors` contiene las funcionalidades para llevar a cabo los recorridos sobre los *ast*, que en este caso serían: los *visitors* para realizar el chequeo semántico, el *visitor* que permite traducir de COOL a CIL, y finalmente, el *visitor* que permite traducir de CIL a MIPS. + +## Licencia + +Este proyecto se encuentra bajo la Licencia (MIT License) - ver el archivo [LICENSE.md](LICENSE.md) para más detalles. diff --git a/doc/report.pdf b/doc/report.pdf new file mode 100644 index 000000000..dde73e2c9 Binary files /dev/null and b/doc/report.pdf differ diff --git a/doc/team.yml b/doc/team.yml index c16162532..40f244493 100644 --- a/doc/team.yml +++ b/doc/team.yml @@ -1,10 +1,10 @@ members: - - name: Nombre Apellido1 Apellido2 - github: github_id - group: CXXX - - name: Nombre Apellido1 Apellido2 - github: github_id - group: CXXX - - name: Nombre Apellido1 Apellido2 - github: github_id - group: CXXX + - name: Carmen Irene Cabrera Rodríguez + github: cicr99 + group: C412 + - name: David Guaty Domínguez + github: Gu4ty + group: C412 + - name: Enrique Martínez González + github: kikeXD + group: C412 diff --git a/requirements.txt b/requirements.txt index 9eb0cad1a..cba16ee2f 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,2 +1,3 @@ pytest pytest-ordering +ply diff --git a/src/code.cl b/src/code.cl new file mode 100644 index 000000000..d94ae9278 --- /dev/null +++ b/src/code.cl @@ -0,0 +1,34 @@ +class Main inherits IO { + number: Int <- 6; + + main () : Object { + testing_fibonacci(number) + }; + + testing_fibonacci(n: Int) : IO {{ + out_string("Iterative Fibonacci : "); + out_int(iterative_fibonacci(n)); + out_string("\\n"); + + out_string("Recursive Fibonacci : "); + out_int(recursive_fibonacci(n)); + out_string("\\n"); + }}; + + recursive_fibonacci (n: AUTO_TYPE) : AUTO_TYPE { + if n <= 2 then 1 else recursive_fibonacci(n - 1) + recursive_fibonacci(n - 2) fi + }; + + iterative_fibonacci(n: AUTO_TYPE) : AUTO_TYPE { + let i: Int <- 2, n1: Int <- 1, n2: Int <- 1, temp: Int in { + while i < n loop + let temp: Int <- n2 in { + n2 <- n2 + n1; + n1 <- temp; + i <- i + 1; + } + pool; + n2; + } + }; +}; \ No newline at end of file diff --git a/src/compiler/__init__.py b/src/compiler/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/cmp/__init__.py b/src/compiler/cmp/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/cmp/ast.py b/src/compiler/cmp/ast.py new file mode 100644 index 000000000..14f22b5a1 --- /dev/null +++ b/src/compiler/cmp/ast.py @@ -0,0 +1,252 @@ +from .semantic import Type +from .utils import Token, emptyToken +from typing import List, Optional, Tuple, Union + + +class Node: + def __init__(self, token: Token): + self.token = token + + +class DeclarationNode(Node): + pass + + +class ExpressionNode(Node): + def __init__(self, token: Token, computed_type: Optional[Type] = None): + super().__init__(token) + self.computed_type = computed_type + + +class FuncDeclarationNode(DeclarationNode): + def __init__( + self, + token: Token, + params: List[Tuple[Token, Token]], + return_type: Token, + body: ExpressionNode, + ): + self.id = token.lex + # `param` is (nameToken, typeToken) + self.params = params + self.type = return_type.lex + self.typeToken = return_type + self.body = body + self.token = token + + +class AttrDeclarationNode(DeclarationNode): + def __init__( + self, + idx: Token, + typex: Token, + expr: Optional[ExpressionNode] = None, + token: Token = emptyToken, + ): + self.id = idx.lex + self.idToken = idx + self.type = typex.lex + self.typeToken = typex + self.expr = expr + self.token = token + + +class ClassDeclarationNode(DeclarationNode): + def __init__( + self, + idx: Token, + features: List[Union[FuncDeclarationNode, AttrDeclarationNode]], + token: Token, + parent: Optional[Token] = None, + ): + self.id = idx.lex + self.tokenId = idx + self.token = token + self.parent = parent + self.features = features + + +class ProgramNode(Node): + def __init__(self, declarations: List[ClassDeclarationNode]): + super().__init__(emptyToken) + self.declarations = declarations + + +class AssignNode(ExpressionNode): + def __init__(self, idx: Token, expr: ExpressionNode, token: Token): + super().__init__(token) + self.id = idx.lex + self.idToken = idx + self.expr = expr + + +class CallNode(ExpressionNode): + def __init__( + self, + obj: ExpressionNode, + idx: Token, + args: List[ExpressionNode], + cast_type: Token = emptyToken, + ): + super().__init__(idx) + self.obj = obj + self.id = idx.lex + self.args = args + self.type = cast_type.lex + self.typeToken = cast_type + + +class CaseBranchNode(Node): + def __init__(self, token: Token, idx: Token, typex: Token, expr: ExpressionNode): + self.token = token + self.id = idx.lex + self.idToken = idx + self.typex = typex.lex + self.typexToken = typex + self.expression = expr + + +class CaseNode(ExpressionNode): + def __init__( + self, expr: ExpressionNode, branch_list: List[CaseBranchNode], token: Token + ): + super().__init__(token) + self.expr = expr + self.branch_list = branch_list + + +class BlockNode(ExpressionNode): + def __init__(self, expr_list: List[ExpressionNode], token: Token): + super().__init__(token) + self.expr_list = expr_list + + +class LoopNode(ExpressionNode): + def __init__(self, cond: ExpressionNode, body: ExpressionNode, token: Token): + super().__init__(token) + self.condition = cond + self.body = body + + +class ConditionalNode(ExpressionNode): + def __init__( + self, + cond: ExpressionNode, + then_body: ExpressionNode, + else_body: ExpressionNode, + token: Token, + ): + super().__init__(token) + self.condition = cond + self.then_body = then_body + self.else_body = else_body + + +class LetVarNode(Node): + def __init__( + self, + idx: Token, + typex: Token, + expr: Optional[ExpressionNode] = None, + token: Token = emptyToken, + ): + self.token = token + self.id = idx.lex + self.idToken = idx + self.typex = typex.lex + self.typexToken = typex + self.expression = expr + + +class LetNode(ExpressionNode): + def __init__(self, id_list: List[LetVarNode], body: ExpressionNode, token: Token): + super().__init__(token) + self.id_list = id_list + self.body = body + + +class AtomicNode(ExpressionNode): + def __init__(self, token: Token): + super().__init__(token) + self.lex = token.lex + + +class UnaryNode(ExpressionNode): + def __init__(self, expr: ExpressionNode, symbol: Token): + super().__init__(symbol) + self.expr = expr + + +class BinaryNode(ExpressionNode): + def __init__(self, left: ExpressionNode, right: ExpressionNode, symbol: Token): + super().__init__(symbol) + self.left = left + self.right = right + + +class ArithmeticNode(BinaryNode): + pass + + +class ComparisonNode(BinaryNode): + pass + + +class ConstantNumNode(AtomicNode): + pass + + +class ConstantStringNode(AtomicNode): + pass + + +class ConstantBoolNode(AtomicNode): + pass + + +class VariableNode(AtomicNode): + pass + + +class InstantiateNode(AtomicNode): + pass + + +class PlusNode(ArithmeticNode): + pass + + +class MinusNode(ArithmeticNode): + pass + + +class StarNode(ArithmeticNode): + pass + + +class DivNode(ArithmeticNode): + pass + + +class LeqNode(ComparisonNode): + pass + + +class LessNode(ComparisonNode): + pass + + +class EqualNode(BinaryNode): + pass + + +class VoidNode(UnaryNode): + pass + + +class NotNode(UnaryNode): + pass + + +class NegNode(UnaryNode): + pass diff --git a/src/compiler/cmp/automata.py b/src/compiler/cmp/automata.py new file mode 100644 index 000000000..da7311f90 --- /dev/null +++ b/src/compiler/cmp/automata.py @@ -0,0 +1,165 @@ +class State: + def __init__(self, state, final=False, formatter=lambda x: str(x), shape="circle"): + self.state = state + self.final = final + self.transitions = {} + self.epsilon_transitions = set() + self.tag = None + self.formatter = formatter + self.shape = shape + + # The method name is set this way from compatibility issues. + def set_formatter(self, value, attr="formatter", visited=None): + if visited is None: + visited = set() + elif self in visited: + return + + visited.add(self) + self.__setattr__(attr, value) + for destinations in self.transitions.values(): + for node in destinations: + node.set_formatter(value, attr, visited) + for node in self.epsilon_transitions: + node.set_formatter(value, attr, visited) + return self + + def has_transition(self, symbol): + return symbol in self.transitions + + def add_transition(self, symbol, state): + try: + self.transitions[symbol].append(state) + except: + self.transitions[symbol] = [state] + return self + + def add_epsilon_transition(self, state): + self.epsilon_transitions.add(state) + return self + + def recognize(self, string): + states = self.epsilon_closure + for symbol in string: + states = self.move_by_state(symbol, *states) + states = self.epsilon_closure_by_state(*states) + return any(s.final for s in states) + + def to_deterministic(self, formatter=lambda x: str(x)): + closure = self.epsilon_closure + start = State(tuple(closure), any(s.final for s in closure), formatter) + + closures = [closure] + states = [start] + pending = [start] + + while pending: + state = pending.pop() + symbols = {symbol for s in state.state for symbol in s.transitions} + + for symbol in symbols: + move = self.move_by_state(symbol, *state.state) + closure = self.epsilon_closure_by_state(*move) + + if closure not in closures: + new_state = State( + tuple(closure), any(s.final for s in closure), formatter + ) + closures.append(closure) + states.append(new_state) + pending.append(new_state) + else: + index = closures.index(closure) + new_state = states[index] + + state.add_transition(symbol, new_state) + + return start + + @staticmethod + def from_nfa(nfa, get_states=False): + states = [] + for n in range(nfa.states): + state = State(n, n in nfa.finals) + states.append(state) + + for (origin, symbol), destinations in nfa.map.items(): + origin = states[origin] + origin[symbol] = [states[d] for d in destinations] + + if get_states: + return states[nfa.start], states + return states[nfa.start] + + @staticmethod + def move_by_state(symbol, *states): + return { + s for state in states if state.has_transition(symbol) for s in state[symbol] + } + + @staticmethod + def epsilon_closure_by_state(*states): + closure = {state for state in states} + + l = 0 + while l != len(closure): + l = len(closure) + tmp = [s for s in closure] + for s in tmp: + for epsilon_state in s.epsilon_transitions: + closure.add(epsilon_state) + return closure + + @property + def epsilon_closure(self): + return self.epsilon_closure_by_state(self) + + @property + def name(self): + return self.formatter(self.state) + + def get(self, symbol): + target = self.transitions[symbol] + assert len(target) == 1 + return target[0] + + def __getitem__(self, symbol): + if symbol == "": + return self.epsilon_transitions + try: + return self.transitions[symbol] + except KeyError: + return None + + def __setitem__(self, symbol, value): + if symbol == "": + self.epsilon_transitions = value + else: + self.transitions[symbol] = value + + def __repr__(self): + return str(self) + + def __str__(self): + return str(self.state) + + def __hash__(self): + return hash(self.state) + + def __iter__(self): + yield from self._visit() + + def _visit(self, visited=None): + if visited is None: + visited = set() + elif self in visited: + return + + visited.add(self) + yield self + + for destinations in self.transitions.values(): + for node in destinations: + yield from node._visit(visited) + for node in self.epsilon_transitions: + yield from node._visit(visited) diff --git a/src/compiler/cmp/cil_ast.py b/src/compiler/cmp/cil_ast.py new file mode 100644 index 000000000..9db2514cb --- /dev/null +++ b/src/compiler/cmp/cil_ast.py @@ -0,0 +1,256 @@ +class Node: + pass + + +class ProgramNode(Node): + def __init__(self, dottypes, dotdata, dotcode): + self.dottypes = dottypes + self.dotdata = dotdata + self.dotcode = dotcode + + +class TypeNode(Node): + def __init__(self, name): + self.name = name + self.attributes = [] + self.methods = [] + + +class DataNode(Node): + def __init__(self, vname, value): + self.name = vname + self.value = value + + +class FunctionNode(Node): + def __init__(self, fname, params, localvars, instructions): + self.name = fname + self.params = params + self.localvars = localvars + self.instructions = instructions + self.ids = dict() + self.labels_count = 0 + + +class ParamNode(Node): + def __init__(self, name): + self.name = name + + +class LocalNode(Node): + def __init__(self, name): + self.name = name + + +class InstructionNode(Node): + pass + + +class AssignNode(InstructionNode): + def __init__(self, dest, source): + self.dest = dest + self.source = source + + +class ArithmeticNode(InstructionNode): + def __init__(self, dest, left, right): + self.dest = dest + self.left = left + self.right = right + + +class PlusNode(ArithmeticNode): + pass + + +class MinusNode(ArithmeticNode): + pass + + +class StarNode(ArithmeticNode): + pass + + +class DivNode(ArithmeticNode): + pass + + +class LeqNode(ArithmeticNode): + pass + + +class LessNode(ArithmeticNode): + pass + + +class EqualNode(ArithmeticNode): + pass + + +class EqualStrNode(EqualNode): + pass + + +class GetAttribNode(InstructionNode): + def __init__(self, dest, obj, attr, computed_type): + self.dest = dest + self.obj = obj + self.attr = attr + self.computed_type = computed_type + + +class SetAttribNode(InstructionNode): + def __init__(self, obj, attr, value, computed_type): + self.obj = obj + self.attr = attr + self.value = value + self.computed_type = computed_type + + +class GetIndexNode(InstructionNode): + pass + + +class SetIndexNode(InstructionNode): + pass + + +class AllocateNode(InstructionNode): + def __init__(self, itype, dest): + self.type = itype + self.dest = dest + + +class ArrayNode(InstructionNode): + pass + + +class TypeOfNode(InstructionNode): + def __init__(self, obj, dest): + self.obj = obj + self.dest = dest + + +class LabelNode(InstructionNode): + def __init__(self, label): + self.label = label + + +class GotoNode(InstructionNode): + def __init__(self, label): + self.label = label + + +class GotoIfNode(InstructionNode): + def __init__(self, condition, label): + self.condition = condition + self.label = label + + +class StaticCallNode(InstructionNode): + def __init__(self, function, dest): + self.function = function + self.dest = dest + + +class DynamicCallNode(InstructionNode): + def __init__(self, xtype, method, dest, computed_type): + self.type = xtype + self.method = method + self.dest = dest + self.computed_type = computed_type + + +class ArgNode(InstructionNode): + def __init__(self, name): + self.name = name + + +class ReturnNode(InstructionNode): + def __init__(self, value=None): + self.value = value + + +class LoadNode(InstructionNode): + def __init__(self, dest, msg): + self.dest = dest + self.msg = msg + + +class ExitNode(InstructionNode): + pass + + +class TypeNameNode(InstructionNode): + def __init__(self, dest, source): + self.dest = dest + self.source = source + + +class NameNode(InstructionNode): + def __init__(self, dest, name): + self.dest = dest + self.name = name + + +class CopyNode(InstructionNode): + def __init__(self, dest, source): + self.dest = dest + self.source = source + + +class LengthNode(InstructionNode): + def __init__(self, dest, source): + self.dest = dest + self.source = source + + +class ConcatNode(InstructionNode): + def __init__(self, dest, prefix, suffix, length): + self.dest = dest + self.prefix = prefix + self.suffix = suffix + self.length = length + + +class SubstringNode(InstructionNode): + def __init__(self, dest, str_value, index, length): + self.dest = dest + self.str_value = str_value + self.index = index + self.length = length + + +class ReadStrNode(InstructionNode): + def __init__(self, dest): + self.dest = dest + + +class ReadIntNode(InstructionNode): + def __init__(self, dest): + self.dest = dest + + +class PrintStrNode(InstructionNode): + def __init__(self, value): + self.value = value + + +class PrintIntNode(InstructionNode): + def __init__(self, value): + self.value = value + + +class ComplementNode(InstructionNode): + def __init__(self, dest, obj): + self.dest = dest + self.obj = obj + + +class VoidNode(InstructionNode): + pass + + +class ErrorNode(InstructionNode): + def __init__(self, data_node): + self.data_node = data_node diff --git a/src/compiler/cmp/grammar.py b/src/compiler/cmp/grammar.py new file mode 100644 index 000000000..ce13a58ba --- /dev/null +++ b/src/compiler/cmp/grammar.py @@ -0,0 +1,182 @@ +from .pycompiler import Grammar +from .ast import * +from .utils import selfToken + +# grammar +G = Grammar() + + +# non-terminals +program = G.NonTerminal("", startSymbol=True) +class_list, def_class = G.NonTerminals(" ") +feature_list, def_attr, def_func = G.NonTerminals( + " " +) +param_list, param, expr_list = G.NonTerminals(" ") +expr, comp, arith, term, factor, atom = G.NonTerminals( + " " +) +s_comp, s_arith, s_term, s_factor = G.NonTerminals( + " " +) +func_call, arg_list, args = G.NonTerminals(" ") +case_def, block_def, loop_def, cond_def, let_def, assign_def = G.NonTerminals( + " " +) +branch_list, branch = G.NonTerminals(" ") +iden_list, iden = G.NonTerminals(" ") + + +# terminals +classx, inherits = G.Terminals("class inherits") +let, inx = G.Terminals("let in") +case, of, esac = G.Terminals("case of esac") +whilex, loop, pool = G.Terminals("while loop pool") +ifx, then, elsex, fi = G.Terminals("if then else fi") +isvoid, notx = G.Terminals("isvoid not") +semi, colon, comma, dot, opar, cpar, ocur, ccur, larrow, rarrow, at = G.Terminals( + "; : , . ( ) { } <- => @" +) +equal, plus, minus, star, div, less, leq, neg = G.Terminals("= + - * / < <= ~") +typeid, objectid, num, stringx, boolx, new = G.Terminals( + "typeid objectid int string bool new" +) + + +# productions + +program %= class_list, lambda h, s: ProgramNode(s[1]) + +class_list %= def_class, lambda h, s: [s[1]] +class_list %= def_class + class_list, lambda h, s: [s[1]] + s[2] + +def_class %= ( + classx + typeid + ocur + feature_list + ccur + semi, + lambda h, s: ClassDeclarationNode(s[2], s[4], s[1]), +) +def_class %= ( + classx + typeid + inherits + typeid + ocur + feature_list + ccur + semi, + lambda h, s: ClassDeclarationNode(s[2], s[6], s[1], s[4]), +) + +feature_list %= G.Epsilon, lambda h, s: [] +feature_list %= def_attr + feature_list, lambda h, s: [s[1]] + s[2] +feature_list %= def_func + feature_list, lambda h, s: [s[1]] + s[2] + +def_attr %= objectid + colon + typeid + semi, lambda h, s: AttrDeclarationNode( + s[1], s[3] +) +def_attr %= ( + objectid + colon + typeid + larrow + expr + semi, + lambda h, s: AttrDeclarationNode(s[1], s[3], s[5], s[4]), +) + +def_func %= ( + objectid + opar + cpar + colon + typeid + ocur + expr + ccur + semi, + lambda h, s: FuncDeclarationNode(s[1], [], s[5], s[7]), +) +def_func %= ( + objectid + opar + param_list + cpar + colon + typeid + ocur + expr + ccur + semi, + lambda h, s: FuncDeclarationNode(s[1], s[3], s[6], s[8]), +) + +param_list %= param, lambda h, s: [s[1]] +param_list %= param + comma + param_list, lambda h, s: [s[1]] + s[3] +param %= objectid + colon + typeid, lambda h, s: (s[1], s[3]) + +expr %= comp, lambda h, s: s[1] +expr %= s_comp, lambda h, s: s[1] + +comp %= arith, lambda h, s: s[1] +comp %= arith + leq + arith, lambda h, s: LeqNode(s[1], s[3], s[2]) +comp %= arith + less + arith, lambda h, s: LessNode(s[1], s[3], s[2]) +comp %= arith + equal + arith, lambda h, s: EqualNode(s[1], s[3], s[2]) + +arith %= term, lambda h, s: s[1] +arith %= arith + plus + term, lambda h, s: PlusNode(s[1], s[3], s[2]) +arith %= arith + minus + term, lambda h, s: MinusNode(s[1], s[3], s[2]) + +term %= factor, lambda h, s: s[1] +term %= term + star + factor, lambda h, s: StarNode(s[1], s[3], s[2]) +term %= term + div + factor, lambda h, s: DivNode(s[1], s[3], s[2]) + +factor %= atom, lambda h, s: s[1] +factor %= opar + expr + cpar, lambda h, s: s[2] +# factor %= isvoid + factor, lambda h, s: VoidNode(s[2], s[1]) +# factor %= neg + factor, lambda h, s: NegNode(s[2], s[1]) +factor %= func_call, lambda h, s: s[1] +factor %= case_def, lambda h, s: s[1] +factor %= block_def, lambda h, s: s[1] +factor %= loop_def, lambda h, s: s[1] +factor %= cond_def, lambda h, s: s[1] + +atom %= num, lambda h, s: ConstantNumNode(s[1]) +atom %= stringx, lambda h, s: ConstantStringNode(s[1]) +atom %= boolx, lambda h, s: ConstantBoolNode(s[1]) +atom %= objectid, lambda h, s: VariableNode(s[1]) +atom %= new + typeid, lambda h, s: InstantiateNode(s[2]) + +func_call %= objectid + opar + arg_list + cpar, lambda h, s: CallNode( + VariableNode(selfToken), s[1], s[3] +) +func_call %= factor + dot + objectid + opar + arg_list + cpar, lambda h, s: CallNode( + s[1], s[3], s[5] +) +func_call %= ( + factor + at + typeid + dot + objectid + opar + arg_list + cpar, + lambda h, s: CallNode(s[1], s[5], s[7], s[3]), +) + +arg_list %= G.Epsilon, lambda h, s: [] +arg_list %= args, lambda h, s: s[1] +args %= expr, lambda h, s: [s[1]] +args %= expr + comma + args, lambda h, s: [s[1]] + s[3] + +case_def %= case + expr + of + branch_list + esac, lambda h, s: CaseNode( + s[2], s[4], s[1] +) +branch_list %= branch, lambda h, s: [s[1]] +branch_list %= branch + branch_list, lambda h, s: [s[1]] + s[2] +branch %= objectid + colon + typeid + rarrow + expr + semi, lambda h, s: CaseBranchNode( + s[4], s[1], s[3], s[5] +) + +block_def %= ocur + expr_list + ccur, lambda h, s: BlockNode(s[2], s[1]) +expr_list %= expr + semi, lambda h, s: [s[1]] +expr_list %= expr + semi + expr_list, lambda h, s: [s[1]] + s[3] + +loop_def %= whilex + expr + loop + expr + pool, lambda h, s: LoopNode(s[2], s[4], s[1]) + +cond_def %= ifx + expr + then + expr + elsex + expr + fi, lambda h, s: ConditionalNode( + s[2], s[4], s[6], s[1] +) + +s_comp %= s_arith, lambda h, s: s[1] +s_comp %= arith + leq + s_arith, lambda h, s: LeqNode(s[1], s[3], s[2]) +s_comp %= arith + less + s_arith, lambda h, s: LessNode(s[1], s[3], s[2]) +s_comp %= arith + equal + s_arith, lambda h, s: EqualNode(s[1], s[3], s[2]) + +s_arith %= s_term, lambda h, s: s[1] +s_arith %= arith + plus + s_term, lambda h, s: PlusNode(s[1], s[3], s[2]) +s_arith %= arith + minus + s_term, lambda h, s: MinusNode(s[1], s[3], s[2]) + +s_term %= s_factor, lambda h, s: s[1] +s_term %= term + star + s_factor, lambda h, s: StarNode(s[1], s[3], s[2]) +s_term %= term + div + s_factor, lambda h, s: DivNode(s[1], s[3], s[2]) + +s_factor %= notx + expr, lambda h, s: NotNode(s[2], s[1]) +s_factor %= let_def, lambda h, s: s[1] +s_factor %= assign_def, lambda h, s: s[1] +s_factor %= isvoid + s_factor, lambda h, s: VoidNode(s[2], s[1]) +s_factor %= neg + s_factor, lambda h, s: NegNode(s[2], s[1]) +s_factor %= factor, lambda h, s: s[1] + +let_def %= let + iden_list + inx + expr, lambda h, s: LetNode(s[2], s[4], s[1]) +iden_list %= iden, lambda h, s: [s[1]] +iden_list %= iden + comma + iden_list, lambda h, s: [s[1]] + s[3] +iden %= objectid + colon + typeid, lambda h, s: LetVarNode(s[1], s[3]) +iden %= objectid + colon + typeid + larrow + expr, lambda h, s: LetVarNode( + s[1], s[3], s[5], s[4] +) + +assign_def %= objectid + larrow + expr, lambda h, s: AssignNode(s[1], s[3], s[2]) diff --git a/src/compiler/cmp/mips_ast.py b/src/compiler/cmp/mips_ast.py new file mode 100644 index 000000000..a6e72f036 --- /dev/null +++ b/src/compiler/cmp/mips_ast.py @@ -0,0 +1,235 @@ +# ***********************Registers*********************** +class Register: + def __init__(self, name): + self.name = name + + +FP = Register("fp") +SP = Register("sp") +RA = Register("ra") +V0 = Register("v0") +RA = Register("ra") +A0 = Register("a0") +A1 = Register("a1") +A2 = Register("a2") +A3 = Register("a3") +ZERO = Register("zero") +T0 = Register("t0") +T1 = Register("t1") +T2 = Register("t2") + +# ***********************Registers*********************** + +# ***********************Utils*********************** + +MAIN_FUNCTION_NAME = "main" +VIRTUAL_TABLE = "virtual_table" +TYPE_LIST = "type_list" + + +def push_to_stack(register: Register): + update_sp = AddInmediateNode(SP, SP, -4) + offset = RegisterRelativeLocation(SP, 0) + store_word = StoreWordNode(register, offset) + return [update_sp, store_word] + + +def pop_from_stack(register: Register): + load_word = LoadWordNode(register, RegisterRelativeLocation(SP, 0)) + update_sp = AddInmediateNode(SP, SP, 4) + return [load_word, update_sp] + + +def exit_program(): + instructions = [] + instructions.append(LoadInmediateNode(V0, 10)) + instructions.append(SyscallNode()) + return instructions + + +# ***********************Utils*********************** + + +# ***********************AST*********************** + + +class Node: + pass + + +class ProgramNode(Node): + def __init__(self, types, data, text): + self.types = types + self.data = data + self.text = text + + +class StringConst(Node): + def __init__(self, label, string): + self.label = label + self.string = string + + +class FunctionNode(Node): + def __init__(self, label, params, localvars): + self.label = label + self.params = params + self.localvars = localvars + self.instructions = [] + + +class TypeNode(Node): + def __init__(self, data_label, type_label, attributes, methods, pos): + self.data_label = data_label + self.type_label = type_label + self.attributes = attributes + self.methods = methods + self.pos = pos + + +class InstructionNode(Node): + pass + + +class LabelNode(InstructionNode): + def __init__(self, label): + self.label = label + + +class MoveNode(InstructionNode): + def __init__(self, reg1, reg2): + self.reg1 = reg1 + self.reg2 = reg2 + + +class LoadInmediateNode(InstructionNode): + def __init__(self, reg, value): + self.reg = reg + self.value = value + + +class LoadWordNode(InstructionNode): + def __init__(self, reg, addr): + self.reg = reg + self.addr = addr + + +class SyscallNode(InstructionNode): + pass + + +class LoadAddressNode(InstructionNode): + def __init__(self, reg, label): + self.reg = reg + self.label = label + + +class StoreWordNode(InstructionNode): + def __init__(self, reg, addr): + self.reg = reg + self.addr = addr + + +class JumpAndLinkNode(InstructionNode): + def __init__(self, label): + self.label = label + + +class JumpRegisterAndLinkNode(InstructionNode): + def __init__(self, reg): + self.reg = reg + + +class JumpRegister(InstructionNode): + def __init__(self, reg): + self.reg = reg + + +class AddInmediateNode(InstructionNode): + def __init__(self, dest, src, value): + self.dest = dest + self.src = src + self.constant_number = value + + +class AddInmediateUnsignedNode(InstructionNode): + def __init__(self, dest, src, value): + self.dest = dest + self.src = src + self.value = value + + +class AddUnsignedNode(InstructionNode): + def __init__(self, dest, sum1, sum2): + self.dest = dest + self.sum1 = sum1 + self.sum2 = sum2 + + +class ShiftLeftLogicalNode(InstructionNode): + def __init__(self, dest, src, bits): + self.dest = dest + self.src = src + self.bits = bits + + +class BranchOnNotEqualNode(InstructionNode): + def __init__(self, reg1, reg2, label): + self.reg1 = reg1 + self.reg2 = reg2 + self.label = label + + +class JumpNode(InstructionNode): + def __init__(self, label): + self.label = label + + +class AddNode(InstructionNode): + def __init__(self, reg1, reg2, reg3): + self.reg1 = reg1 + self.reg2 = reg2 + self.reg3 = reg3 + + +class SubNode(InstructionNode): + def __init__(self, reg1, reg2, reg3): + self.reg1 = reg1 + self.reg2 = reg2 + self.reg3 = reg3 + + +class MultiplyNode(InstructionNode): + def __init__(self, reg1, reg2, reg3): + self.reg1 = reg1 + self.reg2 = reg2 + self.reg3 = reg3 + + +class DivideNode(InstructionNode): + def __init__(self, reg1, reg2): + self.reg1 = reg1 + self.reg2 = reg2 + + +class ComplementNode(InstructionNode): + def __init__(self, reg1, reg2): + self.reg1 = reg1 + self.reg2 = reg2 + + +class MoveFromLowNode(InstructionNode): + def __init__(self, reg): + self.reg = reg + + +class RegisterRelativeLocation: + def __init__(self, register, offset): + self.register = register + self.offset = offset + + +class LabelRelativeLocation: + def __init__(self, label, offset): + self.label = label + self.offset = offset diff --git a/src/compiler/cmp/pycompiler.py b/src/compiler/cmp/pycompiler.py new file mode 100644 index 000000000..9d1100913 --- /dev/null +++ b/src/compiler/cmp/pycompiler.py @@ -0,0 +1,513 @@ +import json + + +class Symbol(object): + def __init__(self, name, grammar): + self.Name = name + self.Grammar = grammar + + def __str__(self): + return self.Name + + def __repr__(self): + return repr(self.Name) + + def __add__(self, other): + if isinstance(other, Symbol): + return Sentence(self, other) + + raise TypeError(other) + + def __or__(self, other): + + if isinstance(other, (Sentence)): + return SentenceList(Sentence(self), other) + + raise TypeError(other) + + @property + def IsEpsilon(self): + return False + + def __len__(self): + return 1 + + +class NonTerminal(Symbol): + def __init__(self, name, grammar): + super().__init__(name, grammar) + self.productions = [] + + def __imod__(self, other): + + if isinstance(other, (Sentence)): + p = Production(self, other) + self.Grammar.Add_Production(p) + return self + + if isinstance(other, tuple): + assert len(other) > 1 + + if len(other) == 2: + other += (None,) * len(other[0]) + + assert ( + len(other) == len(other[0]) + 2 + ), "Debe definirse una, y solo una, regla por cada símbolo de la producción" + # assert len(other) == 2, "Tiene que ser una Tupla de 2 elementos (sentence, attribute)" + + if isinstance(other[0], Symbol) or isinstance(other[0], Sentence): + p = AttributeProduction(self, other[0], other[1:]) + else: + raise Exception("") + + self.Grammar.Add_Production(p) + return self + + if isinstance(other, Symbol): + p = Production(self, Sentence(other)) + self.Grammar.Add_Production(p) + return self + + if isinstance(other, SentenceList): + + for s in other: + p = Production(self, s) + self.Grammar.Add_Production(p) + + return self + + raise TypeError(other) + + @property + def IsTerminal(self): + return False + + @property + def IsNonTerminal(self): + return True + + @property + def IsEpsilon(self): + return False + + +class Terminal(Symbol): + def __init__(self, name, grammar): + super().__init__(name, grammar) + + @property + def IsTerminal(self): + return True + + @property + def IsNonTerminal(self): + return False + + @property + def IsEpsilon(self): + return False + + +class EOF(Terminal): + def __init__(self, Grammar): + super().__init__("$", Grammar) + + +class Sentence(object): + def __init__(self, *args): + self._symbols = tuple(x for x in args if not x.IsEpsilon) + self.hash = hash(self._symbols) + + def __len__(self): + return len(self._symbols) + + def __add__(self, other): + if isinstance(other, Symbol): + return Sentence(*(self._symbols + (other,))) + + if isinstance(other, Sentence): + return Sentence(*(self._symbols + other._symbols)) + + raise TypeError(other) + + def __or__(self, other): + if isinstance(other, Sentence): + return SentenceList(self, other) + + if isinstance(other, Symbol): + return SentenceList(self, Sentence(other)) + + raise TypeError(other) + + def __repr__(self): + return str(self) + + def __str__(self): + return ("%s " * len(self._symbols) % tuple(self._symbols)).strip() + + def __iter__(self): + return iter(self._symbols) + + def __getitem__(self, index): + return self._symbols[index] + + def __eq__(self, other): + return self._symbols == other._symbols + + def __hash__(self): + return self.hash + + @property + def IsEpsilon(self): + return False + + +class SentenceList(object): + def __init__(self, *args): + self._sentences = list(args) + + def Add(self, symbol): + if not symbol and (symbol is None or not symbol.IsEpsilon): + raise ValueError(symbol) + + self._sentences.append(symbol) + + def __iter__(self): + return iter(self._sentences) + + def __or__(self, other): + if isinstance(other, Sentence): + self.Add(other) + return self + + if isinstance(other, Symbol): + return self | Sentence(other) + + +class Epsilon(Terminal, Sentence): + def __init__(self, grammar): + super().__init__("epsilon", grammar) + + def __str__(self): + return "e" + + def __repr__(self): + return "epsilon" + + def __iter__(self): + yield from () + + def __len__(self): + return 0 + + def __add__(self, other): + return other + + def __eq__(self, other): + return isinstance(other, (Epsilon,)) + + def __hash__(self): + return hash("") + + @property + def IsEpsilon(self): + return True + + +class Production(object): + def __init__(self, nonTerminal, sentence): + + self.Left = nonTerminal + self.Right = sentence + + def __str__(self): + + return "%s := %s" % (self.Left, self.Right) + + def __repr__(self): + return "%s -> %s" % (self.Left, self.Right) + + def __iter__(self): + yield self.Left + yield self.Right + + def __eq__(self, other): + return ( + isinstance(other, Production) + and self.Left == other.Left + and self.Right == other.Right + ) + + def __hash__(self): + return hash((self.Left, self.Right)) + + @property + def IsEpsilon(self): + return self.Right.IsEpsilon + + +class AttributeProduction(Production): + def __init__(self, nonTerminal, sentence, attributes): + if not isinstance(sentence, Sentence) and isinstance(sentence, Symbol): + sentence = Sentence(sentence) + super(AttributeProduction, self).__init__(nonTerminal, sentence) + + self.attributes = attributes + + def __str__(self): + return "%s := %s" % (self.Left, self.Right) + + def __repr__(self): + return "%s -> %s" % (self.Left, self.Right) + + def __iter__(self): + yield self.Left + yield self.Right + + @property + def IsEpsilon(self): + return self.Right.IsEpsilon + + # sintetizar en ingles??????, pending aggrement + def syntetice(self): + pass + + +class Grammar: + def __init__(self): + + self.Productions = [] + self.nonTerminals = [] + self.terminals = [] + self.startSymbol = None + # production type + self.pType = None + self.Epsilon = Epsilon(self) + self.EOF = EOF(self) + + self.symbDict = {"$": self.EOF} + + def NonTerminal(self, name, startSymbol=False): + + name = name.strip() + if not name: + raise Exception("Empty name") + + term = NonTerminal(name, self) + + if startSymbol: + + if self.startSymbol is None: + self.startSymbol = term + else: + raise Exception("Cannot define more than one start symbol.") + + self.nonTerminals.append(term) + self.symbDict[name] = term + return term + + def NonTerminals(self, names): + + ans = tuple((self.NonTerminal(x) for x in names.strip().split())) + + return ans + + def Add_Production(self, production): + + if len(self.Productions) == 0: + self.pType = type(production) + + assert type(production) == self.pType, "The Productions most be of only 1 type." + + production.Left.productions.append(production) + self.Productions.append(production) + + def Terminal(self, name): + + name = name.strip() + if not name: + raise Exception("Empty name") + + term = Terminal(name, self) + self.terminals.append(term) + self.symbDict[name] = term + return term + + def Terminals(self, names): + + ans = tuple((self.Terminal(x) for x in names.strip().split())) + + return ans + + def __str__(self): + + mul = "%s, " + + ans = "Non-Terminals:\n\t" + + nonterminals = mul * (len(self.nonTerminals) - 1) + "%s\n" + + ans += nonterminals % tuple(self.nonTerminals) + + ans += "Terminals:\n\t" + + terminals = mul * (len(self.terminals) - 1) + "%s\n" + + ans += terminals % tuple(self.terminals) + + ans += "Productions:\n\t" + + ans += str(self.Productions) + + return ans + + def __getitem__(self, name): + try: + return self.symbDict[name] + except KeyError: + return None + + @property + def to_json(self): + + productions = [] + + for p in self.Productions: + head = p.Left.Name + + body = [] + + for s in p.Right: + body.append(s.Name) + + productions.append({"Head": head, "Body": body}) + + d = { + "NonTerminals": [symb.Name for symb in self.nonTerminals], + "Terminals": [symb.Name for symb in self.terminals], + "Productions": productions, + } + + # [{'Head':p.Left.Name, "Body": [s.Name for s in p.Right]} for p in self.Productions] + return json.dumps(d) + + @staticmethod + def from_json(data): + data = json.loads(data) + + G = Grammar() + dic = {"epsilon": G.Epsilon} + + for term in data["Terminals"]: + dic[term] = G.Terminal(term) + + for noTerm in data["NonTerminals"]: + dic[noTerm] = G.NonTerminal(noTerm) + + for p in data["Productions"]: + head = p["Head"] + dic[head] %= Sentence(*[dic[term] for term in p["Body"]]) + + return G + + def copy(self): + G = Grammar() + G.Productions = self.Productions.copy() + G.nonTerminals = self.nonTerminals.copy() + G.terminals = self.terminals.copy() + G.pType = self.pType + G.startSymbol = self.startSymbol + G.Epsilon = self.Epsilon + G.EOF = self.EOF + G.symbDict = self.symbDict.copy() + + return G + + @property + def IsAugmentedGrammar(self): + augmented = 0 + for left, right in self.Productions: + if self.startSymbol == left: + augmented += 1 + if augmented <= 1: + return True + else: + return False + + def AugmentedGrammar(self, force=False): + if not self.IsAugmentedGrammar or force: + + G = self.copy() + # S, self.startSymbol, SS = self.startSymbol, None, self.NonTerminal('S\'', True) + S = G.startSymbol + G.startSymbol = None + SS = G.NonTerminal("S'", True) + if G.pType is AttributeProduction: + SS %= S + G.Epsilon, lambda x: x + else: + SS %= S + G.Epsilon + + return G + else: + return self.copy() + + # endchange + + +class Item: + def __init__(self, production, pos, lookaheads=[]): + self.production = production + self.pos = pos + self.lookaheads = frozenset(look for look in lookaheads) + + def __str__(self): + s = str(self.production.Left) + " -> " + if len(self.production.Right) > 0: + for i, c in enumerate(self.production.Right): + if i == self.pos: + s += "." + s += str(self.production.Right[i]) + if self.pos == len(self.production.Right): + s += "." + else: + s += "." + s += ", " + str(self.lookaheads)[10:-1] + return s + + def __repr__(self): + return str(self) + + def __eq__(self, other): + return ( + (self.pos == other.pos) + and (self.production == other.production) + and (set(self.lookaheads) == set(other.lookaheads)) + ) + + def __hash__(self): + return hash((self.production, self.pos, self.lookaheads)) + + @property + def IsReduceItem(self): + return len(self.production.Right) == self.pos + + @property + def NextSymbol(self): + if self.pos < len(self.production.Right): + return self.production.Right[self.pos] + else: + return None + + def NextItem(self): + if self.pos < len(self.production.Right): + return Item(self.production, self.pos + 1, self.lookaheads) + else: + return None + + def Preview(self, skip=1): + unseen = self.production.Right[self.pos + skip :] + return [unseen + (lookahead,) for lookahead in self.lookaheads] + + def Center(self): + return Item(self.production, self.pos) diff --git a/src/compiler/cmp/semantic.py b/src/compiler/cmp/semantic.py new file mode 100644 index 000000000..fa62a3023 --- /dev/null +++ b/src/compiler/cmp/semantic.py @@ -0,0 +1,490 @@ +import itertools as itt +from collections import OrderedDict +from typing import Dict, List, Optional, Tuple + + +class SemanticError(Exception): + @property + def text(self): + return self.args[0] + + +class Attribute: + def __init__(self, name, typex, idx=None): + self.name = name + self.type = typex + self.idx = idx + + def __str__(self): + return f"[attrib] {self.name} : {self.type.name};" + + def __repr__(self): + return str(self) + + +class Method: + def __init__( + self, name, param_names, param_types, return_type, param_idx, ridx=None + ): + self.name = name + self.param_names = param_names + self.param_types = param_types + self.param_idx = param_idx + self.return_type = return_type + self.ridx = ridx + + def __str__(self): + params = ", ".join( + f"{n}:{t.name}" for n, t in zip(self.param_names, self.param_types) + ) + return f"[method] {self.name}({params}): {self.return_type.name};" + + def __eq__(self, other): + return ( + other.name == self.name + and other.return_type == self.return_type + and other.param_types == self.param_types + ) + + +class Type: + def __init__(self, name: str, pos: Tuple[int, int] = (0, 0)): + self.name: str = name + self.attributes: List[Attribute] = [] + self.methods: List[Method] = [] + self.parent: Optional[Type] = None + self.pos: Tuple[int, int] = pos + + def set_parent(self, parent): + if self.parent is not None: + raise SemanticError(f"Parent type is already set for {self.name}.") + self.parent = parent + + def get_attribute(self, name: str): + try: + return next(attr for attr in self.attributes if attr.name == name) + except StopIteration: + if self.parent is None: + raise SemanticError( + f'Attribute "{name}" is not defined in {self.name}.' + ) + try: + return self.parent.get_attribute(name) + except SemanticError: + raise SemanticError( + f'Attribute "{name}" is not defined in {self.name}.' + ) + + def define_attribute(self, name: str, typex, idx=None) -> Attribute: + try: + self.get_attribute(name) + except SemanticError: + attribute = Attribute(name, typex, idx) + self.attributes.append(attribute) + return attribute + else: + raise SemanticError( + f'Attribute "{name}" is already defined in {self.name}.' + ) + + def get_method(self, name: str) -> Method: + try: + return next(method for method in self.methods if method.name == name) + except StopIteration: + if self.parent is None: + raise SemanticError(f'Method "{name}" is not defined in {self.name}.') + try: + return self.parent.get_method(name) + except SemanticError: + raise SemanticError(f'Method "{name}" is not defined in {self.name}.') + + def define_method( + self, + name: str, + param_names: list, + param_types: list, + return_type, + param_idx: list, + ridx=None, + ): + if name in (method.name for method in self.methods): + raise SemanticError(f'Method "{name}" is multiply defined.') + + method = Method(name, param_names, param_types, return_type, param_idx, ridx) + self.methods.append(method) + return method + + def all_attributes(self, clean=True): + plain = ( + OrderedDict() if self.parent is None else self.parent.all_attributes(False) + ) + for attr in self.attributes: + plain[attr.name] = (attr, self) + return plain.values() if clean else plain + + def all_methods(self, clean=True): + plain = OrderedDict() if self.parent is None else self.parent.all_methods(False) + for method in self.methods: + plain[method.name] = (method, self) + return plain.values() if clean else plain + + def update_attr(self, attr_name, attr_type): + for i, item in enumerate(self.attributes): + if item.name == attr_name: + self.attributes[i] = Attribute(attr_name, attr_type) + break + + def update_method_rtype(self, method_name, rtype): + for i, item in enumerate(self.methods): + if item.name == method_name: + self.methods[i].return_type = rtype + break + + def update_method_param(self, method_name, param_type, param_idx): + for i, item in enumerate(self.methods): + if item.name == method_name: + self.methods[i].param_types[param_idx] = param_type + break + + def conforms_to(self, other): + return ( + other.bypass() + or self.name == other.name + or self.parent is not None + and self.parent.conforms_to(other) + ) + + def bypass(self): + return False + + def can_be_inherited(self): + return True + + def __str__(self): + output = f"type {self.name}" + parent = "" if self.parent is None else f" : {self.parent.name}" + output += parent + output += " {" + output += "\n\t" if self.attributes or self.methods else "" + output += "\n\t".join(str(x) for x in self.attributes) + output += "\n\t" if self.attributes else "" + output += "\n\t".join(str(x) for x in self.methods) + output += "\n" if self.methods else "" + output += "}\n" + return output + + def __repr__(self): + return str(self) + + def __eq__(self, other): + return self.conforms_to(other) and other.conforms_to(self) + + +class ErrorType(Type): + def __init__(self): + Type.__init__(self, "") + + def conforms_to(self, other): + return True + + def bypass(self): + return True + + def __eq__(self, other): + return isinstance(other, Type) + + +class ObjectType(Type): + def __init__(self): + Type.__init__(self, "Object") + + def __eq__(self, other): + return other.name == self.name or isinstance(other, ObjectType) + + +class IOType(Type): + def __init__(self): + Type.__init__(self, "IO") + + def __eq__(self, other): + return other.name == self.name or isinstance(other, IOType) + + +class StringType(Type): + def __init__(self): + Type.__init__(self, "String") + + def __eq__(self, other): + return other.name == self.name or isinstance(other, StringType) + + def can_be_inherited(self): + return False + + +class BoolType(Type): + def __init__(self): + Type.__init__(self, "Bool") + + def __eq__(self, other): + return other.name == self.name or isinstance(other, BoolType) + + def can_be_inherited(self): + return False + + +class IntType(Type): + def __init__(self): + Type.__init__(self, "Int") + + def __eq__(self, other): + return other.name == self.name or isinstance(other, IntType) + + def can_be_inherited(self): + return False + + +class SelfType(Type): + def __init__(self, fixed=None): + Type.__init__(self, "SELF_TYPE") + self.fixed_type = fixed + + def can_be_inherited(self): + return False + + +class AutoType(Type): + def __init__(self): + Type.__init__(self, "AUTO_TYPE") + + def can_be_inherited(self): + return False + + def conforms_to(self, other): + return True + + def bypass(self): + return True + + +class Context: + def __init__(self): + self.types: Dict[str, Type] = {} + + def create_type(self, name: str, pos: Tuple[int, int] = (0, 0)): + if name in self.types: + raise SemanticError("Classes may not be redefined") + typex = self.types[name] = Type(name, pos) + return typex + + def get_type(self, name: str): + try: + return self.types[name] + except KeyError: + raise TypeError(f'Type "{name}" is not defined.') + + def __str__(self): + return ( + "{\n\t" + + "\n\t".join(y for x in self.types.values() for y in str(x).split("\n")) + + "\n}" + ) + + def __repr__(self): + return str(self) + + +class VariableInfo: + def __init__(self, name, vtype, idx=None): + self.name = name + self.type = vtype + self.idx = idx + + +class Scope: + def __init__(self, parent=None): + self.locals = [] + self.parent = parent + self.children = [] + self.index = 0 if parent is None else len(parent) + + def __len__(self): + return len(self.locals) + + def create_child(self): + child = Scope(self) + self.children.append(child) + return child + + def define_variable(self, vname, vtype, idx=None): + info = VariableInfo(vname, vtype, idx) + self.locals.append(info) + return info + + def find_variable(self, vname, index=None): + locals = self.locals if index is None else itt.islice(self.locals, index) + try: + return next(x for x in locals if x.name == vname) + except StopIteration: + return ( + self.parent.find_variable(vname, self.index) + if self.parent is not None + else None + ) + + def is_defined(self, vname): + return self.find_variable(vname) is not None + + def is_local(self, vname): + return any(True for x in self.locals if x.name == vname) + + def update_variable(self, vname, vtype, index=None): + locals = self.locals if index is None else itt.islice(self.locals, index) + for i, item in enumerate(locals): + if item.name == vname: + self.locals[i] = VariableInfo(vname, vtype, item.idx) + return True + return ( + self.parent.update_variable(vname, vtype, self.index) + if self.parent is not None + else False + ) + + +class InferencerManager: + def __init__(self): + # given a type represented by int idx, types[idx] = (A, B), where A and B are sets + # if x in A then idx.conforms_to(x) + # if x in B then x.conforms_to(idx) + self.conforms_to = [] + self.conformed_by = [] + self.infered_type = [] + self.count = 0 + + def assign_id(self, obj_type): + idx = self.count + self.conforms_to.append([obj_type]) + self.conformed_by.append([]) + self.infered_type.append(None) + self.count += 1 + + return idx + + def upd_conforms_to(self, idx, other): + for item in other: + self.auto_to_type(idx, item) + + def upd_conformed_by(self, idx, other): + for item in other: + self.type_to_auto(idx, item) + + def auto_to_type(self, idx, typex): + if isinstance(typex, SelfType): + typex = typex.fixed_type + try: + assert not isinstance(typex, ErrorType) + assert not any(item.name == typex.name for item in self.conforms_to[idx]) + + self.conforms_to[idx].append(typex) + except AssertionError: + pass + + def type_to_auto(self, idx, typex): + if isinstance(typex, SelfType): + typex = typex.fixed_type + try: + assert not isinstance(typex, ErrorType) + assert not any(item.name == typex.name for item in self.conformed_by[idx]) + + self.conformed_by[idx].append(typex) + except AssertionError: + pass + + def infer(self, idx): + try: + assert self.infered_type[idx] is None + assert len(self.conforms_to[idx]) > 1 or len(self.conformed_by[idx]) > 0 + + try: + start = self.get_min(self.conforms_to[idx]) + self.infered_type[idx] = start + + if len(self.conformed_by[idx]) > 0: + final = LCA(self.conformed_by[idx]) + assert final.conforms_to(start) + self.infered_type[idx] = final + + except AssertionError: + self.infered_type[idx] = ErrorType() + return True + except AssertionError: + return False + + def infer_all(self): + change = False + for i in range(self.count): + change |= self.infer(i) + + return change + + def infer_object(self, obj_type): + for i in range(self.count): + if self.infered_type[i] is None: + self.infered_type[i] = obj_type + + def get_min(self, types): + path = [] + + def find(typex): + for i, item in enumerate(path): + if item.name == typex.name: + return i + return len(path) + + for item in types: + current = [] + while item is not None: + idx = find(item) + if idx == len(path): + current.append(item) + item = item.parent + continue + + assert idx == len(path) - 1 or len(current) == 0 + break + current.reverse() + path.extend(current) + + return path[-1] + + +def LCA(types): + if len(types) == 0: + return None + + # check ErrorType: + if any(isinstance(item, ErrorType) for item in types): + return ErrorType() + + # check AUTO_TYPE + if any(isinstance(item, AutoType) for item in types): + return AutoType() + + # check SELF_TYPE: + if all(isinstance(item, SelfType) for item in types): + return types[0] + + for i, item in enumerate(types): + if isinstance(item, SelfType): + types[i] = item.fixed_type + + current = types[0] + while current: + for item in types: + if not item.conforms_to(current): + break + else: + return current + current = current.parent + + # This part of the code is supposed to be unreachable + return None diff --git a/src/compiler/cmp/utils.py b/src/compiler/cmp/utils.py new file mode 100644 index 000000000..a2139016b --- /dev/null +++ b/src/compiler/cmp/utils.py @@ -0,0 +1,205 @@ +from .pycompiler import Production, Sentence, Symbol, EOF, Epsilon + + +class ContainerSet: + def __init__(self, *values, contains_epsilon=False): + self.set = set(values) + self.contains_epsilon = contains_epsilon + + def add(self, value): + n = len(self.set) + self.set.add(value) + return n != len(self.set) + + def extend(self, values): + change = False + for value in values: + change |= self.add(value) + return change + + def set_epsilon(self, value=True): + last = self.contains_epsilon + self.contains_epsilon = value + return last != self.contains_epsilon + + def update(self, other): + n = len(self.set) + self.set.update(other.set) + return n != len(self.set) + + def epsilon_update(self, other): + return self.set_epsilon(self.contains_epsilon | other.contains_epsilon) + + def hard_update(self, other): + return self.update(other) | self.epsilon_update(other) + + def find_match(self, match): + for item in self.set: + if item == match: + return item + return None + + def __len__(self): + return len(self.set) + int(self.contains_epsilon) + + def __str__(self): + return "%s-%s" % (str(self.set), self.contains_epsilon) + + def __repr__(self): + return str(self) + + def __iter__(self): + return iter(self.set) + + def __nonzero__(self): + return len(self) > 0 + + def __eq__(self, other): + if isinstance(other, set): + return self.set == other + return ( + isinstance(other, ContainerSet) + and self.set == other.set + and self.contains_epsilon == other.contains_epsilon + ) + + +def inspect(item, grammar_name="G", mapper=None): + try: + return mapper[item] + except (TypeError, KeyError): + if isinstance(item, dict): + items = ",\n ".join( + f"{inspect(key, grammar_name, mapper)}: {inspect(value, grammar_name, mapper)}" + for key, value in item.items() + ) + return f"{{\n {items} \n}}" + elif isinstance(item, ContainerSet): + args = ( + f'{ ", ".join(inspect(x, grammar_name, mapper) for x in item.set) } ,' + if item.set + else "" + ) + return f"ContainerSet({args} contains_epsilon={item.contains_epsilon})" + elif isinstance(item, EOF): + return f"{grammar_name}.EOF" + elif isinstance(item, Epsilon): + return f"{grammar_name}.Epsilon" + elif isinstance(item, Symbol): + return f"G['{item.Name}']" + elif isinstance(item, Sentence): + items = ", ".join(inspect(s, grammar_name, mapper) for s in item._symbols) + return f"Sentence({items})" + elif isinstance(item, Production): + left = inspect(item.Left, grammar_name, mapper) + right = inspect(item.Right, grammar_name, mapper) + return f"Production({left}, {right})" + elif isinstance(item, tuple) or isinstance(item, list): + ctor = ("(", ")") if isinstance(item, tuple) else ("[", "]") + return f'{ctor[0]} {("%s, " * len(item)) % tuple(inspect(x, grammar_name, mapper) for x in item)}{ctor[1]}' + else: + raise ValueError(f"Invalid: {item}") + + +class Token: + """ + Basic token class. + + Parameters + ---------- + lex : str + Token's lexeme. + token_type : Enum + Token's type. + pos : (int, int) + Token's starting position (row, column) + """ + + def __init__(self, lex, token_type, pos): + self.lex = lex + self.token_type = token_type + self.pos = pos + + def __str__(self): + return f"{self.token_type}: {self.lex}" + + def __repr__(self): + return str(self) + + @property + def is_valid(self): + return True + + +class UnknownToken(Token): + def __init__(self, lex, pos): + Token.__init__(self, lex, None, pos) + + def transform_to(self, token_type): + return Token(self.lex, token_type, self.pos) + + @property + def is_valid(self): + return False + + +class DisjointSet: + def __init__(self, *items): + self.nodes = {x: DisjointNode(x) for x in items} + + def merge(self, items): + items = (self.nodes[x] for x in items) + try: + head, *others = items + for other in others: + head.merge(other) + except ValueError: + pass + + @property + def representatives(self): + return {n.representative for n in self.nodes.values()} + + @property + def groups(self): + return [ + [n for n in self.nodes.values() if n.representative == r] + for r in self.representatives + ] + + def __len__(self): + return len(self.representatives) + + def __getitem__(self, item): + return self.nodes[item] + + def __str__(self): + return str(self.groups) + + def __repr__(self): + return str(self) + + +class DisjointNode: + def __init__(self, value): + self.value = value + self.parent = self + + @property + def representative(self): + if self.parent != self: + self.parent = self.parent.representative + return self.parent + + def merge(self, other): + other.representative.parent = self.representative + + def __str__(self): + return str(self.value) + + def __repr__(self): + return str(self) + + +emptyToken = Token("", "", (0, 0)) +selfToken = Token("self", "", (0, 0)) diff --git a/src/compiler/lexer/__init__.py b/src/compiler/lexer/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/lexer/lex.py b/src/compiler/lexer/lex.py new file mode 100644 index 000000000..1c2fec965 --- /dev/null +++ b/src/compiler/lexer/lex.py @@ -0,0 +1,367 @@ +from ..cmp.grammar import * +from ..cmp.utils import Token + +import ply.lex as lex + + +class CoolLexer(object): + def __init__(self): + self.count = 0 + self.build() + + states = ( + ("string", "exclusive"), + ("comment", "exclusive"), + ) + + reserved = { + "class": "CLASS", + "inherits": "INHERITS", + "let": "LET", + "in": "IN", + "case": "CASE", + "of": "OF", + "esac": "ESAC", + "while": "WHILE", + "loop": "LOOP", + "pool": "POOL", + "if": "IF", + "then": "THEN", + "else": "ELSE", + "fi": "FI", + "isvoid": "ISVOID", + "not": "NOT", + "new": "NEW", + "true": "TRUE", + "false": "FALSE", + } + + tokens = [ + "SEMICOLON", + "COLON", + "COMMA", + "DOT", + "OPAR", + "CPAR", + "OCUR", + "CCUR", + "LARROW", + "RARROW", + "AT", + "EQUAL", + "PLUS", + "MINUS", + "STAR", + "DIV", + "LESS", + "LEQ", + "NEG", + "TYPEIDENTIFIER", + "OBJECTIDENTIFIER", + "NUMBER", + "STRING", + "ERROR", + ] + list(reserved.values()) + + token_type = { + "CLASS": classx, + "INHERITS": inherits, + "LET": let, + "IN": inx, + "CASE": case, + "OF": of, + "ESAC": esac, + "WHILE": whilex, + "LOOP": loop, + "POOL": pool, + "IF": ifx, + "THEN": then, + "ELSE": elsex, + "FI": fi, + "ISVOID": isvoid, + "NOT": notx, + "NEW": new, + "TRUE": boolx, + "FALSE": boolx, + "SEMICOLON": semi, + "COLON": colon, + "COMMA": comma, + "DOT": dot, + "OPAR": opar, + "CPAR": cpar, + "OCUR": ocur, + "CCUR": ccur, + "LARROW": larrow, + "RARROW": rarrow, + "AT": at, + "EQUAL": equal, + "PLUS": plus, + "MINUS": minus, + "STAR": star, + "DIV": div, + "LESS": less, + "LEQ": leq, + "NEG": neg, + "TYPEIDENTIFIER": typeid, + "OBJECTIDENTIFIER": objectid, + "NUMBER": num, + "STRING": stringx, + } + + def t_begin_STARTSTRING(self, t): + r'"' + self.string = "" + self.lexer.begin("string") + + def t_string_ENDSTRING(self, t): + r'"' + self.lexer.begin("INITIAL") + t.value = self.string + t.type = "STRING" + self.add_column(t) + return t + + def t_string_NULL(self, t): + r"\0" + self.lexer.begin("INITIAL") + self.add_column(t) + t.type = "ERROR" + t.value = f"({t.lineno}, {t.col}) - LexicographicError: String contains null character" + return t + + def t_string_newline1(self, t): + r"\\n" + self.string += "\n" + + def t_string_escaped_newline(self, t): + r"\\\n" + self.string += "\n" + t.lexer.lineno += 1 + self.count = t.lexpos + len(t.value) + + def t_string_invalid_newline2(self, t): + r"\n" + self.lexer.begin("INITIAL") + self.add_column(t) + self.count = t.lexpos + len(t.value) + t.type = "ERROR" + t.value = ( + f"({t.lineno}, {t.col}) - LexicographicError: Unterminated string constant" + ) + t.lexer.lineno += 1 + return t + + def t_string_special_character(self, t): + r"\\[btf]" + self.string += t.value + + def t_string_escaped_character(self, t): + r"\\." + self.string += t.value[1] + + def t_string_character(self, t): + r"." + self.string += t.value + + def t_string_eof(self, t): + self.add_column(t) + t.type = "ERROR" + t.value = f"({t.lineno},{t.col}) - LexicographicError: EOF in string constant" + t.lexer.begin("INITIAL") + return t + + def t_begin_STARTCOMMENT(self, t): + r"\(\*" + self.comment_level = 1 + self.lexer.begin("comment") + + def t_comment_STARTCOMMENT(self, t): + r"\(\*" + self.comment_level += 1 + + def t_comment_ENDCOMMENT(self, t): + r"\*\)" + self.comment_level -= 1 + if self.comment_level == 0: + self.lexer.begin("INITIAL") + + def t_comment_character(self, t): + r"." + + def t_comment_eof(self, t): + self.add_column(t) + t.type = "ERROR" + t.value = f"({t.lexer.lineno}, {t.col}) - LexicographicError: EOF in comment" + self.lexer.begin("INITIAL") + return t + + def t_NUMBER(self, t): + r"\d+" + t.value = int(t.value) + self.add_column(t) + return t + + # Rule to track line numbers + def t_ANY_newline(self, t): + r"\n+" + t.lexer.lineno += len(t.value) + self.count = t.lexpos + len(t.value) + + t_ignore = " \t\f\r\v" + + def t_COMMENTLINE(self, t): + r"--.*" + + def t_TYPEIDENTIFIER(self, t): + r"[A-Z][0-9A-Za-z_]*" + val = t.value.lower() + if val not in ["true", "false"]: + t.type = self.reserved.get(val, "TYPEIDENTIFIER") + self.add_column(t) + return t + + def t_OBJECTIDENTIFIER(self, t): + r"[a-z][0-9A-Za-z_]*" + val = t.value.lower() + t.type = self.reserved.get(val, "OBJECTIDENTIFIER") + self.add_column(t) + return t + + def t_SEMICOLON(self, t): + r";" + self.add_column(t) + return t + + def t_COLON(self, t): + r":" + self.add_column(t) + return t + + def t_COMMA(self, t): + r"," + self.add_column(t) + return t + + def t_DOT(self, t): + r"\." + self.add_column(t) + return t + + def t_OPAR(self, t): + r"\(" + self.add_column(t) + return t + + def t_CPAR(self, t): + r"\)" + self.add_column(t) + return t + + def t_OCUR(self, t): + r"{" + self.add_column(t) + return t + + def t_CCUR(self, t): + r"}" + self.add_column(t) + return t + + def t_LARROW(self, t): + r"<-" + self.add_column(t) + return t + + def t_RARROW(self, t): + r"=>" + self.add_column(t) + return t + + def t_AT(self, t): + r"@" + self.add_column(t) + return t + + def t_EQUAL(self, t): + r"=" + self.add_column(t) + return t + + def t_PLUS(self, t): + r"\+" + self.add_column(t) + return t + + def t_MINUS(self, t): + r"-" + self.add_column(t) + return t + + def t_STAR(self, t): + r"\*" + self.add_column(t) + return t + + def t_DIV(self, t): + r"/" + self.add_column(t) + return t + + def t_LEQ(self, t): + r"<=" + self.add_column(t) + return t + + def t_LESS(self, t): + r"<" + self.add_column(t) + return t + + def t_NEG(self, t): + r"~" + self.add_column(t) + return t + + def t_eof(self, t): + t.lexer.eof = (t.lexer.lineno, self.add_column(t)) + return None + + def t_error(self, t): + self.add_column(t) + t.type = "ERROR" + error_msg = t.value[0] + t.value = f'({t.lineno}, {t.col}) - LexicographicError: ERROR "{error_msg}"' + t.lexer.skip(1) + return t + + # Build the lexer + def build(self, **kwargs): + self.lexer = lex.lex( + module=self, errorlog=lex.NullLogger(), debug=False, **kwargs + ) + self.lexer.eof = (1, 1) + self.comment_level = 0 + self.string = "" + + def add_column(self, t): + t.col = t.lexpos - self.count + 1 + + def tokenize(self, data): + self.lexer.input(data) + token_list = [] + errors = [] + while True: + tok = self.lexer.token() + if not tok: + break + + if tok.type == "ERROR": + errors.append(tok.value) + else: + token_list.append( + Token(tok.value, self.token_type[tok.type], (tok.lineno, tok.col)) + ) + if not token_list: + errors.append("(0, 0) - SyntacticError: Unexpected token EOF") + token_list.append(Token("$", G.EOF, self.lexer.eof)) + return token_list, errors diff --git a/src/compiler/parser/__init__.py b/src/compiler/parser/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/parser/parser.py b/src/compiler/parser/parser.py new file mode 100644 index 000000000..3008eb6c6 --- /dev/null +++ b/src/compiler/parser/parser.py @@ -0,0 +1,200 @@ +from typing import List +from ..cmp.automata import State +from ..cmp.pycompiler import EOF, Item +from ..cmp.utils import Token, ContainerSet +from .utils import upd_table, compute_firsts, expand, compress + + +class ShiftReduceParser: + SHIFT = "SHIFT" + REDUCE = "REDUCE" + OK = "OK" + + def __init__(self, G, verbose=False): + self.G = G + self.verbose = verbose + self.action = {} + self.goto = {} + self._build_parsing_table() + + def _build_parsing_table(self): + raise NotImplementedError() + + def __call__(self, w: List[Token], get_shift_reduce=False): + stack = [0] + cursor = 0 + output = [] + operations = [] + + while True: + state = stack[-1] + lookahead = w[cursor].token_type + if self.verbose: + print(stack, w[cursor:]) + + try: + if state not in self.action or lookahead not in self.action[state]: + error = f"{w[cursor].pos} - SyntacticError: ERROR at or near {w[cursor].lex}" + return None, error + except: + print(state) + print(self.action) + print(lookahead) + error = f"{w[cursor].pos} - SyntacticError: ERROR at or near {w[cursor].lex}" + return None, error + + action, tag = list(self.action[state][lookahead])[0] + if action is self.SHIFT: + operations.append(self.SHIFT) + stack.append(tag) + cursor += 1 + elif action is self.REDUCE: + operations.append(self.REDUCE) + if len(tag.Right): + stack = stack[: -len(tag.Right)] + stack.append(list(self.goto[stack[-1]][tag.Left])[0]) + output.append(tag) + elif action is ShiftReduceParser.OK: + return (output if not get_shift_reduce else (output, operations)), None + else: + raise ValueError + + +class LR1Parser(ShiftReduceParser): + def _build_parsing_table(self): + self.ok = True + G = self.Augmented = self.G.AugmentedGrammar(True) + + automaton = self.automaton = build_LR1_automaton(G) + for i, node in enumerate(automaton): + if self.verbose: + print(i, "\t", "\n\t ".join(str(x) for x in node.state), "\n") + node.idx = i + node.tag = f"I{i}" + + for node in automaton: + idx = node.idx + for item in node.state: + if item.IsReduceItem: + prod = item.production + if prod.Left == G.startSymbol: + self.ok &= upd_table( + self.action, idx, G.EOF, (ShiftReduceParser.OK, "") + ) + else: + for lookahead in item.lookaheads: + self.ok &= upd_table( + self.action, + idx, + lookahead, + (ShiftReduceParser.REDUCE, prod), + ) + else: + next_symbol = item.NextSymbol + if next_symbol.IsTerminal: + self.ok &= upd_table( + self.action, + idx, + next_symbol, + (ShiftReduceParser.SHIFT, node[next_symbol.Name][0].idx), + ) + else: + self.ok &= upd_table( + self.goto, idx, next_symbol, node[next_symbol.Name][0].idx + ) + + +def build_LR1_automaton(G): + assert len(G.startSymbol.productions) == 1, "Grammar must be augmented" + + firsts = compute_firsts(G) + firsts[G.EOF] = ContainerSet(G.EOF) + + start_production = G.startSymbol.productions[0] + start_item = Item(start_production, 0, lookaheads=(G.EOF,)) + start = frozenset([start_item]) + + closure = closure_lr1(start, firsts) + automaton = State(frozenset(closure), True) + + pending = [start] + visited = {start: automaton} + + while pending: + current = pending.pop() + current_state = visited[current] + + for symbol in G.terminals + G.nonTerminals: + items = current_state.state + kernel = goto_lr1(items, symbol, just_kernel=True) + if not kernel: + continue + try: + next_state = visited[kernel] + except KeyError: + closure = goto_lr1(items, symbol, firsts) + next_state = visited[kernel] = State(frozenset(closure), True) + pending.append(kernel) + + current_state.add_transition(symbol.Name, next_state) + + automaton.set_formatter(lambda x: "") + return automaton + + +def closure_lr1(items, firsts): + closure = ContainerSet(*items) + + changed = True + while changed: + changed = False + + new_items = ContainerSet() + for item in closure: + new_items.extend(expand(item, firsts)) + + changed = closure.update(new_items) + + return compress(closure) + + +def goto_lr1(items, symbol, firsts=None, just_kernel=False): + assert ( + just_kernel or firsts is not None + ), "`firsts` must be provided if `just_kernel=False`" + items = frozenset(item.NextItem() for item in items if item.NextSymbol == symbol) + return items if just_kernel else closure_lr1(items, firsts) + + +def evaluate_reverse_parse(right_parse, operations, tokens): + if not right_parse or not operations or not tokens: + return + + right_parse = iter(right_parse) + tokens = iter(tokens) + stack = [] + for operation in operations: + if operation == ShiftReduceParser.SHIFT: + token = next(tokens) + stack.append(token) + elif operation == ShiftReduceParser.REDUCE: + production = next(right_parse) + _, body = production + attributes = production.attributes + assert all( + rule is None for rule in attributes[1:] + ), "There must be only synteticed attributes." + rule = attributes[0] + + if len(body): + synteticed = [None] + stack[-len(body) :] + value = rule(None, synteticed) + stack[-len(body) :] = [value] + else: + stack.append(rule(None, None)) + else: + raise Exception("Invalid action!!!") + + assert len(stack) == 1 + assert isinstance(next(tokens).token_type, EOF) + return stack[0] diff --git a/src/compiler/parser/utils.py b/src/compiler/parser/utils.py new file mode 100644 index 000000000..8ff715dcd --- /dev/null +++ b/src/compiler/parser/utils.py @@ -0,0 +1,95 @@ +from ..cmp.pycompiler import Item +from ..cmp.utils import ContainerSet + + +def upd_table(table, head, symbol, production): + if not head in table: + table[head] = {} + if not symbol in table[head]: + table[head][symbol] = [] + if production not in table[head][symbol]: + table[head][symbol].append(production) + return len(table[head][symbol]) <= 1 + + +def compute_firsts(G): + firsts = {} + change = True + + for terminal in G.terminals: + firsts[terminal] = ContainerSet(terminal) + + for nonterminal in G.nonTerminals: + firsts[nonterminal] = ContainerSet() + + while change: + change = False + + for production in G.Productions: + X = production.Left + alpha = production.Right + + first_X = firsts[X] + + try: + first_alpha = firsts[alpha] + except KeyError: + first_alpha = firsts[alpha] = ContainerSet() + + local_first = compute_local_first(firsts, alpha) + + change |= first_alpha.hard_update(local_first) + change |= first_X.hard_update(local_first) + + return firsts + + +def expand(item, firsts): + next_symbol = item.NextSymbol + if next_symbol is None or not next_symbol.IsNonTerminal: + return [] + + lookaheads = ContainerSet() + for preview in item.Preview(): + lookaheads.hard_update(compute_local_first(firsts, preview)) + + assert not lookaheads.contains_epsilon + return [Item(prod, 0, lookaheads) for prod in next_symbol.productions] + + +def compress(items): + centers = {} + + for item in items: + center = item.Center() + try: + lookaheads = centers[center] + except KeyError: + centers[center] = lookaheads = set() + lookaheads.update(item.lookaheads) + + return { + Item(x.production, x.pos, set(lookahead)) for x, lookahead in centers.items() + } + + +def compute_local_first(firsts, alpha): + first_alpha = ContainerSet() + + try: + alpha_is_epsilon = alpha.IsEpsilon + except: + alpha_is_epsilon = False + + if alpha_is_epsilon: + first_alpha.set_epsilon() + + else: + for symbol in alpha: + first_alpha.update(firsts[symbol]) + if not firsts[symbol].contains_epsilon: + break + else: + first_alpha.set_epsilon() + + return first_alpha diff --git a/src/compiler/visitors/__init__.py b/src/compiler/visitors/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/visitors/cil2mips/__init__.py b/src/compiler/visitors/cil2mips/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/visitors/cil2mips/cil2mips.py b/src/compiler/visitors/cil2mips/cil2mips.py new file mode 100644 index 000000000..ccf642204 --- /dev/null +++ b/src/compiler/visitors/cil2mips/cil2mips.py @@ -0,0 +1,774 @@ +from ...cmp import mips_ast as mips +from ...cmp.cil_ast import ( + AllocateNode, + ArgNode, + ArithmeticNode, + AssignNode, + ComplementNode, + ConcatNode, + CopyNode, + DataNode, + DivNode, + DynamicCallNode, + EqualNode, + EqualStrNode, + ErrorNode, + ExitNode, + FunctionNode, + GetAttribNode, + GotoIfNode, + GotoNode, + LabelNode, + LengthNode, + LeqNode, + LessNode, + LoadNode, + MinusNode, + NameNode, + PlusNode, + PrintIntNode, + PrintStrNode, + ProgramNode, + ReadIntNode, + ReadStrNode, + ReturnNode, + SetAttribNode, + StarNode, + StaticCallNode, + SubstringNode, + TypeNameNode, + TypeNode, + TypeOfNode, + VoidNode, +) +from ...visitors import visitor +from .utils import flatten +from typing import Optional + + +class FunctionCollectorVisitor: + def __init__(self): + self.function_count = 0 + self.functions = {} + + def generate_function_name(self): + self.function_count += 1 + return f"F_{self.function_count}" + + @visitor.on("node") + def collect(self, node): + pass + + @visitor.when(ProgramNode) + def collect(self, node: ProgramNode): + for func in node.dotcode: + self.collect(func) + + @visitor.when(FunctionNode) + def collect(self, node: FunctionNode): + if node.name == "entry": + self.functions[node.name] = "main" + else: + self.functions[node.name] = self.generate_function_name() + + +class BaseCILToMIPSVisitor: + def __init__(self): + self.data = {} + self.text = {} + self.types = {} + self.current_function: Optional[mips.FunctionNode] = None + self.pushed_args = 0 + self.label_count = 0 + self.type_count = 0 + self.function_labels = {} + self.function_collector = FunctionCollectorVisitor() + self.function_label_count = 0 + + def make_data_label(self): + self.label_count += 1 + return f"data_{self.label_count}" + + def make_function_label(self): + self.function_label_count += 1 + return f"Label_{self.function_label_count}" + + def make_type_label(self): + self.type_count += 1 + return f"type_{self.type_count}" + + def make_callee_init_instructions(self, function_node: mips.FunctionNode): + push_ra = mips.push_to_stack(mips.RA) + push_fp = mips.push_to_stack(mips.FP) + set_fp = mips.AddInmediateNode(mips.FP, mips.SP, 4) + local_vars_frame_size = len(function_node.localvars) * 4 + set_sp = mips.AddInmediateNode(mips.SP, mips.SP, -local_vars_frame_size) + return list(flatten([push_ra, push_fp, set_fp, set_sp])) + + def make_callee_final_instructions(self, function_node: mips.FunctionNode): + local_vars_frame_size = len(function_node.localvars) * 4 + set_sp = mips.AddInmediateNode(mips.SP, mips.SP, local_vars_frame_size) + pop_FP = mips.pop_from_stack(mips.FP) + pop_RA = mips.pop_from_stack(mips.RA) + final = None + if function_node.label == mips.MAIN_FUNCTION_NAME: + final = mips.exit_program() + else: + final = mips.JumpRegister(mips.RA) + + return list(flatten([set_sp, pop_FP, pop_RA, final])) + + def register_function(self, name, function: mips.FunctionNode): + self.text[name] = function + self.current_function = function + self.function_labels = {} + + def get_param_var_index(self, name): + index = self.current_function.params.index(name) + offset = (len(self.current_function.params) - index) * 4 + return mips.RegisterRelativeLocation(mips.FP, offset) + + def get_local_var_index(self, name): + index = self.current_function.localvars.index(name) + offset = (index + 2) * -4 + return mips.RegisterRelativeLocation(mips.FP, offset) + + def get_var_location(self, name): + try: + return self.get_param_var_index(name) + except ValueError: + return self.get_local_var_index(name) + + def get_type_size(self, type_name): + return (len(self.types[type_name].attributes) + 1) * 4 + + +class CILToMIPSVisitor(BaseCILToMIPSVisitor): + @visitor.on("node") + def visit(self, node): + pass + + @visitor.when(ProgramNode) + def visit(self, node: ProgramNode) -> mips.ProgramNode: + + self.function_collector.collect(node) + self.data["null_str"] = mips.StringConst("null_str", "") + + for dd in node.dotdata: + self.visit(dd) + + for dt in node.dottypes: + self.visit(dt) + + for dc in node.dotcode: + self.visit(dc) + + return mips.ProgramNode( + [t for t in self.types.values()], + [d for d in self.data.values()], + [f for f in self.text.values()], + ) + + @visitor.when(TypeNode) + def visit(self, node: TypeNode): + data_label = self.make_data_label() + self.data[data_label] = mips.StringConst(data_label, node.name) + + type_label = self.make_type_label() + methods = { + type_function: self.function_collector.functions[implemented_function] + for type_function, implemented_function in node.methods + } + + self.types[node.name] = mips.TypeNode( + data_label, type_label, node.attributes, methods, self.type_count - 1 + ) + + @visitor.when(DataNode) + def visit(self, node: DataNode): + data_label = self.make_data_label() + self.data[node.name] = mips.StringConst(data_label, node.value) + + @visitor.when(FunctionNode) + def visit(self, node: FunctionNode): + # Init function + params = [p.name for p in node.params] + local_vars = [lv.name for lv in node.localvars] + function_name = self.function_collector.functions[node.name] + function_node = mips.FunctionNode(function_name, params, local_vars) + self.register_function(function_name, function_node) + for inst in node.instructions: + if isinstance(inst, LabelNode): + new_label = self.make_function_label() + self.function_labels[inst.label] = new_label + + # Conventions of Init intructions of the callee function + init_callee = self.make_callee_init_instructions(function_node) + + # Body instructions + self.current_function = function_node + body = [self.visit(instruction) for instruction in node.instructions] + + # Conventions of Final callee instructions + final_callee = self.make_callee_final_instructions(function_node) + + total_instructions = list(flatten(init_callee + body + final_callee)) + function_node.instructions = total_instructions + self.current_function = None + + @visitor.when(AssignNode) + def visit(self, node: AssignNode): + instructions = [] + + if isinstance(node.source, VoidNode): + instructions.append( + mips.StoreWordNode(mips.ZERO, self.get_var_location(node.dest)) + ) + return instructions + + if node.source.isnumeric(): + instructions.append(mips.LoadInmediateNode(mips.A0, int(node.source))) + else: + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.source)) + ) + + instructions.append( + mips.StoreWordNode(mips.A0, self.get_var_location(node.dest)) + ) + + return instructions + + @visitor.when(PlusNode) + def visit(self, node: PlusNode): + instructions = [mips.AddNode(mips.T2, mips.T0, mips.T1)] + return self.numeric_operation(node, instructions) + + @visitor.when(MinusNode) + def visit(self, node: MinusNode): + instructions = [mips.SubNode(mips.T2, mips.T0, mips.T1)] + return self.numeric_operation(node, instructions) + + @visitor.when(StarNode) + def visit(self, node: StarNode): + instructions = [mips.MultiplyNode(mips.T2, mips.T0, mips.T1)] + return self.numeric_operation(node, instructions) + + @visitor.when(DivNode) + def visit(self, node: DivNode): + instructions = [ + mips.DivideNode(mips.T0, mips.T1), + mips.MoveFromLowNode(mips.T2), + ] + return self.numeric_operation(node, instructions) + + @visitor.when(LeqNode) + def visit(self, node: LeqNode): + return self.boolean_operation(node, "less_equal") + + @visitor.when(LessNode) + def visit(self, node: LessNode): + return self.boolean_operation(node, "less") + + @visitor.when(EqualNode) + def visit(self, node: EqualNode): + instructions = [] + + if isinstance(node.left, int): + instructions.append(mips.LoadInmediateNode(mips.A0, node.left)) + elif isinstance(node.left, VoidNode): + instructions.append(mips.LoadInmediateNode(mips.A0, 0)) + else: + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.left)) + ) + + if isinstance(node.right, int): + instructions.append(mips.LoadInmediateNode(mips.A1, node.right)) + elif isinstance(node.right, VoidNode): + instructions.append(mips.LoadInmediateNode(mips.A1, 0)) + else: + instructions.append( + mips.LoadWordNode(mips.A1, self.get_var_location(node.right)) + ) + + instructions.append(mips.JumpAndLinkNode("equals")) + + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + + return instructions + + @visitor.when(EqualStrNode) + def visit(self, node: EqualStrNode): + instructions = [] + + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.left)) + ) + instructions.append( + mips.LoadWordNode(mips.A1, self.get_var_location(node.right)) + ) + instructions.append(mips.JumpAndLinkNode("equal_str")) + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + return instructions + + @visitor.when(ComplementNode) + def visit(self, node: ComplementNode): + instructions = [] + + if isinstance(node.obj, int): + instructions.append(mips.LoadInmediateNode(mips.T0, node.obj)) + else: + instructions.append( + mips.LoadWordNode(mips.T0, self.get_var_location(node.obj)) + ) + + instructions.append(mips.ComplementNode(mips.T1, mips.T0)) + instructions.append(mips.AddInmediateNode(mips.T1, mips.T1, 1)) + instructions.append( + mips.StoreWordNode(mips.T1, self.get_var_location(node.dest)) + ) + + return instructions + + @visitor.when(AllocateNode) + def visit(self, node: AllocateNode): + instructions = [] + + tp = 0 + if node.type.isnumeric(): + tp = node.type + else: + tp = self.types[node.type].pos + + instructions.append( + mips.LoadInmediateNode(mips.A0, self.get_type_size(node.type)) + ) + instructions.append(mips.LoadInmediateNode(mips.V0, 9)) + instructions.append(mips.SyscallNode()) + instructions.append(mips.LoadInmediateNode(mips.A0, tp)) + instructions.append( + mips.StoreWordNode(mips.A0, mips.RegisterRelativeLocation(mips.V0, 0)) + ) + if self.types["Int"].pos == tp: + instructions.append(mips.LoadInmediateNode(mips.A0, 0)) + instructions.append( + mips.StoreWordNode(mips.A0, mips.RegisterRelativeLocation(mips.V0, 4)) + ) + if self.types["String"].pos == tp: + instructions.append(mips.LoadAddressNode(mips.A0, "null_str")) + instructions.append( + mips.StoreWordNode(mips.A0, mips.RegisterRelativeLocation(mips.V0, 4)) + ) + + instructions.append(mips.MoveNode(mips.A1, mips.V0)) + + int_type_index = self.types["Int"].pos + instructions.append(mips.LoadInmediateNode(mips.A0, 8)) + instructions.append(mips.LoadInmediateNode(mips.V0, 9)) + instructions.append(mips.SyscallNode()) + instructions.append(mips.LoadInmediateNode(mips.A0, int_type_index)) + instructions.append( + mips.StoreWordNode(mips.A0, mips.RegisterRelativeLocation(mips.V0, 0)) + ) + instructions.append(mips.LoadInmediateNode(mips.A0, 0)) + instructions.append( + mips.StoreWordNode(mips.A0, mips.RegisterRelativeLocation(mips.V0, 4)) + ) + instructions.append( + mips.StoreWordNode(mips.V0, mips.RegisterRelativeLocation(mips.A1, 8)) + ) + + instructions.append(mips.MoveNode(mips.V0, mips.A1)) + + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + + return instructions + + @visitor.when(TypeOfNode) + def visit(self, node: TypeOfNode): + instructions = [] + instructions.append(mips.LoadWordNode(mips.A0, self.get_var_location(node.obj))) + + instructions.append( + mips.LoadWordNode(mips.A1, mips.RegisterRelativeLocation(mips.A0, 0)) + ) + instructions.append( + mips.StoreWordNode(mips.A1, self.get_var_location(node.dest)) + ) + + return instructions + + @visitor.when(NameNode) + def visit(self, node: NameNode): + instructions = [] + + instructions.append(mips.LoadAddressNode(mips.A0, mips.TYPE_LIST)) + + tp_number = self.types[node.name].pos + instructions.append( + mips.AddInmediateUnsignedNode(mips.A0, mips.A0, tp_number * 4) + ) + instructions.append( + mips.LoadWordNode(mips.A0, mips.RegisterRelativeLocation(mips.A0, 0)) + ) + + instructions.append( + mips.StoreWordNode(mips.A0, self.get_var_location(node.dest)) + ) + + return instructions + + @visitor.when(LabelNode) + def visit(self, node: LabelNode): + return [mips.LabelNode(self.function_labels[node.label])] + + @visitor.when(StaticCallNode) + def visit(self, node: StaticCallNode): + instructions = [] + function_to_call = self.function_collector.functions[node.function] + instructions.append(mips.JumpAndLinkNode(function_to_call)) + + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + if self.pushed_args > 0: + instructions.append( + mips.AddInmediateNode(mips.SP, mips.SP, self.pushed_args * 4) + ) + self.pushed_args = 0 + return instructions + + @visitor.when(DynamicCallNode) + def visit(self, node: DynamicCallNode): + instructions = [] + caller_type = self.types[node.computed_type] + index = [m for m, m_label in caller_type.methods.items()].index(node.method) + + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.type)) + ) + + instructions.append(mips.LoadAddressNode(mips.A1, mips.VIRTUAL_TABLE)) + instructions.append(mips.ShiftLeftLogicalNode(mips.A2, mips.A0, 2)) + instructions.append(mips.AddUnsignedNode(mips.A1, mips.A1, mips.A2)) + instructions.append( + mips.LoadWordNode(mips.A1, mips.RegisterRelativeLocation(mips.A1, 0)) + ) + instructions.append(mips.AddInmediateUnsignedNode(mips.A1, mips.A1, index * 4)) + instructions.append( + mips.LoadWordNode(mips.A1, mips.RegisterRelativeLocation(mips.A1, 0)) + ) + instructions.append(mips.JumpRegisterAndLinkNode(mips.A1)) + + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + + if self.pushed_args > 0: + instructions.append( + mips.AddInmediateNode(mips.SP, mips.SP, self.pushed_args * 4) + ) + self.pushed_args = 0 + + return instructions + + @visitor.when(ArgNode) + def visit(self, node: ArgNode): + self.pushed_args += 1 + instructions = [] + if isinstance(node.name, int): + instructions.append(mips.LoadInmediateNode(mips.A0, node.name)) + instructions.extend(mips.push_to_stack(mips.A0)) + else: + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.name)) + ) + instructions.extend(mips.push_to_stack(mips.A0)) + + return instructions + + @visitor.when(ReturnNode) + def visit(self, node: ReturnNode): + instructions = [] + if node.value is None: + instructions.append(mips.LoadInmediateNode(mips.V0, 0)) + elif isinstance(node.value, int): + instructions.append(mips.LoadInmediateNode(mips.V0, node.value)) + elif isinstance(node.value, VoidNode): + instructions.append(mips.LoadInmediateNode(mips.V0, 0)) + else: + instructions.append( + mips.LoadWordNode(mips.V0, self.get_var_location(node.value)) + ) + + return instructions + + @visitor.when(LoadNode) + def visit(self, node: LoadNode): + instructions = [] + + location = mips.LabelRelativeLocation(self.data[node.msg.name].label, 0) + instructions.append(mips.LoadAddressNode(mips.A0, location)) + instructions.append( + mips.StoreWordNode(mips.A0, self.get_var_location(node.dest)) + ) + + return instructions + + @visitor.when(LengthNode) + def visit(self, node: LengthNode): + instructions = [] + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.source)) + ) + instructions.append(mips.JumpAndLinkNode("len")) + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + return instructions + + @visitor.when(ConcatNode) + def visit(self, node: ConcatNode): + instructions = [] + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.prefix)) + ) + instructions.append( + mips.LoadWordNode(mips.A1, self.get_var_location(node.suffix)) + ) + instructions.append( + mips.LoadWordNode(mips.A2, self.get_var_location(node.length)) + ) + instructions.append(mips.JumpAndLinkNode("concat")) + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + return instructions + + @visitor.when(SubstringNode) + def visit(self, node: SubstringNode): + instructions = [ + mips.LoadWordNode(mips.A0, self.get_var_location(node.str_value)) + ] + instructions.extend( + self.jump_and_link_node_instructions( + node.index, node.length, node.dest, "substr", mips.A1, mips.A2, mips.V0 + ) + ) + return instructions + + @visitor.when(ReadStrNode) + def visit(self, node: ReadStrNode): + instructions = [] + instructions.append(mips.JumpAndLinkNode("read_str")) + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + return instructions + + @visitor.when(PrintIntNode) + def visit(self, node: PrintIntNode): + return self.print_instructions(node, 1) + + @visitor.when(PrintStrNode) + def visit(self, node: PrintStrNode): + return self.print_instructions(node, 4) + + @visitor.when(ErrorNode) + def visit(self, node: ErrorNode): + instructions = [] + + mips_label = self.data[node.data_node.name].label + + instructions.append(mips.LoadInmediateNode(mips.V0, 4)) + instructions.append(mips.LoadAddressNode(mips.A0, mips_label)) + instructions.append(mips.SyscallNode()) + instructions.append(mips.LoadInmediateNode(mips.V0, 10)) + instructions.append(mips.SyscallNode()) + + return instructions + + @visitor.when(TypeNameNode) + def visit(self, node: TypeNameNode): + instructions = [] + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.source)) + ) + instructions.append( + mips.LoadWordNode(mips.A0, mips.RegisterRelativeLocation(mips.A0, 0)) + ) + instructions.append(mips.ShiftLeftLogicalNode(mips.A0, mips.A0, 2)) + instructions.append(mips.LoadAddressNode(mips.A1, mips.TYPE_LIST)) + instructions.append(mips.AddUnsignedNode(mips.A0, mips.A0, mips.A1)) + instructions.append( + mips.LoadWordNode(mips.A0, mips.RegisterRelativeLocation(mips.A0, 0)) + ) + instructions.append( + mips.StoreWordNode(mips.A0, self.get_var_location(node.dest)) + ) + return instructions + + @visitor.when(ExitNode) + def visit(self, node: ExitNode): + instructions = [] + instructions.append(mips.LoadInmediateNode(mips.V0, 10)) + instructions.append(mips.SyscallNode()) + return instructions + + @visitor.when(GetAttribNode) + def visit(self, node: GetAttribNode): + instructions = [] + + dest = node.dest if isinstance(node.dest, str) else node.dest.name + obj = node.obj if isinstance(node.obj, str) else node.obj.name + comp_type = ( + node.computed_type + if isinstance(node.computed_type, str) + else node.computed_type.name + ) + + instructions.append(mips.LoadWordNode(mips.A0, self.get_var_location(obj))) + + tp = self.types[comp_type] + offset = (tp.attributes.index(node.attr) + 1) * 4 + + instructions.append( + mips.LoadWordNode(mips.A1, mips.RegisterRelativeLocation(mips.A0, offset)) + ) + instructions.append(mips.StoreWordNode(mips.A1, self.get_var_location(dest))) + return instructions + + @visitor.when(SetAttribNode) + def visit(self, node: SetAttribNode): + instructions = [] + + obj = node.obj if isinstance(node.obj, str) else node.obj.name + comp_type = ( + node.computed_type + if isinstance(node.computed_type, str) + else node.computed_type.name + ) + + tp = self.types[comp_type] + offset = (tp.attributes.index(node.attr) + 1) * 4 + + instructions.append(mips.LoadWordNode(mips.A0, self.get_var_location(obj))) + + if isinstance(node.value, int): + instructions.append(mips.LoadInmediateNode(mips.A1, node.value)) + elif isinstance(node.value, VoidNode): + instructions.append(mips.LoadInmediateNode(mips.A1, 0)) + else: + instructions.append( + mips.LoadWordNode(mips.A1, self.get_var_location(node.value)) + ) + + instructions.append( + mips.StoreWordNode(mips.A1, mips.RegisterRelativeLocation(mips.A0, offset)) + ) + return instructions + + @visitor.when(CopyNode) + def visit(self, node: CopyNode): + instructions = [] + # reg1 T0 reg2 A3 + instructions.extend(mips.push_to_stack(mips.T0)) + instructions.append( + mips.LoadWordNode(mips.T0, self.get_var_location(node.source)) + ) + instructions.append( + mips.LoadWordNode(mips.A0, mips.RegisterRelativeLocation(mips.T0, 4)) + ) + instructions.append(mips.ShiftLeftLogicalNode(mips.A0, mips.A0, 2)) + instructions.append(mips.JumpAndLinkNode("malloc")) + instructions.append(mips.MoveNode(mips.A2, mips.A0)) + instructions.append(mips.MoveNode(mips.A0, mips.T0)) + instructions.append(mips.MoveNode(mips.A1, mips.V0)) + instructions.append(mips.JumpAndLinkNode("copy")) + instructions.extend(mips.pop_from_stack(mips.T0)) + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + return instructions + + @visitor.when(GotoIfNode) + def visit(self, node: GotoIfNode): + instructions = [] + + local_label = self.function_labels[node.label] + + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.condition)) + ) + instructions.append(mips.BranchOnNotEqualNode(mips.A0, mips.ZERO, local_label)) + return instructions + + @visitor.when(GotoNode) + def visit(self, node: GotoNode): + local_label = self.function_labels[node.label] + return [mips.JumpNode(local_label)] + + @visitor.when(ReadIntNode) + def visit(self, node: ReadIntNode): + instructions = [] + instructions.append(mips.LoadInmediateNode(mips.V0, 5)) + instructions.append(mips.SyscallNode()) + instructions.append( + mips.StoreWordNode(mips.V0, self.get_var_location(node.dest)) + ) + return instructions + + def add_node_instruction(self, node, register: mips.Register): + if isinstance(node, int): + return mips.LoadInmediateNode(register, node) + else: + return mips.LoadWordNode(register, self.get_var_location(node)) + + def numeric_operation(self, node, instructions): + return self.generate_instructions( + node.left, node.right, node.dest, instructions, mips.T0, mips.T1, mips.T2 + ) + + def boolean_operation(self, node, op: str): + return self.jump_and_link_node_instructions( + node.left, node.right, node.dest, op, mips.A0, mips.A1, mips.V0 + ) + + def jump_and_link_node_instructions(self, node0, node1, dest, op, reg0, reg1, reg2): + instructions = [mips.JumpAndLinkNode(op)] + return self.generate_instructions( + node0, node1, dest, instructions, reg0, reg1, reg2 + ) + + def generate_instructions( + self, + node0, + node1, + dest, + specific_instructions, + reg0: mips.Register, + reg1: mips.Register, + reg2: mips.Register, + ): + instructions = [] + instructions.append(self.add_node_instruction(node0, reg0)) + instructions.append(self.add_node_instruction(node1, reg1)) + instructions.extend(specific_instructions) + instructions.append(mips.StoreWordNode(reg2, self.get_var_location(dest))) + return instructions + + def print_instructions(self, node, n): + instructions = [] + instructions.append(mips.LoadInmediateNode(mips.V0, n)) + instructions.append( + mips.LoadWordNode(mips.A0, self.get_var_location(node.value)) + ) + instructions.append(mips.SyscallNode()) + return instructions diff --git a/src/compiler/visitors/cil2mips/mips_lib.asm b/src/compiler/visitors/cil2mips/mips_lib.asm new file mode 100644 index 000000000..608534319 --- /dev/null +++ b/src/compiler/visitors/cil2mips/mips_lib.asm @@ -0,0 +1,1152 @@ + + +header_size = 12 #in bytes +header_size_slot = 0 +header_next_slot = 4 +header_reachable_slot = 8 +alloc_size = 2048 +total_alloc_size = 2060 #alloc_size + header_size +neg_header_size = -12 #-header_size +free_list = 0 +used_list = header_size +state_size = 4 +stack_base = -4 +init_alloc_size = 28 #(header_size*2) + state_size +object_mark = -1 +meta_data_object_size = 4 #in words +object_expanded = -2 +reachable = 1 +new_line = 10 +str_size_treshold = 1024 +int_type = 0 +string_type = 0 +type_number = 0 + + + +##################################################################################################### +# Initialize memory manager # +# Args: # +# # +# Return: # +# # +# Summary: # +# The initial blocks for Free-List and Used-List are created. # +# The $gp is set to use as reference when initial blocks or values related to memory manager # +# state are needed. # +# A block of size alloc_size is created an added to Free-List # +##################################################################################################### +mem_manager_init: + + addiu $sp $sp -16 + sw $v0 0($sp) + sw $a0 4($sp) + sw $a1 8($sp) + sw $ra 12($sp) + + + li $v0 9 + li $a0 init_alloc_size + syscall #Creating free-list start point + move $gp $v0 + addiu $gp $gp state_size + + sw $zero header_size_slot($gp) #The free-list start with a block without space, just header, that will always be there. + sw $zero header_next_slot($gp) + sw $zero header_reachable_slot($gp) + + move $a0 $gp + li $a1 alloc_size + jal extend_heap + + addiu $a0 $a0 header_size + sw $zero header_size_slot($a0) #The used-list start with a block without space, just header, that will always be there. + sw $zero header_next_slot($a0) + sw $zero header_reachable_slot($a0) + + + + lw $v0 0($sp) + lw $a0 4($sp) + lw $a1 8($sp) + lw $ra 12($sp) + addiu $sp $sp 16 + + sw $sp stack_base($gp) + + jr $ra + + +##################################################################################################### +# Free a block previously allocated # +# Args: # +# $a0 Block to free address # +# Return: # +# # +# Summary: # +# Remove the block from the used-list and add it to the free-list # +##################################################################################################### +free_block: + addiu $sp $sp -28 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $a0 12($sp) + sw $ra 16($sp) + sw $t3 20($sp) + sw $t4 24($sp) + + move $t0 $a0 + + addiu $t1 $gp free_list # Store in $t1 the initial block of the free-list + + addiu $t3 $gp used_list # Store in $t3 the initial block of the used-list + +free_block_loop_used_list: # Iterate througth the used-list until find the block + lw $t4 header_next_slot($t3) + beq $t4 $t0 free_block_loop_free_list + move $t3 $t4 + j free_block_loop_used_list + + +free_block_loop_free_list: # Iterate througth the free-list to find the antecesor of the block in the free-list + lw $t2 header_next_slot($t1) + beq $t2 $zero free_block_founded_prev + bge $t2 $t0 free_block_founded_prev + move $t1 $t2 + j free_block_loop_free_list + +free_block_founded_prev: + # Remove the block from the used-list + lw $t4 header_next_slot($t0) + sw $t4 header_next_slot($t3) + + # Add the block to the free-list + sw $t2 header_next_slot($t0) + sw $t0 header_next_slot($t1) + +free_block_end: + + # Try to merge the list where the new block was added + move $a0 $t0 + jal expand_block + move $a0 $t1 + jal expand_block + + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $a0 12($sp) + lw $ra 16($sp) + lw $t3 20($sp) + lw $t4 24($sp) + addiu $sp $sp 28 + + jr $ra + + +##################################################################################################### +# Merge two continuos blocks of the free-list # +# Args: # +# $a0 First of the two blocks to merge # +# Return: # +# # +# Summary: # +# Check if a block can be merged with its sucesor in the free list # +##################################################################################################### +expand_block: + addiu $sp $sp -16 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $t3 12($sp) + + + addiu $t0 $gp free_list # $t0 = the initial block of the free-list + + beq $t0 $a0 expand_block_end # The initial block can't be expanded, the initial block always will have size 0 + + move $t0 $a0 + + # Check if the block and its sucesor in the free list are contiguous in memory + lw $t1 header_next_slot($t0) + lw $t2 header_size_slot($t0) + move $t3 $t2 + addiu $t2 $t2 header_size + addu $t2 $t2 $t0 + beq $t2 $t1 expand_block_expand + j expand_block_end + +expand_block_expand: #Increment the size of the first block and update next field + lw $t2 header_size_slot($t1) + addi $t2 $t2 header_size + add $t2 $t2 $t3 + sw $t2 header_size_slot($t0) + lw $t1 header_next_slot($t1) + sw $t1 header_next_slot($t0) + +expand_block_end: + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $t3 12($sp) + addiu $sp $sp 16 + + jr $ra + + +##################################################################################################### +# Allocate more memory for the process and add it to the free-list # +# Args: # +# $a0 Last block of the free-list # +# $a1 Memory amount to alloc # +# Return: # +# # +# Summary: # +# More memory is allocated and add it to the free-list as a block. # +##################################################################################################### +extend_heap: + addiu $sp $sp -12 + sw $a0 0($sp) + sw $a1 4($sp) + sw $t0 8($sp) + + # Increase the amount of memory by header_size to create a block with that size + li $v0 9 + addiu $a0 $a1 header_size + syscall + + # Set values of the block_header + move $t0 $a1 + sw $t0 header_size_slot($v0) + sw $zero header_next_slot($v0) + sw $zero header_reachable_slot($v0) + + # Add block to the end of the free-list + lw $t0, 0($sp) + sw $v0 header_next_slot($t0) + + move $a0 $t0 + lw $a1 4($sp) + lw $t0 8($sp) + addiu $sp $sp 12 + + jr $ra + + + +##################################################################################################### +# Split a block into two blocks, one of the requested size and the other with the rest. # +# Args: # +# $a0 Address of the block to split # +# $a1 Size requested for one block # +# Return: # +# # +# Summary: # +# The block is splitted into two blocks if the size allow it. # +##################################################################################################### +split_block: + addiu $sp $sp -16 + sw $t0 0($sp) + sw $t1 4($sp) + sw $a0 8($sp) + sw $a1 12($sp) + + # Check if the block can be splitted in two blocks, one of the requested size + lw $t0 header_size_slot($a0) + bgt $a1 $t0 split_block_error_small + + # Check if after a split the block there is enough space to create another block, if there is not do not split + sub $t0 $t0 $a1 + li $t1 header_size + ble $t0 $t1 split_block_same_size + + # Compute the address of the second block + addu $t0 $a0 $a1 + addiu $t0 $t0 header_size + + #Update headers of the two blocks + lw $t1 header_next_slot($a0) + sw $t1 header_next_slot($t0) + sw $t0 header_next_slot($a0) + + lw $t1 header_size_slot($a0) #update sizes + sub $t1 $t1 $a1 + + addi $t1 $t1 neg_header_size + sw $t1 header_size_slot($t0) + sw $a1 header_size_slot($a0) + move $v0 $a0 + j split_block_end + +split_block_same_size: + move $v0 $a0 + j split_block_end + +split_block_error_small: + j split_block_end + +split_block_end: + lw $t0 0($sp) + lw $t1 4($sp) + lw $a0 8($sp) + lw $a1 12($sp) + addiu $sp $sp 16 + + jr $ra + + +##################################################################################################### +# Best Fit strategy is used to select the block # +# Args: # +# $a0 size to alloc # +# Return: # +# $v0 address of allocated block # +# Summary: # +# Actual block is store in $t0, the size block is checked to know if it is a # +# valid block (a block is valid if its size is larger or equal than the required size), # +# if the block is valid we compare it with the actual best block and keep the shorter block. # +# If there is not a block with the required size, a new block of size # +# max(total_alloc_size, size requested) is requested with sbrk and splitted if necessary # +##################################################################################################### +malloc: + move $v0 $zero + addiu $sp $sp -28 + sw $t1 0($sp) + sw $t0 4($sp) + sw $a0 8($sp) + sw $a1 12($sp) + sw $ra 16($sp) + sw $t2 20($sp) + sw $t3 24($sp) + + addiu $t0 $gp free_list + j malloc_loop + +malloc_end: + + move $a0 $v0 + lw $a1 8($sp) # a1 = requested block size + jal split_block + + lw $t1 header_next_slot($v0) + sw $t1 header_next_slot($t3) + + addiu $t1 $gp used_list + lw $a0 header_next_slot($t1) + + sw $a0 header_next_slot($v0) + sw $v0 header_next_slot($t1) + + addiu $v0 $v0 header_size + + lw $t3 24($sp) + lw $t2 20($sp) + lw $ra 16($sp) + lw $a1 12($sp) + lw $a0 8($sp) + lw $t0 4($sp) + lw $t1 0($sp) + addiu $sp $sp 28 + + jr $ra +####################################################################### +# t0 = actual block address # +####################################################################### +malloc_loop: + move $t2 $t0 # save previous block in $t2 (this is usefull when we lw $t3 24($sp)need to alloc the new block) + lw $t0 header_next_slot($t0) # t0 = next block address + beq $t0 $zero malloc_search_end # if t0 == 0 we reach to the free-list end + j malloc_check_valid_block + +####################################################################### +# $v0 = actual selected block address # +####################################################################### +malloc_search_end: + beq $v0 $zero malloc_alloc_new_block # if v0 == 0 a valid block was not found + j malloc_end + +####################################################################### +# t2 = last block of free list # +# a0 = requested block size # +####################################################################### +malloc_alloc_new_block: + li $t1 alloc_size # t1 = standard alloc size + move $t3 $t2 + move $a1 $a0 # a1 = requested block size + move $a0 $t2 # a0 = last block of free list + bge $a1 $t1 malloc_big_block # if the requested size is bigger than the standar alloc size go to malloc_big_block + li $a1 alloc_size # a1 = standard alloc size + jal extend_heap + + j malloc_end + +###################################################################### +# a1 = requested block size # +###################################################################### +malloc_big_block: + #addiu $a1 $a1 header_size # Add header size to alloc size + jal extend_heap + j malloc_end + + + +######################################################################## +# t0 = actual block address # +######################################################################## +malloc_check_valid_block: + lw $t1 header_size_slot($t0) # t1 = size new block + bge $t1 $a0 malloc_valid_block # the actual block have the required size + j malloc_loop + +######################################################################## +# t0 = actual block address # +# t1 = size actual block # +# v0 = actual selected block address(0 if no one have been selected) # +# v1 = actual selected block size # +######################################################################## +malloc_valid_block: + beq $v0 $zero malloc_first_valid_block # this is the first valid block + bge $t1 $v1 malloc_loop # the selected block is smaller than actual block + move $v0 $t0 # selected block address = actual block address + move $v1 $t1 # selected block size = actual block size + move $t3 $t2 + j malloc_loop + + +######################################################################## +# t0 = actual block address # +# t1 = size actual block # +# v0 = actual selected block address(0 if no one have been selected) # +# v1 = actual selected block size # +######################################################################## +malloc_first_valid_block: + move $v0 $t0 # selected block address = actual block address + move $v1 $t1 # selected block size = actual block size + move $t3 $t2 + j malloc_loop + + +#TODO Look for objects in registers +##################################################################################################### +# Remove from used-list the blocks that are not reachables, the root objects are in the stack and # +# registers # +# Args: # +# # +# Return: # +# # +# Summary: # +# First the objects in stack and registers are marked as reachables, after that the objects # +# that are reachables from them are marked as reachable too using a dfs algorithm. When all # +# reachables objects are marked the used-list is scanned and all the objects that are not # +# marked as reachables are released. # +##################################################################################################### + +gc_collect: + addiu $sp $sp -24 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $t3 12($sp) + sw $a0 16($sp) + sw $ra 20($sp) + + li $t3 reachable # $t3 = reachable value + addiu $t0 $sp 20 # $t0 = the start of the stack without count this function + lw $t1 stack_base($gp) # $t1 = the end of the stack + + li $t2 1 +# Go through the stack searching for objects +gc_collect_loop: + addiu $t0 $t0 4 + beq $t0 $t1 gc_collect_dfs # If the end of the stack was reached finish this loop + + lw $a0 0($t0) + jal check_if_is_object + + bne $v0 $t2 gc_collect_loop + + addiu $a0 $a0 neg_header_size + sw $t3 header_reachable_slot($a0) + + j gc_collect_loop + +gc_collect_dfs: + addiu $t1 $gp used_list + +# Go through the used-list and try to expand any reachable block +gc_collect_outer_loop: + lw $t1 header_next_slot($t1) + beq $t1 $zero gc_collect_free + lw $t2 header_reachable_slot($t1) + beq $t2 reachable gc_collect_expand + j gc_collect_outer_loop + +gc_collect_expand: + addiu $a0 $t1 header_size # expand an object not a block + jal gc_collect_recursive_expand + j gc_collect_outer_loop + +gc_collect_free: + addiu $t0 $gp used_list + lw $t0 header_next_slot($t0) + +# Go through the used-list and free any unreachable object and set the reachable and expanded field to their default values +gc_collect_free_loop: + beq $t0 $zero gc_collect_end + lw $t1 header_reachable_slot($t0) + bne $t1 reachable gc_collect_free_loop_free + sw $zero header_reachable_slot($t0) + move $a0 $t0 + jal check_if_is_object + beq $v0 $zero gc_collect_free_loop + li $t1 object_mark + addiu $t2 $t0 header_size + lw $t3 4($t2) + sll $t3 $t3 2 + addu $t2 $t2 $t3 + sw $t1 -4($t2) + lw $t0 header_next_slot($t0) + j gc_collect_free_loop + +gc_collect_free_loop_free: + move $a0 $t0 + lw $t0 header_next_slot($t0) + jal free_block + j gc_collect_free_loop + + +gc_collect_end: + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $t3 12($sp) + lw $a0 16($sp) + lw $ra 20($sp) + addiu $sp $sp 24 + + jr $ra + + + + +##################################################################################################### +# Mark the objects that are reachable from the attrs of one object in a recursive way. # +# Args: # +# $a0: Object to expand # +# Return: # +# # +# Summary: # +# The actual object is marked as reachable and expanded to avoid infinite cycles, and this # +# routine is called recursively to expand the objects in the attrs of the actual object. # +##################################################################################################### +gc_collect_recursive_expand: + addiu $sp $sp -16 + sw $a0 0($sp) + sw $t0 4($sp) + sw $t1 8($sp) + sw $ra 12($sp) + + jal check_if_is_object # If is not an object can not be expanded + beq $v0 $zero gc_collect_recursive_expand_end + + lw $t0 4($a0) + sll $t0 $t0 2 + addiu $t0 $t0 -4 + addu $t0 $a0 $t0 + lw $t1 0($t0) # Check if the object was ready expanded to avoid infinite cycles + beq $t1 object_expanded gc_collect_recursive_expand_end + + # Mark the block that contains the object as reachable + li $t1 reachable + addiu $a0 $a0 neg_header_size + sw $t1 header_reachable_slot($a0) + addiu $a0 $a0 header_size + + # Mark the object as expanded + li $t1 object_expanded + sw $t1 0($t0) + + lw $t0 0($a0) # $t0 = type of the object + + # int and string types are special cases + la $t1 int_type + lw $t1 0($t1) + beq $t0 $t1 gc_collect_recursive_expand_end + + la $t1 string_type + lw $t1 0($t1) + beq $t0 $t1 gc_collect_recursive_expand_string_object + + lw $t0 4($a0) + li $t1 meta_data_object_size + sub $t0 $t0 $t1 + + addiu $t1 $a0 12 + +# call this routine in every attr of the object +gc_collect_recursive_expand_attr_loop: + beq $t0 $zero gc_collect_recursive_expand_end + lw $a0 0($t1) + jal gc_collect_recursive_expand + addiu $t1 $t1 4 + sub $t0 $t0 1 + j gc_collect_recursive_expand_attr_loop + +# the value field of string object is not an object but it is a +# reference to the block where the string is saved, so that block +# needs to be marked as reachable +gc_collect_recursive_expand_string_object: + lw $t0 8($a0) + addiu $t0 $t0 neg_header_size + li $t1 reachable + sw $t1 header_reachable_slot($t0) + + +gc_collect_recursive_expand_end: + lw $a0 0($sp) + lw $t0 4($sp) + lw $t1 8($sp) + lw $ra 12($sp) + addiu $sp $sp 16 + + jr $ra + + + + + + + + +# $a0 address from +# $a1 address to +# $a2 size +copy: + addiu $sp $sp -16 + sw $a0 0($sp) + sw $a1 4($sp) + sw $a2 8($sp) + sw $t0 12($sp) + +copy_loop: + beq $a2 $zero copy_end + lw $t0 0($a0) + sw $t0 0($a1) + addiu $a0 $a0 4 + addiu $a1 $a1 4 + addi $a2 $a2 -4 + j copy_loop + +copy_end: + lw $a0 0($sp) + lw $a1 4($sp) + lw $a2 8($sp) + lw $t0 12($sp) + addiu $sp $sp 16 + + jr $ra + + +##################################################################################################### +# Check if a value is a reference to an object # +# Args: # +# $a0: Value to check # +# Return: # +# $v0: 1 if is a reference to an object else 0 # +# Summary: # +# Check if a value is a valid heap address and if it is check if in that address there are # +# values that match with the object schema # +##################################################################################################### +check_if_is_object: + addiu $sp $sp -20 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $t3 12($sp) + sw $a0 16($sp) + + move $t0 $a0 + + li $v0 9 + move $a0 $zero + syscall + + addiu $t1 $v0 -4 # Last word of heap + + # Check that the first word is a type object + blt $t0 $gp check_if_is_object_not_object + bgt $t0 $t1 check_if_is_object_not_object + lw $t2 0($t0) + blt $t2 $zero check_if_is_object_not_object + la $t3 type_number + lw $t3 0($t3) + bge $t2 $t3 check_if_is_object_not_object + + addiu $t0 $t0 4 + blt $t0 $gp check_if_is_object_not_object + bgt $t0 $t1 check_if_is_object_not_object + lw $t2 0($t0) #Store size in $t2 + + addiu $t0 $t0 8 + + + li $t3 meta_data_object_size + sub $t2 $t2 $t3 + sll $t2 $t2 2 + addu $t0 $t0 $t2 + + # Check if the last word of the object is an object mark + blt $t0 $gp check_if_is_object_not_object + bgt $t0 $t1 check_if_is_object_not_object + lw $t2 0($t0) + beq $t2 object_mark check_if_is_object_is_object + beq $t2 object_expanded check_if_is_object_is_object + +check_if_is_object_not_object: + li $v0 0 + j check_if_is_object_end + + +check_if_is_object_is_object: + li $v0 1 + + +check_if_is_object_end: + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $t3 12($sp) + lw $a0 16($sp) + addiu $sp $sp 20 + + jr $ra + + +equals: + beq $a0 $a1 equals_equal + li $v0 0 + j equals_end + +equals_equal: + li $v0 1 + +equals_end: + jr $ra + + + +less_equal: + ble $a0 $a1 less_equal_true + li $v0 0 + j less_equal_end + +less_equal_true: + li $v0 1 + +less_equal_end: + jr $ra + + +less: + blt $a0 $a1 less_true + li $v0 0 + j less_end + +less_true: + li $v0 1 + +less_end: + jr $ra + + +len: + addiu $sp $sp -8 + sw $t0 0($sp) + sw $t1 4($sp) + + move $t0 $a0 + move $v0 $zero + +len_loop: + lb $t1 0($t0) + beq $t1 $zero len_end + addi $v0 $v0 1 + addiu $t0 $t0 1 + j len_loop + +len_end: + lw $t0 0($sp) + lw $t1 4($sp) + addiu $sp $sp 8 + + jr $ra + + +use_block: + addiu $sp $sp -12 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + + addiu $t0 $gp free_list + +use_block_loop: + move $t1 $t0 + lw $t0 header_next_slot($t0) + beq $t0 $zero use_block_end + beq $t0 $a0 use_block_founded + j use_block_loop + +use_block_founded: + lw $t2 header_next_slot($t0) + sw $t2 header_next_slot($t1) + + addiu $t1 $gp used_list + lw $t2 header_next_slot($t1) + sw $t0 header_next_slot($t1) + sw $t2 header_next_slot($t0) + +use_block_end: + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + addiu $sp $sp 12 + + jr $ra + + + + +read_str: + addiu $sp $sp -36 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $t3 12($sp) + sw $t4 16($sp) + sw $t5 20($sp) + sw $a0 24($sp) + sw $a1 28($sp) + sw $ra 32($sp) + + addiu $t0 $gp free_list + move $t1 $zero + move $t2 $t0 + +read_str_larger_block_loop: + lw $t0 header_next_slot($t0) + beq $t0 $zero read_str_reading + lw $t3 header_size_slot($t0) + bge $t1 $t3 read_str_larger_block_loop + move $t1 $t3 + move $t2 $t0 + j read_str_larger_block_loop + +read_str_reading: + beq $t1 $zero read_str_new_block + move $a1 $t1 + li $v0 8 + addiu $a0 $t2 header_size + syscall + move $t0 $a0 + move $t1 $zero + +read_str_look_nl: + lb $t2 0($t0) + beq $t2 new_line read_str_nl_founded + beq $t2 $zero read_str_zero_founded#read_str_no_nl + addi $t1 $t1 1 + addi $t0 $t0 1 + j read_str_look_nl + +read_str_zero_founded: + blt $t1 $t3 read_str_nl_founded + j read_str_no_nl + +read_str_nl_founded: + sb $zero 0($t0) + addi $t1 $t1 1 + li $t2 4 + div $t1 $t2 + mfhi $t3 + beq $t3 $zero read_str_nl_founded_alligned + sub $t2 $t2 $t3 + add $t1 $t1 $t2 +read_str_nl_founded_alligned: + move $a1 $t1 + addiu $a0 $a0 neg_header_size + jal split_block + jal use_block + + addiu $v0 $a0 header_size + j read_str_end + + +read_str_no_nl: + addi $t1 $t1 1 + blt $t1 str_size_treshold read_str_dup + addi $t1 $t1 alloc_size + j read_str_extend_heap +read_str_dup: + sll $t1 $t1 1 +read_str_extend_heap: + move $a1 $t1 + move $t0 $a0 + addiu $a0 $gp free_list + +read_str_last_block_loop: + lw $t1 header_next_slot($a0) + beq $t1 $zero read_str_last_block_founded + lw $a0 header_next_slot($a0) + j read_str_last_block_loop + +read_str_last_block_founded: + jal extend_heap + jal expand_block + lw $t1 header_next_slot($a0) + bne $t1 $zero read_str_copy_prev + move $t1 $a0 + +read_str_copy_prev: + lw $t3 header_size_slot($t1) + move $t2 $zero + move $t5 $t1 + addiu $t1 $t1 header_size + +read_str_copy_loop: + lb $t4 0($t0) + beq $t4 $zero read_str_copy_end + sb $t4 0($t1) + addi $t2 $t2 1 + addi $t0 $t0 1 + addi $t1 $t1 1 + j read_str_copy_loop + +read_str_copy_end: + sub $t3 $t3 $t2 + move $a0 $t1 + move $a1 $t3 + li $v0 8 + syscall + move $t0 $a0 + move $t1 $t2 + addiu $a0 $t5 header_size + j read_str_look_nl + + +read_str_end: + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $t3 12($sp) + lw $t4 16($sp) + lw $t5 20($sp) + lw $a0 24($sp) + lw $a1 28($sp) + lw $ra 32($sp) + addiu $sp $sp 36 + + jr $ra + + +read_str_new_block: + addiu $t0 $gp free_list + +read_str_new_block_search_last: + lw $t1 header_next_slot($t0) + beq $t1 $zero read_str_new_block_create + move $t0 $t1 + j read_str_new_block_search_last + +read_str_new_block_create: + move $a0 $t0 + li $a1 alloc_size + jal extend_heap + jal expand_block + lw $t2 header_next_slot($a0) + beq $t2 $zero read_str_new_block_expanded + lw $t1 header_size_slot($t2) + j read_str_reading + +read_str_new_block_expanded: + move $t2 $a0 + lw $t1 header_size_slot($a0) + j read_str_reading + + + +concat: + addiu $sp $sp -24 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $a0 12($sp) + sw $a1 16($sp) + sw $ra 20($sp) + + move $t0 $a0 + move $t1 $a1 + + + addiu $a0 $a2 1 + li $t2 4 + div $a0 $t2 + mfhi $a0 + bne $a0 $zero concat_allign_size + addiu $a0 $a2 1 + +concat_size_alligned: + jal malloc + move $t2 $v0 + j concat_copy_first_loop + +concat_allign_size: + sub $t2 $t2 $a0 + add $a0 $a2 $t2 + addiu $a0 $a0 1 + j concat_size_alligned + +concat_copy_first_loop: + lb $a0 0($t0) + beq $a0 $zero concat_copy_second_loop + sb $a0 0($t2) + addiu $t0 $t0 1 + addiu $t2 $t2 1 + j concat_copy_first_loop + +concat_copy_second_loop: + lb $a0 0($t1) + beq $a0 $zero concat_end + sb $a0 0($t2) + addiu $t1 $t1 1 + addiu $t2 $t2 1 + j concat_copy_second_loop + +concat_end: + sb $zero 0($t2) + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $a0 12($sp) + lw $a1 16($sp) + lw $ra 20($sp) + addiu $sp $sp 24 + + jr $ra + + +substr: + addiu $sp $sp -24 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $t3 12($sp) + sw $a0 16($sp) + sw $ra 20($sp) + + move $t0 $a0 + li $t1 4 + addiu $t3 $a2 1 + div $t3 $t1 + + mfhi $t2 + bne $t2 $zero substr_allign_size + move $t1 $t3 + j substr_new_block + +substr_allign_size: + sub $t1 $t1 $t2 + add $t1 $t1 $t3 + +substr_new_block: + move $a0 $t1 + jal malloc + move $t3 $v0 + move $t1 $zero + addu $t0 $t0 $a1 + +substr_copy_loop: + beq $t1 $a2 substr_end + lb $t2 0($t0) + sb $t2 0($t3) + addiu $t0 $t0 1 + addiu $t3 $t3 1 + addiu $t1 $t1 1 + j substr_copy_loop + +substr_end: + sb $zero 0($t3) + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $t3 12($sp) + lw $a0 16($sp) + lw $ra 20($sp) + addiu $sp $sp 24 + + jr $ra + + +equal_str: + addiu $sp $sp -16 + sw $t0 0($sp) + sw $t1 4($sp) + sw $t2 8($sp) + sw $t3 12($sp) + + move $t0 $a0 + move $t1 $a1 + +equal_str_loop: + lb $t2 0($t0) + lb $t3 0($t1) + bne $t2 $t3 equal_str_not_equal + beq $t2 $zero equal_str_equal + + addiu $t0 $t0 1 + addiu $t1 $t1 1 + j equal_str_loop + +equal_str_not_equal: + move $v0 $zero + j equal_str_end + +equal_str_equal: + li $v0 1 + +equal_str_end: + lw $t0 0($sp) + lw $t1 4($sp) + lw $t2 8($sp) + lw $t3 12($sp) + addiu $sp $sp 16 + + jr $ra + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/compiler/visitors/cil2mips/mips_printer.py b/src/compiler/visitors/cil2mips/mips_printer.py new file mode 100644 index 000000000..53f594e54 --- /dev/null +++ b/src/compiler/visitors/cil2mips/mips_printer.py @@ -0,0 +1,150 @@ +from compiler.cmp.mips_ast import * +from compiler.visitors import visitor + + +class MIPSPrintVisitor: + @visitor.on("node") + def visit(self, node): + pass + + @visitor.when(Register) + def visit(self, node): + return f"${node.name}" + + @visitor.when(int) + def visit(self, node): + return str(node) + + @visitor.when(str) + def visit(self, node): + return node + + @visitor.when(ProgramNode) + def visit(self, node): + data_section_header = "\t.data" + static_strings = "\n".join([self.visit(d) for d in node.data]) + + names_table = f"{TYPE_LIST}:\n\t .word " + ", ".join( + [f"{tp.data_label}" for tp in node.types] + ) + virtual_table = f"{VIRTUAL_TABLE}:\n\t .word " + ", ".join( + [f"{tp.type_label}_dispatch" for tp in node.types] + ) + + types = "\n\n".join([self.visit(tp) for tp in node.types]) + + code = "\n".join([self.visit(func) for func in node.text]) + return f"{data_section_header}\n{static_strings}\n\n{names_table}\n\n{virtual_table}\n\n{types}\n\n.text\n\t.globl main\n{code}" + + @visitor.when(StringConst) + def visit(self, node): + return f'{node.label}: .asciiz "{node.string}"' + + @visitor.when(TypeNode) + def visit(self, node: TypeNode): + methods = ", ".join([f"{node.methods[m]}" for m in node.methods]) + dispatch_table = f"{node.type_label}_dispatch:\n\t .word {methods}" + return f"{dispatch_table}" + + @visitor.when(SyscallNode) + def visit(self, node): + return "syscall" + + @visitor.when(LabelRelativeLocation) + def visit(self, node): + return f"{node.label} + {node.offset}" + + @visitor.when(RegisterRelativeLocation) + def visit(self, node): + return f"{node.offset}({self.visit(node.register)})" + + @visitor.when(FunctionNode) + def visit(self, node): + instr = [self.visit(instruction) for instruction in node.instructions] + instr2 = [inst for inst in instr if type(inst) == str] + instructions = "\n\t".join(instr2) + return f"{node.label}:\n\t{instructions}" + + @visitor.when(AddInmediateNode) + def visit(self, node): + return f"addi {self.visit(node.dest)}, {self.visit(node.src)}, {self.visit(node.constant_number)}" + + @visitor.when(StoreWordNode) + def visit(self, node): + return f"sw {self.visit(node.reg)}, {self.visit(node.addr)}" + + @visitor.when(LoadInmediateNode) + def visit(self, node): + return f"li {self.visit(node.reg)}, {self.visit(node.value)}" + + @visitor.when(JumpAndLinkNode) + def visit(self, node): + return f"jal {node.label}" + + @visitor.when(JumpRegister) + def visit(self, node): + return f"jr {self.visit(node.reg)}" + + @visitor.when(JumpRegisterAndLinkNode) + def visit(self, node): + return f"jalr {self.visit(node.reg)}" + + @visitor.when(LoadWordNode) + def visit(self, node): + return f"lw {self.visit(node.reg)}, {self.visit(node.addr)}" + + @visitor.when(LoadAddressNode) + def visit(self, node): + return f"la {self.visit(node.reg)}, {self.visit(node.label)}" + + @visitor.when(MoveNode) + def visit(self, node): + return f"move {self.visit(node.reg1)} {self.visit(node.reg2 )}" + + @visitor.when(ShiftLeftLogicalNode) + def visit(self, node): + return f"sll {self.visit(node.dest)} {self.visit(node.src)} {node.bits}" + + @visitor.when(AddInmediateUnsignedNode) + def visit(self, node): + return f"addiu {self.visit(node.dest)} {self.visit(node.src)} {self.visit(node.value)}" + + @visitor.when(AddUnsignedNode) + def visit(self, node): + return f"addu {self.visit(node.dest)} {self.visit(node.sum1)} {self.visit(node.sum2)}" + + @visitor.when(LabelNode) + def visit(self, node): + return f"{node.label}:" + + @visitor.when(BranchOnNotEqualNode) + def visit(self, node): + return f"bne {self.visit(node.reg1)} {self.visit(node.reg2)} {node.label}" + + @visitor.when(JumpNode) + def visit(self, node): + return f"j {node.label}" + + @visitor.when(AddNode) + def visit(self, node): + return f"add {self.visit(node.reg1)} {self.visit(node.reg2)} {self.visit(node.reg3)}" + + @visitor.when(SubNode) + def visit(self, node): + return f"sub {self.visit(node.reg1)} {self.visit(node.reg2)} {self.visit(node.reg3)}" + + @visitor.when(MultiplyNode) + def visit(self, node): + return f"mul {self.visit(node.reg1)} {self.visit(node.reg2)} {self.visit(node.reg3)}" + + @visitor.when(DivideNode) + def visit(self, node): + return f"div {self.visit(node.reg1)} {self.visit(node.reg2)}" + + @visitor.when(ComplementNode) + def visit(self, node): + return f"not {self.visit(node.reg1)} {self.visit(node.reg2)}" + + @visitor.when(MoveFromLowNode) + def visit(self, node): + return f"mflo {self.visit(node.reg)}" diff --git a/src/compiler/visitors/cil2mips/utils.py b/src/compiler/visitors/cil2mips/utils.py new file mode 100644 index 000000000..6785766ad --- /dev/null +++ b/src/compiler/visitors/cil2mips/utils.py @@ -0,0 +1,6 @@ +def flatten(iterable): + for item in iterable: + try: + yield from flatten(item) + except TypeError: + yield item diff --git a/src/compiler/visitors/cool2cil/__init__.py b/src/compiler/visitors/cool2cil/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/visitors/cool2cil/cil_formatter.py b/src/compiler/visitors/cool2cil/cil_formatter.py new file mode 100644 index 000000000..a9e2ce4dc --- /dev/null +++ b/src/compiler/visitors/cool2cil/cil_formatter.py @@ -0,0 +1,187 @@ +from ...cmp.cil_ast import * +import compiler.visitors.visitor as visitor + + +class PrintCILVisitor(object): + @visitor.on("node") + def visit(self, node): + pass + + @visitor.when(ProgramNode) + def visit(self, node): + dottypes = "\n".join(self.visit(t) for t in node.dottypes) + dotdata = "\n".join(self.visit(t) for t in node.dotdata) + dotcode = "\n".join(self.visit(t) for t in node.dotcode) + + return f".TYPES\n{dottypes}\n\n.DATA\n{dotdata}\n\n.CODE\n{dotcode}" + + @visitor.when(TypeNode) + def visit(self, node): + attributes = "\n\t".join(f"attribute {x}" for x in node.attributes) + methods = "\n\t".join(f"method {x}: {y}" for x, y in node.methods) + + return f"type {node.name} {{\n\t{attributes}\n\n\t{methods}\n}}" + + @visitor.when(CopyNode) + def visit(self, node): + return f"{node.dest} = COPY {node.source}" + + @visitor.when(DataNode) + def visit(self, node): + return f"{node.name} = {node.value}" + + @visitor.when(FunctionNode) + def visit(self, node): + params = "\n\t".join(self.visit(x) for x in node.params) + localvars = "\n\t".join(self.visit(x) for x in node.localvars) + instructions = "\n\t".join(self.visit(x) for x in node.instructions) + + return f"function {node.name} {{\n\t{params}\n\n\t{localvars}\n\n\t{instructions}\n}}" + + @visitor.when(ParamNode) + def visit(self, node): + return f"PARAM {node.name}" + + @visitor.when(LoadNode) + def visit(self, node): + return f"{node.dest} = Load {node.msg}" + + @visitor.when(LocalNode) + def visit(self, node): + return f"LOCAL {node.name}" + + @visitor.when(AssignNode) + def visit(self, node): + return f"{node.dest} = {node.source}" + + @visitor.when(PlusNode) + def visit(self, node): + return f"{node.dest} = {node.left} + {node.right}" + + @visitor.when(MinusNode) + def visit(self, node): + return f"{node.dest} = {node.left} - {node.right}" + + @visitor.when(StarNode) + def visit(self, node): + return f"{node.dest} = {node.left} * {node.right}" + + @visitor.when(DivNode) + def visit(self, node): + return f"{node.dest} = {node.left} / {node.right}" + + @visitor.when(LeqNode) + def visit(self, node): + return f"{node.dest} = {node.left} <= {node.right}" + + @visitor.when(LessNode) + def visit(self, node): + return f"{node.dest} = {node.left} < {node.right}" + + @visitor.when(EqualNode) + def visit(self, node): + return f"{node.dest} = {node.left} == {node.right}" + + @visitor.when(ComplementNode) + def visit(self, node): + return f"{node.dest} = COMPL {node.obj}" + + @visitor.when(VoidNode) + def visit(self, node): + return "VOID" + + @visitor.when(GetAttribNode) + def visit(self, node): + return f"{node.dest} = GETATTR {node.obj} {node.attr}" + + @visitor.when(SetAttribNode) + def visit(self, node): + return f"SETATTR {node.obj} {node.attr} {node.value}" + + @visitor.when(AllocateNode) + def visit(self, node): + return f"{node.dest} = ALLOCATE {node.type}" + + @visitor.when(TypeOfNode) + def visit(self, node): + return f"{node.dest} = TYPEOF {node.obj}" + + @visitor.when(LabelNode) + def visit(self, node): + return f"LABEL {node.label}" + + @visitor.when(GotoNode) + def visit(self, node): + return f"GOTO {node.label}" + + @visitor.when(GotoIfNode) + def visit(self, node): + return f"IF {node.condition} GOTO {node.label}" + + @visitor.when(StaticCallNode) + def visit(self, node): + return f"{node.dest} = CALL {node.function}" + + @visitor.when(DynamicCallNode) + def visit(self, node): + return f"{node.dest} = VCALL {node.type} {node.method}" + + @visitor.when(ArgNode) + def visit(self, node): + return f"ARG {node.name}" + + @visitor.when(ReturnNode) + def visit(self, node): + return f'RETURN {node.value if node.value is not None else ""}' + + @visitor.when(LengthNode) + def visit(self, node): + return f"{node.dest} = LENGTH {node.source}" + + @visitor.when(ConcatNode) + def visit(self, node): + return f"{node.dest} = CONCAT {node.prefix} {node.suffix}" + + @visitor.when(SubstringNode) + def visit(self, node): + return f"{node.dest} = SUBSTRING {node.index} {node.length}" + + @visitor.when(ExitNode) + def visit(self, node): + return f"EXIT" + + @visitor.when(LoadNode) + def visit(self, node): + return f"{node.dest} = Load {node.msg}" + + @visitor.when(NameNode) + def visit(self, node): + return f"{node.dest} = NAME {node.name}" + + @visitor.when(TypeNameNode) + def visit(self, node): + return f"{node.dest} = TYPENAME {node.source}" + + @visitor.when(ReadStrNode) + def visit(self, node): + return f"{node.dest} = READSTR" + + @visitor.when(ReadIntNode) + def visit(self, node): + return f"{node.dest} = READINT" + + @visitor.when(PrintStrNode) + def visit(self, node): + return f"PRINT {node.value}" + + @visitor.when(PrintIntNode) + def visit(self, node): + return f"PRINT {node.value}" + + @visitor.when(ErrorNode) + def visit(self, node): + return f"ERROR {node.data_node}" + + @visitor.when(EqualStrNode) + def visit(self, node): + return f"{node.dest} = {node.left} == {node.right}" diff --git a/src/compiler/visitors/cool2cil/cool2cil.py b/src/compiler/visitors/cool2cil/cool2cil.py new file mode 100644 index 000000000..9d2f5fd27 --- /dev/null +++ b/src/compiler/visitors/cool2cil/cool2cil.py @@ -0,0 +1,1075 @@ +from ...cmp import cil_ast as cil +from ...cmp.ast import ( + AssignNode, + AttrDeclarationNode, + BlockNode, + CallNode, + CaseBranchNode, + CaseNode, + ClassDeclarationNode, + ConditionalNode, + ConstantBoolNode, + ConstantNumNode, + ConstantStringNode, + DivNode, + EqualNode, + FuncDeclarationNode, + InstantiateNode, + LeqNode, + LessNode, + LetNode, + LetVarNode, + LoopNode, + MinusNode, + NegNode, + NotNode, + PlusNode, + ProgramNode, + StarNode, + VariableNode, + VoidNode, +) +from ...cmp.semantic import ( + Context, + Method, + Scope, + SemanticError, + SelfType, + Type, + VariableInfo, +) +from typing import List, Optional +import compiler.visitors.visitor as visitor + + +class BaseCOOLToCILVisitor: + def __init__(self, context: Context): + self.dottypes: List[cil.TypeNode] = [] + self.dotdata: List[cil.DataNode] = [] + self.dotcode: List[cil.FunctionNode] = [] + self.current_type: Optional[Type] = None + self.current_method: Optional[Method] = None + self.current_function: Optional[cil.FunctionNode] = None + self.context = context + self.vself = VariableInfo("self", None) + self.value_types = ["String", "Int", "Bool"] + + @property + def params(self): + return self.current_function.params + + @property + def localvars(self): + return self.current_function.localvars + + @property + def ids(self): + return self.current_function.ids + + @property + def instructions(self): + return self.current_function.instructions + + def register_local(self, vinfo, id=False): + new_vinfo = VariableInfo("", None) + if ( + len(self.current_function.name) >= 8 + and self.current_function.name[:8] == "function" + ): + new_vinfo.name = f"local_{self.current_function.name[9:]}_{vinfo.name}_{len(self.localvars)}" + else: + new_vinfo.name = f"local_{self.current_function.name[5:]}_{vinfo.name}_{len(self.localvars)}" + + local_node = cil.LocalNode(new_vinfo.name) + if id: + self.ids[vinfo.name] = new_vinfo.name + self.localvars.append(local_node) + return new_vinfo.name + + def define_internal_local(self): + vinfo = VariableInfo("internal", None) + return self.register_local(vinfo) + + def register_instruction(self, instruction): + self.instructions.append(instruction) + return instruction + + def to_function_name(self, method_name, type_name): + return f"function_{method_name}_at_{type_name}" + + def register_function(self, function_name): + function_node = cil.FunctionNode(function_name, [], [], []) + self.dotcode.append(function_node) + return function_node + + def register_param(self, vinfo): + param_node = cil.ParamNode(vinfo.name) + self.params.append(param_node) + return vinfo.name + + def register_type(self, name): + type_node = cil.TypeNode(name) + self.dottypes.append(type_node) + return type_node + + def register_data(self, value): + vname = f"data_{len(self.dotdata)}" + data_node = cil.DataNode(vname, value) + self.dotdata.append(data_node) + return data_node + + def register_label(self, label): + lname = f"{label}_{self.current_function.labels_count}" + self.current_function.labels_count += 1 + return cil.LabelNode(lname) + + def register_runtime_error(self, condition, msg): + error_node = self.register_label("error_label") + continue_node = self.register_label("continue_label") + self.register_instruction(cil.GotoIfNode(condition, error_node.label)) + self.register_instruction(cil.GotoNode(continue_node.label)) + self.register_instruction(error_node) + data_node = self.register_data(msg) + self.register_instruction(cil.ErrorNode(data_node)) + + self.register_instruction(continue_node) + + def init_name(self, type_name, attr=False): + if attr: + return f"init_attr_at_{type_name}" + return f"init_at_{type_name}" + + def buildHierarchy(self, t: str): + if t == "Object": + return None + return { + x.name + for x in self.context.types.values() + if x.name != "AUTO_TYPE" and x.conforms_to(self.context.get_type(t)) + } + + def register_built_in(self): + # Object + type_node = self.register_type("Object") + + # init Object + self.current_function = self.register_function(self.init_name("Object")) + instance = self.define_internal_local() + self.register_instruction(cil.AllocateNode("Object", instance)) + self.register_instruction(cil.ReturnNode(instance)) + + # abort Object + self.current_function = self.register_function( + self.to_function_name("abort", "Object") + ) + self.register_param(self.vself) + vname = self.define_internal_local() + data_node = [ + dn for dn in self.dotdata if dn.value == "Abort called from class " + ][0] + self.register_instruction(cil.LoadNode(vname, data_node)) + self.register_instruction(cil.PrintStrNode(vname)) + self.register_instruction(cil.TypeNameNode(vname, self.vself.name)) + self.register_instruction(cil.PrintStrNode(vname)) + data_node = self.register_data("\n") + self.register_instruction(cil.LoadNode(vname, data_node)) + self.register_instruction(cil.PrintStrNode(vname)) + self.register_instruction(cil.ExitNode()) + + # type_name Object + self.current_function = self.register_function( + self.to_function_name("type_name", "Object") + ) + self.register_param(self.vself) + result = self.define_internal_local() + self.register_instruction(cil.TypeNameNode(result, self.vself.name)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(result)) + self.register_instruction( + cil.StaticCallNode(self.init_name("String"), instance) + ) + self.register_instruction(cil.ReturnNode(instance)) + + # copy Object + self.current_function = self.register_function( + self.to_function_name("copy", "Object") + ) + self.register_param(self.vself) + result = self.define_internal_local() + self.register_instruction(cil.CopyNode(result, self.vself.name)) + self.register_instruction(cil.ReturnNode(result)) + + # Object + type_node.methods = [ + (name, self.to_function_name(name, "Object")) + for name in ["abort", "type_name", "copy"] + ] + type_node.methods += [("init", self.init_name("Object"))] + obj_methods = ["abort", "type_name", "copy"] + + ############################################## + + # IO + type_node = self.register_type("IO") + + # init IO + self.current_function = self.register_function(self.init_name("IO")) + instance = self.define_internal_local() + self.register_instruction(cil.AllocateNode("IO", instance)) + self.register_instruction(cil.ReturnNode(instance)) + + # out_string IO + self.current_function = self.register_function( + self.to_function_name("out_string", "IO") + ) + self.register_param(self.vself) + self.register_param(VariableInfo("x", None)) + vname = self.define_internal_local() + self.register_instruction(cil.GetAttribNode(vname, "x", "value", "String")) + self.register_instruction(cil.PrintStrNode(vname)) + self.register_instruction(cil.ReturnNode(self.vself.name)) + + # out_int IO + self.current_function = self.register_function( + self.to_function_name("out_int", "IO") + ) + self.register_param(self.vself) + self.register_param(VariableInfo("x", None)) + vname = self.define_internal_local() + self.register_instruction(cil.GetAttribNode(vname, "x", "value", "Int")) + self.register_instruction(cil.PrintIntNode(vname)) + self.register_instruction(cil.ReturnNode(self.vself.name)) + + # in_string IO + self.current_function = self.register_function( + self.to_function_name("in_string", "IO") + ) + self.register_param(self.vself) + result = self.define_internal_local() + self.register_instruction(cil.ReadStrNode(result)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(result)) + self.register_instruction( + cil.StaticCallNode(self.init_name("String"), instance) + ) + self.register_instruction(cil.ReturnNode(instance)) + + # in_int IO + self.current_function = self.register_function( + self.to_function_name("in_int", "IO") + ) + self.register_param(self.vself) + result = self.define_internal_local() + self.register_instruction(cil.ReadIntNode(result)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(result)) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), instance)) + self.register_instruction(cil.ReturnNode(instance)) + + # IO + type_node.methods = [ + (method, self.to_function_name(method, "Object")) for method in obj_methods + ] + type_node.methods += [ + (name, self.to_function_name(name, "IO")) + for name in ["out_string", "out_int", "in_string", "in_int"] + ] + type_node.methods += [("init", self.init_name("IO"))] + + ############################################## + + # String + type_node = self.register_type("String") + type_node.attributes = ["value", "length"] + + # init String + self.current_function = self.register_function(self.init_name("String")) + self.register_param(VariableInfo("val", None)) + instance = self.define_internal_local() + self.register_instruction(cil.AllocateNode("String", instance)) + self.register_instruction(cil.SetAttribNode(instance, "value", "val", "String")) + result = self.define_internal_local() + self.register_instruction(cil.LengthNode(result, "val")) + attr = self.define_internal_local() + self.register_instruction(cil.ArgNode(result)) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), attr)) + self.register_instruction(cil.SetAttribNode(instance, "length", attr, "String")) + self.register_instruction(cil.ReturnNode(instance)) + + # length String + self.current_function = self.register_function( + self.to_function_name("length", "String") + ) + self.register_param(self.vself) + result = self.define_internal_local() + self.register_instruction( + cil.GetAttribNode(result, self.vself.name, "length", "String") + ) + self.register_instruction(cil.ReturnNode(result)) + + # concat String + self.current_function = self.register_function( + self.to_function_name("concat", "String") + ) + self.register_param(self.vself) + self.register_param(VariableInfo("s", None)) + str_1 = self.define_internal_local() + str_2 = self.define_internal_local() + length_1 = self.define_internal_local() + length_2 = self.define_internal_local() + self.register_instruction( + cil.GetAttribNode(str_1, self.vself.name, "value", "String") + ) + self.register_instruction(cil.GetAttribNode(str_2, "s", "value", "String")) + self.register_instruction( + cil.GetAttribNode(length_1, self.vself.name, "length", "String") + ) + self.register_instruction(cil.GetAttribNode(length_2, "s", "length", "String")) + self.register_instruction(cil.GetAttribNode(length_1, length_1, "value", "Int")) + self.register_instruction(cil.GetAttribNode(length_2, length_2, "value", "Int")) + self.register_instruction(cil.PlusNode(length_1, length_1, length_2)) + + result = self.define_internal_local() + self.register_instruction(cil.ConcatNode(result, str_1, str_2, length_1)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(result)) + self.register_instruction( + cil.StaticCallNode(self.init_name("String"), instance) + ) + self.register_instruction(cil.ReturnNode(instance)) + + # subst String + self.current_function = self.register_function( + self.to_function_name("substr", "String") + ) + self.register_param(self.vself) + self.register_param(VariableInfo("i", None)) + self.register_param(VariableInfo("l", None)) + result = self.define_internal_local() + index_value = self.define_internal_local() + length_value = self.define_internal_local() + length_attr = self.define_internal_local() + length_substr = self.define_internal_local() + less_value = self.define_internal_local() + str_value = self.define_internal_local() + self.register_instruction( + cil.GetAttribNode(str_value, self.vself.name, "value", "String") + ) + self.register_instruction(cil.GetAttribNode(index_value, "i", "value", "Int")) + self.register_instruction(cil.GetAttribNode(length_value, "l", "value", "Int")) + # Check Out of range error + self.register_instruction( + cil.GetAttribNode(length_attr, self.vself.name, "length", "String") + ) + self.register_instruction( + cil.PlusNode(length_substr, length_value, index_value) + ) + self.register_instruction(cil.LessNode(less_value, length_attr, length_substr)) + self.register_runtime_error(less_value, "Substring out of range") + self.register_instruction( + cil.SubstringNode(result, str_value, index_value, length_value) + ) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(result)) + self.register_instruction( + cil.StaticCallNode(self.init_name("String"), instance) + ) + self.register_instruction(cil.ReturnNode(instance)) + + # String + type_node.methods = [ + (method, self.to_function_name(method, "Object")) for method in obj_methods + ] + type_node.methods += [ + (name, self.to_function_name(name, "String")) + for name in ["length", "concat", "substr"] + ] + type_node.methods += [("init", self.init_name("String"))] + + ############################################## + + # Int + type_node = self.register_type("Int") + type_node.attributes = ["value"] + + # init Int + self.current_function = self.register_function(self.init_name("Int")) + self.register_param(VariableInfo("val", None)) + instance = self.define_internal_local() + self.register_instruction(cil.AllocateNode("Int", instance)) + self.register_instruction(cil.SetAttribNode(instance, "value", "val", "Int")) + self.register_instruction(cil.ReturnNode(instance)) + + # Int + type_node.methods = [ + (method, self.to_function_name(method, "Object")) for method in obj_methods + ] + type_node.methods += [("init", self.init_name("Int"))] + + # Bool + type_node = self.register_type("Bool") + type_node.attributes = ["value"] + + # init Bool + self.current_function = self.register_function(self.init_name("Bool")) + self.register_param(VariableInfo("val", None)) + instance = self.define_internal_local() + self.register_instruction(cil.AllocateNode("Bool", instance)) + self.register_instruction(cil.SetAttribNode(instance, "value", "val", "Bool")) + self.register_instruction(cil.ReturnNode(instance)) + + # Bool + type_node.methods = [ + (method, self.to_function_name(method, "Object")) for method in obj_methods + ] + type_node.methods += [("init", self.init_name("Bool"))] + + +class COOLToCILVisitor(BaseCOOLToCILVisitor): + @visitor.on("node") + def visit(self, node): + pass + + @visitor.when(ProgramNode) + def visit(self, node: ProgramNode, scope: Scope): + self.current_function = self.register_function("entry") + result = self.define_internal_local() + instance = self.register_local(VariableInfo("instance", None)) + self.register_instruction(cil.StaticCallNode(self.init_name("Main"), instance)) + self.register_instruction(cil.ArgNode(instance)) + main_method_name = self.to_function_name("main", "Main") + self.register_instruction(cil.StaticCallNode(main_method_name, result)) + self.register_instruction(cil.ReturnNode(0)) + + self.register_data("Abort called from class ") + self.register_built_in() + self.current_function = None + + for declaration, child_scope in zip(node.declarations, scope.children): + self.visit(declaration, child_scope) + + return cil.ProgramNode(self.dottypes, self.dotdata, self.dotcode) + + @visitor.when(ClassDeclarationNode) + def visit(self, node: ClassDeclarationNode, scope: Scope): + self.current_type: Type = self.context.get_type(node.id) + + # Handle all the .TYPE section + type_node = self.register_type(self.current_type.name) + + type_node.attributes.extend( + [attr.name for attr, _ in self.current_type.all_attributes()] + ) + type_node.methods.extend( + [ + (method.name, self.to_function_name(method.name, typex.name)) + for method, typex in self.current_type.all_methods() + ] + ) + for feature, child_scope in zip(node.features, scope.children): + if isinstance(feature, FuncDeclarationNode): + self.visit(feature, child_scope) + + # init + self.current_function = self.register_function(self.init_name(node.id)) + # allocate + instance = self.register_local(VariableInfo("instance", None)) + self.register_instruction(cil.AllocateNode(node.id, instance)) + + func = self.current_function + vtemp = self.define_internal_local() + + # init_attr + self.current_function = self.register_function( + self.init_name(node.id, attr=True) + ) + self.register_param(self.vself) + parent_type = self.context.get_type(node.id).parent + if parent_type.name != "Object" and parent_type.name != "IO": + vtemp2 = self.define_internal_local() + self.register_instruction(cil.ArgNode(self.vself.name)) + self.register_instruction( + cil.StaticCallNode(self.init_name(parent_type.name, attr=True), vtemp2) + ) + + for feature, child_scope in zip(node.features, scope.children): + if isinstance(feature, AttrDeclarationNode): + self.visit(feature, child_scope) + + self.current_function = func + self.register_instruction(cil.ArgNode(instance)) + self.register_instruction( + cil.StaticCallNode(self.init_name(node.id, attr=True), vtemp) + ) + + self.register_instruction(cil.ReturnNode(instance)) + self.current_function = None + self.current_type = None + + @visitor.when(AttrDeclarationNode) + def visit(self, node: AttrDeclarationNode, scope: Scope): + if node.expr: + value = self.visit(node.expr, scope) + self.register_instruction( + cil.SetAttribNode(self.vself.name, node.id, value, self.current_type) + ) + return value + + elif node.type in self.value_types: + value = self.define_internal_local() + self.register_instruction(cil.AllocateNode(node.type, value)) + self.register_instruction( + cil.SetAttribNode(self.vself.name, node.id, value, self.current_type) + ) + return value + + @visitor.when(FuncDeclarationNode) + def visit(self, node: FuncDeclarationNode, scope: Scope): + self.current_method = self.current_type.get_method(node.id) + + # Handle PARAMS + self.current_function = self.register_function( + self.to_function_name(self.current_method.name, self.current_type.name) + ) + + self.register_param(self.vself) + for param_name, _ in node.params: + self.register_param(VariableInfo(param_name.lex, None)) + + # Handle RETURN + value = self.visit(node.body, scope) + if value is None: + self.register_instruction(cil.ReturnNode("")) + elif self.current_function.name == "entry": + self.register_instruction(cil.ReturnNode(0)) + else: + self.register_instruction(cil.ReturnNode(value)) + + self.current_method = None + + @visitor.when(AssignNode) + def visit(self, node: AssignNode, scope: Scope): + value = self.visit(node.expr, scope) + + try: + self.current_type.get_attribute(node.id) + self.register_instruction( + cil.SetAttribNode( + self.vself.name, node.id, value, self.current_type.name + ) + ) + except SemanticError: + vname = None + param_names = [pn.name for pn in self.current_function.params] + if node.id in param_names: + vname = node.id + else: + vname = self.ids[node.id] + + self.register_instruction(cil.AssignNode(vname, value)) + + return value + + @visitor.when(CallNode) + def visit(self, node: CallNode, scope: Scope): + args = [] + for arg in node.args: + vname = self.register_local(VariableInfo(f"{node.id}_arg", None), id=True) + ret = self.visit(arg, scope) + self.register_instruction(cil.AssignNode(vname, ret)) + args.append(cil.ArgNode(vname)) + result = self.register_local( + VariableInfo(f"return_value_of_{node.id}", None), id=True + ) + + # static call node + if ( + isinstance(node.obj, VariableNode) + and node.obj.lex == self.vself.name + and not node.type + ): + self.register_instruction(cil.ArgNode(self.vself.name)) + for arg in args: + self.register_instruction(arg) + + type_of_node = self.register_local( + VariableInfo(f"{self.vself.name}_type", None) + ) + self.register_instruction(cil.TypeOfNode(self.vself.name, type_of_node)) + self.register_instruction( + cil.DynamicCallNode( + type_of_node, node.id, result, self.current_type.name + ) + ) + return result + + vobj = self.define_internal_local() + ret = self.visit(node.obj, scope) + self.register_instruction(cil.AssignNode(vobj, ret)) + + # Check if node.obj is void + void = cil.VoidNode() + equal_result = self.define_internal_local() + self.register_instruction(cil.EqualNode(equal_result, vobj, void)) + + self.register_runtime_error( + equal_result, + f"{node.token.pos} - RuntimeError: Dispatch on void\n", + ) + + # self + self.register_instruction(cil.ArgNode(vobj)) + for arg in args: + self.register_instruction(arg) + + if node.type: + # Call of type @.id(,...,) + self.register_instruction( + cil.StaticCallNode(self.to_function_name(node.id, node.type), result) + ) + else: + # Call of type .(,...,) + type_of_node = self.register_local( + VariableInfo(f"{node.id}_type", None), id=True + ) + self.register_instruction(cil.TypeOfNode(vobj, type_of_node)) + computed_type = node.obj.computed_type + if isinstance(computed_type, SelfType): + computed_type = computed_type.fixed_type + self.register_instruction( + cil.DynamicCallNode(type_of_node, node.id, result, computed_type.name) + ) + + return result + + @visitor.when(ConditionalNode) + def visit(self, node: ConditionalNode, scope: Scope): + vret = self.register_local(VariableInfo("if_then_else_value", None)) + vcondition = self.define_internal_local() + + then_label_node = self.register_label("then_label") + else_label_node = self.register_label("else_label") + continue_label_node = self.register_label("continue_label") + + # If condition GOTO then_label + ret = self.visit(node.condition, scope) + self.register_instruction(cil.GetAttribNode(vcondition, ret, "value", "Bool")) + self.register_instruction(cil.GotoIfNode(vcondition, then_label_node.label)) + # GOTO else_label + self.register_instruction(cil.GotoNode(else_label_node.label)) + # Label then_label + self.register_instruction(then_label_node) + retif = self.visit(node.then_body, scope) + self.register_instruction(cil.AssignNode(vret, retif)) + self.register_instruction(cil.GotoNode(continue_label_node.label)) + # Label else_label + self.register_instruction(else_label_node) + retelse = self.visit(node.else_body, scope) + self.register_instruction(cil.AssignNode(vret, retelse)) + + self.register_instruction(continue_label_node) + return vret + + @visitor.when(LoopNode) + def visit(self, node: LoopNode, scope: Scope): + vcondition = self.define_internal_local() + while_label_node = self.register_label("while_label") + loop_label_node = self.register_label("loop_label") + pool_label_node = self.register_label("pool_label") + # Label while + self.register_instruction(while_label_node) + # If condition GOTO loop + ret = self.visit(node.condition, scope) + self.register_instruction(cil.GetAttribNode(vcondition, ret, "value", "Bool")) + self.register_instruction(cil.GotoIfNode(vcondition, loop_label_node.label)) + # GOTO pool + self.register_instruction(cil.GotoNode(pool_label_node.label)) + # Label loop + self.register_instruction(loop_label_node) + ret = self.visit(node.body, scope) + # GOTO while + self.register_instruction(cil.GotoNode(while_label_node.label)) + # Label pool + self.register_instruction(pool_label_node) + + # The result of a while loop is void + return cil.VoidNode() + + @visitor.when(BlockNode) + def visit(self, node: BlockNode, scope: Scope): + ret_value = None + for expr in node.expr_list: + ret_value = self.visit(expr, scope) + + return ret_value + + @visitor.when(LetNode) + def visit(self, node: LetNode, scope: Scope): + vret = self.register_local(VariableInfo("let_in_value", None)) + + for let_var_node in node.id_list: + self.visit(let_var_node, scope) + ret = self.visit(node.body, scope) + self.register_instruction(cil.AssignNode(vret, ret)) + return vret + + @visitor.when(LetVarNode) + def visit(self, node: LetVarNode, scope: Scope): + try: + vname = self.ids[node.id] + except KeyError: + vname = self.register_local(VariableInfo(node.id, node.typex), id=True) + + if node.expression: + ret_value = self.visit(node.expression, scope) + self.register_instruction(cil.AssignNode(vname, ret_value)) + elif node.typex in self.value_types: + self.register_instruction(cil.AllocateNode(node.typex, vname)) + + @visitor.when(CaseNode) + def visit(self, node: CaseNode, scope: Scope): + ret = self.register_local(VariableInfo("case_expr_value", None)) + ret_type = self.register_local(VariableInfo("typeName_value", None)) + vcond = self.register_local(VariableInfo("equal_value", None)) + value = self.register_local(VariableInfo("case_value", None)) + + ret_val = self.visit(node.expr, scope) + + self.register_instruction(cil.AssignNode(ret, ret_val)) + self.register_instruction(cil.TypeNameNode(ret_type, ret_val)) + + # Check if node.expr is void and raise proper error if vexpr value is void + void = cil.VoidNode() + equal_result = self.define_internal_local() + self.register_instruction(cil.EqualNode(equal_result, ret, void)) + + self.register_runtime_error( + equal_result, + f"{node.token.pos} - RuntimeError: Case on void\n", + ) + + # sorting the branches + order = [] + for b in node.branch_list: + count = 0 + t1 = self.context.get_type(b.typex) + for other in node.branch_list: + t2 = self.context.get_type(other.typex) + count += t2.conforms_to(t1) + order.append((count, b)) + order.sort(key=lambda x: x[0]) + + labels = [] + old = {} + for idx, (_, b) in enumerate(order): + labels.append(self.register_label(f"{idx}_label")) + h = self.buildHierarchy(b.typex) + if not h: + self.register_instruction(cil.GotoNode(labels[-1].label)) + break + for s in old: + h -= s + for t in h: + vbranch_type_name = self.register_local( + VariableInfo("branch_type_name", None) + ) + self.register_instruction(cil.NameNode(vbranch_type_name, t)) + self.register_instruction( + cil.EqualNode(vcond, ret_type, vbranch_type_name) + ) + self.register_instruction(cil.GotoIfNode(vcond, labels[-1].label)) + + # Raise runtime error if no Goto was executed + data_node = self.register_data( + f"({node.token.pos[0] + 1 + len(node.branch_list)},{node.token.pos[1] - 5}) - RuntimeError: Execution of a case statement without a matching branch\n" + ) + self.register_instruction(cil.ErrorNode(data_node)) + + end_label = self.register_label("end_label") + for idx, l in enumerate(labels): + self.register_instruction(l) + vid = self.register_local(VariableInfo(order[idx][1].id, None), id=True) + self.register_instruction(cil.AssignNode(vid, ret)) + ret_2 = self.visit(order[idx][1], scope) + self.register_instruction(cil.AssignNode(value, ret_2)) + self.register_instruction(cil.GotoNode(end_label.label)) + + self.register_instruction(end_label) + return value + + @visitor.when(CaseBranchNode) + def visit(self, node: CaseBranchNode, scope: Scope): + return self.visit(node.expression, scope) + + @visitor.when(NotNode) + def visit(self, node: NotNode, scope: Scope): + vname = self.define_internal_local() + value = self.define_internal_local() + instance = self.define_internal_local() + + ret = self.visit(node.expr, scope) + self.register_instruction(cil.GetAttribNode(value, ret, "value", "Bool")) + self.register_instruction(cil.MinusNode(vname, 1, value)) + + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Bool"), instance)) + return instance + + @visitor.when(LeqNode) + def visit(self, node: LeqNode, scope: Scope): + vname = self.define_internal_local() + left_value = self.define_internal_local() + right_value = self.define_internal_local() + instance = self.define_internal_local() + + left = self.visit(node.left, scope) + right = self.visit(node.right, scope) + self.register_instruction(cil.GetAttribNode(left_value, left, "value", "Bool")) + self.register_instruction( + cil.GetAttribNode(right_value, right, "value", "Bool") + ) + self.register_instruction(cil.LeqNode(vname, left_value, right_value)) + + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Bool"), instance)) + return instance + + @visitor.when(LessNode) + def visit(self, node: LessNode, scope: Scope): + vname = self.define_internal_local() + left_value = self.define_internal_local() + right_value = self.define_internal_local() + instance = self.define_internal_local() + + left = self.visit(node.left, scope) + right = self.visit(node.right, scope) + self.register_instruction(cil.GetAttribNode(left_value, left, "value", "Bool")) + self.register_instruction( + cil.GetAttribNode(right_value, right, "value", "Bool") + ) + self.register_instruction(cil.LessNode(vname, left_value, right_value)) + + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Bool"), instance)) + return instance + + @visitor.when(EqualNode) + def visit(self, node: EqualNode, scope: Scope): + vname = self.define_internal_local() + type_left = self.define_internal_local() + type_int = self.define_internal_local() + type_bool = self.define_internal_local() + type_string = self.define_internal_local() + equal_result = self.define_internal_local() + left_value = self.define_internal_local() + right_value = self.define_internal_local() + instance = self.define_internal_local() + + left = self.visit(node.left, scope) + right = self.visit(node.right, scope) + + self.register_instruction(cil.TypeNameNode(type_left, left)) + self.register_instruction(cil.NameNode(type_int, "Int")) + self.register_instruction(cil.NameNode(type_bool, "Bool")) + self.register_instruction(cil.NameNode(type_string, "String")) + + int_node = self.register_label("int_label") + string_node = self.register_label("string_label") + reference_node = self.register_label("reference_label") + continue_node = self.register_label("continue_label") + self.register_instruction(cil.EqualNode(equal_result, type_left, type_int)) + self.register_instruction(cil.GotoIfNode(equal_result, int_node.label)) + self.register_instruction(cil.EqualNode(equal_result, type_left, type_bool)) + self.register_instruction(cil.GotoIfNode(equal_result, int_node.label)) + self.register_instruction(cil.EqualNode(equal_result, type_left, type_string)) + self.register_instruction(cil.GotoIfNode(equal_result, string_node.label)) + self.register_instruction(cil.GotoNode(reference_node.label)) + + self.register_instruction(int_node) + self.register_instruction(cil.GetAttribNode(left_value, left, "value", "Int")) + self.register_instruction(cil.GetAttribNode(right_value, right, "value", "Int")) + self.register_instruction(cil.EqualNode(vname, left_value, right_value)) + self.register_instruction(cil.GotoNode(continue_node.label)) + + self.register_instruction(string_node) + self.register_instruction( + cil.GetAttribNode(left_value, left, "value", "String") + ) + self.register_instruction( + cil.GetAttribNode(right_value, right, "value", "String") + ) + self.register_instruction(cil.EqualStrNode(vname, left_value, right_value)) + self.register_instruction(cil.GotoNode(continue_node.label)) + + self.register_instruction(reference_node) + self.register_instruction(cil.EqualNode(vname, left, right)) + + self.register_instruction(continue_node) + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Bool"), instance)) + return instance + + @visitor.when(PlusNode) + def visit(self, node: PlusNode, scope: Scope): + vname = self.define_internal_local() + vleft = self.define_internal_local() + vright = self.define_internal_local() + left = self.visit(node.left, scope) + self.register_instruction(cil.GetAttribNode(vleft, left, "value", "Int")) + right = self.visit(node.right, scope) + self.register_instruction(cil.GetAttribNode(vright, right, "value", "Int")) + self.register_instruction(cil.PlusNode(vname, vleft, vright)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), instance)) + return instance + + @visitor.when(MinusNode) + def visit(self, node: MinusNode, scope: Scope): + vname = self.define_internal_local() + vleft = self.define_internal_local() + vright = self.define_internal_local() + left = self.visit(node.left, scope) + self.register_instruction(cil.GetAttribNode(vleft, left, "value", "Int")) + right = self.visit(node.right, scope) + self.register_instruction(cil.GetAttribNode(vright, right, "value", "Int")) + self.register_instruction(cil.MinusNode(vname, vleft, vright)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), instance)) + return instance + + @visitor.when(StarNode) + def visit(self, node: StarNode, scope: Scope): + vname = self.define_internal_local() + vleft = self.define_internal_local() + vright = self.define_internal_local() + left = self.visit(node.left, scope) + self.register_instruction(cil.GetAttribNode(vleft, left, "value", "Int")) + right = self.visit(node.right, scope) + self.register_instruction(cil.GetAttribNode(vright, right, "value", "Int")) + self.register_instruction(cil.StarNode(vname, vleft, vright)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), instance)) + return instance + + @visitor.when(DivNode) + def visit(self, node: DivNode, scope: Scope): + vname = self.define_internal_local() + vleft = self.define_internal_local() + vright = self.define_internal_local() + left = self.visit(node.left, scope) + self.register_instruction(cil.GetAttribNode(vleft, left, "value", "Int")) + right = self.visit(node.right, scope) + self.register_instruction(cil.GetAttribNode(vright, right, "value", "Int")) + + # Check division by 0 + equal_result = self.define_internal_local() + self.register_instruction(cil.EqualNode(equal_result, vright, 0)) + self.register_runtime_error( + equal_result, + f"{node.token.pos} - RuntimeError: Division by zero\n", + ) + + self.register_instruction(cil.DivNode(vname, vleft, vright)) + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), instance)) + return instance + + @visitor.when(VoidNode) + def visit(self, node: VoidNode, scope: Scope): + void = cil.VoidNode() + value = self.define_internal_local() + ret = self.visit(node.expr, scope) + self.register_instruction(cil.AssignNode(value, ret)) + result = self.define_internal_local() + self.register_instruction(cil.EqualNode(result, value, void)) + self.register_instruction(cil.ArgNode(result)) + self.register_instruction(cil.StaticCallNode(self.init_name("Bool"), result)) + return result + + @visitor.when(NegNode) + def visit(self, node: NegNode, scope: Scope): + vname = self.define_internal_local() + value = self.define_internal_local() + instance = self.define_internal_local() + ret = self.visit(node.expr, scope) + self.register_instruction(cil.GetAttribNode(value, ret, "value", "Int")) + self.register_instruction(cil.ComplementNode(vname, value)) + self.register_instruction(cil.ArgNode(vname)) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), instance)) + return instance + + @visitor.when(InstantiateNode) + def visit(self, node: InstantiateNode, scope: Scope): + instance = self.define_internal_local() + + if node.computed_type.name == SelfType().name: + vtype = self.define_internal_local() + self.register_instruction(cil.TypeOfNode(self.vself.name, vtype)) + self.register_instruction(cil.AllocateNode(vtype, instance)) + elif node.computed_type.name == "Int" or node.computed_type.name == "Bool": + self.register_instruction(cil.ArgNode(0)) + elif node.computed_type.name == "String": + data_node = [dn for dn in self.dotdata if dn.value == ""][0] + vmsg = self.register_local(VariableInfo("msg", None)) + self.register_instruction(cil.LoadNode(vmsg, data_node)) + self.register_instruction(cil.ArgNode(vmsg)) + + self.register_instruction( + cil.StaticCallNode(self.init_name(node.computed_type.name), instance) + ) + return instance + + @visitor.when(VariableNode) + def visit(self, node: VariableNode, scope: Scope): + + try: + self.current_type.get_attribute(node.lex) + attr = self.register_local(VariableInfo(node.lex, None), id=True) + self.register_instruction( + cil.GetAttribNode( + attr, self.vself.name, node.lex, self.current_type.name + ) + ) + return attr + except SemanticError: + param_names = [pn.name for pn in self.current_function.params] + if node.lex in param_names: + return node.lex + else: + return self.ids[node.lex] + + @visitor.when(ConstantNumNode) + def visit(self, node: ConstantNumNode, scope: Scope): + instance = self.define_internal_local() + self.register_instruction(cil.ArgNode(int(node.lex))) + self.register_instruction(cil.StaticCallNode(self.init_name("Int"), instance)) + return instance + + @visitor.when(ConstantStringNode) + def visit(self, node: ConstantStringNode, scope: Scope): + try: + data_node = [dn for dn in self.dotdata if dn.value == node.lex][0] + except IndexError: + data_node = self.register_data(node.lex) + + vmsg = self.register_local(VariableInfo("msg", None)) + ret = self.define_internal_local() + self.register_instruction(cil.LoadNode(vmsg, data_node)) + self.register_instruction(cil.ArgNode(vmsg)) + self.register_instruction(cil.StaticCallNode(self.init_name("String"), ret)) + return ret + + @visitor.when(ConstantBoolNode) + def visit(self, node: ConstantBoolNode, scope: Scope): + if node.lex == "true": + v = 1 + else: + v = 0 + ret = self.define_internal_local() + self.register_instruction(cil.ArgNode(v)) + self.register_instruction(cil.StaticCallNode(self.init_name("Bool"), ret)) + return ret diff --git a/src/compiler/visitors/semantics_check/__init__.py b/src/compiler/visitors/semantics_check/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/compiler/visitors/semantics_check/formatter.py b/src/compiler/visitors/semantics_check/formatter.py new file mode 100644 index 000000000..6a8f33575 --- /dev/null +++ b/src/compiler/visitors/semantics_check/formatter.py @@ -0,0 +1,144 @@ +from ...cmp.ast import ( + AssignNode, + AtomicNode, + AttrDeclarationNode, + BinaryNode, + BlockNode, + CallNode, + CaseNode, + ClassDeclarationNode, + ConditionalNode, + FuncDeclarationNode, + InstantiateNode, + LetNode, + LoopNode, + ProgramNode, + UnaryNode, +) +import compiler.visitors.visitor as visitor + + +class FormatVisitor(object): + @visitor.on("node") + def visit(self, node, tabs=0): + pass + + @visitor.when(ProgramNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__ProgramNode [ ... ]" + statements = "\n".join( + self.visit(child, tabs + 1) for child in node.declarations + ) + return f"{ans}\n{statements}" + + @visitor.when(ClassDeclarationNode) + def visit(self, node, tabs=0): + parent = "" if node.parent is None else f"inherits {node.parent}" + ans = ( + "\t" * tabs + + f"\\__ClassDeclarationNode: class {node.id} {parent} {{ ... }}" + ) + features = "\n".join(self.visit(child, tabs + 1) for child in node.features) + return f"{ans}\n{features}" + + @visitor.when(FuncDeclarationNode) + def visit(self, node, tabs=0): + paramStr = [(id.lex, typex.lex) for (id, typex) in node.params] + params = ", ".join(" : ".join(param) for param in paramStr) + ans = ( + "\t" * tabs + + f"\\__FuncDeclarationNode: {node.id}({params}) : {node.type} {{ }}" + ) + body = self.visit(node.body, tabs + 1) + return f"{ans}\n{body}" + + @visitor.when(AttrDeclarationNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__AttrDeclarationNode: {node.id} : {node.type}" + if node.expr is not None: + expr = self.visit(node.expr, tabs + 1) + ans = f"{ans} <- \n{expr}" + return f"{ans}" + + @visitor.when(AssignNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__AssignNode: {node.id} <- " + expr = self.visit(node.expr, tabs + 1) + return f"{ans}\n{expr}" + + @visitor.when(CallNode) + def visit(self, node, tabs=0): + obj = self.visit(node.obj, tabs + 1) + cast = "" if node.type is None else f"@{node.type}" + ans = "\t" * tabs + f"\\__CallNode: {cast}.{node.id}(, ..., )" + args = "\n".join(self.visit(arg, tabs + 1) for arg in node.args) + return f"{ans}\n{obj}\n{args}" + + @visitor.when(CaseNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__CaseNode: case of [ , ..., ]" + expr = self.visit(node.expr, tabs + 1) + branches = [] + for branch in node.branch_list: + branches.append( + "\t" * (tabs + 1) + + f"{branch[0]} : {branch[1]} => \n{self.visit(branch[2], tabs + 2)}" + ) + branches = "\n".join(branches) + return f"{ans}\n{expr}\n{branches}" + + @visitor.when(BlockNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__BlockNode: {{ ; ... ; ; }}" + exprs = "\n".join(self.visit(expr, tabs + 1) for expr in node.expr_list) + return f"{ans}\n{exprs}" + + @visitor.when(LoopNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__LoopNode: while loop pool" + cond = self.visit(node.condition, tabs + 1) + body = self.visit(node.body, tabs + 1) + return f"{ans}\n{cond}\n{body}" + + @visitor.when(ConditionalNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__ConditionalNode: if then else fi" + cond = self.visit(node.condition, tabs + 1) + then_body = self.visit(node.then_body, tabs + 1) + else_body = self.visit(node.else_body, tabs + 1) + return f"{ans}\n{cond}\n{then_body}\n{else_body}" + + @visitor.when(LetNode) + def visit(self, node: LetNode, tabs=0): + ans = "\t" * tabs + f"\\__LetNode: let in " + expr = self.visit(node.body, tabs + 1) + iden_list = [] + for item in node.id_list: + iden = "\t" * (tabs + 1) + f"{item.id} : {item.typex}" + if item.expression is not None: + iden = f"{iden} <- \n{self.visit(item.expression, tabs + 2)}" + iden_list.append(iden) + iden_list = "\n".join(iden_list) + return f"{ans}\n{iden_list}\n{expr}" + + @visitor.when(BinaryNode) + def visit(self, node, tabs=0): + ans = "\t" * tabs + f"\\__ {node.__class__.__name__} " + left = self.visit(node.left, tabs + 1) + right = self.visit(node.right, tabs + 1) + return f"{ans}\n{left}\n{right}" + + @visitor.when(AtomicNode) + def visit(self, node, tabs=0): + return ( + "\t" * tabs + f"\\__ {node.__class__.__name__}: {node.lex, node.token.pos}" + ) + + @visitor.when(UnaryNode) + def visit(self, node, tabs=0): + expr = self.visit(node.expr, tabs + 1) + return "\t" * tabs + f"\\__ {node.__class__.__name__} \n{expr}" + + @visitor.when(InstantiateNode) + def visit(self, node, tabs=0): + return "\t" * tabs + f"\\__ InstantiateNode: new {node.lex}" diff --git a/src/compiler/visitors/semantics_check/type_builder.py b/src/compiler/visitors/semantics_check/type_builder.py new file mode 100644 index 000000000..1d3c0580f --- /dev/null +++ b/src/compiler/visitors/semantics_check/type_builder.py @@ -0,0 +1,168 @@ +from ...cmp.ast import ( + ProgramNode, + ClassDeclarationNode, + AttrDeclarationNode, + FuncDeclarationNode, +) +from ...cmp.semantic import ( + Context, + SemanticError, + ErrorType, + InferencerManager, + AutoType, + SelfType, + Type, +) +from ..utils import ( + ALREADY_DEFINED, + MAIN_CLASS_ERROR, + MAIN_PROGRAM_ERROR, + SELF_ERROR, +) +from typing import List, Optional, Tuple +import compiler.visitors.visitor as visitor + + +class TypeBuilder: + def __init__(self, context): + self.context: Context = context + self.current_type: Optional[Type] = None + self.errors: List[Tuple[Exception, Tuple[int, int]]] = [] + self.manager = InferencerManager() + + self.obj_type = self.context.get_type("Object") + + @visitor.on("node") + def visit(self, node): + pass + + @visitor.when(ProgramNode) + def visit(self, node: ProgramNode): + for dec in node.declarations: + self.visit(dec) + + self.check_main_class() + + @visitor.when(ClassDeclarationNode) + def visit(self, node: ClassDeclarationNode): + self.current_type = self.context.get_type(node.id) + for feat in node.features: + self.visit(feat) + + @visitor.when(FuncDeclarationNode) + def visit(self, node: FuncDeclarationNode): + ## Building param-names and param-types of the method + param_names = [] + param_types = [] + node.index = [] + for namex, typex in node.params: + n, t = namex.lex, typex.lex + node.index.append(None) + + # Checking param name can't be self + if n == "self": + self.errors.append((SemanticError(SELF_ERROR), namex.pos)) + + # Generate valid parameter name + if n in param_names: + self.errors.append( + ( + SemanticError(ALREADY_DEFINED % (n)), + namex.pos, + ) + ) + while True: + if n in param_names: + n = f"1{n}" + else: + param_names.append(n) + break + + try: + t = self.context.get_type(t) + if isinstance(t, SelfType): + t = SelfType(self.current_type) + elif isinstance(t, AutoType): + node.index[-1] = self.manager.assign_id(self.obj_type) + except TypeError as ex: + self.errors.append((ex, typex.pos)) + t = ErrorType() + param_types.append(t) + + # Checking return type + try: + rtype = self.context.get_type(node.type) + if isinstance(rtype, SelfType): + rtype = SelfType(self.current_type) + except TypeError as ex: + self.errors.append((ex, node.typeToken.pos)) + rtype = ErrorType() + + node.idx = ( + self.manager.assign_id(self.obj_type) + if isinstance(rtype, AutoType) + else None + ) + + # Defining the method in the current type. There can not be another method with the same name. + try: + self.current_type.define_method( + node.id, param_names, param_types, rtype, node.index, node.idx + ) + except SemanticError as ex: + self.errors.append((ex, node.token.pos)) + + @visitor.when(AttrDeclarationNode) + def visit(self, node: AttrDeclarationNode): + # Checking attribute type + try: + attr_type = self.context.get_type(node.type) + if isinstance(attr_type, SelfType): + attr_type = SelfType(self.current_type) + except TypeError as ex: + self.errors.append((ex, node.typeToken.pos)) + attr_type = ErrorType() + + node.idx = ( + self.manager.assign_id(self.obj_type) + if isinstance(attr_type, AutoType) + else None + ) + + # Checking attribute can't be named self + if node.id == "self": + self.errors.append( + ( + SemanticError(SELF_ERROR), + node.idToken.pos, + ) + ) + + # Checking attribute name. No other attribute can have the same name + flag = False + try: + self.current_type.define_attribute(node.id, attr_type, node.idx) + flag = True + except SemanticError as ex: + self.errors.append( + ( + SemanticError(ALREADY_DEFINED % (node.id)), + node.idToken.pos, + ) + ) + + while not flag: + node.id = f"1{node.id}" + try: + self.current_type.define_attribute(node.id, attr_type, node.idx) + flag = True + except SemanticError: + pass + + def check_main_class(self): + try: + typex = self.context.get_type("Main") + if not any(method.name == "main" for method in typex.methods): + self.errors.append((SemanticError(MAIN_CLASS_ERROR), (0, 0))) + except TypeError: + self.errors.append((SemanticError(MAIN_PROGRAM_ERROR), (0, 0))) diff --git a/src/compiler/visitors/semantics_check/type_checker.py b/src/compiler/visitors/semantics_check/type_checker.py new file mode 100644 index 000000000..0009c1b22 --- /dev/null +++ b/src/compiler/visitors/semantics_check/type_checker.py @@ -0,0 +1,577 @@ +from ...cmp.ast import ( + BinaryNode, + ProgramNode, + ClassDeclarationNode, + AttrDeclarationNode, + FuncDeclarationNode, + AssignNode, + CallNode, + CaseNode, + BlockNode, + LoopNode, + ConditionalNode, + LetNode, + ArithmeticNode, + ComparisonNode, + EqualNode, + VoidNode, + NotNode, + NegNode, + ConstantNumNode, + ConstantStringNode, + ConstantBoolNode, + VariableNode, + InstantiateNode, +) +from ...cmp.semantic import ( + Context, + InferencerManager, + Method, + Scope, + SemanticError, + ErrorType, + SelfType, + AutoType, + LCA, + Type, +) +from ..utils import * +from typing import List, Optional, Tuple + +import compiler.visitors.visitor as visitor + + +class TypeChecker: + def __init__(self, context, manager): + self.context: Context = context + self.current_type: Optional[Type] = None + self.current_method: Optional[Method] = None + self.errors: List[Tuple[Exception, Tuple[int, int]]] = [] + self.manager: InferencerManager = manager + + # built-in types + self.obj_type = self.context.get_type("Object") + self.int_type = self.context.get_type("Int") + self.bool_type = self.context.get_type("Bool") + self.string_type = self.context.get_type("String") + + @visitor.on("node") + def visit(self, node, scope=None): + pass + + @visitor.when(ProgramNode) + def visit(self, node: ProgramNode, scope: Optional[Scope] = None) -> Scope: + scope = Scope() + for declaration in node.declarations: + self.visit(declaration, scope.create_child()) + return scope + + @visitor.when(ClassDeclarationNode) + def visit(self, node: ClassDeclarationNode, scope: Scope): + self.current_type = self.context.get_type(node.id) + scope.define_variable("self", SelfType(self.current_type)) + attributes = self.current_type.all_attributes() + for values in attributes: + attr, _ = values + scope.define_variable(attr.name, attr.type, attr.idx) + + for feature in node.features: + self.visit(feature, scope.create_child()) + + @visitor.when(AttrDeclarationNode) + def visit(self, node: AttrDeclarationNode, scope: Scope): + var = scope.find_variable(node.id) + attr_type = var.type + + if node.expr is not None: + computed_type = self.visit(node.expr, scope) + if not self.check_conformance(computed_type, attr_type): + self.errors.append( + ( + TypeError(INCOMPATIBLE_TYPES % (computed_type.name, node.type)), + node.token.pos, + ) + ) + + @visitor.when(FuncDeclarationNode) + def visit(self, node: FuncDeclarationNode, scope: Scope): + self.current_method = self.current_type.get_method(node.id) + + # checking overwriting + try: + method = self.current_type.parent.get_method(node.id) + if not len(self.current_method.param_types) == len(method.param_types): + self.errors.append( + ( + SemanticError( + WRONG_SIGNATURE % (node.id, self.current_type.name) + ), + node.token.pos, + ) + ) + else: + for i, t in enumerate(self.current_method.param_types): + if not method.param_types[i] == t: + self.errors.append( + ( + SemanticError( + WRONG_SIGNATURE % (node.id, self.current_type.name) + ), + node.token.pos, + ) + ) + break + else: + if not self.current_method.return_type == method.return_type: + self.errors.append( + ( + SemanticError( + WRONG_SIGNATURE % (node.id, self.current_type.name) + ), + node.typeToken.pos, + ) + ) + except SemanticError: + pass + + # defining variables in new scope + for i, var in enumerate(self.current_method.param_names): + if scope.is_local(var): + self.errors.append( + ( + SemanticError( + LOCAL_ALREADY_DEFINED % (var, self.current_method.name) + ), + node.token.pos, + ) + ) + else: + scope.define_variable( + var, + self.current_method.param_types[i], + self.current_method.param_idx[i], + ) + + computed_type = self.visit(node.body, scope) + + # checking return type + rtype = self.current_method.return_type + if not self.check_conformance(computed_type, rtype): + self.errors.append( + ( + TypeError( + INCOMPATIBLE_TYPES + % (computed_type.name, self.current_method.return_type.name) + ), + node.typeToken.pos, + ) + ) + + @visitor.when(AssignNode) + def visit(self, node: AssignNode, scope: Scope) -> Type: + if node.id == "self": + self.errors.append((SemanticError(SELF_IS_READONLY), node.idToken.pos)) + + # checking variable is defined + var = scope.find_variable(node.id) + if var is None: + self.errors.append( + ( + NameError(VARIABLE_NOT_DEFINED % (node.id, self.current_type.name)), + node.idToken.pos, + ) + ) + var = scope.define_variable(node.id, ErrorType()) + + computed_type = self.visit(node.expr, scope.create_child()) + + if not self.check_conformance(computed_type, var.type): + self.errors.append( + ( + TypeError(INCOMPATIBLE_TYPES % (computed_type.name, var.type.name)), + node.token.pos, + ) + ) + node.computed_type = computed_type + return computed_type + + @visitor.when(CallNode) + def visit(self, node: CallNode, scope: Scope): + # Evaluate object + obj_type = self.visit(node.obj, scope) + + # Check object type conforms to cast type + cast_type = obj_type + if not node.type == "": + try: + cast_type = self.context.get_type(node.type) + if isinstance(cast_type, AutoType): + raise SemanticError(AUTOTYPE_ERROR) + if isinstance(cast_type, SelfType): + cast_type = SelfType(self.current_type) + except (SemanticError, TypeError) as ex: + cast_type = ErrorType() + self.errors.append((ex, node.typeToken.pos)) + if not self.check_conformance(obj_type, cast_type): + self.errors.append( + ( + TypeError(INCOMPATIBLE_TYPES % (obj_type.name, cast_type.name)), + node.typeToken.pos, + ) + ) + + # if the obj that is calling the function is autotype, let it pass + if isinstance(cast_type, AutoType): + node.computed_type = cast_type + return cast_type + + if isinstance(cast_type, SelfType): + cast_type = self.current_type + + # Check this function is defined for cast_type + try: + method = cast_type.get_method(node.id) + # Check equal number of parameters + if not len(node.args) == len(method.param_types): + self.errors.append( + ( + SemanticError( + INVALID_OPERATION % (method.name, cast_type.name) + ), + node.token.pos, + ) + ) + node.computed_type = ErrorType() + return node.computed_type + + # Check conformance to parameter types + for i, arg in enumerate(node.args): + computed_type = self.visit(arg, scope) + if not self.check_conformance(computed_type, method.param_types[i]): + self.errors.append( + ( + TypeError( + INCOMPATIBLE_TYPES + % (computed_type.name, method.param_types[i].name) + ), + node.token.pos, + ) + ) + + # check self_type + rtype = method.return_type + if isinstance(rtype, SelfType): + rtype = obj_type + node.computed_type = rtype + return rtype + + except SemanticError: + self.errors.append( + ( + AttributeError(METHOD_NOT_DEFINED % (node.id)), + node.token.pos, + ) + ) + node.computed_type = ErrorType() + return node.computed_type + + @visitor.when(CaseNode) + def visit(self, node: CaseNode, scope: Scope): + # check expression + self.visit(node.expr, scope) + + nscope = scope.create_child() + + # check branches + types = [] + node.branch_idx = [] + decTypes = set() + size = 0 + for branch in node.branch_list: + node.branch_idx.append(None) + + # check idx is not self + if branch.id == "self": + self.errors.append( + (SemanticError(SELF_IS_READONLY), branch.idToken.pos) + ) + + # check no branch repeats type + decTypes.add(branch.typex) + if size == len(decTypes): + self.errors.append( + ( + SemanticError(CASE_BRANCH_ERROR % (branch.typex)), + branch.typexToken.pos, + ) + ) + size += 1 + + try: + var_type = self.context.get_type(branch.typex) + if isinstance(var_type, SelfType): + var_type = SelfType(self.current_type) + except TypeError as ex: + self.errors.append((ex, branch.typexToken.pos)) + var_type = ErrorType() + + # check type is autotype and assign an id in the manager + if isinstance(var_type, AutoType): + node.branch_idx[-1] = self.manager.assign_id(self.obj_type) + + new_scope = nscope.create_child() + new_scope.define_variable(branch.id, var_type, node.branch_idx[-1]) + + computed_type = self.visit(branch.expression, new_scope) + types.append(computed_type) + + node.computed_type = LCA(types) + return node.computed_type + + @visitor.when(BlockNode) + def visit(self, node: BlockNode, scope: Scope): + nscope = scope.create_child() + + # Check expressions + computed_type = None + for expr in node.expr_list: + computed_type = self.visit(expr, nscope) + + # return the type of the last expression of the list + node.computed_type = computed_type + return computed_type + + @visitor.when(LoopNode) + def visit(self, node: LoopNode, scope: Scope): + nscope = scope.create_child() + + # checking condition: it must conform to bool + cond_type = self.visit(node.condition, nscope) + if not cond_type.conforms_to(self.bool_type): + self.errors.append( + ( + TypeError( + INCOMPATIBLE_TYPES % (cond_type.name, self.bool_type.name) + ), + node.token.pos, + ) + ) + + # checking body + self.visit(node.body, nscope) + + node.computed_type = self.obj_type + return node.computed_type + + @visitor.when(ConditionalNode) + def visit(self, node: ConditionalNode, scope: Scope): + + # check condition conforms to bool + cond_type = self.visit(node.condition, scope) + if not cond_type.conforms_to(self.bool_type): + self.errors.append( + ( + TypeError( + INCOMPATIBLE_TYPES % (cond_type.name, self.bool_type.name) + ), + node.token.pos, + ) + ) + + then_type = self.visit(node.then_body, scope.create_child()) + else_type = self.visit(node.else_body, scope.create_child()) + + node.computed_type = LCA([then_type, else_type]) + return node.computed_type + + @visitor.when(LetNode) + def visit(self, node: LetNode, scope: Scope): + nscope = scope.create_child() + + node.idx_list = [None] * len(node.id_list) + for i, item in enumerate(node.id_list): + # create a new_scope for every variable defined + new_scope = nscope.create_child() + + # check id in let can not be self + if item.id == "self": + self.errors.append((SemanticError(SELF_IS_READONLY), item.idToken.pos)) + item.id = f"1{item.id}" + node.id_list[i] = (item.id, item.typex, item.expression) + + try: + typex = self.context.get_type(item.typex) + if isinstance(typex, SelfType): + typex = SelfType(self.current_type) + except TypeError as ex: + self.errors.append((ex, item.typexToken.pos)) + typex = ErrorType() + + if isinstance(typex, AutoType): + node.idx_list[i] = self.manager.assign_id(self.obj_type) + + if item.expression is not None: + expr_type = self.visit(item.expression, new_scope) + if not self.check_conformance(expr_type, typex): + self.errors.append( + ( + TypeError( + INCOMPATIBLE_TYPES % (expr_type.name, typex.name) + ), + item.token.pos, + ) + ) + + new_scope.define_variable(item.id, typex, node.idx_list[i]) + nscope = new_scope + + node.computed_type = self.visit(node.body, nscope) + return node.computed_type + + @visitor.when(ArithmeticNode) + def visit(self, node: ArithmeticNode, scope: Scope): + self.check_expr(node, scope) + node.computed_type = self.int_type + return node.computed_type + + @visitor.when(ComparisonNode) + def visit(self, node: ComparisonNode, scope: Scope): + self.check_expr(node, scope) + node.computed_type = self.bool_type + return node.computed_type + + @visitor.when(EqualNode) + def visit(self, node: EqualNode, scope: Scope): + left = self.visit(node.left, scope) + right = self.visit(node.right, scope) + + types = [self.int_type, self.bool_type, self.string_type] + + def check_equal(typex, other): + for t in types: + if typex.conforms_to(t): + if not other.conforms_to(t): + self.errors.append( + ( + TypeError(INCOMPATIBLE_TYPES % (other.name, t.name)), + node.token.pos, + ) + ) + return True + return False + + ok = check_equal(left, right) + if not ok: + check_equal(right, left) + + node.computed_type = self.bool_type + return node.computed_type + + @visitor.when(VoidNode) + def visit(self, node: VoidNode, scope: Scope): + self.visit(node.expr, scope) + + node.computed_type = self.bool_type + return node.computed_type + + @visitor.when(NotNode) + def visit(self, node: NotNode, scope: Scope): + typex = self.visit(node.expr, scope) + if not typex.conforms_to(self.bool_type): + self.errors.append( + ( + TypeError(INCOMPATIBLE_TYPES % (typex.name, self.bool_type.name)), + node.token.pos, + ) + ) + + node.computed_type = self.bool_type + return node.computed_type + + @visitor.when(NegNode) + def visit(self, node: NegNode, scope: Scope): + typex = self.visit(node.expr, scope) + if not typex.conforms_to(self.int_type): + self.errors.append( + ( + TypeError(INCOMPATIBLE_TYPES % (typex.name, self.int_type.name)), + node.token.pos, + ) + ) + + node.computed_type = self.int_type + return node.computed_type + + @visitor.when(ConstantNumNode) + def visit(self, node, scope): + node.computed_type = self.int_type + return node.computed_type + + @visitor.when(ConstantBoolNode) + def visit(self, node, scope): + node.computed_type = self.bool_type + return node.computed_type + + @visitor.when(ConstantStringNode) + def visit(self, node, scope): + node.computed_type = self.string_type + return node.computed_type + + @visitor.when(VariableNode) + def visit(self, node: VariableNode, scope: Scope): + var = scope.find_variable(node.lex) + if var is None: + self.errors.append( + ( + NameError( + VARIABLE_NOT_DEFINED % (node.lex, self.current_type.name) + ), + node.token.pos, + ) + ) + var = scope.define_variable(node.lex, ErrorType()) + + node.computed_type = var.type + return var.type + + @visitor.when(InstantiateNode) + def visit(self, node: InstantiateNode, scope: Scope): + try: + typex = self.context.get_type(node.lex) + if isinstance(typex, AutoType): + raise SemanticError(AUTOTYPE_ERROR) + if isinstance(typex, SelfType): + typex = SelfType(self.current_type) + except (SemanticError, TypeError) as ex: + self.errors.append((ex, node.token.pos)) + typex = ErrorType() + + node.computed_type = typex + return typex + + def check_expr(self, node: BinaryNode, scope: Scope): + # checking left expr + left = self.visit(node.left, scope) + if not left.conforms_to(self.int_type): + self.errors( + ( + TypeError(INCOMPATIBLE_TYPES % (left.name, self.int_type.name)), + node.token.pos, + ) + ) + + # checking right expr + right = self.visit(node.right, scope) + if not right.conforms_to(self.int_type): + self.errors.append( + ( + TypeError(INCOMPATIBLE_TYPES % (right.name, self.int_type.name)), + node.token.pos, + ) + ) + + def check_conformance(self, computed_type, attr_type): + return computed_type.conforms_to(attr_type) or ( + isinstance(computed_type, SelfType) + and self.current_type.conforms_to(attr_type) + ) diff --git a/src/compiler/visitors/semantics_check/type_collector.py b/src/compiler/visitors/semantics_check/type_collector.py new file mode 100644 index 000000000..6db89e746 --- /dev/null +++ b/src/compiler/visitors/semantics_check/type_collector.py @@ -0,0 +1,181 @@ +from ...cmp.utils import Token +from ...cmp.semantic import ( + SemanticError, + Type, + Context, + ObjectType, + IOType, + StringType, + IntType, + BoolType, + SelfType, + AutoType, +) +from ...cmp.ast import ProgramNode, ClassDeclarationNode +from ..utils import AUTOTYPE_ERROR +from typing import Dict, List, Optional, Tuple +import compiler.visitors.visitor as visitor + +built_in_types = [] + + +class TypeCollector(object): + def __init__(self): + self.context: Optional[Context] = None + self.errors: List[Tuple[Exception, Tuple[int, int]]] = [] + self.parent: Dict[str, Optional[Token]] = {} + + @visitor.on("node") + def visit(self, node): + pass + + @visitor.when(ProgramNode) + def visit(self, node: ProgramNode): + self.context = Context() + self.define_built_in_types() + + # Adding built-in types to context + for typex in built_in_types: + self.context.types[typex.name] = typex + + for declaration in node.declarations: + self.visit(declaration) + + self.check_parents() + self.check_cyclic_inheritance() + + # Order class declarations according to their depth in the inheritance tree + node.declarations = self.order_types(node) + + @visitor.when(ClassDeclarationNode) + def visit(self, node: ClassDeclarationNode): + # flag is set to True if the class is succesfully added to the context + flag = False + try: + if node.id == "AUTO_TYPE": + raise SemanticError(AUTOTYPE_ERROR) + self.context.create_type(node.id, node.tokenId.pos) + flag = True + self.parent[node.id] = node.parent + except SemanticError as ex: + self.errors.append((ex, node.token.pos)) + + # changing class id so it can be added to context + while not flag: + node.id = f"1{node.id}" + try: + self.context.create_type(node.id) + flag = True + self.parent[node.id] = node.parent + except SemanticError: + pass + + def define_built_in_types(self): + objectx = ObjectType() + iox = IOType() + intx = IntType() + stringx = StringType() + boolx = BoolType() + self_type = SelfType() + autotype = AutoType() + + # Object Methods + objectx.define_method("abort", [], [], objectx, []) + objectx.define_method("type_name", [], [], stringx, []) + objectx.define_method("copy", [], [], self_type, []) + + # IO Methods + iox.define_method("out_string", ["x"], [stringx], self_type, [None]) + iox.define_method("out_int", ["x"], [intx], self_type, [None]) + iox.define_method("in_string", [], [], stringx, []) + iox.define_method("in_int", [], [], intx, []) + + # String Methods + stringx.define_method("length", [], [], intx, []) + stringx.define_method("concat", ["s"], [stringx], stringx, [None]) + stringx.define_method("substr", ["i", "l"], [intx, intx], stringx, [None]) + + # Setting Object as parent + iox.set_parent(objectx) + stringx.set_parent(objectx) + intx.set_parent(objectx) + boolx.set_parent(objectx) + + built_in_types.extend([objectx, iox, stringx, intx, boolx, self_type, autotype]) + + def check_parents(self): + for item in self.parent.keys(): + item_type = self.context.get_type(item) + if self.parent[item] is None: + # setting Object as parent + item_type.set_parent(built_in_types[0]) + else: + try: + typex = self.context.get_type(self.parent[item].lex) + if not typex.can_be_inherited(): + self.errors.append( + ( + SemanticError( + f"Class {item} can not inherit class {typex.name}" + ), + self.parent[item].pos, + ) + ) + typex = built_in_types[0] + item_type.set_parent(typex) + except TypeError as ex: + self.errors.append((ex, self.parent[item].pos)) + item_type.set_parent(built_in_types[0]) + + def check_cyclic_inheritance(self): + flag = [] + + def find(item: Type) -> int: + for i, typex in enumerate(flag): + if typex.name == item.name: + return i + return len(flag) + + def check_path(idx: int, item: Type): + while True: + flag.append(item) + parent = item.parent + if parent is None: + break + pos = find(parent) + if pos < len(flag): + if pos >= idx: + self.errors.append( + (SemanticError("Cyclic heritage."), item.pos) + ) + item.parent = built_in_types[0] + break + item = parent + + for item in self.context.types.values(): + idx = find(item) + if idx == len(flag): + check_path(idx, item) + + def order_types(self, node: ProgramNode): + sorted_declarations: List[ClassDeclarationNode] = [] + flag = [False] * len(node.declarations) + + change = True + while change: + change = False + + current = [] + for i, dec in enumerate(node.declarations): + if not flag[i]: + typex = self.context.get_type(dec.id) + if typex.parent.name in [ + item.id for item in sorted_declarations + ] or any(typex.parent.name == bit.name for bit in built_in_types): + current.append(dec) + flag[i] = True + change = True + + sorted_declarations.extend(current) + + return sorted_declarations diff --git a/src/compiler/visitors/semantics_check/type_inferencer.py b/src/compiler/visitors/semantics_check/type_inferencer.py new file mode 100644 index 000000000..b67c343b7 --- /dev/null +++ b/src/compiler/visitors/semantics_check/type_inferencer.py @@ -0,0 +1,505 @@ +from ...cmp.ast import ( + ArithmeticNode, + AssignNode, + AttrDeclarationNode, + BlockNode, + CallNode, + CaseNode, + ClassDeclarationNode, + ComparisonNode, + ConditionalNode, + ConstantBoolNode, + ConstantNumNode, + ConstantStringNode, + EqualNode, + FuncDeclarationNode, + InstantiateNode, + LetNode, + LoopNode, + NegNode, + NotNode, + ProgramNode, + VariableNode, + VoidNode, +) +from ...cmp.semantic import ( + AutoType, + Context, + ErrorType, + InferencerManager, + LCA, + Method, + Scope, + SelfType, + SemanticError, + Type, +) +from ..utils import AUTOTYPE_ERROR +from typing import List, Tuple +import compiler.visitors.visitor as visitor + + +class TypeInferencer: + def __init__(self, context, manager): + self.context: Context = context + self.errors: List[Tuple[Exception, Tuple[int, int]]] = [] + self.manager: InferencerManager = manager + + self.current_type: Type = None + self.current_method: Method = None + self.scope_children_id = 0 + + # built-in types + self.obj_type = self.context.get_type("Object") + self.int_type = self.context.get_type("Int") + self.bool_type = self.context.get_type("Bool") + self.string_type = self.context.get_type("String") + + @visitor.on("node") + def visit(self, node, scope, types=None): + pass + + @visitor.when(ProgramNode) + def visit(self, node: ProgramNode, scope: Scope, types=None): + if types is None: + types = [] + + change = self.manager.count > 0 + while change: + change = False + + for declaration, child_scope in zip(node.declarations, scope.children): + self.scope_children_id = 0 + self.visit(declaration, child_scope, types) + + change = self.manager.infer_all() + + self.manager.infer_object(self.obj_type) + for declaration, child_scope in zip(node.declarations, scope.children): + self.scope_children_id = 0 + self.visit(declaration, child_scope, types) + + @visitor.when(ClassDeclarationNode) + def visit(self, node, scope, types): + self.current_type = self.context.get_type(node.id) + + for feature, child_scope in zip(node.features, scope.children): + self.scope_children_id = 0 + self.visit(feature, child_scope, types) + + @visitor.when(AttrDeclarationNode) + def visit(self, node: AttrDeclarationNode, scope, types): + var_attr = scope.find_variable(node.id) + attr_type = var_attr.type + idx = var_attr.idx + + if isinstance(attr_type, AutoType): + inf_type = self.manager.infered_type[idx] + if inf_type is not None: + if isinstance(inf_type, ErrorType): + self.errors.append( + (SemanticError(AUTOTYPE_ERROR), node.typeToken.pos) + ) + else: + node.type = inf_type.name + self.current_type.update_attr(node.id, inf_type) + scope.update_variable(node.id, inf_type) + attr_type = inf_type + + conforms_to_types = [] + if isinstance(attr_type, AutoType): + conforms_to_types.extend(self.manager.conforms_to[idx]) + else: + conforms_to_types.append(attr_type) + + if node.expr is not None: + _, computed_types = self.visit(node.expr, scope, conforms_to_types) + if isinstance(attr_type, AutoType): + self.manager.upd_conformed_by(node.idx, computed_types) + + @visitor.when(FuncDeclarationNode) + def visit(self, node: FuncDeclarationNode, scope: Scope, types): + self.current_method = self.current_type.get_method(node.id) + + method_params = [] + for i, t in enumerate(self.current_method.param_types): + idx = self.current_method.param_idx[i] + name = self.current_method.param_names[i] + if isinstance(t, AutoType): + inf_type = self.manager.infered_type[idx] + if inf_type is not None: + if isinstance(inf_type, ErrorType): + self.errors.append( + (SemanticError(AUTOTYPE_ERROR), node.params[i][1].pos) + ) + else: + node.params[i] = (node.params[i][0], inf_type.name) + self.current_type.update_method_param(name, inf_type, i) + scope.update_variable(name, inf_type) + t = inf_type + method_params.append(t) + + rtype = self.current_method.return_type + idx = self.current_method.ridx + if isinstance(rtype, AutoType): + inf_type = self.manager.infered_type[idx] + if inf_type is not None: + if isinstance(inf_type, ErrorType): + self.errors.append( + (SemanticError(AUTOTYPE_ERROR), node.typeToken.pos) + ) + else: + node.type = inf_type.name + self.current_type.update_method_rtype( + self.current_method.name, inf_type + ) + rtype = inf_type + + # checking overwriting + try: + method = self.current_type.parent.get_method(node.id) + + for i, t in enumerate(method_params): + if not isinstance(method.param_types[i], AutoType) and isinstance( + t, AutoType + ): + self.manager.auto_to_type( + self.current_method.param_idx[i], method.param_types[i] + ) + self.manager.type_to_auto( + self.current_method.param_idx[i], method.param_types[i] + ) + + if isinstance(rtype, AutoType) and not isinstance( + method.return_type, AutoType + ): + self.manager.auto_to_type(idx, method.return_type) + self.manager.type_to_auto(idx, method.return_type) + except SemanticError: + pass + + # checking return type in computed types of the expression + conforms_to_types = [] + if isinstance(rtype, AutoType): + conforms_to_types.extend(self.manager.conforms_to[idx]) + else: + conforms_to_types.append(rtype) + _, computed_types = self.visit(node.body, scope, conforms_to_types) + if isinstance(rtype, AutoType): + self.manager.upd_conformed_by(self.current_method.ridx, computed_types) + + @visitor.when(AssignNode) + def visit(self, node, scope, types): + var = scope.find_variable(node.id) + # obtaining defined variable + if isinstance(var.type, AutoType): + inf_type = self.manager.infered_type[var.idx] + if inf_type is not None: + scope.update_variable(var.name, inf_type) + var.type = inf_type + + conforms_to_types = [] + if isinstance(var.type, AutoType): + conforms_to_types.extend(self.manager.conforms_to[var.idx]) + else: + conforms_to_types.append(var.type) + conforms_to_types.extend(types) + + scope_index = self.scope_children_id + self.scope_children_id = 0 + typex, computed_types = self.visit( + node.expr, scope.children[scope_index], conforms_to_types + ) + self.scope_children_id = scope_index + 1 + + if isinstance(var.type, AutoType): + self.manager.upd_conformed_by(var.idx, computed_types) + + return typex, computed_types + + @visitor.when(CallNode) + def visit(self, node, scope, types): + # Check cast type + cast_type = None + if not node.type == "": + try: + cast_type = self.context.get_type(node.type) + if isinstance(cast_type, AutoType): + cast_type = None + elif isinstance(cast_type, SelfType): + cast_type = self.current_type + except TypeError: + pass + + # Check object + conforms_to_types = [] if cast_type is None else [cast_type] + obj_type, computed_types = self.visit(node.obj, scope, conforms_to_types) + + if cast_type is None: + cast_type = obj_type + if isinstance(cast_type, SelfType): + cast_type = self.current_type + + # if the obj that is calling the function is autotype, let it pass + if isinstance(cast_type, AutoType): + return cast_type, [] + + # Check this function is defined for cast_type + try: + method = cast_type.get_method(node.id) + if not len(node.args) == len(method.param_types): + return ErrorType(), [] + for i, arg in enumerate(node.args): + arg_type = method.param_types[i] + if isinstance(arg_type, AutoType): + inf_type = self.manager.infered_type[method.param_idx[i]] + if inf_type is not None: + arg_type = inf_type + + conforms_to_types = [] + if isinstance(arg_type, AutoType): + conforms_to_types.extend( + self.manager.conforms_to[method.param_idx[i]] + ) + else: + conforms_to_types.append(arg_type) + _, computed_types = self.visit(arg, scope, conforms_to_types) + if isinstance(arg_type, AutoType): + self.manager.upd_conformed_by(method.param_idx[i], computed_types) + + # check return_type + computed_types = [] + rtype = method.return_type + if isinstance(rtype, SelfType): + rtype = obj_type + + if isinstance(rtype, AutoType): + self.manager.upd_conforms_to(method.ridx, types) + computed_types.extend(self.manager.conformed_by[method.ridx]) + else: + computed_types.append(rtype) + + return rtype, computed_types + + except SemanticError: + return ErrorType(), [] + + @visitor.when(CaseNode) + def visit(self, node: CaseNode, scope: Scope, types): + # check expression + self.visit(node.expr, scope, set()) + + scope_index = self.scope_children_id + nscope = scope.children[scope_index] + + # check branches + expr_types = [] + for i, (branch, child_scope) in enumerate( + zip(node.branch_list, nscope.children) + ): + branch_name, branch_type, expr = branch.id, branch.typex, branch.expression + var = child_scope.find_variable(branch_name) + branch_type = var.type + if isinstance(branch_type, AutoType): + inf_type = self.manager.infered_type[node.branch_idx[i]] + if inf_type is not None: + if isinstance(inf_type, ErrorType): + self.errors.append( + (SemanticError(AUTOTYPE_ERROR), branch.token.pos) + ) + else: + node.branch_list[i] = (branch_name, inf_type.name, expr) + child_scope.update_variable(branch_name, inf_type) + + self.scope_children_id = 0 + _, computed_types = self.visit(expr, child_scope, types) + expr_types.extend(computed_types) + + self.scope_children_id = scope_index + 1 + return LCA(expr_types), expr_types + + @visitor.when(BlockNode) + def visit(self, node, scope, types): + scope_index = self.scope_children_id + nscope = scope.children[scope_index] + self.scope_children_id = 0 + + # Check expressions but last one + for expr in node.expr_list[:-1]: + self.visit(expr, nscope, []) + + # Check last expression + typex, computed_types = self.visit(node.expr_list[-1], nscope, types) + + # return the type of the last expression of the list + self.scope_children_id = scope_index + 1 + return typex, computed_types + + @visitor.when(LoopNode) + def visit(self, node, scope, types): + scope_index = self.scope_children_id + nscope = scope.children[scope_index] + self.scope_children_id = 0 + + # checking condition: it must conform to bool + self.visit(node.condition, nscope, [self.bool_type]) + + # checking body + self.visit(node.body, nscope, []) + + self.scope_children_id = scope_index + 1 + return self.obj_type, [self.obj_type] + + @visitor.when(ConditionalNode) + def visit(self, node, scope, types): + # check condition conforms to bool + self.visit(node.condition, scope, [self.bool_type]) + + branch_types = [] + + scope_index = self.scope_children_id + self.scope_children_id = 0 + _, then_types = self.visit(node.then_body, scope.children[scope_index], types) + scope_index += 1 + self.scope_children_id = 0 + _, else_types = self.visit(node.else_body, scope.children[scope_index], types) + + branch_types.extend(then_types) + branch_types.extend(else_types) + + self.scope_children_id = scope_index + 1 + return LCA(branch_types), branch_types + + @visitor.when(LetNode) + def visit(self, node: LetNode, scope: Scope, types): + scope_index = self.scope_children_id + nscope = scope.children[scope_index] + self.scope_children_id = 0 + + for i, item in enumerate(node.id_list): + temp_scope_index = self.scope_children_id + new_scope = nscope.children[temp_scope_index] + self.scope_children_id = 0 + + var_name, _, expr = item.id, item.typex, item.expression + var = new_scope.find_variable(var_name) + + if isinstance(var.type, AutoType): + inf_type = self.manager.infered_type[node.idx_list[i]] + if inf_type is not None: + if isinstance(inf_type, ErrorType): + self.errors.append( + (SemanticError(AUTOTYPE_ERROR), item.typexToken.pos) + ) + else: + node.id_list[i] = (var_name, inf_type.name, expr) + new_scope.update_variable(var_name, inf_type) + var.type = inf_type + + conforms_to_types = [] + if isinstance(var.type, AutoType): + conforms_to_types.extend(self.manager.conforms_to[node.idx_list[i]]) + else: + conforms_to_types.append(var.type) + + if expr is not None: + _, computed_types = self.visit(expr, new_scope, conforms_to_types) + if isinstance(var.type, AutoType): + self.manager.upd_conformed_by(node.idx_list[i], computed_types) + + nscope = new_scope + + expr_type, computed_types = self.visit(node.body, nscope, types) + self.scope_children_id = scope_index + 1 + return expr_type, computed_types + + @visitor.when(ArithmeticNode) + def visit(self, node, scope, types): + self.check_expr(node, scope) + return self.int_type, [self.int_type] + + @visitor.when(ComparisonNode) + def visit(self, node, scope, types): + self.check_expr(node, scope) + return self.bool_type, [self.bool_type] + + @visitor.when(EqualNode) + def visit(self, node, scope, types): + left, _ = self.visit(node.left, scope, []) + right, _ = self.visit(node.right, scope, []) + + fixed_types = [self.int_type, self.bool_type, self.string_type] + + def check_equal(typex): + if not isinstance(typex, AutoType) and not isinstance(typex, ErrorType): + for t in fixed_types: + if typex.conforms_to(t): + return True + return False + + if check_equal(left): + self.visit(node.right, scope, [left]) + elif check_equal(right): + self.visit(node.left, scope, [right]) + + return self.bool_type, [self.bool_type] + + @visitor.when(VoidNode) + def visit(self, node, scope, types): + self.visit(node.expr, scope, []) + return self.bool_type, [self.bool_type] + + @visitor.when(NotNode) + def visit(self, node, scope, types): + self.visit(node.expr, scope, [self.bool_type]) + return self.bool_type, [self.bool_type] + + @visitor.when(NegNode) + def visit(self, node, scope, types): + self.visit(node.expr, scope, [self.int_type]) + return self.int_type, [self.int_type] + + @visitor.when(ConstantNumNode) + def visit(self, node, scope, types): + return self.int_type, [self.int_type] + + @visitor.when(ConstantBoolNode) + def visit(self, node, scope, types): + return self.bool_type, [self.bool_type] + + @visitor.when(ConstantStringNode) + def visit(self, node, scope, types): + return self.string_type, [self.string_type] + + @visitor.when(VariableNode) + def visit(self, node, scope, types): + var = scope.find_variable(node.lex) + if isinstance(var.type, AutoType): + inf_type = self.manager.infered_type[var.idx] + if inf_type is not None: + scope.update_variable(var.name, inf_type) + var.type = inf_type + + conformed_by = [] + if isinstance(var.type, AutoType): + self.manager.upd_conforms_to(var.idx, types) + conformed_by.extend(self.manager.conformed_by[var.idx]) + else: + conformed_by.append(var.type) + + return var.type, conformed_by + + @visitor.when(InstantiateNode) + def visit(self, node, scope, types): + try: + typex = self.context.get_type(node.lex) + if isinstance(typex, SelfType): + typex = SelfType(self.current_type) + except SemanticError: + typex = ErrorType() + + return typex, [typex] + + def check_expr(self, node, scope): + self.visit(node.left, scope, [self.int_type]) + self.visit(node.right, scope, [self.int_type]) diff --git a/src/compiler/visitors/utils.py b/src/compiler/visitors/utils.py new file mode 100644 index 000000000..a109af42d --- /dev/null +++ b/src/compiler/visitors/utils.py @@ -0,0 +1,14 @@ +WRONG_SIGNATURE = 'Method "%s" already defined in "%s" with a different signature.' +SELF_IS_READONLY = 'Variable "self" is read-only.' +LOCAL_ALREADY_DEFINED = 'Variable "%s" is already defined in method "%s".' +INCOMPATIBLE_TYPES = 'Cannot convert "%s" into "%s".' +VARIABLE_NOT_DEFINED = 'Variable "%s" is not defined in "%s".' +INVALID_OPERATION = 'Operation is not defined between "%s" and "%s".' +INVALID_TYPE = "SELF_TYPE is not valid" +AUTOTYPE_ERROR = "Incorrect use of AUTO_TYPE" +METHOD_NOT_DEFINED = 'Dispatch to undefined method "%s".' +CASE_BRANCH_ERROR = 'Duplicate branch "%s" in case statement.' +SELF_ERROR = 'Cannot use "self" as identifier' +ALREADY_DEFINED = '"%s" is already defined' +MAIN_CLASS_ERROR = "Class Main must contain a method main" +MAIN_PROGRAM_ERROR = "Program must contain a class Main" diff --git a/src/compiler/visitors/visitor.py b/src/compiler/visitors/visitor.py new file mode 100644 index 000000000..0bcba712e --- /dev/null +++ b/src/compiler/visitors/visitor.py @@ -0,0 +1,85 @@ +# The MIT License (MIT) +# +# Copyright (c) 2013 Curtis Schlak +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +# THE SOFTWARE. + +import inspect + +__all__ = ["on", "when"] + + +def on(param_name): + def f(fn): + dispatcher = Dispatcher(param_name, fn) + return dispatcher + + return f + + +def when(param_type): + def f(fn): + frame = inspect.currentframe().f_back + func_name = fn.func_name if "func_name" in dir(fn) else fn.__name__ + dispatcher = frame.f_locals[func_name] + if not isinstance(dispatcher, Dispatcher): + dispatcher = dispatcher.dispatcher + dispatcher.add_target(param_type, fn) + + def ff(*args, **kw): + return dispatcher(*args, **kw) + + ff.dispatcher = dispatcher + return ff + + return f + + +class Dispatcher(object): + def __init__(self, param_name, fn): + frame = inspect.currentframe().f_back.f_back + top_level = frame.f_locals == frame.f_globals + self.param_index = self.__argspec(fn).args.index(param_name) + self.param_name = param_name + self.targets = {} + + def __call__(self, *args, **kw): + typ = args[self.param_index].__class__ + d = self.targets.get(typ) + if d is not None: + return d(*args, **kw) + else: + issub = issubclass + t = self.targets + ks = t.keys() + ans = [t[k](*args, **kw) for k in ks if issub(typ, k)] + if len(ans) == 1: + return ans.pop() + return ans + + def add_target(self, typ, target): + self.targets[typ] = target + + @staticmethod + def __argspec(fn): + # Support for Python 3 type hints requires inspect.getfullargspec + if hasattr(inspect, "getfullargspec"): + return inspect.getfullargspec(fn) + else: + return inspect.getargspec(fn) diff --git a/src/coolc.sh b/src/coolc.sh index 3088de4f9..62c220125 100755 --- a/src/coolc.sh +++ b/src/coolc.sh @@ -1,11 +1,7 @@ -# Incluya aquí las instrucciones necesarias para ejecutar su compilador - INPUT_FILE=$1 -OUTPUT_FILE=${INPUT_FILE:0: -2}mips -# Si su compilador no lo hace ya, aquí puede imprimir la información de contacto -echo "LINEA_CON_NOMBRE_Y_VERSION_DEL_COMPILADOR" # TODO: Recuerde cambiar estas -echo "Copyright (c) 2019: Nombre1, Nombre2, Nombre3" # TODO: líneas a los valores correctos +echo "codersUP - COOLCompilerv0.0.1" +echo "Copyright (c) 2022: Carmen Irene Cabrera Rodríguez, David Guaty Domínguez, Enrique Martínez González" -# Llamar al compilador -echo "Compiling $INPUT_FILE into $OUTPUT_FILE" +# Run the compiler +python3 main.py -f "$INPUT_FILE" diff --git a/src/guide.md b/src/guide.md new file mode 100644 index 000000000..d45a19abe --- /dev/null +++ b/src/guide.md @@ -0,0 +1,19 @@ +# Work guidelines + +Here is a guide on how to organize the team work. In the [Documentation](#documentation) section you can find a share resources for everyone to study and learn. + +## Documentation + +- [Python packages](https://dev.to/codemouse92/dead-simple-python-project-structure-and-imports-38c6) to use modules in the project + +## Tasks + +No work is going to be pushed to main directly. Each task we develope should be done in a separate branch and reviewed by the others in order to merge it. + +We'll have 3 sections for the tasks: **TODO, IN PROGRESS, DONE**. Each new task goes into **TODO** section. Once someone picks a task, it should be assigned to it. When that person starts working on the task, it should be moved into **IN PROGRESS** section. And of course once the task is finished (when it is merged into main branch), it should be moved into **DONE** section. + +### TODO + +### IN PROGRESS + +### DONE diff --git a/src/main.py b/src/main.py new file mode 100644 index 000000000..1331ac819 --- /dev/null +++ b/src/main.py @@ -0,0 +1,116 @@ +from compiler.cmp.grammar import G +from compiler.lexer.lex import CoolLexer +from compiler.parser.parser import LR1Parser, evaluate_reverse_parse +from compiler.visitors.cil2mips.cil2mips import CILToMIPSVisitor +from compiler.visitors.cil2mips.mips_printer import MIPSPrintVisitor +from compiler.visitors.cool2cil.cool2cil import COOLToCILVisitor +from compiler.visitors.semantics_check.type_builder import TypeBuilder +from compiler.visitors.semantics_check.type_checker import TypeChecker +from compiler.visitors.semantics_check.type_collector import TypeCollector +from compiler.visitors.semantics_check.type_inferencer import TypeInferencer +from sys import exit +import os + + +def main(args): + try: + with open(args.file, "r") as fd: + code = fd.read() + except: + print(f"(0,0) - CompilerError: file {args.file} not found") + exit(1) + + # Lexer + lexer = CoolLexer() + tokens, errors = lexer.tokenize(code) + for error in errors: + print(error) + if errors: + exit(1) + + # Parser + parser = LR1Parser(G) + parseResult, error = parser(tokens, get_shift_reduce=True) + if error: + print(error) + exit(1) + + parse, operations = parseResult + ast = evaluate_reverse_parse(parse, operations, tokens) + + # Collecting types + collector = TypeCollector() + collector.visit(ast) + context = collector.context + for (e, pos) in collector.errors: + print(f"{pos} - {type(e).__name__}: {str(e)}") + if collector.errors: + exit(1) + + # Building types + builder = TypeBuilder(context) + builder.visit(ast) + manager = builder.manager + for (e, pos) in builder.errors: + print(f"{pos} - {type(e).__name__}: {str(e)}") + if builder.errors: + exit(1) + + # Type checking + checker = TypeChecker(context, manager) + scope = checker.visit(ast) + for (e, pos) in checker.errors: + print(f"{pos} - {type(e).__name__}: {str(e)}") + if checker.errors: + exit(1) + + # Inferencing Autotype + inferencer = TypeInferencer(context, manager) + inferencer.visit(ast, scope) + for e in inferencer.errors: + print(f"{pos} - {type(e).__name__}: {str(e)}") + if inferencer.errors: + exit(1) + + # Last check without autotypes + checker = TypeChecker(context, manager) + checker.visit(ast) + for (e, pos) in checker.errors: + print(f"{pos} - {type(e).__name__}: {str(e)}") + if checker.errors: + exit(1) + + # COOL to CIL + cil_visitor = COOLToCILVisitor(context) + cil_ast = cil_visitor.visit(ast, scope) + + # CIL to MIPS + cil_to_mips = CILToMIPSVisitor() + mips_ast = cil_to_mips.visit(cil_ast) + printer = MIPSPrintVisitor() + mips_code = printer.visit(mips_ast) + + # Output MIPS file + out_file = f"{args.file[:-3]}.mips" + lib_path = os.path.abspath( + os.path.join(__file__, "../compiler/visitors/cil2mips/mips_lib.asm") + ) + with open(out_file, "w") as f: + f.write(mips_code) + with open(lib_path) as f2: + f.write("".join(f2.readlines())) + + exit(0) + + +if __name__ == "__main__": + import argparse + + parser = argparse.ArgumentParser(description="CoolCompiler") + parser.add_argument( + "-f", "--file", type=str, default="code.cl", help="File to read cool code from" + ) + + args = parser.parse_args() + + main(args) diff --git a/src/makefile b/src/makefile index 30df993f5..9be6ed365 100644 --- a/src/makefile +++ b/src/makefile @@ -1,12 +1,24 @@ -.PHONY: clean +.DEFAULT_GOAL := info +.PHONY: clean, info, main, test + +CODE := code.cl +FILE_NAME := $(shell echo $(CODE) | cut -d '.' -f 1) +ASM := $(FILE_NAME).mips main: - # Compiling the compiler :) + @./coolc.sh $(CODE) + @spim -file $(ASM) clean: - rm -rf build/* - rm -rf ../tests/*/*.mips + @rm -rf build/* + @rm -rf ../tests/*/*.mips test: - pytest ../tests -v --tb=short -m=${TAG} + @pytest ../tests -v --tb=short -m=${TAG} + +install: + @python -m pip install -r ../requirements.txt + @sudo apt-get install spim +info: + @echo "Cool Compiler 2021 - CodersUP" \ No newline at end of file diff --git a/tests/codegen_test.py b/tests/codegen_test.py index 48df768ff..393d7907a 100644 --- a/tests/codegen_test.py +++ b/tests/codegen_test.py @@ -2,8 +2,8 @@ import os from utils import compare_outputs -tests_dir = __file__.rpartition('/')[0] + '/codegen/' -tests = [(file) for file in os.listdir(tests_dir) if file.endswith('.cl')] +tests_dir = __file__.rpartition("/")[0] + "/codegen/" +tests = [(file) for file in os.listdir(tests_dir) if file.endswith(".cl")] # @pytest.mark.lexer # @pytest.mark.parser @@ -13,5 +13,9 @@ @pytest.mark.run(order=4) @pytest.mark.parametrize("cool_file", tests) def test_codegen(compiler_path, cool_file): - compare_outputs(compiler_path, tests_dir + cool_file, tests_dir + cool_file[:-3] + '_input.txt',\ - tests_dir + cool_file[:-3] + '_output.txt') \ No newline at end of file + compare_outputs( + compiler_path, + tests_dir + cool_file, + tests_dir + cool_file[:-3] + "_input.txt", + tests_dir + cool_file[:-3] + "_output.txt", + ) diff --git a/tests/conftest.py b/tests/conftest.py index 1f44eeb72..0969ec84d 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -1,6 +1,7 @@ import pytest import os + @pytest.fixture def compiler_path(): - return os.path.abspath('./coolc.sh') \ No newline at end of file + return os.path.abspath("./coolc.sh") diff --git a/tests/lexer_test.py b/tests/lexer_test.py index 2a27223d3..1e2a36a0d 100644 --- a/tests/lexer_test.py +++ b/tests/lexer_test.py @@ -2,12 +2,15 @@ import os from utils import compare_errors -tests_dir = __file__.rpartition('/')[0] + '/lexer/' -tests = [(file) for file in os.listdir(tests_dir) if file.endswith('.cl')] +tests_dir = __file__.rpartition("/")[0] + "/lexer/" +tests = [(file) for file in os.listdir(tests_dir) if file.endswith(".cl")] + @pytest.mark.lexer @pytest.mark.error @pytest.mark.run(order=1) @pytest.mark.parametrize("cool_file", tests) def test_lexer_errors(compiler_path, cool_file): - compare_errors(compiler_path, tests_dir + cool_file, tests_dir + cool_file[:-3] + '_error.txt') \ No newline at end of file + compare_errors( + compiler_path, tests_dir + cool_file, tests_dir + cool_file[:-3] + "_error.txt" + ) diff --git a/tests/parser_test.py b/tests/parser_test.py index 129c0f20a..a1ecc8f9e 100644 --- a/tests/parser_test.py +++ b/tests/parser_test.py @@ -2,12 +2,15 @@ import os from utils import compare_errors -tests_dir = __file__.rpartition('/')[0] + '/parser/' -tests = [(file) for file in os.listdir(tests_dir) if file.endswith('.cl')] +tests_dir = __file__.rpartition("/")[0] + "/parser/" +tests = [(file) for file in os.listdir(tests_dir) if file.endswith(".cl")] + @pytest.mark.parser @pytest.mark.error @pytest.mark.run(order=2) @pytest.mark.parametrize("cool_file", tests) def test_parser_errors(compiler_path, cool_file): - compare_errors(compiler_path, tests_dir + cool_file, tests_dir + cool_file[:-3] + '_error.txt') \ No newline at end of file + compare_errors( + compiler_path, tests_dir + cool_file, tests_dir + cool_file[:-3] + "_error.txt" + ) diff --git a/tests/semantic_test.py b/tests/semantic_test.py index cac9cd78b..c80b3552d 100644 --- a/tests/semantic_test.py +++ b/tests/semantic_test.py @@ -2,13 +2,18 @@ import os from utils import compare_errors, first_error_only_line -tests_dir = __file__.rpartition('/')[0] + '/semantic/' -tests = [(file) for file in os.listdir(tests_dir) if file.endswith('.cl')] +tests_dir = __file__.rpartition("/")[0] + "/semantic/" +tests = [(file) for file in os.listdir(tests_dir) if file.endswith(".cl")] + @pytest.mark.semantic @pytest.mark.error @pytest.mark.run(order=3) @pytest.mark.parametrize("cool_file", tests) def test_semantic_errors(compiler_path, cool_file): - compare_errors(compiler_path, tests_dir + cool_file, tests_dir + cool_file[:-3] + '_error.txt', \ - cmp=first_error_only_line) \ No newline at end of file + compare_errors( + compiler_path, + tests_dir + cool_file, + tests_dir + cool_file[:-3] + "_error.txt", + cmp=first_error_only_line, + ) diff --git a/tests/utils/__init__.py b/tests/utils/__init__.py index 90f60fdd8..16281fe0b 100644 --- a/tests/utils/__init__.py +++ b/tests/utils/__init__.py @@ -1 +1 @@ -from .utils import * \ No newline at end of file +from .utils import * diff --git a/tests/utils/utils.py b/tests/utils/utils.py index 961cf7cbc..21712c8f9 100644 --- a/tests/utils/utils.py +++ b/tests/utils/utils.py @@ -2,16 +2,17 @@ import re -COMPILER_TIMEOUT = 'El compilador tarda mucho en responder.' -SPIM_TIMEOUT = 'El spim tarda mucho en responder.' -TEST_MUST_FAIL = 'El test %s debe fallar al compilar' -TEST_MUST_COMPILE = 'El test %s debe compilar' -BAD_ERROR_FORMAT = '''El error no esta en formato: (,) - : - o no se encuentra en la 3ra linea\n\n%s''' -UNEXPECTED_ERROR = 'Se esperaba un %s en (%d, %d). Su error fue un %s en (%d, %d)' -UNEXPECTED_OUTPUT = 'La salida de %s no es la esperada:\n%s\nEsperada:\n%s' +COMPILER_TIMEOUT = "El compilador tarda mucho en responder." +SPIM_TIMEOUT = "El spim tarda mucho en responder." +TEST_MUST_FAIL = "El test %s debe fallar al compilar" +TEST_MUST_COMPILE = "El test %s debe compilar" +BAD_ERROR_FORMAT = """El error no esta en formato: (,) - : + o no se encuentra en la 3ra linea\n\n%s""" +UNEXPECTED_ERROR = "Se esperaba un %s en (%d, %d). Su error fue un %s en (%d, %d)" +UNEXPECTED_OUTPUT = "La salida de %s no es la esperada:\n%s\nEsperada:\n%s" + +ERROR_FORMAT = r"^\s*\(\s*(\d+)\s*,\s*(\d+)\s*\)\s*-\s*(\w+)\s*:(.*)$" -ERROR_FORMAT = r'^\s*\(\s*(\d+)\s*,\s*(\d+)\s*\)\s*-\s*(\w+)\s*:(.*)$' def parse_error(error: str): merror = re.fullmatch(ERROR_FORMAT, error) @@ -25,67 +26,108 @@ def first_error(compiler_output: list, errors: list): oline, ocolumn, oerror_type, _ = parse_error(compiler_output[0]) - assert line == oline and column == ocolumn and error_type == oerror_type,\ - UNEXPECTED_ERROR % (error_type, line, column, oerror_type, oline, ocolumn) + assert ( + line == oline and column == ocolumn and error_type == oerror_type + ), UNEXPECTED_ERROR % (error_type, line, column, oerror_type, oline, ocolumn) + def first_error_only_line(compiler_output: list, errors: list): line, column, error_type, _ = parse_error(errors[0]) oline, ocolumn, oerror_type, _ = parse_error(compiler_output[0]) - assert line == oline and error_type == oerror_type,\ - UNEXPECTED_ERROR % (error_type, line, column, oerror_type, oline, ocolumn) + assert line == oline and error_type == oerror_type, UNEXPECTED_ERROR % ( + error_type, + line, + column, + oerror_type, + oline, + ocolumn, + ) def get_file_name(path: str): try: - return path[path.rindex('/') + 1:] + return path[path.rindex("/") + 1 :] except ValueError: return path -def compare_errors(compiler_path: str, cool_file_path: str, error_file_path: str, cmp=first_error, timeout=100): + +def compare_errors( + compiler_path: str, + cool_file_path: str, + error_file_path: str, + cmp=first_error, + timeout=100, +): try: - sp = subprocess.run(['bash', compiler_path, cool_file_path], capture_output=True, timeout=timeout) + sp = subprocess.run( + ["bash", compiler_path, cool_file_path], + capture_output=True, + timeout=timeout, + ) return_code, output = sp.returncode, sp.stdout.decode() except subprocess.TimeoutExpired: assert False, COMPILER_TIMEOUT assert return_code == 1, TEST_MUST_FAIL % get_file_name(cool_file_path) - fd = open(error_file_path, 'r') - errors = fd.read().split('\n') + fd = open(error_file_path, "r") + errors = fd.read().split("\n") fd.close() # checking the errors of compiler - compiler_output = output.split('\n') + compiler_output = output.split("\n") cmp(compiler_output[2:], errors) -SPIM_HEADER = r'''^SPIM Version .+ of .+ + +SPIM_HEADER = r"""^SPIM Version .+ of .+ Copyright .+\, James R\. Larus\. All Rights Reserved\. See the file README for a full copyright notice\. -(?:Loaded: .+\n)*''' -def compare_outputs(compiler_path: str, cool_file_path: str, input_file_path: str, output_file_path: str, timeout=100): +(?:Loaded: .+\n)*""" + + +def compare_outputs( + compiler_path: str, + cool_file_path: str, + input_file_path: str, + output_file_path: str, + timeout=100, +): try: - sp = subprocess.run(['bash', compiler_path, cool_file_path], capture_output=True, timeout=timeout) + sp = subprocess.run( + ["bash", compiler_path, cool_file_path], + capture_output=True, + timeout=timeout, + ) assert sp.returncode == 0, TEST_MUST_COMPILE % get_file_name(cool_file_path) except subprocess.TimeoutExpired: assert False, COMPILER_TIMEOUT - spim_file = cool_file_path[:-2] + 'mips' + spim_file = cool_file_path[:-2] + "mips" try: - fd = open(input_file_path, 'rb') - sp = subprocess.run(['spim', '-file', spim_file], input=fd.read(), capture_output=True, timeout=timeout) + fd = open(input_file_path, "rb") + sp = subprocess.run( + ["spim", "-file", spim_file], + input=fd.read(), + capture_output=True, + timeout=timeout, + ) fd.close() mo = re.match(SPIM_HEADER, sp.stdout.decode()) if mo: - output = mo.string[mo.end():] + output = mo.string[mo.end() :] except subprocess.TimeoutExpired: assert False, SPIM_TIMEOUT - fd = open(output_file_path, 'r') + fd = open(output_file_path, "r") eoutput = fd.read() fd.close() - assert output == eoutput, UNEXPECTED_OUTPUT % (spim_file, repr(output), repr(eoutput)) + assert output == eoutput, UNEXPECTED_OUTPUT % ( + spim_file, + repr(output), + repr(eoutput), + )