Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add initial specs documents #243

Merged
merged 1 commit into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions specs/Identifier.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
### ECMA Specification for Vein Language Identifier Expression

Check notice on line 1 in specs/Identifier.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/Identifier.md#L1

First line in a file should be a top-level heading
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a top-level heading to conform to Markdown best practices.

+ # ECMA Specification for Vein Language Identifier Expression
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### ECMA Specification for Vein Language Identifier Expression
# ECMA Specification for Vein Language Identifier Expression
### ECMA Specification for Vein Language Identifier Expression
Tools
GitHub Check: Codacy Static Code Analysis

[notice] 1-1: specs/Identifier.md#L1
First line in a file should be a top-level heading


#### 1. Introduction

This section defines the behavior of the `IdentifierExpression` in the Vein Language. The `IdentifierExpression` is a fundamental lexical element used to declare and reference identifiers within the language syntax.

#### 2. Identifier Expression Definition

An `IdentifierExpression` in the Vein Language represents a valid identifier that allows for the naming of variables, functions, types, and other language constructs. The rules for forming an `IdentifierExpression` are as follows:

#### 3. Raw Identifier

The basic building block of an `IdentifierExpression` is the `RawIdentifier`. The `RawIdentifier` must adhere to the following criteria:

- **Starting characters**: The identifier must begin with a letter (`A-Z`, `a-z`), an underscore (`_`), or an at symbol (`@`). This allows for the inclusion of conventional identifiers, special identifiers, and identifiers within specific namespaces or contexts.
- **Subsequent characters**: After the initial character, the identifier may consist of any combination of letters, digits (`0-9`), or underscores (`_`).

#### 4. Tokenization and Whitespace Handling

An `IdentifierExpression` is derived from the `RawIdentifier` and includes tokenization to handle leading and trailing whitespace. This ensures that the identifier is correctly parsed in the context of surrounding whitespace.

- **Token**: The `RawIdentifier` is tokenized to ignore any whitespace characters that may surround it.
- **Naming**: The resulting tokenized identifier is named as "Identifier" for referencing within syntactical structures.

Check notice on line 23 in specs/Identifier.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/Identifier.md#L23

Expected: 80; Actual: 120

#### 5. Validation and Error Marking

An additional validation step is implemented to ensure that certain reserved system types cannot be used as identifiers. This enhances the robustness and predictability of the language by preventing naming conflicts with key system types.

- **Error Marking**: If an identifier matches any system type listed in the `VeinKeywords`, it will be marked as an error with the message "cannot use system type as identifier."

#### 6. Positional Tracking

The `IdentifierExpression` tracks its position within the source code to provide detailed diagnostic information during parsing and compilation.

- **Positioned**: The identifier token retains positional information to assist in error reporting and debugging processes.

#### 7. Sample Grammar

The following grammar describes the structure of an `IdentifierExpression` using the rules defined above:

Check notice on line 39 in specs/Identifier.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/Identifier.md#L39

Expected: 80; Actual: 105

```

Check notice on line 41 in specs/Identifier.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/Identifier.md#L41

Fenced code blocks should have a language specified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify the language for the code block to enhance syntax highlighting.

- ```
+ ```vein
Tools
Markdownlint

41-41: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified

GitHub Check: Codacy Static Code Analysis

[notice] 41-41: specs/Identifier.md#L41
Fenced code blocks should have a language specified

IdentifierExpression ::= Token(RawIdentifier) : {
if VeinKeywords.contains(identifier) {
raise "cannot use system type as identifier"
}
}
```

#### 8. Exception Handling

If an invalid identifier is detected (i.e., using a reserved system type), a parsing exception will be thrown with an appropriate error message, indicating the nature of the violation and its position in the source code.

#### 9. Example

```vein
// Valid identifier
let _myVariable123 = 10;

// Invalid identifier (system type)
let i32 = 15; // Error: cannot use system type as identifier
```

#### 10. Conclusion

This specification defines the rules and behavior for parsing and validating `IdentifierExpression` in the Vein Language. By adhering to these guidelines, Vein ensures consistent and predictable identifier handling, thereby aiding developers in writing clear and error-free code.
315 changes: 315 additions & 0 deletions specs/expressions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,315 @@
### ECMA Specification for Vein Language Expressions and Blocks

#### 1. Introduction

This section defines the behaviors and rules for parsing various expressions and blocks in the Vein Language, including block statements, qualified expressions, assignment expressions, literal expressions, and more.

#### 2. Block Statements

Blocks are fundamental structural elements that contain a sequence of statements enclosed within braces (`{}`).

##### Block Statement

The syntax for a block statement is:
```
Block ::= '{' Statements '}'
Statements ::= Statement*
```
Comment on lines +14 to +17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a language specifier to the fenced code block for proper syntax highlighting and readability.

- ```
+ ```ebnf
Tools
Markdownlint

14-14: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines


14-14: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified


##### Parser Implementation

The parser rule for `Block` is structured as follows:

```csharp
protected internal virtual Parser<BlockSyntax> Block =>
from comments in CommentParser.AnyComment.Token().Many()
from openBrace in Parse.Char('{').Token().Commented(this)
from statements in Statement.Many()
from closeBrace in Parse.Char('}').Commented(this)
select new BlockSyntax
{
LeadingComments = comments.ToList(),
Statements = statements.ToList(),
InnerComments = closeBrace.LeadingComments.ToList(),
TrailingComments = closeBrace.TrailingComments.ToList(),
}
.SetStart(openBrace.Transform.pos)
.SetEnd(closeBrace.Transform.pos)
.As<BlockSyntax>();
```

### Shortform Block Statement

The syntax for a shortform block is:
```
BlockShortform ::= '|>' Expression ';'
```

##### Parser Implementation

The parser rule for `BlockShortform` is structured as follows:

```csharp
protected internal virtual Parser<BlockSyntax> BlockShortform<T>() where T : StatementSyntax =>

Check notice on line 53 in specs/expressions.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/expressions.md#L53

Expected: 80; Actual: 95
from comments in CommentParser.AnyComment.Token().Many()
from op in Parse.String("|>").Token()
from exp in QualifiedExpression.Token().Positioned()
from end in Parse.Char(';').Token()
select new BlockSyntax
{
LeadingComments = comments.ToList(),
Statements = new List<StatementSyntax>()
{
typeof(T) == typeof(ReturnStatementSyntax) ?
new ReturnStatementSyntax(exp).SetPos<ReturnStatementSyntax>(exp.Transform) :
new SingleStatementSyntax(exp)
}
}
.SetStart(exp.Transform.pos)
.SetEnd(exp.Transform.pos)
.As<BlockSyntax>();
```

#### 3. Qualified Expressions

Qualified expressions include various types such as assignment, conditional, and lambda expressions.

##### Qualified Expression

The syntax for a qualified expression is:
```
QualifiedExpression ::= AssignmentExpression
| NonAssignmentExpression
```

##### Parser Implementation

The parser rule for `QualifiedExpression` is structured as follows:

```csharp
protected internal virtual Parser<ExpressionSyntax> _shadow_QualifiedExpression =>
assignment.Or(non_assignment_expression);

protected internal virtual Parser<ExpressionSyntax> QualifiedExpression =>
Parse.Ref(() => _shadow_QualifiedExpression);
```

### Assignment Expression

The syntax for an assignment expression is:
```
AssignmentExpression ::= UnaryExpression AssignmentOperator QualifiedExpression
| UnaryExpression '??=' FailableExpression
```

##### Parser Implementation

The parser rule for `assignment` is structured as follows:

```csharp
protected internal virtual Parser<ExpressionSyntax> assignment =>
(
from exp in unary_expression
from op in assignment_operator
from exp2 in QualifiedExpression
select new BinaryExpressionSyntax(exp, exp2, op)
)
.Or(
from exp in unary_expression
from op in Parse.String("??=").Text().Token()
from exp2 in failable_expression
select new BinaryExpressionSyntax(exp, exp2, op)
);
```

### Assignment Operators

The syntax for assignment operators is:
```
AssignmentOperator ::= '<<=' | '^=' | '|=' | '&=' | '%=' | '/=' | '*=' | '-=' | '+=' | '='
```

##### Parser Implementation

The parser rule for `assignment_operator` is structured as follows:

```csharp
protected internal virtual Parser<string> assignment_operator =>
Parse.String("<<=")
.Or(Parse.String("^="))
.Or(Parse.String("|="))
.Or(Parse.String("&="))
.Or(Parse.String("%="))
.Or(Parse.String("/="))
.Or(Parse.String("*="))
.Or(Parse.String("-="))
.Or(Parse.String("+="))
.Or(Parse.String("=")).Token().Text();
```

### Warning
Abount all spec, in general,
I understand perfectly well that no one reads it,
but if you are a reader who has read up to this point, then I am very surprised -
I will say right away, this is a nominal spec, it is unlikely that I will follow it, plus it is written using chatgps, chatjpt, chathpt ☺️

Check notice on line 154 in specs/expressions.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/expressions.md#L154

Expected: 0 or 2; Actual: 9

Check notice on line 154 in specs/expressions.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/expressions.md#L154

Expected: 80; Actual: 148
###

Check notice on line 155 in specs/expressions.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/expressions.md#L155

Expected: atx; Actual: atx_closed

### Non-Assignment Expression

The syntax for a non-assignment expression covers a wide variety of expression types, but fundamentally, it could be either a conditional expression or a lambda expression.

##### Parser Implementation

The parser rule for `non_assignment_expression` is structured as follows:

```csharp
protected internal virtual Parser<ExpressionSyntax> non_assignment_expression =>
conditional_expression.Or(lambda_expression);
```

### Lambda Expression

The syntax for a lambda expression is:
```
LambdaExpression ::= LFunctionSignature '|>' LFunctionBody
```

##### Parser Implementation

The parser rule for `lambda_expression` is structured as follows:

```csharp
protected internal virtual Parser<ExpressionSyntax> lambda_expression =>
from dec in lfunction_signature
from op in Parse.String("|>").Token()
from body in lfunction_body
select new AnonymousFunctionExpressionSyntax(dec, body);
```

### LFunction Signature and Body

The parts defining a lambda function include signatures and bodies:

##### Parser Implementations

Check notice on line 193 in specs/expressions.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

specs/expressions.md#L193

Expected: h4; Actual: h5

The parser rules for LFunction signature and body are structured as follows:

```csharp
protected internal virtual Parser<ExpressionSyntax> lfunction_signature =>
WrappedExpression('(', ')')
.Or(WrappedExpression('(', ')', explicit_anonymous_function_parameter_list).Select(x => new AnonFunctionSignatureExpression(x)))
.Or(WrappedExpression('(', ')', implicit_anonymous_function_parameter_list).Select(x => new AnonFunctionSignatureExpression(x)))
.Or(IdentifierExpression);

protected internal virtual Parser<ExpressionSyntax> lfunction_body =>
failable_expression.Or(block.Select(x => x.GetOrElse(new BlockSyntax())));
```

### Binary Expression

The syntax for a binary expression is:
```
BinaryExpression ::= Operand Operator Operand
Operator ::= '+', '-', '*', '/', '%', '||', '&&', ...

BinaryOperator ::= AdditiveOperator
| MultiplicativeOperator
| ConditionalOperator
...
```

##### Parser Implementation

The parser rules for `BinaryExpression` are structured to use recursive patterns:

```csharp
private Parser<ExpressionSyntax> BinaryExpression<T>(Parser<T> t, string op) where T : ExpressionSyntax, IPositionAware<ExpressionSyntax> =>
from c in t.Token()
from data in
(from _op in Parse.String(op).Text().Token()
from a in t.Token()
select (_op, a)).Many()
select FlatIfEmptyOrNull(c, data.EmptyIfNull().ToArray());

private Parser<ExpressionSyntax> BinaryExpression<T>(Parser<T> t, params string[] ops) where T : ExpressionSyntax, IPositionAware<ExpressionSyntax> =>
from c in t.Token()
from data in
(from _op in Parse.Regex(ops.Select(x => $"\\{x}").Join("|"), $"operators '{ops.Join(',')}'")
from a in t.Token()
select (_op, a)).Many()
select FlatIfEmptyOrNull(c, data.EmptyIfNull().ToArray());
```

### Range Expression

The syntax for a range expression is:
```
RangeExpression ::= Operand '..' Operand
```

##### Parser Implementation

The parser rule for `range_expression` is structured as follows:

```csharp
protected internal virtual Parser<ExpressionSyntax> range_expression =>
(
from s1 in unary_expression
from op in Parse.String("..").Token()
from s2 in unary_expression
select new RangeExpressionSyntax(s1, s2)
).Or(unary_expression);
```

### Conditional Expression

The syntax for a conditional expression can include multiple clauses:

##### Parser Implementation

The parser rule for `conditional_expression` is structured as follows:

```csharp
protected internal virtual Parser<ExpressionSyntax> conditional_expression =>
from operand in null_coalescing_expression
from d in Parse.Char('?')
.Token()
.Then(_ => failable_expression
.Then(x => Parse
.Char(':')
.Token()
.Then(_ => failable_expression
.Select(z => (x, z)))))
.Token()
.Optional()
select FlatIfEmptyOrNull(operand, new CoalescingExpressionSyntax(d));
```

### Inclusion of Other Expressions

The Parsing rules make use of combining multiple types using `Or` and `XOr`.

##### Parser Implementation

```csharp
protected internal virtual Parser<LiteralExpressionSyntax> LiteralExpression =>
from expr in
FloatLiteralExpression.Log("FloatLiteralExpression").Or(
IntLiteralExpression.Log("IntLiteralExpression")).XOr(
StringLiteralExpression.Log("StringLiteralExpression")).XOr(
BinaryLiteralExpression.Log("BinaryLiteralExpression")).XOr(
BooleanLiteralExpression.Log("BooleanLiteralExpression")).XOr(
NullLiteralExpression.Log("NullLiteralExpression"))
.Positioned().Commented(this)
select expr.Value
.WithLeadingComments(expr.LeadingComments)
.WithTrailingComments(expr.TrailingComments);
```

##### Literal Expressions

The syntax covers literals of various types like int, float, boolean, null, etc.

#### Conclusion

This specification outlines the rules and behaviors for parsing blocks and various expressions in the Vein Language, ensuring a structured approach to handle both simple and complex expressions. The provided parser implementations use a combination of `Or`, `XOr`, position tracking, and comments to effectively manage and identify different constructs within the Vein Language. This enhances the ability for maintaining consistency and robustness within the language features.
Loading
Loading