-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial draft of language specification #3618
base: master
Are you sure you want to change the base?
Conversation
Cadence Benchstat comparisonThis branch with compared with the base branch onflow:master commit 4a2d406 Collapsed results for better readability
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @jsproz! 👏
This is a really great start. 👌 I've left some first comments, will continue reviewing and giving feedback over the following days.
- **Allowed Characters**: | ||
- Letters (a-z, A-Z) | ||
- Digits (0-9) | ||
- Special symbols used in operators and syntax, such as `+`, `-`, `*`, `/`, `=`, `{`, `}`, `(`, `)`, `[`, `]`, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to be exhaustive here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will reconcile this with what I find in the compiler, and include all symbols and characters used in the language.
|
||
- **Unicode Support**: | ||
- While the core syntax is based on ASCII characters, Unicode is fully supported in **string literals** and **comments**. | ||
- Identifiers may include Unicode characters in the ranges of letters and numbers but should generally follow best practices for naming and clarity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general we can probably omit recommendations / best practices in this document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. There are things included in this first draft that I now believe belong somewhere else.
|
||
### 2. Identifiers | ||
|
||
Identifiers in Cadence are used to name **variables**, **constants**, **functions**, **contracts**, **types**, and other user-defined entities. Identifiers must follow specific rules to ensure consistency and avoid conflicts with reserved keywords. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Contracts are also types, maybe remove. Just like above, we want to be exhaustive/specific, so also remove the remainder of the sentence.
Identifiers in Cadence are used to name **variables**, **constants**, **functions**, **contracts**, **types**, and other user-defined entities. Identifiers must follow specific rules to ensure consistency and avoid conflicts with reserved keywords. | |
Identifiers in Cadence are used to name **variables**, **constants**, **functions**, and **types**. Identifiers must follow specific rules to ensure consistency and avoid conflicts with reserved keywords. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsproz Can you please push the update? So far the PR only has one commit. Thanks!
|
||
Identifiers in Cadence are used to name **variables**, **constants**, **functions**, **contracts**, **types**, and other user-defined entities. Identifiers must follow specific rules to ensure consistency and avoid conflicts with reserved keywords. | ||
|
||
- **Rules for Naming Identifiers**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make sense to give the EBNF notation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also had the same thought
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update the grammar notation now that I have a grammar that will parser most files. I plan to use a modified EBNF style notation. There will need to be a new section at the top to explain it.
- **Special Identifiers**: | ||
- Identifiers starting with an underscore (`_`) are typically used for private or internal entities. Although allowed, their use should follow common programming conventions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly code style and not really part of the language specification
- **Special Identifiers**: | |
- Identifiers starting with an underscore (`_`) are typically used for private or internal entities. Although allowed, their use should follow common programming conventions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
- **access**: Specifies the access control for the struct and its members (e.g., pub, access(contract), access(self)). | ||
- **StructName**: The name of the struct. | ||
- **Fields**: Declared with the let or var keywords. Constant fields (let) are immutable after initialization, while variable fields (var) can be modified after the struct is created. | ||
- **Initializer**: The init function is responsible for initializing all fields in the struct. It ensures that each field is assigned a value before the struct is used. | ||
- **Functions**: Structs can contain functions, which operate on their fields and are defined within the struct body. | ||
|
||
- **Example**: | ||
```cadence | ||
pub struct User { | ||
pub let id: Int | ||
pub var name: String | ||
|
||
init(id: Int, name: String) { | ||
self.id = id | ||
self.name = name | ||
} | ||
|
||
pub fun updateName(newName: String) { | ||
self.name = newName | ||
} | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that in Cadence 1.0 the access specifiers pub
, pub(set)
and priv
got removed. See "pub and priv Access Modifiers Got Removed" in https://github.com/onflow/cadence/releases/tag/v1.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to remove many of the examples. They may be better fitted to another document. I will definitely remove the ones that use pub
and priv
.
- **Syntax**: | ||
- The basic syntax for a resource declaration is as follows: | ||
```cadence | ||
access(resource) resource ResourceName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
access(resource)
is not a valid access modifier. See the implementation for a valid list: https://github.com/onflow/cadence/blob/master/ast/access.go
let b: Int = 20 | ||
``` | ||
|
||
- **Unique Identifiers**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe describe this as "shadowing"
var y = 10 // This will result in an error | ||
``` | ||
|
||
- **Redeclaration in Sub-scopes**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe reference scoping rules / and or move shadowing rules to scoping rules, given they are very closely related.
|
||
- **Rules for Storing Resources and Data**: | ||
- Resources and data must be stored in well-defined paths under one of the storage categories (`storage`, `private`, or `public`). Resources are moved between accounts using explicit transactions and must be stored in the correct paths to ensure security and proper access control. | ||
- Only the account owner can directly manipulate their `storage` paths. Public and private paths must use **capabilities** to grant or restrict access to external |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe mention here that Cadence uses lexical scoping (as opposed to e.g. dynamic scoping)
|
||
This lexical structure ensures a clear and consistent foundation for writing Cadence programs, supporting both readability and maintainability across a variety of use cases within the Flow blockchain. | ||
|
||
## III. Syntax and Grammar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we structure the specification in terms of layers, like all features in terms of syntax, than all features in terms of semantics, etc. it might make sense to explain the purpose of a feature in one section.
For example, when describing the syntax for a struct declaration, we could only focus on the syntax, and defer the explanation and purpose for the feature to another section (semantics?), e.g. "the syntax ... introduces a struct declaration", where "struct declaration" links to the section explaining the feature and semantics for structs.
|
||
--- | ||
|
||
### 4. Literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, maybe also provide the EBNF grammar for the production literal
and for each separate literal kind. That could be later also referred to by other productions like expression
, etc.
```ebnf | ||
structDeclaration | ||
: access 'struct' identifier conformances? '{' membersAndNestedDeclarations '}' | ||
; | ||
|
||
membersAndNestedDeclarations | ||
: (fieldDeclaration | functionDeclaration | structDeclaration)* | ||
; | ||
``` | ||
- **access**: The access control for the struct (e.g., pub for public, priv for private). | ||
- **identifier**: The name of the struct. | ||
- **conformances**: Optional interfaces the struct conforms to. | ||
- **membersAndNestedDeclarations**: Defines the fields, functions, and nested structs within the struct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More of a general proposal/suggestion on the grammar formatting: I think we could make those non-terminals in a production-rule to link to its definition. e.g: link access
in the above production-rule to the section where access control's grammar is defined. Something like this.
Unfortunately, this is not easy to get it working, would most probably need some post-processor to generate links, do formatting, etc.
But in the meantime, maybe extract those to separate sections (e.g: access
, conformance
, etc.), and remove it from here (can link later), as those can get repetitive.
Next step: refactor this to use the machine readable spec with annotations from here. |
work towards: #3599
Description
master
branchFiles changed
in the Github PR explorer