Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added info on symbolic tokens in design docs #2657

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
0ae01ab
Added info on symbolic tokens in design docs
aswin2108 Mar 5, 2023
6b6aac3
Fixed typos!
aswin2108 Mar 5, 2023
bed7bd3
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 Mar 13, 2023
73f8d50
Fixed pre-commit issues
aswin2108 Mar 13, 2023
a05a815
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 Mar 14, 2023
7d23548
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 Mar 20, 2023
b5e73bf
Merge remote-tracking branch 'upstream/trunk' into Add-symbolic-token…
aswin2108 Apr 1, 2023
8206dab
Added reviewed changes and revamped whitespace section
aswin2108 Apr 1, 2023
3fd7770
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 Apr 18, 2023
64508f7
Resolved some reviews
aswin2108 Apr 18, 2023
581950e
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 Apr 29, 2023
f726bd1
Added missing tokens to the table
aswin2108 Apr 29, 2023
e7f2997
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 May 12, 2023
e0c7c70
Fixed the table
aswin2108 May 12, 2023
39b88f4
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 May 13, 2023
9273b02
Added missing seperators
aswin2108 May 13, 2023
9326978
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 May 24, 2023
5eba8b9
Single table row lists both delimiters
aswin2108 May 24, 2023
28b7054
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 Jun 1, 2023
4d05316
Added TODO message
aswin2108 Jun 1, 2023
bb3f2a9
Fixed punctuation mistakes
aswin2108 Jun 1, 2023
f56561d
Edited the details section
aswin2108 Jun 1, 2023
718b873
Merge branch 'carbon-language:trunk' into Add-symbolic-tokens-in-desi…
aswin2108 Jun 2, 2023
2068acd
Removed unwanted lines
aswin2108 Jun 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/design/lexical_conventions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,12 @@ A _lexical element_ is one of the following:
- a maximal sequence of [whitespace](whitespace.md) characters
- a [word](words.md)
- a literal:

- a [numeric literal](numeric_literals.md)
- a [string literal](string_literals.md)

- a [comment](comments.md)
- TODO: operators ...
- a [symbolic token](symbolic_tokens.md)

The sequence of lexical elements is formed by repeatedly removing the longest
initial sequence of characters that forms a valid lexical element.
101 changes: 101 additions & 0 deletions docs/design/lexical_conventions/symbolic_tokens.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Symbolic Tokens

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

<!-- toc -->

## Table of contents

- [Overview](#overview)
- [Details](#details)
- [Symbolic token list](#symbolic-token-list)
- [Alternatives considered](#alternatives-considered)
- [References](#references)

<!-- tocstop -->

## Overview

A _symbolic token_ is one of a fixed set of
[tokens](https://en.wikipedia.org/wiki/Lexical_analysis#Token) that consist of
characters that are not valid in identifiers. That is, they are tokens
consisting of symbols, not letters or numbers. Operators are one use of symbolic
tokens, but they are also used in patterns `:`, declarations (`->` to indicate
return type, `,` to separate parameters), statements (`;`, `=`, and so on), and
other places (`,` to separate function call arguments).

Carbon has a fixed set of symbolic tokens, defined by the language
specification. Developers cannot define new symbolic tokens in their own code.

Symbolic tokens are lexed using a "max munch" rule: at each lexing step, the
longest symbolic token defined by the language specification that appears
starting at the current input position is lexed, if any.

When a symbolic token is used as an operator, the surrounding whitespace must
follow certain rules:

- There can be no whitespace between a unary operator and its operand.
- The whitespace around a binary operator must be consistent: either there is
whitespace on both sides or on neither side.
- If there is whitespace on neither side of a binary operator, the token
before the operator must be an identifier, a literal, or any kind of closing
bracket (for example, `)`, `]`, or `}`), and the token after the operator
must be an identifier, a literal, or any kind of opening bracket (for
example, `(`, `[`, or `{`).

These rules enable us to use a token like `*` as a prefix, infix, and postfix
operator, without creating ambiguity.

## Details

### Symbolic token list

The following is the initial list of symbolic tokens recognized in a Carbon
source file:

| Symbolic Tokens | Explanation |
| --------------- | ------------------------------------------------------------------------------------------------------------ |
| `+` | Addition |
| `-` | Subtraction and negation |
| `*` | Indirection, multiplication, and forming pointer types |
| `/` | Division |
| `%` | Modulus |
| `=` | Assignment |
| `^` | Complementing and Bitwise XOR |
| `&` | Address-of and Bitwise AND |
| `\|` | Bitwise OR |
| `<<` | Arithmetic and Logical Left-shift |
| `>>` | Arithmetic and Logical Right-shift |
| `==` | Equality or equal to |
| `!=` | Inequality or not equal to |
| `>` | Greater than |
| `>=` | Greater than or equal to |
| `<` | Less than |
| `<=` | Less than or equal to |
| `->` | Return type and indirect member access |
| `=>` | Match syntax |
| `[` and `]` | Subscript and deduced parameter lists |
| `(` and `)` | Function call, function declaration, and tuple literals |
| `{` and `}` | Struct literals, blocks of control flow statements, and the bodies of definitions (classes, functions, etc.) |
| `,` | Separate tuple and struct elements |
| `.` | Member access |
| `:` | Name bindings |
| `:!` | Generic binding |
| `;` | Statement separator |

TODO: The assignment operators in
[#2511](https://github.com/carbon-language/carbon-lang/pull/2511) are still to
be added.

## Alternatives considered

- [Proposal: p0601](/proposals/p0601.md#alternatives-considered)

## References

- Proposal
[#601: Symbolic tokens](https://github.com/carbon-language/carbon-lang/pull/601)