-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Specify initial incomplete program schema
- Loading branch information
Showing
15 changed files
with
589 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
{ | ||
"label": "ethdebug/format/program", | ||
"position": 5, | ||
"link": null | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
--- | ||
sidebar_position: 2 | ||
--- | ||
|
||
# Key concepts | ||
|
||
## Programs are associated with a contract's compiled bytecode | ||
|
||
This bytecode might either be the call bytecode, executed when a contract | ||
account with this bytecode receives a message on-chain, or the create bytecode, | ||
executed as part of deploying the contract associated with the bytecode. | ||
|
||
Reflecting this relationship, **ethdebug/format/program** records contain | ||
a reference to the concrete contract (i.e., not an `abstract contract` or | ||
`interface`), the environment the bytecode will be executed (call or | ||
create), and the compilation that yielded the contract and bytecode. | ||
|
||
## Programs contain instruction listings for debuggers to reference | ||
|
||
Programs contain a list of **ethdebug/format/program/instruction** objects, | ||
where each instruction corresponds to one machine instruction in the | ||
associated bytecode. | ||
|
||
These instructions are ordered sequentially, matching the order and | ||
corresponding one-to-one with the encoded binary machine instructions in | ||
the bytecode. Instructions specify the byte offset at which they appear in the | ||
bytecode; this offset is equivalent to program counter on non-EOF EVMs. | ||
|
||
By indexing these instructions by their offset, **ethdebug/format** | ||
programs allow debuggers to lookup high-level information at any point | ||
during machine execution. | ||
|
||
## Instructions describe high-level context details | ||
|
||
Each instruction object in a program contains crucial information about the | ||
high-level language state at that point in the bytecode execution. | ||
Instructions represent these details using the | ||
**ethdebug/format/program/context** schema, and these details may include: | ||
|
||
- Source code ranges associated with the instruction (i.e., "source mappings") | ||
- Variables known to be in scope following the instruction and where to | ||
find those variable's values in the machine state | ||
- Control flow information such as an instruction being associated with the | ||
process of calling from one function to another | ||
|
||
This information serves as a compile-time guarantee about the high-level | ||
state of the world that exists following each instruction. | ||
|
||
## Contexts inform high-level language semantics during machine tracing | ||
|
||
The context information provided for each instruction serves as a bridge | ||
between low-level EVM execution and high-level language constructs. Debuggers | ||
can use these strong compile-time guarantees to piece together a useful and | ||
consistent model of the high-level language code behind the running machine | ||
binary. | ||
|
||
By following the state of machine execution, a debugger can use context | ||
information to stay apprised of the changing compile-time facts over the | ||
course of the trace. Each successively-encountered context serves as the | ||
source of an observed state transition in the debugger's high-level state | ||
model. This allows the debugger to maintain an ever-changing and coherent | ||
view of the high-level language runtime. | ||
|
||
In essence, the information provided by objects in this schema serves as a | ||
means of reducing over state transitions, yielding a dynamic and accurate | ||
representation of the program's high-level state. This enables debugging | ||
tools to: | ||
|
||
1. Map the current execution point back to the original source code | ||
2. Reconstruct the state of variables at any given point | ||
3. Provide meaningful stack traces that reference function names and source | ||
locations | ||
4. Offer insights into control flow, such as entering or exiting functions, | ||
or iterating through loops | ||
5. Present data structures (like arrays or mappings) in a way that reflects | ||
their high-level representation, rather than their low-level storage | ||
|
||
By leveraging these contexts, debugging tools can offer a more intuitive and | ||
developer-friendly experience when working with EVM bytecode, effectively | ||
translating between the machine-level execution and the high-level code that | ||
developers write and understand. This continuous mapping between low-level | ||
execution and high-level semantics allows developers to debug their smart | ||
contracts more effectively, working with familiar concepts and structures | ||
even as they delve into the intricacies of EVM operation. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
{ | ||
"label": "Program contexts", | ||
"position": 6, | ||
"link": null | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
sidebar_position: 4 | ||
--- | ||
|
||
import SchemaViewer from "@site/src/components/SchemaViewer"; | ||
|
||
# Code contexts | ||
|
||
<SchemaViewer | ||
schema={{ id: "schema:ethdebug/format/program/context/code" }} | ||
/> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
sidebar_position: 3 | ||
--- | ||
|
||
import SchemaViewer from "@site/src/components/SchemaViewer"; | ||
|
||
# Schema | ||
|
||
<SchemaViewer | ||
schema={{ id: "schema:ethdebug/format/program/context" }} | ||
/> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
sidebar_position: 5 | ||
--- | ||
|
||
import SchemaViewer from "@site/src/components/SchemaViewer"; | ||
|
||
# Variables contexts | ||
|
||
<SchemaViewer | ||
schema={{ id: "schema:ethdebug/format/program/context/variables" }} | ||
/> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
sidebar_position: 5 | ||
--- | ||
|
||
import SchemaViewer from "@site/src/components/SchemaViewer"; | ||
|
||
# Instruction schema | ||
|
||
<SchemaViewer | ||
schema={{ id: "schema:ethdebug/format/program/instruction" }} | ||
/> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
--- | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Overview | ||
|
||
:::tip[Summary] | ||
|
||
**ethdebug/format/program** is a JSON schema for describing compile-time | ||
information about EVM bytecode, organized from the perspective of individual | ||
machine instructions. | ||
|
||
In **ethdebug/format**, a program record (or "program") represents one block of | ||
executable EVM machine code that a compiler generated for a specific contract. | ||
This could be either the contract's runtime call bytecode or the bytecode | ||
to create the contract. | ||
|
||
A program is structured as a sequence of instruction records ("instructions"), | ||
where each corresponds to a single EVM instruction in the machine code. Each | ||
instruction contains information about the high-level language context at that | ||
point in the bytecode. This allows debuggers to map low-level machine state | ||
back to high-level language concepts at any point during execution. | ||
|
||
Key information that programs contain for a particular instruction might | ||
include: | ||
- the source range or source ranges that are "associated" with the | ||
instruction | ||
- the collection of known high-level variables at that point in time, | ||
including their types and where to find the bytes with those variables' | ||
values | ||
- signals to indicate that the instruction is part of some control flow | ||
operation, such as calling some function from another. | ||
|
||
These program records provide debuggers with a powerful reference resource | ||
to be consulted while observing a running EVM. At each step of EVM machine | ||
execution, debuggers can find the matching **ethdebug/format** program | ||
instruction and use its information to maintain a coherent model of the | ||
high-level world, step-by-step. | ||
|
||
::: | ||
|
||
This format defines the primary **ethdebug/format/program** schema as well as | ||
various sub-schemas in the ethdebug/format/program/* namespace. | ||
|
||
JSON values adhering to this schema contain comprehensive information about a | ||
particular EVM bytecode object. This includes contract metadata (e.g., reference to the source range where the contract is defined) and, importantly, an | ||
ordered list of **ethdebug/format/program/instruction** objects. | ||
|
||
Each instruction object contains essential details for translating low-level | ||
machine state at the time of the instruction back into high-level language | ||
concepts. This allows debuggers to provide a meaningful representation of | ||
program state at any point during execution. | ||
|
||
## Reading this schema | ||
|
||
The **ethdebug/format/program** schema is a root schema that composes other | ||
related schemas in the ethdebug/format/program/* namespace. | ||
|
||
These schemas (like all schemas in this format) are specified as | ||
[JSON Schema](https://json-schema.org), draft 2020-12. | ||
|
||
Please refer to one or more of the following resources in this section, or | ||
see the navigation bar for complete contents: | ||
|
||
- [Key concepts](/spec/program/concepts) | ||
|
||
- [Schema](/spec/program) (**ethdebug/format/program** schema listing) | ||
|
||
- [Instruction schema](/spec/program/instruction) | ||
(**ethdebug/format/program/instruction** schema listing) | ||
|
||
- [Context schema](/spec/program/context) | ||
(**ethdebug/format/program/context** schema listing) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
sidebar_position: 4 | ||
--- | ||
|
||
import SchemaViewer from "@site/src/components/SchemaViewer"; | ||
|
||
# Schema | ||
|
||
<SchemaViewer | ||
schema={{ id: "schema:ethdebug/format/program" }} | ||
/> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,145 @@ | ||
$schema: "https://json-schema.org/draft/2020-12/schema" | ||
$id: "schema:ethdebug/format/program" | ||
|
||
title: ethdebug/format/program | ||
description: | | ||
Debugging information about a particular bytecode in a compilation. | ||
type: object | ||
|
||
properties: | ||
compilation: | ||
title: Compilation reference by ID | ||
description: | | ||
A reference to the compilation as an `{ "id": ... }` object. | ||
$ref: "schema:ethdebug/format/materials/reference" | ||
|
||
contract: | ||
type: object | ||
properties: | ||
name: | ||
type: string | ||
|
||
definition: | ||
$ref: "schema:ethdebug/format/materials/source-range" | ||
required: | ||
- definition | ||
|
||
environment: | ||
title: Bytecode execution environment | ||
description: | | ||
Whether this bytecode is for contract creation or runtime calls. | ||
type: string | ||
enum: | ||
- call | ||
- create | ||
|
||
context: | ||
description: | | ||
The context known to exist prior to the execution of the first | ||
instruction in the bytecode. | ||
$ref: "schema:ethdebug/format/program/context" | ||
|
||
instructions: | ||
type: array | ||
description: | | ||
The full array of instructions for the bytecode. | ||
items: | ||
$ref: "schema:ethdebug/format/program/instruction" | ||
additionalItems: false | ||
|
||
required: | ||
- contract | ||
- environment | ||
- instructions | ||
|
||
examples: | ||
- # Incrementing a storage counter | ||
# | ||
# This example represents the call bytecode for the following pseudo-code: | ||
# ``` | ||
# contract Incrementer; | ||
# | ||
# storage { | ||
# [0] storedValue: uint256; | ||
# }; | ||
# | ||
# code { | ||
# let localValue = storedValue; | ||
# storedValue += 1; | ||
# value = tmp; | ||
# }; | ||
# ``` | ||
contract: | ||
name: "Incrementer" | ||
definition: | ||
source: | ||
id: 0 | ||
environment: call | ||
context: | ||
variables: | ||
- &stored-value | ||
identifier: storedValue | ||
type: | ||
kind: uint | ||
bits: 256 | ||
pointer: | ||
location: storage | ||
slot: 0 | ||
instructions: | ||
- offset: 0 | ||
operation: | ||
mnemonic: PUSH0 | ||
context: | ||
variables: | ||
- *stored-value | ||
- offset: 1 | ||
operation: | ||
mnemonic: SLOAD | ||
context: | ||
variables: | ||
- *stored-value | ||
- &local-value | ||
identifier: localValue | ||
type: | ||
kind: uint | ||
bits: 256 | ||
pointer: | ||
location: stack | ||
slot: 0 | ||
- offset: 2 | ||
operation: | ||
mnemonic: PUSH1 | ||
arguments: ["0x01"] | ||
context: | ||
variables: | ||
- *stored-value | ||
- <<: *local-value | ||
pointer: | ||
location: stack | ||
slot: 1 | ||
|
||
- offset: 4 | ||
operation: | ||
mnemonic: ADD | ||
context: | ||
variables: | ||
- *stored-value | ||
- *local-value | ||
- offset: 5 | ||
operation: | ||
mnemonic: PUSH0 | ||
context: | ||
variables: | ||
- *stored-value | ||
- <<: *local-value | ||
pointer: | ||
location: stack | ||
slot: 1 | ||
|
||
- offset: 6 | ||
operation: | ||
mnemonic: SSTORE | ||
context: | ||
variables: | ||
- *stored-value |
Oops, something went wrong.