The uFork virtual machine is designed to support machine-level actors. All instructions execute within the context of an actor handling a message-event. There is no support for procedure/function call/return. Instead actors are used to implement procedure/function abstractions. There is no support for address arithmetic or load/store of arbitrary memory. Mutation is always local to an actor's private state. Immutable values are passed between actors via message-events. External events (such as "interrupts") are turned into message-events.
The quad-cell is the primary internal data-structure in uFork. It consists of four integers (current compile options are 16, 32, and 64 bits).
t | x | y | z |
---|---|---|---|
type/proc | head/car | tail/cdr | link/next |
The integers in each field carry a type tag in their 2 most-significant bits (MSBs). The 1st MSB is {0=indirect-reference, 1=direct-value}. The 2nd MSB is {0=transparent, 1=opaque}. The resulting type-heirarchy looks like this:
any-value
0 / \ 1
indirect direct (fixnum)
0 / \ 1
(quad) transparent opaque (ocap)
Direct values (fixnums) are stored in 2's-complement representation, where the 2nd MSB is the sign bit of the integer value.
Indirect values (references) desigate quad-cells (with fields {t, x, y, z}).
Opaque values (object-capabilities) cannot be dereferenced except by the virtual-processor (to implement actor primitive operations).
Transparent values designate quad that may be written as well as read.
Quad-cells are used to encode most of the important data-structures in uFork.
Structure | Description |
---|---|
{t:Event_T, x:target, y:msg, z:next} | actor event queue entry |
{t:IP, x:SP, y:EP, z:next} | continuation queue entry |
{t:Instr_T, x:opcode, y:data, z:next} | machine instruction (typical) |
{t:Symbol_T, x:hash, y:string, z:value} | immutable symbolic-name |
{t:Pair_T, x:head, y:tail} | pair-lists of user data (cons) |
{t:Pair_T, x:item, y:rest} | stack entry holding item |
{t:Actor_T, x:beh, y:sp, z:?} | idle actor |
{t:Actor_T, x:beh', y:sp', z:events} | busy actor, intially {z:()} |
{t:Fexpr_T, x:actor, y:?, z:?} | interpreter (cust ast env) |
{t:Fexpr_T, x:self, y:msgs, z:beh} | meta-actor transaction |
{t:Dict_T, x:key, y:value, z:next } | dictionary binding entry |
{t:Free_T, z:next} | cell in the free-list |
The uFork instruction execution engine implements a linked-stack machine,
however the stack is only used for local state in a computation.
The input for each instruction is taken from the stack
and the output is placed back onto the stack.
Instructions all have a t
field containing Instr_T
type marker.
The operation code is carried in the x
field of the instruction.
Many instructions also have an immediate value,
usually carried in the y
field of the instruction.
For the typical case of a instruction with a single continuation,
the "next instruction" is carried in the z
field of the instruction.
Input | Instruction | Output | Description |
---|---|---|---|
v | {x:VM_typeq, y:T, z:K} | bool | TRUE if v has type T, otherwise FALSE |
T | {x:VM_cell, y:1, z:K} | cell | create cell {t:T} |
T X | {x:VM_cell, y:2, z:K} | cell | create cell {t:T, x:X} |
T X Y | {x:VM_cell, y:3, z:K} | cell | create cell {t:T, x:X, y:Y} |
T X Y Z | {x:VM_cell, y:4, z:K} | cell | create cell {t:T, x:X, y:Y, z:Z} |
cell | {x:VM_get, y:T, z:K} | t | get t from cell |
cell | {x:VM_get, y:X, z:K} | x | get x from cell |
cell | {x:VM_get, y:Y, z:K} | y | get y from cell |
cell | {x:VM_get, y:Z, z:K} | z | get z from cell |
cell T | {x:VM_set, y:T, z:K} | cell' | set t to T in cell |
cell X | {x:VM_set, y:X, z:K} | cell' | set x to X in cell |
cell Y | {x:VM_set, y:Y, z:K} | cell' | set y to Y in cell |
cell Z | {x:VM_set, y:Z, z:K} | cell' | set z to Z in cell |
... tail head | {x:VM_pair, y:n, z:K} | pair | create {t:Pair_T, x:head, y:tail} (n times) |
pair | {x:VM_part, y:n, z:K} | ... tail head | split pair into head and tail (n times) |
pair | {x:VM_nth, y:n, z:K} | itemn | extract item n from a pair list |
pair | {x:VM_nth, y:-n, z:K} | tailn | extract tail n from a pair list |
— | {x:VM_push, y:value, z:K} | value | push literal value on stack |
vn ... v1 | {x:VM_depth, z:K} | vn ... v1 n | count items on stack |
vn ... v1 | {x:VM_drop, y:n, z:K} | — | remove n items from stack |
vn ... v1 | {x:VM_pick, y:n, z:K} | vn ... v1 vn | copy item n to top of stack |
vn ... v1 | {x:VM_dup, y:n, z:K} | vn ... v1 vn ... v1 | duplicate top n items on stack |
vn ... v1 | {x:VM_roll, y:n, z:K} | vn-1 ... v1 vn | roll item n to top of stack |
vn ... v1 | {x:VM_roll, y:-n, z:K} | v1 vn ... v2 | roll top of stack to item n |
n | {x:VM_alu, y:NOT, z:K} | ~n | bitwise not n |
n m | {x:VM_alu, y:AND, z:K} | n&m | bitwise n and m |
n m | {x:VM_alu, y:OR, z:K} | n|m | bitwise n or m |
n m | {x:VM_alu, y:XOR, z:K} | n^m | bitwise n exclusive-or m |
n m | {x:VM_alu, y:ADD, z:K} | n+m | sum of n and m |
n m | {x:VM_alu, y:SUB, z:K} | n-m | difference of n and m |
n m | {x:VM_alu, y:MUL, z:K} | n*m | product of n and m |
m | {x:VM_eq, y:n, z:K} | bool | TRUE if n == m, otherwise FALSE |
n m | {x:VM_cmp, y:EQ, z:K} | bool | TRUE if n == m, otherwise FALSE |
n m | {x:VM_cmp, y:GE, z:K} | bool | TRUE if n >= m, otherwise FALSE |
n m | {x:VM_cmp, y:GT, z:K} | bool | TRUE if n > m, otherwise FALSE |
n m | {x:VM_cmp, y:LT, z:K} | bool | TRUE if n < m, otherwise FALSE |
n m | {x:VM_cmp, y:LE, z:K} | bool | TRUE if n <= m, otherwise FALSE |
n m | {x:VM_cmp, y:NE, z:K} | bool | TRUE if n != m, otherwise FALSE |
n c | {x:VM_cmp, y:CLS, z:K} | bool | TRUE if n in c, otherwise FALSE |
bool | {x:VM_if, y:T, z:F} | — | continue F if "falsy", otherwise continue T |
— | {x:VM_msg, y:0, z:K} | msg | copy event message to stack |
— | {x:VM_msg, y:n, z:K} | msgn | copy message item n to stack |
— | {x:VM_msg, y:-n, z:K} | tailn | copy message tail n to stack |
— | {x:VM_my, y:SELF, z:K} | actor | push current actor address on stack |
— | {x:VM_my, y:BEH, z:K} | beh | push current actor behavior on stack |
— | {x:VM_my, y:STATE, z:K} | v1 ... vn | push current actor state on stack |
msg actor | {x:VM_send, y:0, z:K} | — | send msg to actor |
mn ... m1 actor | {x:VM_send, y:n, z:K} | — | send (m1 ... mn) to actor |
beh | {x:VM_new, y:0, z:K} | actor | create new actor with behavior beh |
v1 ... vn beh | {x:VM_new, y:n, z:K} | actor | create new actor with (v1 ... vn . beh) |
beh | {x:VM_beh, y:0, z:K} | — | replace behavior with beh |
v1 ... vn beh | {x:VM_beh, y:n, z:K} | — | replace behavior with (v1 ... vn . beh) |
reason | {x:VM_end, y:ABORT} | — | abort actor transaction with reason |
— | {x:VM_end, y:STOP} | — | stop current continuation (thread) |
— | {x:VM_end, y:COMMIT} | — | commit actor transaction |
— | {x:VM_end, y:RELEASE} | — | commit transaction and free actor |
chars | {x:VM_cvt, y:LST_NUM, z:K} | fixnum | convert chars to fixnum |
chars | {x:VM_cvt, y:LST_SYM, z:K} | symbol | convert chars to symbol |
char | {x:VM_putc, z:K} | — | write char to console |
— | {x:VM_getc, z:K} | char | read char from console |
value | {x:VM_debug, y:tag, z:K} | — | debug_print tag: value to console |
The diagram below shows a typical graph of quad-cells
representing the contents of the e_queue
(event queue)
and the k_queue
(continuation queue).
These two queues, plus the global symbol table,
and the interrupt-handling actors,
form the root-set of objects for garbage-collection.
e_queue: [head,tail]--------------------------+
| V
+-->[Event,to,msg,next]---> ... -->[Event,to,msg,NIL]
| |
| +--> actor message content
V
[Actor,beh,sp,?]
| |
| +--> actor state (initial SP)
|
+--> actor behavior (initial IP)
k_queue: [head,tail]--------------------+
| V
+-->[ip,sp,ep,kp]---> ... -->[ip,sp,ep,NIL]
| | |
| | +-->[Event,to,msg,NIL]
| | | |
| | | +--> ...
| | V
| | [Actor,beh',sp',events]---> ... -->[Event,to,msg,NIL]
| V
| [Pair,car,cdr,?]
| | |
| | +--> ... -->[Pair,car,NIL,?]
| V
| item
V
[Op,EQ,0,k]
|
+--> [Op,IF,t,f]
| |
| +--> ...
V
...
Instructions like VM_nth
and VM_msg
have an immediate index argument (n)
to succinctly designate parts of a pair-list.
- Positive n designates elements of the list, starting at
1
. - Negative n designates list tails, starting at
-1
. - Zero designates the whole list/value (for messages).
0 -1 -2 -3
---->[car,cdr]---->[car,cdr]---->[car,cdr]---->...
+1 | +2 | +3 |
V V V
...or more compactly...
0-->[1,-1]-->[2,-2]-->[3,-3]--> ...
| | |
V V V
If the index is out-of-bounds, the result is ?
(undefined).
The {x:VM_cmp, y:CLS}
instruction
produces TRUE
if a character value
is part of the specified class,
or FALSE
otherwise.
The class is defined as a union
of the named classes:
CTL
Control CharacterDGT
Decimal DigitUPR
Uppercase LetterLWR
Lowercase LetterDLM
DelimiterSYM
Name SymbolHEX
Hexadecimal DigitWSP
Whitespace
The tables below define which characters are included in each class.
ch | dec | hex | CTL | DGT | UPR | LWR | DLM | SYM | HEX | WSP |
---|---|---|---|---|---|---|---|---|---|---|
^@ | 0 | 00 | x | |||||||
^A | 1 | 01 | x | |||||||
^B | 2 | 02 | x | |||||||
^C | 3 | 03 | x | |||||||
^D | 4 | 04 | x | |||||||
^E | 5 | 05 | x | |||||||
^F | 6 | 06 | x | |||||||
^G | 7 | 07 | x | |||||||
^H | 8 | 08 | x | |||||||
^I | 9 | 09 | x | x | ||||||
^J | 10 | 0a | x | x | ||||||
^K | 11 | 0b | x | x | ||||||
^L | 12 | 0c | x | x | ||||||
^M | 13 | 0d | x | x | ||||||
^N | 14 | 0e | x | |||||||
^O | 15 | 0f | x | |||||||
^P | 16 | 10 | x | |||||||
^Q | 17 | 11 | x | |||||||
^R | 18 | 12 | x | |||||||
^S | 19 | 13 | x | |||||||
^T | 20 | 14 | x | |||||||
^U | 21 | 15 | x | |||||||
^V | 22 | 16 | x | |||||||
^W | 23 | 17 | x | |||||||
^X | 24 | 18 | x | |||||||
^Y | 25 | 19 | x | |||||||
^Z | 26 | 1a | x | |||||||
^[ | 27 | 1b | x | |||||||
^\ | 28 | 1c | x | |||||||
^] | 29 | 1d | x | |||||||
^^ | 30 | 1e | x | |||||||
^_ | 31 | 1f | x |
ch | dec | hex | CTL | DGT | UPR | LWR | DLM | SYM | HEX | WSP |
---|---|---|---|---|---|---|---|---|---|---|
32 | 20 | x | ||||||||
! | 33 | 21 | x | |||||||
" | 34 | 22 | x | |||||||
# | 35 | 23 | x | |||||||
$ | 36 | 24 | x | |||||||
% | 37 | 25 | x | |||||||
& | 38 | 26 | x | |||||||
' | 39 | 27 | x | |||||||
( | 40 | 28 | x | |||||||
) | 41 | 29 | x | |||||||
* | 42 | 2a | x | |||||||
+ | 43 | 2b | x | |||||||
, | 44 | 2c | x | |||||||
- | 45 | 2d | x | |||||||
. | 46 | 2e | x | |||||||
/ | 47 | 2f | x | |||||||
0 | 48 | 30 | x | x | ||||||
1 | 49 | 31 | x | x | ||||||
2 | 50 | 32 | x | x | ||||||
3 | 51 | 33 | x | x | ||||||
4 | 52 | 34 | x | x | ||||||
5 | 53 | 35 | x | x | ||||||
6 | 54 | 36 | x | x | ||||||
7 | 55 | 37 | x | x | ||||||
8 | 56 | 38 | x | x | ||||||
9 | 57 | 39 | x | x | ||||||
: | 58 | 3a | x | |||||||
; | 59 | 3b | x | |||||||
< | 60 | 3c | x | |||||||
= | 61 | 3d | x | |||||||
> | 62 | 3e | x | |||||||
? | 63 | 3f | x |
ch | dec | hex | CTL | DGT | UPR | LWR | DLM | SYM | HEX | WSP |
---|---|---|---|---|---|---|---|---|---|---|
@ | 64 | 40 | x | |||||||
A | 65 | 41 | x | x | ||||||
B | 66 | 42 | x | x | ||||||
C | 67 | 43 | x | x | ||||||
D | 68 | 44 | x | x | ||||||
E | 69 | 45 | x | x | ||||||
F | 70 | 46 | x | x | ||||||
G | 71 | 47 | x | |||||||
H | 72 | 48 | x | |||||||
I | 73 | 49 | x | |||||||
J | 74 | 4a | x | |||||||
K | 75 | 4b | x | |||||||
L | 76 | 4c | x | |||||||
M | 77 | 4d | x | |||||||
N | 78 | 4e | x | |||||||
O | 79 | 4f | x | |||||||
P | 80 | 50 | x | |||||||
Q | 81 | 51 | x | |||||||
R | 82 | 52 | x | |||||||
S | 83 | 53 | x | |||||||
T | 84 | 54 | x | |||||||
U | 85 | 55 | x | |||||||
V | 86 | 56 | x | |||||||
W | 87 | 57 | x | |||||||
X | 88 | 58 | x | |||||||
Y | 89 | 59 | x | |||||||
Z | 90 | 5a | x | |||||||
[ | 91 | 5b | x | |||||||
\ | 92 | 5c | x | |||||||
] | 93 | 5d | x | |||||||
^ | 94 | 5e | x | |||||||
_ | 95 | 5f | x |
ch | dec | hex | CTL | DGT | UPR | LWR | DLM | SYM | HEX | WSP |
---|---|---|---|---|---|---|---|---|---|---|
` | 96 | 60 | x | |||||||
a | 97 | 61 | x | x | ||||||
b | 98 | 62 | x | x | ||||||
c | 99 | 63 | x | x | ||||||
d | 100 | 64 | x | x | ||||||
e | 101 | 65 | x | x | ||||||
f | 102 | 66 | x | x | ||||||
g | 103 | 67 | x | |||||||
h | 104 | 68 | x | |||||||
i | 105 | 69 | x | |||||||
j | 106 | 6a | x | |||||||
k | 107 | 6b | x | |||||||
l | 108 | 6c | x | |||||||
m | 109 | 6d | x | |||||||
n | 110 | 6e | x | |||||||
o | 111 | 6f | x | |||||||
p | 112 | 70 | x | |||||||
q | 113 | 71 | x | |||||||
r | 114 | 72 | x | |||||||
s | 115 | 73 | x | |||||||
t | 116 | 74 | x | |||||||
u | 117 | 75 | x | |||||||
v | 118 | 76 | x | |||||||
w | 119 | 77 | x | |||||||
x | 120 | 78 | x | |||||||
y | 121 | 79 | x | |||||||
z | 122 | 7a | x | |||||||
{ | 123 | 7b | x | |||||||
| | 124 | 7c | x | |||||||
} | 125 | 7d | x | |||||||
~ | 126 | 7e | x | |||||||
^? | 127 | 7f | x |
Many instruction streams end with a common suffix. These immutable continuation sequences are available to be shared by many behaviors.
K_CALL: [MSG,+0,k]---+
|
|
RESEND: [MSG,+0,k] | RV_ZERO: [PUSH,+0,k]-----+
| | |
v | |
[SELF,?,k]---+ RV_NIL: [PUSH,NIL,k]----+
| |
| |
RV_SELF: [SELF,?,k] | RV_UNDEF: [PUSH,UNDEF,k]--+
| | |
v | |
CUST_SEND: [MSG,+1,k]<--|--------------------------------+
| |
v |
SEND_0: [SEND,0,k]<--+ RELEASE_0: [SEND,0,k]
| |
v v
COMMIT: [END,+1,?] RELEASE: [END,+2,?]