Skip to content

Latest commit

 

History

History
199 lines (170 loc) · 11 KB

gc.md

File metadata and controls

199 lines (170 loc) · 11 KB

Garbage Collection

Mutable quad-memory (RAM) is subject to machine-level garbage collection. The garbage-collected heap ranges from RAM_BASE up to (not including) RAM_TOP. The floor (currently RAM_BASE) may be moved upward to include additional "reserved" cells. The ceiling (held in the variable RAM_TOP) is extended upward to expand the pool of available memory, up to a limit of RAM_MAX.

GC Private Memory

The garbage collector maintains storage sufficient to hold a memory reference (tagged Raw value) for each quad-memory cell in RAM. That storage is private to the garbage collector, and is not visible to any executing programs.

GC Algorithm

The current garbage-collection algorithm is a fairly traditional mark-and-sweep style. It is a no-motion collector, so the address of an allocated object never changes over its lifetime. It is an incremental (concurrent) collector, so additional allocation and mutation can occur while the garbage collector is running. Actions performed by the garbage collector are interleaved with instruction execution.

GC Memory Evolution

When a GC cycle starts the GC queue is initialized to empty, Reserved RAM is set to TRUE (marked "black"), and the rest of GC memory is set to FALSE (marked "white").

Offset Quad.T Quad.X Quad.Y Quad.Z Description GC queue
0 ram_top = 20 next_free = 17 free_cnt = 1 gc_root = NIL RAM Descriptor gc_first = NIL
1 event_first = 19 event_last = 19 cont_first = NIL cont_last = NIL Double-Deque gc_last = NIL
2 - 14 ... ... ... ... Reserved RAM TRUE
15 alloc_limit event_limit cycle_limit -- Root Sponsor FALSE
16 Pair_T car = 42 cdr = NIL UNDEF Cons Cell FALSE
17 Free_T UNDEF UNDEF NIL Free Cell FALSE
18 Actor_T code = boot_beh data = NIL effect = UNDEF Bootstrap Actor FALSE
19 sponsor = 15 target = 18 message = NIL next = NIL Bootstrap Event FALSE
20 ... ... ... ... Top of RAM FALSE

Reserved RAM is scanned (starting with the Double-Deque), adding any referenced cells to the GC queue (marked "grey").

Offset Quad.T Quad.X Quad.Y Quad.Z Description GC queue
0 ram_top = 20 next_free = 17 free_cnt = 1 gc_root = NIL RAM Descriptor gc_first = 19
1 event_first = 19 event_last = 19 cont_first = NIL cont_last = NIL Double-Deque gc_last = 19
2 - 14 ... ... ... ... Reserved RAM TRUE
15 alloc_limit event_limit cycle_limit -- Root Sponsor FALSE
16 Pair_T car = 42 cdr = NIL UNDEF Cons Cell FALSE
17 Free_T UNDEF UNDEF NIL Free Cell FALSE
18 Actor_T code = boot_beh data = NIL effect = UNDEF Bootstrap Actor FALSE
19 sponsor = 15 target = 18 message = NIL next = NIL Bootstrap Event NIL
20 ... ... ... ... Top of RAM FALSE

The first item in the GC queue is scanned, adding any referenced cells to the GC queue. The first item is removed from the GC queue (marked "black").

Offset Quad.T Quad.X Quad.Y Quad.Z Description GC queue
0 ram_top = 20 next_free = 17 free_cnt = 1 gc_root = NIL RAM Descriptor gc_first = 15
1 event_first = 19 event_last = 19 cont_first = NIL cont_last = NIL Double-Deque gc_last = 18
2 - 14 ... ... ... ... Reserved RAM TRUE
15 alloc_limit event_limit cycle_limit -- Root Sponsor 18
16 Pair_T car = 42 cdr = NIL UNDEF Cons Cell FALSE
17 Free_T UNDEF UNDEF NIL Free Cell FALSE
18 Actor_T code = boot_beh data = NIL effect = UNDEF Bootstrap Actor NIL
19 sponsor = 15 target = 18 message = NIL next = NIL Bootstrap Event TRUE
20 ... ... ... ... Top of RAM FALSE

This process is repeated until the GC queue is empty.

Offset Quad.T Quad.X Quad.Y Quad.Z Description GC queue
0 ram_top = 20 next_free = 17 free_cnt = 1 gc_root = NIL RAM Descriptor gc_first = NIL
1 event_first = 19 event_last = 19 cont_first = NIL cont_last = NIL Double-Deque gc_last = NIL
2 - 14 ... ... ... ... Reserved RAM TRUE
15 alloc_limit event_limit cycle_limit -- Root Sponsor TRUE
16 Pair_T car = 42 cdr = NIL UNDEF Cons Cell FALSE
17 Free_T UNDEF UNDEF NIL Free Cell FALSE
18 Actor_T code = boot_beh data = NIL effect = UNDEF Bootstrap Actor TRUE
19 sponsor = 15 target = 18 message = NIL next = NIL Bootstrap Event TRUE
20 ... ... ... ... Top of RAM FALSE

Cells that are still set to UNDEF (marked "white") are swept into the free-list, from the Top of RAM down to Reserved RAM.

Offset Quad.T Quad.X Quad.Y Quad.Z Description GC queue
0 ram_top = 20 next_free = 16 free_cnt = 2 gc_root = NIL RAM Descriptor gc_first = NIL
1 event_first = 19 event_last = 19 cont_first = NIL cont_last = NIL Double-Deque gc_last = NIL
2 - 14 ... ... ... ... Reserved RAM TRUE
15 alloc_limit event_limit cycle_limit -- Root Sponsor TRUE
16 Free_T UNDEF UNDEF 17 Free Cell FALSE
17 Free_T UNDEF UNDEF NIL Free Cell FALSE
18 Actor_T code = boot_beh data = NIL effect = UNDEF Bootstrap Actor TRUE
19 sponsor = 15 target = 18 message = NIL next = NIL Bootstrap Event TRUE
20 ... ... ... ... Top of RAM FALSE

Concurrent/Incremental GC

The initial implementation of this GC algorithm runs in a stop-the-world fashion, usually triggered by completely exhausting available RAM. This creates a "pause" between instructions that occurs at unpredictable times. It is desirable to perform garbage collection in an incremental fashion, concurrent (or interleaved) with normal mutation resulting from executing instructions.

GC Phase Descriptions

There are three phases to the current GC algorithm:

  1. Preparation
  2. Scanning
  3. Sweeping

In preparation the GC scan-queue is empty and all non-reserved RAM quad-cells are marked "white". The reserved cells are scanned starting with the double-queue structure and the GC root pointer (if in RAM) is marked "grey". This marks "grey" the first and last entries of both the event queue and the continuation queue.

In scanning entries are removed from the GC scan-queue and marked "black". Each field of the designated quad is scanned, adding any "white" RAM reference to the GC scan-queue, thus marking them "grey". This process continues until the GC scan-queue is empty, at which point all cells should be either "white" (unreachable) or "black" (reachable).

In sweeping a linear-address sweep considers all non-reserved RAM starting at the top of memory. Any "white" cells that are not already in the free-list are added to the free list. Any "black" cells are marked "white" in preparation for the next GC pass.

Mutation During Collection

Mutation interacts with garbage-collection via the reserve (allocate) and release (free) primitives. When a cell is released, it is added to the free-list and the cell is marked "white". If the cell was in the GC scan-queue (marked "grey"), which can only occur during scanning, it is removed from the queue.

When a cell is reserved, it is assumed to be reachable until a future GC pass finds otherwise. During preparation the cell is marked "white" because that's the initial condition for all cells. During scanning, if the cell is still "white" (or "black"?) it is added to the GC scan-queue (marked "grey"). During sweeping, if the cell is above the sweep address (the sweep has already passed it) it is marked "white", otherwise it is marked "black" and thus protected from collection.

A Fatal Flaw

The preceeding algorithm for handling mutation has a fatal flaw! During scanning it is possible for an object not-yet-marked as "black" (reachable) to be unlinked from one cell and relinked into another that has already been scanned. If the original parent is then released, the relinked child may never appear in the GC scan-queue and thus may remain "white", causing it to be collected into the free-list during sweeping. Until a more robust mechanism is developed for detecting this kind of migration, the incremental GC cannot be trusted and the stop-the-world GC must be used instead.