On the Fly Garbage Collection an Exercise in Cooperation (1978)

February 28, 2025

In Go 1.5, the algorithm of the garbage collector (GC) changed to a concurrent, tri-color, mark-sweep collector, which was first proposed by Dijkstra et al. in 1978. The algorithm proposed by Dijkstra et al. alternates a marking phase and an appending phase. In the marking phase, the garbage collector colors reachable objects in gray or black. In the appending phase, it collects white objects that are unreachable from things on the stack and global variables. The algorithm prioritizes minimizing the atomic operations of the garbage collector to reduce overhead on the application.

A post on The Go Blog, Go GC: Prioritizing low latency and simplicity, states that the algorithm deliberately diverges from most “enterprise” grade garbage collectors of today. While this may be true, ZGC, one of Java’s garbage collectors, employs the concept of the tri-color marking.

Rich Hudson, a member of Google’s Go team, explained why the algorithm suited for Go’s runtime in Rick HudsonのGo GC: Latency Problem Solved and its video. He emphasized the importance of memory allocation locality in the presentation and another talk. First, the fields are embedded in their structs and allocated sequentially in the heap. Second, as long as an interior pointer refers to a member of a struct, the struct remains alive.

The algorithm abstracts the heap as a directed graph, where nodes represent pieces of heap memory, and edges represent references. The number of the nodes is fixed. The objects referred to by globals and variables on the stack are root nodes. The algorithm collects free nodes in lists, using root nodes as their heads. Note that a root node can also be a target from another node by definition.

In the context of the directed graph, the application is referred to as the mutator, and the garbage collector is called the collector. The nodes unreachable from any root nodes are garbage. The collector appends them to the free list.

The goal of the algorithm is to identify all the garbage nodes and add them to the free list within a finite number of steps. If a garbage node exists at the beginning of an appending phase, it is appended to the free list by the end of the next appending phase.

There are some restrictions on the structure of the directed graph to simplify the proof without loss of generality. Each node has two outgoing edges. There is a special node, NIL, whose two outgoing edges points itself. An edge targeting NIL represents a source node with fewer than two outgoing edges.

The algorithm uses colors to distinguish between live and garbage nodes and alternates between the two phases in a fnite number of steps. Nodes can be colored white, gray, or black. White is lighter than gray, and gray is lighter than black. During the marking phase, no node becomes a lighter color. In the appending phase, white nodes are considered garbage, and the collector appends them to the free list. If the target of a previously redirected edge is white, the mutator makes it gray before redirecting the edge toward another node or itself. Coloring and redirection are distinct atomic operations.

The marking phase continues until there are no gray nodes except for the root nodes. The following pseudocode represents the marking phase, assuming initially no nodes are black.

for root in roots:
  if root.color == WHITE:
    root.color = GRAY

i, k = 0, M # M is the number of the nodes.
	
while k > 0:
  c = nodes[i].color // atomic
  if c == GRAY:
    k = M
	for side in range(2):
      succ := nodes[i].successors[side] // atomic
	  // begin atomic
      if succ.color == WHITE:
	    succ.color = GRAY
	  // end atomic
	nodes[i].color = BLACK // atomic
  else:
    k -= 1
	
  i = (i + 1) % M

At the beginning of the marking phase, the collector colors any white root nodes gray. Afterwards, the collector sequentially processes each node, turning white successors gray and then turning the source node black. The marking phase ends if the collector does not find any gray nodes for M times in a row. Because the root node of the free list is gray at the beginning, the free nodes eventually turn black by the end of the marking phase.

In the appending phase, the collector appends the white nodes to the free list, and colors black nodes white. During this phase, there are no gray nodes except for the root nodes. The following pseudocode shows the appending phase.

for i in range(m):
  color := nodes[i].color // atomic
  if color == WHITE:
    free_list.append(nodes[i]) // atomic
  else: // color == BLACK
    nodes[i].color = WHITE // atomic

Note that Go’s garbage collection is not implemented literally as described above. “Go GC: Prioritizing low latency and simplicity”, mentioned previously, states that Go’s GC is turned off after the sweep phase. Additionally, the Go’s GC introduces stop-the-world pauses to simplify the marking phase’s termination.