Tree Borrows: A Model of LLVM's noalias?

In this blog post, I want to finally write down an observation about Tree Borrows that we had quite some time ago. Concretely, we think that with some slight modifications, Tree Borrows turns into a suitable model of LLVM’s noalias. We originally planned to write a short sentence about us conjecturing this in paper, but this remark was cut less than two hours before submission for space reasons. Instead, all that survived was the understatement that “Tree Borrows [..] was designed with [LLVM’s noalias] in mind.”

The rest of this blog post requires some familiarity with Tree Borrows, so you can should read the paper, at least Sections 2 and 3. In particular, everything in the following recap should already be known.

A Quick Recap of Tree Borrows

At the core of Tree Borrows is a state machine. The state machine often shown in explainers of Tree Borrows is the from Figure 1 in the paper, reproduced here:

Figure 1 from the paper: Default state machine for Tree Borrows. Dashed arrows indicate entrypoints (for mutable, shared, or interior mutable references). (R)ead and (W)rite events are either ↑(foreign) or ↓(local) and drive the state machine. Reaching the lightning means UB.

Remember that in Tree Borrows, there is a tree for each memory locations, and every new reference to a memory location is added in that tree as a new node. Each node then contains the state machine above, starting out in the initial state (depending on what kind of reference it was that created the node). Every time any reference is accessed, all nodes in the tree change according to the state machine. For the accessed node, and all its parent nodes in the tree, the access is considered local, while for all others, it is considered foreign. The state machine then defines the new state for each node, depending on if it was a read or write and if the access was local or foreign. Triggering UB in any such transition is enough for the program to have UB.

Protectors, Uncovered

The above figure showed the “default” state machine for Tree Borrows. Tree Borrows also contains a second, more complicated state machine called the protected state machine. This protected state machine is used for protected references. References are considered protected whenever they occur as an argument to a method, like in the following example

fn example(x : &mut i32, y : &i32) {
    // x and y are implicitly retagged as protected on method entry.
    // z is not protected, but it has a parent that is protected.
    let z = &mut *x;
    *z += *y;
}

So the tree in Tree Borrows contains a bunch of nodes, some of which are protected, and those nodes have a different state machine.

Figure 2 from the paper: Protected state machine for Tree Borrows. This state machine is a lie.

In the version of Tree Borrows presented in the paper, this state machine was shown in that paper’s Figure 2. This state machine is reproduced above. But Figure 2 in the paper is kind of a lie by omission, since it only shows half the protected state machine. This lie is justified because the figure accurately shows the state machine one encounters on “inbounds” accesses.

But as explained around Example 10 of the paper, accesses can be out-of-bounds. This has an interesting effect that the paper only alludes to when it discussed the “used” tracking on page 13.

Concretely, for the out-of-bounds offsets of a reference (like in Example 10), these are not considered “used.” For a protected reference, these unused out-of-bounds accesses even tolerate foreign writes, which seems impossible under the state machine shown in the paper. What is actually happening is that there is a hidden part of the state machine for nodes not considered used. This part of Tree Borrows is rather messy and we used to not show it, but after a recent refactor, we can now show the protected state machine in all its beauty:

The real protected state machine, without omissions and lies.

You can see that there are some more states than we showed in the paper. For extra confusion, we also renamed the states a bit. The state FrozenL and ReservedL conform to the Frozen and Reserved states shown earlier. The ReservedLF state used to be called Reserved (conflicted). If you wonder why we now mark Reserved as initial instead of ReservedL, this is because we now consider the entrypoint for the unused/out-of-bounds offsets, which do not undergo an implicit read.

Also notice how Disabled is back. The Disabled state is reached by a foreign write to an unused reference, and it will block any future local accesses (by triggering UB on these). The four different kinds of Reserved track whether there have been local and foreign reads. If local reads happened, then foreign writes are no longer possible, and vice-versa. If both have happened, we are in state ReservedLF, and no write is possible anymore. Finally, Unique forbids any foreign access.

For shared references, we have Frozen and FrozenL. Attentive readers might notice they are similar to ReservedF and ReservedLF. In fact, these states are bisimilar in the protected state machine. However, when the protector is lost, all the different kinds of Reserved will go back to the more permissive unprotected Reserved state, while the two kinds of Frozen become unprotected Frozen, so we can’t merge them here.

A Suspicious Symmetry

At this point, you should ponder the protected state machine a bit more (but you can ignore the Frozen states). Do you notice the symmetry between Unique and Disabled, and ReservedL and ReservedF? The state machine ensures that if “one side” (local or foreign) performs a write access, then no accesses are allowed from the other side. If both sides only ever read, everything is fine.

LLVM’s noalias, informally.

The noalias flag in LLVM is used to tell LLVM that it can apply stronger optimizations around a reference. It’s the main workhorse for rustc’s optimizations around mutable references, since rustc marks mutable references in functions arguments as noalias. But how does noalias help LLVM optimize? What does noalias even mean? The answer can be found in LLVM’s reference, which states:

[noalias] indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. If there are other accesses not based on the argument or return value, the behavior is undefined.

This sentence is a prime example of “standard legalese” that is very reminiscent of the C standard. To understand it, let’s look at the following C function and pretend that it is LLVM bitcode, where x is noalias:

void foo(int * noalias x, int * y) {
    // we know x and y actually point to the same memory (see main)
    int* x2 = x; //x2 is based on x, but not on y
    int* y2 = y; //y2 is not based on x, as y is not based on x

    // we can allow this:
    int tmp = *x2 + *y2;
    // or alternatively this:
    *x2 = 42;
    // or alternatively this:
    *y2 = 42;
    // but doing any of them in combination is forbidden!

    return;
}

void main() {
    int a = 42;
    foo(&x, &x);
}

The text I quoted talks about pointers “based on” other pointers, which alludes to some notion of pointer provenance. So when we compute x2 in the example above, it is “based on” x. Both x and y are based on the local variable a in main, but x or y are not on each other. (Note that this is my interpretation of what “based on” mean. There is sadly no formal definition.)

When we try to use both, the text clearly allows us to do one the following things (the three scenarios in the code above):

  • If we only ever read from x or y, everything is fine. (Since no restrictions apply when we do not modify.)
  • If we write through x (or a pointer based on it), then we may not access the memory with a “non-based-on” pointer (like y)
  • If we write through y (or any other pointer not based on x), then may not even read that same memory through x (or a pointer based on x).

And as long as we follow one of those patterns, we can eventually return from the function and all the restrictions go away.

Notice the Simiarity

If we replace “non-based-on” with “foreign” and “based on” with “local” in the above, then what we get is precisely what our protected state machine already enforces! And indeed, equating “based on” with “local” makes a lot of sense: A reference is local to some other reference if it was retagged from it, and similarly a pointer is “based on” another pointer when it is obtained from it in a similar way.

This similarity is the crux of this blog post, and it is there for a reason: after all, protectors exist to justify the optimizations LLVM does with noalias. And since we wanted to be as lenient as possible with Tree Borrows while still justifying what LLVM is doing, our protectors are basically a thin layer of abstraction over noalias, encoding essentially the same restrictions. There is however one exception.

When noalias Ends

In Rust, when a protector ends, we revert back to the unprotected state machine. In LLVM, there are no restrictions on how pointers can be aliased, i.e. it only has “raw pointers.” Thus, in our model of LLVM, when noalias ends, instead of falling back to Rust’s state machine for references, we need to fall back to a maximally permissive state machine where UB is never triggered. This is easily accomplished by a “dummy” state machine with only one state, where all transitions self-loop back into that one state, never reaching UB. In fact, since the past summer, Tree Borrows has a state very much like this for handling interior mutable data.

The Proposed Formal Model

So, our formal model for LLVM’s noalias is that there is a tree, just like in Tree Borrows, and that pointers have a “borrow tag” (name subject to change), just like in Tree Borrows, and local and foreign accesses also work similarly. The only difference is that we only create a new node in the “borrow tree” when a function is called where an argument is annotated as noalias. This node’s state machine is precisely the unabridged “protected state machine” shown above. Reads, writes and deallocations traverse the tree like in Tree Borrows, driving the state machines forward in the same way. When the protectornoalias ends, we replace this state machine in the corresponding node with the previously described “dummy” singleton state machine.

Protector End Actions

In Tree Borrows, we had protector end actions, necessary for complicated reasons. (Compare Sections 3.2 and 5 of the paper.) These remain necessary, and work the same way: Unique triggers writes, ReservedL and ReservedLF triggers reads, and the other states trigger nothing.

Like in Tree Borrows, these end actions are “special” in that they don’t affect children of the node which was protected.

Data Races and Weak Memory

We did not consider the interaction of data race UB and aliasing model UB in our paper. This interaction is highly nontrivial, since aliasing optimizations can create new writes, and these new writes can create data races. So the data race model must already be aware of the “potential writes” that the aliasing model could introduce. Modelling this is out of scope for this blog post, as it was out of scope for the Tree Borrows paper.

Similarly, under weak memory, it does not make a lot of sense to speak about “the” protector end, since “time is relative.” Instead, one perhaps needs to assign a vector clock timestamp to the protector ending, and then consider for each event whether the protector end happened-before? I am by no means an expert on weak memory, so this also remains future work. If you have some ideas here, feel free to tell me. Other people are also thinking about this, see e.g. this GitHub issue.

Parting Thoughts

The fact that Tree Borrows “kind of” describes noalias is intended, since noalias also “kind of” defines the optimizations that rustc performs. Nonetheless, we were a bit surprized when we first realized that Tree Borrows basically contains the precise state machine needed to describe noalias.

For further thought, the informal definition of noalias is heavily inspired by C’s definition of restrict. We thus also conjecture that the proposed model here is an operational definition of restrict.