Tree Borrows: A Model of LLVM's noalias?
In this blog post, I want to finally write down an observation about Tree Borrows that we had quite some time ago. Concretely, we think that with some slight modifications, Tree Borrows turns into a suitable model of LLVM’s noalias. I even mentioned this when I posted about submitting the paper, back in November 2024. We had originally planned to write a short sentence about us conjecturing this in the paper, but this remark was cut in the final hours before submission as we frantically tried to fit everything into the page limit. All that survived was the understatement that “Tree Borrows [..] was designed with [LLVM’s noalias] in mind.”
In this blog post, we will construct a formal model of noalias by taking Tree Borrows and deleting things from it, without adding anything. The rest of this blog post requires some familiarity with Tree Borrows, so you can should read the paper, at least Sections 2 and 3. In particular, everything in the following recap should already be known.
A Quick Recap of Tree Borrows
At the core of Tree Borrows is a state machine. The state machine often shown in explainers of Tree Borrows is the one from Figure 1 in the paper, reproduced here:
Remember that in Tree Borrows, there is a tree for each memory locations, and every new reference to a memory location is added in that tree as a new node. Each node then contains the state machine above, starting out in the initial state (depending on what kind of reference it was that created the node). Every time any reference is accessed, all nodes in the tree change according to the state machine. For the accessed node, and all its parent nodes in the tree, the access is considered local, while for all others, it is considered foreign. The state machine then defines the new state for each node, depending on if it was a read or write and if the access was local or foreign. Triggering UB in any such transition is enough for the program to have UB.
Protectors, Uncovered
The above figure showed the “default” state machine for Tree Borrows. Tree Borrows also contains a second, more complicated state machine called the protected state machine. This protected state machine is used for protected references. References are considered protected whenever they occur as an argument to a method, like in the following example
fn example(x : &mut i32, y : &i32) {
// x and y are implicitly retagged as protected on method entry.
// z is not protected, but it has a parent that is protected.
let z = &mut *x;
*z += *y;
}
So the tree in Tree Borrows contains a bunch of nodes, some of which are protected, and those nodes have a different state machine.
The state machine above shown above is Figure 2 from our Tree Borrows paper, and is supposed to show the state machine for protected nodes. But this Figure is a lie (by omission), since it only shows half the protected state machine. This lie is justified because the figure accurately shows the state machine one encounters on “inbounds” accesses.
But as explained around Example 10 of the paper, accesses can be out-of-bounds. This has an interesting effect that the paper only alludes to when it discussed the “used” tracking on page 13. Concretely, for the out-of-bounds offsets of a reference (like in Example 10), these are not considered used. For a protected reference, these unused out-of-bounds accesses even tolerate foreign writes, which seems impossible under the state machine shown so far. What is actually happening is that there is a hidden part of the state machine for such unusued nodes. This part of Tree Borrows used to be rather messy and we used to not show it, but a recent refactor made it much nicer. So we can now finally show the protected state machine in all its beauty:
You can see that there are some more states than we showed in the paper. For extra confusion, the refactor also renamed the states a bit. The state FrozenL and ReservedL conform to the Frozen and Reserved states shown earlier. The ReservedLF state used to be called Reserved (conflicted). If you wonder why we now mark Reserved as initial instead of ReservedL, this is because we now consider the entrypoint for the unused/out-of-bounds offsets, which do not undergo an implicit read.
Also notice how Disabled is back. The Disabled state is reached by a foreign write to an unused reference, and it will block any future local accesses (by triggering UB on these). The four different kinds of Reserved track whether there have been local and foreign reads. If local reads happened, then foreign writes are no longer possible, and vice-versa. If both have happened, we are in state ReservedLF, and no write is possible anymore. Finally, Unique forbids any foreign access.
For shared references, we have Frozen and FrozenL. Attentive readers might notice they are similar to ReservedF and ReservedLF. In fact, these states are bisimilar in the protected state machine. However, when the protector is lost, all the different kinds of Reserved will go back to the more permissive unprotected Reserved state, while the two kinds of Frozen become unprotected Frozen, so we can’t merge them here.
A Suspicious Symmetry
At this point, you should ponder the protected state machine a bit more (but you can ignore the Frozen states). Do you notice the symmetry between Unique and Disabled, and between ReservedL and ReservedF? The state machine ensures that if “one side” (local or foreign) performs a write access, then no accesses are allowed from the other side. If both sides only ever read, everything is fine.
LLVM’s noalias, informally.
The noalias flag in LLVM is used to tell LLVM that it can apply stronger optimizations around a reference. It’s the main workhorse for rustc’s optimizations around mutable references, since rustc marks mutable references in functions arguments as noalias. But how does noalias help LLVM optimize? What does noalias even mean? The answer can be found in LLVM’s reference, which states:
[
noalias] indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. If there are other accesses not based on the argument or return value, the behavior is undefined.
This sentence is a prime example of “standard legalese” that is very reminiscent of the C standard. To understand it, let’s look at the following C function and pretend that it is LLVM bitcode, where x is noalias:
void foo(int * noalias x, int * y) {
// we know x and y actually point to the same memory (see main)
int* x2 = x; //x2 is based on x, but not on y
int* y2 = y; //y2 is not based on x, as y is not based on x
// we can allow this:
int tmp = *x2 + *y2;
// or alternatively this:
*x2 = 42;
// or alternatively this:
*y2 = 42;
// but doing any of them in combination is forbidden!
return; // after returning, the restrictions all end.
}
void main() {
int a = 42;
foo(&x, &x);
}
The text I quoted talks about pointers “based on” other pointers, which alludes to some notion of pointer provenance. So when we compute x2 in the example above, it is “based on” x. Both x and y are based on the local variable a in main, but x or y are not on each other. (Note that this is my interpretation of what “based on” mean. There is sadly no formal definition.)
When we try to use both, the text clearly allows us to do one the following things (the three scenarios in the code above):
- If we only ever read from
xory, everything is fine. (Since no restrictions apply when we do not modify.) - If we write through
x(or a pointer based on it), then we may not access the memory with a “non-based-on” pointer (likey) - If we write through
y(or any other pointer not based onx), then may not even read that same memory throughx(or a pointer based onx).
And as long as we follow one of those patterns, we can eventually return from the function and all the restrictions go away.
Notice the Simiarity
If we replace “non-based-on” with “foreign” and “based on” with “local” in the above, then what we get is precisely what our protected state machine already enforces! And indeed, equating “based on” with “local” makes a lot of sense: A reference is local to some other reference if it was retagged from it, and similarly a pointer is “based on” another pointer when it is obtained from it in a similar way.
This similarity is the crux of this blog post, and it is there for a reason: after all, protectors exist to justify the optimizations LLVM does with noalias. And since we wanted to be as lenient as possible with Tree Borrows while still justifying what LLVM is doing, our protectors are basically a thin layer of abstraction over noalias, encoding essentially the same restrictions. There is however one exception.
When noalias Ends
In Rust, when a protector ends, we revert back to the unprotected state machine. In LLVM, there are no restrictions on how pointers can be aliased, i.e. it only has “raw pointers.” Thus, in our model of LLVM, when noalias ends, instead of falling back to Rust’s state machine for references, we need to fall back to a maximally permissive state machine where UB is never triggered. This is easily accomplished by a “dummy” state machine with only one state, where all transitions self-loop back into that one state, never causing UB.
Such a dummy state machine is not unprecedented: I supervised a Bachelor’s student this year, who added such a state to Tree Borrows to handle interior mutable references, allowing us to do more finely grained tracking of which bytes are covered by an UnsafeCell. You can look at the dummy state machine in Figure 3.2 of Xinglu’s Bachelor’s thesis.
The Proposed Formal Model
So, our formal model for LLVM’s noalias is that there is a tree, just like in Tree Borrows, and that each pointer corresponds to one such node via pointer provenance, just like in Tree Borrows. Formally, a reference is then “based on” another reference if it’s tree node is a child of the other’s tree node, corresponding to local nodes in Tree Borrows. (In fact, we propose using the Tree Borrows terminology here.) The main difference is that we only create a new node in the “borrow tree” when a function is called where an argument is annotated as noalias. This node’s state machine is precisely the unabridged “protected state machine” shown above. Reads, writes and deallocations traverse the tree like in Tree Borrows, driving the state machines forward in the same way. When the protectornoalias ends, we replace this state machine in the corresponding node with the previously described “dummy” singleton state machine.
Protector End Actions
In Tree Borrows, we had protector end actions, necessary for complicated reasons. (Compare Sections 3.2 and 5 of the paper.) These remain necessary, and work the same way: Unique triggers writes, ReservedL(F) and FrozenL triggers reads, and the other states trigger nothing since they have not yet been locally accessed.
Like in Tree Borrows, these end actions are “special” in that they don’t affect children of the node which was protected.
Data Races and Weak Memory
We did not consider the interaction of data race UB and aliasing model UB in our paper. This interaction is highly nontrivial, since aliasing optimizations can create new writes, and these new writes can create data races. So the data race model must already be aware of the “potential writes” that the aliasing model could introduce. Modelling this is out of scope for this blog post, as it was out of scope for the Tree Borrows paper.
Similarly, under weak memory, it does not make a lot of sense to speak about “the” protector end, since “time is relative.” Instead, one perhaps needs to assign a vector clock timestamp to the protector ending, and then consider for each event whether the protector end happened-before? I am by no means an expert on weak memory, so this also remains future work. If you have some ideas here, feel free to tell me. Other people are also thinking about this, see e.g. this GitHub issue.
Other Things not Convered
Of course, an actual model of LLVM would have to answer a lot more questions. For example, how does noalias interact with other flags like dereferenceable? How does noalias interact with other aliasing-based constraints in the IR, like getelementptr inbounds? This blog post is mostly about the cute observation about noalias and that it’s hopefully a starting-off point for actual research.
Parting Thoughts
The fact that Tree Borrows “kind of” describes noalias is intended, since noalias also “kind of” defines the optimizations that rustc performs. Nonetheless, we were a bit surprised when we first realized that Tree Borrows basically contains the precise state machine needed to describe noalias, instead of an overapproximation.
For further thought, the informal definition of noalias is heavily inspired by C’s definition of restrict, and is intended to have the same semantics. We thus also conjecture that the proposed model here is an operational definition of restrict. Re-reading our Tree Borrows paper now, I realize that we actually claimed in there that “we suspect that Tree Borrows could be used as the basis for a formal definition of restrict.” While I could have just as well written this post about C, formally specifying C seems like a thankless (pointless?) endeavour to me. Formal models for LLVM, on the other had, are successfully used in practice and extending some of these models with noalias support seems a lot more exciting.