I fixed a major issue in a piece of code this month. It’s not the first time I’ve tried to attack the problem, either: I’ve revisited it multiple times over the past year, with no apparent progress. The fix also took much less time to discover and implement than I’d spent on those prior attempts.

The worst part is that I found the fix by doing what I tell my students to do all the time, but hadn’t been doing myself this in case. I’m embarrassed! But, as a beloved math teacher of mine used to say: we’re only limited by our own embarrassment. So rather than silently pushing the fix and moving on, I’d like to reflect on the process. It might be useful.

The Problem

I do development work on Forge, a lightweight formal-methods tool inspired by Alloy and built for education. Students have used Forge to model and reason about all kinds of systems, from page tables to networks to blockchain.

Forge lets me write reusable helper functions to define concepts from the domain I’m modeling, like this one copied from a model of leader-election in Raft that I’ve been working on:

/** The initial startup state for the cluster */
pred init {
    all s: Server | {      // all servers in the cluster...
        s.role = Follower  // start in the Follower role,
        no s.votedFor      // haven't made any votes yet, and
        s.currentTerm = 0  // start in the initial term of office.
    } 
}

Since predicates can be used in other predicates, it’s possible that someone might write a predicate that calls itself, or otherwise create a reference cycle. Here’s an example, where I might be trying to solve one of my early homework problems in family trees.

// An attempt to define ancestry using recursion in Forge
pred ancestor_of[older, younger: Person] {
    older = younger.mother or 
    older = younger.father or 
    ancestor_of[older, younger.mother] or
    ancestor_of[older, younger.father]
}

In a programming language, a function calling another function is perfectly reasonable. Indeed, in a logic-programming language like Prolog, this sort of recursive predicate is standard operating procedure!

But Forge isn’t a programming language; it’s a lightweight specification language. Its power is deliberately limited to ease automation. It has no heap and no call stack, and no recursive constraints. Its engine turns everything into a boolean satisfiability problem. So, while we might imagine unrolling recursion to a particular depth (which would often scale poorly) the usual unbounded recursive intuitions simply don’t fit.

Of course, it’s quite useful to have predicates call other predicates. In my Raft model, I might want a predicate that defines traces in the transition system. It should invoke the init predicate I defined above, but also encode possible transitions. Using Forge’s temporal sublanguage, I can use temporal operators like always and eventually to describe traces:

pred electionSystemTrace {
    init        // Start in an initial state for the system
    always { 
        // In every state, always step forward according to the system definition
        ... ... // list of potential transitions omitted for brevity
    }
}

Forge supports this via substitution. So electionSystemTrace expands to:

all s: Server | { 
    s.role = Follower
    no s.votedFor    
    s.currentTerm = 0 
} 
always { 
    ... ...
}

This is fine if there can never be a cycle in predicate references. But if someone writes a recursive predicate in Forge, and the tool doesn’t catch the reference cycle before trying to build the corresponding solver constraints, Forge could end up substituting over and over again—forever.

Detecting cyclic references is a well-known problem! We could walk the parse tree of the language as soon as it’s parsed, searching for cycles—but I wanted to be able to catch the problem later on. This would be useful in case we eventually want to support bounded recursion (as Alloy does) and also as a backup if the initial walk missed something. Implementing that check is what caused me so much trouble.

The Problem With The Problem

Why did I struggle with this for so long?

Macros. Macros are a language feature that transform syntax before it is actually executed as code, and they’re important in Forge:

  • They are a useful way to expand the abstract syntax of the surface language (like the pred definition above) into valid Racket code that builds constraints, typechecks them, and prepares them to be sent to the solver.
  • They let us capture syntax location information from what the user wrote. If Forge knows what line and column number a predicate is defined on or being called from, etc. it can produce much better error messages.

Taken together, this means that down in the belly of the tool, there’s a macro for turning the abstract syntax of pred definitions into code (while capturing declaration-site information). And that generated code is itself a macro for the invoking the predicate itself (and which captures call-site information).

Expansion Time vs. Run Time

You don’t want to use macros for everything, though. It’s best practice to expand syntax to a runtime procedure call, and put the bulk of the work in the procedure, to avoid code duplication and code size bloat. This is important!

To see why, consider Forge’s if-then-else expression, which looks like C => T else F, where C is a branching condition. Here’s an example from a domain model of crypto protocols:

// Evaluate to the inverse of key `k`. If it's a public key...
k in PublicKey => 
    // produce the left component of the pertinent key pair 
    ((KeyPairs.pairs).k)
else
    // otherwise, produce the right component. 
    (k.(KeyPairs.pairs))

But the same syntactic form can produce constraints rather than values. If I wanted to enforce that some other key k2 was in fact the inverse of k, I’d write this:

// Enforce `k2` is the inverse of `k`. If `k` is a public key...
k in PublicKey => 
    // require k2 to be the value in the left component of the key pair
    k2 = ((KeyPairs.pairs).k)
else 
    // otherwise, k2 must be the value in the right component
    k2 = (k.(KeyPairs.pairs))

These are handled differently in the engine, and so to check for consistent use, Forge needs to infer which case it is working with—even though the syntax is the same across the two cases. We can’t always use enclosing context to resolve this ambiguity, either: users can ask Forge to evaluate arbitrary syntax when viewing an output instance.

Back in the early development days of Forge, we had a macro for if-then-else that expanded into too much code. It didn’t cause a problem for examples like this one, but imagine nesting multiple if-then-else expressions. If I wrote:

C1 => X else (C2 => Y else Z)

it would expand to (a more verbose version of) this runtime code, which has to disambiguate between the value- and constraint-producing cases.

(if (is-constraint? X) 
    (build_ITE_formula C1 X (C2 => Y else Z))
    (build_ITE_expr    C1 X (C2 => Y else Z)))

which then needs to be expanded twice more, for the inner if-then-elses:

(if (is-constraint? X) 
    (build_ITE_formula C1 X (if (is-constraint? Y)
                                (build_ite_formula C2 Y Z)
                                (build_ite_expr    C2 Y Z)))
    (build_ITE_expr    C1 X (if (is-constraint? Y)
                                (build_ite_formula C2 Y Z)
                                (build_ite_expr    C2 Y Z))))

Notice that the original expression had a single occurance each of C1, X, and Y. Now there are many more of each. We were accidentally encoding the entire decision tree in the expanded code (even though at runtime, only a single path down the tree would be taken)!

But if we instead defined a runtime procedure to handle the disambiguation:

(define (INTERNAL_ITE_CHOOSE C T F)
  (if (is-constraint? T) 
    (build_ITE_formula C T F)
    (build_ITE_expr    C T F)))

and expand

C1 => X else (C2 => Y else Z)

to an invocation of that procedure:

(INTERNAL_ITE_CHOOSE C1 X (C2 => Y else Z))

we see a very different final version:

(INTERNAL_ITE_CHOOSE C1 X (INTERNAL_ITE_CHOOSE C2 Y Z))

The decision tree is now implicit, not explicit. We lose nothing in the change. After all, we only need the macro for recording syntax-location information.

We didn’t invent this idea ourselves! “Don’t use a macro where a procedure will do” is one of the standard pieces of advice given to new Racket-macro authors. But it can also complicate the mental model of what’s happening to code, and whether it happens at macro-expansion time or runtime. That’s what happened to me with this issue and the macro-generating-macro pred.

Simplify, Simplify, Simplify

I picked up this task several times. Each time I was fixating too much on trying to get Racket to keep track of a stack at macro-expansion time. I spent ages in the documentation, wrestling with phase levels and parameters and so on, but never managed to get cycle-detection to work properly.

If only I had just stepped back and thought about the problem, not the walls of the hole I had dug myself into. I was so fixated on the macros: I knew that they were very complicated, so I probably didn’t understand them completely. And macros were the context where my cycle-detection fix wasn’t working, so that must have been the right place to focus my effort. (It wasn’t.)

This time, I did step back. I decided to deploy the above runtime-procedure trick even when it didn’t seem to be necessary, eliminating infinite macro expansion as a possibility and deferring cycle checking to those new runtime procedures. At runtime, setting up a tracking stack via Racket’s parameterize was a simple matter. 10 minutes later, my test cases were all throwing the proper errors.

Takeaway

This is a good reminder to force myself to alternate between high-level and low-level when I’m debugging; to beware of the sunk cost fallacy and availability bias; to not follow an unproductive path for too long without exploring alternatives; to treat the task like a science problem and make hypotheses, eliminate complexity, and check to see if I’m right. I should work to really follow the recipe I give in class, rather than only saying the words.

When I was applying to PhD programs, I read this advice from Ronald Azuma. There’s a part of it that has stuck with me ever since, about how he felt right after graduating:

I was sorely disappointed that the heavens didn‘t part, with trumpet-playing angels descending to announce this monumental occasion. Upon hearing this observation, Dr. Fred Brooks (one of my committee members) commented, “And the sad fact is, you’re no smarter today than you were yesterday.” “That’s true,” I replied, “but the important thing is that I am smarter than I was six and a half years ago.”

Our brains do a lot of work in the background, and often that work happens after much repetition and failure. I’m not smarter than I was yesterday, but maybe I’m smarter than I was a year ago.

Thanks to my Forge co-author Ben Greenman for teaching me the expand-to-a-procedure technique.

"There is hope in honest error: none in the icy perfections of the mere stylist."

- Charles Rennie Mackintosh