# Resolution Proofs

This document contains a two-class sequence on resolution proofs and how they relate to boolean solvers. **This material will be directly useful in your 2nd SAT homework.**

## Context: Proofs vs. Instances

Almost all of our work in 1710 so far has focused on *satisfiability*: given constraints, how can they be satisfied? Our conversations have a character that puts instances first---how are they related, how can they be changed, how can a partial instance be completed, etc. In the field of logic, this is called a *model-theoretic* view.

But there's a second perspective, one that focuses on necessity, deduction, and contradiction---on justifying unsatisfiability with *proof*. Today we'll start exploring the proof-theoretic side of 1710. But we'll do so in a way that's immediately applicable to what we already know. In particular, by the time we're done with this week, you'll understand a basic version of how a modern SAT solver can return a *proof* of unsatisfiability. This proof can be processed to produce cores like those Forge exposes via experimental `solver`

, `core_minimization`

, etc. options.

We'll start simple, from CNF and unit propagation, and move on from there.

## A Chain Rule For CNF

Suppose I know two things:

- it's raining today; and
- if it's raining today, we can't hold class outside.

I might write this more mathematically as the set of known facts: ${r,r⟹¬c}$, where $r$ means "rain" and $c$ means "class outside".

Because the vast majority of the materials you might find on this topic are written using mathematical notation, I'm using $⟹$ for `implies`

and $¬$ for `not`

. If you go reading elsewhere, you might also see $∧$ for `and`

and $∨$ for `or`

.

Given this knowledge base, can we infer anything new? Yes! We know that if it's raining, we can't hold class outside. But we know it's raining, and therefore we can conclude class needs to be indoors. This intuition is embodied formally as a logical *rule of inference* called *modus ponens*:

$BA,A⟹B $

The horizontal bar in this notation divides the inputs to the rule from the outputs. For *any* $A$ and $B$, if we know $A$, and we know $A⟹B$, we can use modus ponens to deduce $B$.

Modus ponens is very closely related to unit propagation. Indeed, that connection is what we're going to leverage to get our solvers to output proofs.

I like to think of rules of inference as little enzymes that operate on formula syntax. Modus ponens recognizes a specific pattern of syntax in our knowledge base and *rewrites* that pattern into something new. We can check rules like this for validity using a truth table:

$A$ | $B$ | $A⟹B$ |
---|---|---|

0 | 0 | 1 |

0 | 1 | 1 |

1 | 0 | 0 |

1 | 1 | 1 |

In any world where both $A$ and $A⟹B$ are true, $B$ must be true.

### Beyond Modus Ponens

Suppose we don't have something as straightforward as ${r,r⟹¬c}$ to work with. Maybe we only have:

- if it's raining today, we can't hold class outside; and
- if Tim is carrying an umbrella, then it's raining today.

That is, we have a pair of implications: ${u⟹r,r⟹¬c}$. We cannot conclude that it's raining from this knowledge base, but we can still conclude something: that *if* Tim is carrying an umbrella, *then* we can't hold class outside. We've learned something new, but it remains contingent: $u⟹¬c$.

This generalization of modus ponens lets us chain together implications to generate new ones:

$A⟹CA⟹B,B⟹C $

Like before, we can check it with a truth table. This time, there are 8 rows because there are 3 inputs to the rule:

$A$ | $B$ | $C$ | $A⟹B$ | $B⟹C$ | $A⟹C$ |
---|---|---|---|---|---|

0 | 0 | 0 | 1 | 1 | 1 |

0 | 0 | 1 | 1 | 1 | 1 |

0 | 1 | 0 | 1 | 0 | 1 |

0 | 1 | 1 | 1 | 1 | 1 |

1 | 0 | 0 | 0 | 1 | 0 |

1 | 0 | 1 | 0 | 1 | 1 |

1 | 1 | 0 | 1 | 0 | 0 |

1 | 1 | 1 | 1 | 1 | 1 |

**Note:** This rule works no matter what form $A$, $B$, and $C$ take, but for our purposes we'll think of them as literals (i.e., variables or their negation).

## Propositional Resolution

The *resolution rule* is a further generalization of what we just discovered. Here's the idea: because we can view an "or" as an implication, we should be able to apply this idea of chaining implications to *clauses*.

First, let's agree on how to phrase clauses of more than 2 elements as implications. Suppose we have a clause $(l_{1}∨l_{2}∨l_{3})$. Recall that:

- a clause is a big "or" of literals;
- a literal is either a variable or its negation; and
- $∨$ is just another way of writing "or".

We might write $(l_{1}∨l_{2}∨l_{3})$ as an implication in a number of ways, e.g.:

- $(l_{1}∨l_{2}∨l_{3})≡(¬l_{1}⟹(l_{2}∨l_{3}))$
- $(l_{1}∨l_{2}∨l_{3})≡(¬l_{2}⟹(l_{1}∨l_{3}))$
- $(l_{1}∨l_{2}∨l_{3})≡(¬l_{3}⟹(l_{1}∨l_{2}))$
- $(l_{1}∨l_{2}∨l_{3})≡((¬l_{1}∧¬l_{2})⟹l_{3})$
- $(l_{1}∨l_{2}∨l_{3})≡((¬l_{1}∧¬l_{3})⟹l_{2})$
- $(l_{1}∨l_{2}∨l_{3})≡((¬l_{2}∧¬l_{3})⟹l_{1})$

So if we have a large clause, there may be more ways of phrasing it as an implication than we'd want to write down. Instead, let's make this new rule something that works on clauses directly.

How would we recognize that two clauses can be combined like the above? Well, if we see something like these two clauses:

- $(l_{1}∨l_{2})$; and
- $(¬l_{1}∨l_{3})$ then, if we wanted to, we could rewrite them as:
- $(¬l_{2}⟹l_{1})$; and
- $(l_{1}⟹l_{3})$ and then apply the rule above to get:
- $(¬l_{2}⟹l_{3})$. We could then rewrite the implication back into a clause:
- $(l_{2}∨l_{3})$.

Notice what just happened. The two opposite literals have cancelled out, leaving us with a new clause containing *everything else* that was in the two original clauses.

$(A∨C)(A∨B),(¬B∨C) $

This is called the *binary propositional resolution rule*. It generalizes to something like this (where I've labeled literals in the two clauses with a superscript to tell them apart):

$(l_{2}∨l_{n}∨l_{1}∨...∨l_{m})(l_{1}∨l_{2}∨...∨l_{n}),(¬l_{1}∨l_{1}∨...∨l_{m}) $

This rule is a very powerful one. In particular, since unit propagation is a basic version of resolution (**think about why!**) we'll be able to use this idea in our SAT solvers to *prove* why an input CNF is unsatisfiable.

### Resolution Proofs

What is a proof? For our purposes today, it's a tree where:

- each leaf is a clause in some input CNF; and
- each internal node is an application of the resolution rule to two other nodes.

Here's an example resolution proof that shows the combination of a specific 4 clauses is contradictory:

This tree is not a paragraph of text, and it isn't even a picture written on a sheet of paper. It is a *data structure*, a computational object, which we can process and manipulate in a program.

### Soundness and Completeness

Resolution is a *sound* proof system. Any correct resolution proof tree shows that its root follows unavoidably from its leaves.

Resolution is also *refutation complete*. For any unsatisfiable CNF, there exists a resolution proof tree whose leaves are taken from that CNF and whose root is the empty clause.

Resolution is **not** complete in general; it can't be used to derive *every* consequence of the input. There may be other consequences of the input CNF that resolution can't produce. For a trivial example of this, notice that we cannot use resolution to produce a tautology like $(l_{1}∨¬l_{1}$ from an empty input, even though a tautology is always true.

### Getting Some Practice

Here's a CNF:

```
(-1, 2, 3)
(1)
(-2, 4)
(-3, 5)
(-4, -2)
(-5, -3)
```

Can you prove that there's a contradiction here?

## Prove, then click!

Let's just start applying the rule and generating everything we can...

Wow, this is a lot of work! Notice two things:

- we're going to end up with the same kind of 4-clause contradiction pattern as in the prior example;
- it would be nice to have a way to guide generation of the proof, rather than just generating
*every clause we can*. An early form of DPLL did just that, but the full algorithm added the branching and backtracking. So, maybe there's a way to use the structure of DPLL to guide proof generation...

Notice that one of the resolution steps you used was, effectively, a unit propagation step. This should help motivate the idea that unit propagation (into a clause were the unit is negated) is a special case of resolution:

$(...)(A),(¬A∨...) $

You might wonder: what about the other aspect of unit propagation---the removal of clauses entirely when they're subsumed by others?

## Think, then click!

This is a fair question! Resolution doesn't account for subsumption because proof is free to disregard clauses it doesn't need to use. So, while subsumption will be used in any solver, it's an optimization.

Don't try to "run resolution on multiple variables at once". To see why not, try resolving the following two clauses:

```
(1, 2, 3)
(-1, -2, 4)
```

It might initially appear that these are a good candidate for resolution. However, notice what happens if we try to resolve on both `1`

and `2`

at the same time. We would get:

```
(3, 4)
```

which is *not* a consequence of the input! In fact, we've mis-applied the resolution rule. The rule says that if we have $(A∨B)$ and we have $(¬A∨C)$ we can deduce $(B∨C)$. These letters can correspond to any formula---but they have to match! If we pick (say) $1∨2$ for $A$, then we need a clause that contains $¬A$, which is $¬(1∨2)$, but that's not possible to see in clause; a clause has to be a big "or" of literals only. So we simply cannot run resolution in this way.

Following the rule correctly, if we only resolve on one variable, we'd get something like this:

```
(2, -2, 3, 4)
```

which is always true, and thus usless in our proof, but at least not unsound. So remember:

- always resolve on
*one variable at a time*; and - if resolution produces a result that's tautologous, you can just ignore it.

## Learning From Conflicts

Let's return to that CNF from before:

```
(-1, 2, 3)
(1)
(-2, 4)
(-3, 5)
(-4, -2)
(-5, -3)
```

Instead of trying to build a *proof*, let's look at what your DPLL implementations might do when given this input. I'm going to try to sketch that here. Your own implementation may be slightly different. (That doesn't necessarily make it wrong!) If you've started the SAT1 assignment, then open up your implementation as you read, and follow along. If not, note down this example.

- Called on:
`[(-1, 2, 3), (1), (-2, 4), (-3, 5), (-4, -2), (-5, -3)]`

- Unit-propagate
`(1)`

into`(-1, 2, 3)`

to get`(2, 3)`

- There's no more unit-propagation to do, so we need to branch. We know the value of
`1`

, so let's branch on`2`

and try`True`

first. - Called on:
`[(2), (2, 3), (1), (-2, 4), (-3, 5), (-4, -2), (-5, -3)]`

- Unit-propagate
`(2)`

into`(-2, 4)`

to get`(4)`

. - Remove
`(2, 3)`

, as it is subsumed by`(2)`

. - Unit-propagate
`(4)`

into`(-4, -2)`

to get`(-2)`

. - Remove
`(-2, 4)`

, as it is subsumed by`(4)`

. - Unit-propagate
`(-2)`

into`(2)`

to get the empty clause.

Upon deriving the empty clause, we've found a contradiction. *Some part* of the assumptions we've made so far (here, only that `2`

is `True`

) contradicts the input CNF.

If we wanted to, we could learn a new clause that applies `or`

to ("disjoins") all the assumptions made to reach this point. But there might be many assumptions in general, so it would be good to do some sort of fast blame analysis: learning a new clause with 5 literals is a lot better than learning a new clause with 20 literals!

Here's the idea: we're going to use the unit-propagation steps we recorded to derive a resolution proof that the input CNF *plus any current assumptions* lead to the empty clause. We'll then reduce that proof into a "conflict clause". This is one of the key ideas behind a modern improvement to DPLL: CDCL, or Conflict Driven Clause Learning. We won't talk about all the tricks that CDCL uses here, nor will you have to implement them. If you're curious for more, consider shopping CSCI 2951-O. For now, it suffices to be aware that **reasoning about why a conflict has been reached can be useful for performance.**

In the above case, what did we actually use to derive the empty clause? Let's work *backwards*. We'll try to produce a linear proof where the leaves are input clauses or assumptions, and the internal nodes are unit-propagation steps (remember that these are just a restricted kind of resolution). We ended with:

- Unit-propagate
`(-2)`

into`(2)`

to get the empty clause.

The `(2)`

was an assumption. The `(-2)`

was derived:

- Unit-propagate
`(4)`

into`(-4, -2)`

to get`(-2)`

.

The `(-4, -2)`

was an input clause. The `(4)`

was derived:

- Unit-propagate
`(2)`

into`(-2, 4)`

to get`(4)`

.

The `(-2, 4)`

was an input clause. The `(2)`

was an assumption.

Now we're done; we have a proof:

Using only those two input clauses, we know that assuming `(2)`

won't be productive, and (because we have a proof) we can explain why. And, crucially, because the proof is a data structure, we can manipulate it if we need to.

## Explaining Unsatisfiable Results

This is promising: we have a *piece* of the overall proof of unsatisfiability that we want. But we can't use it alone as a proof that the *input* is unsatisfiable: the proof currently has an assumption `(2)`

in it. We'd like to convert this proof into one that derives `(-2)`

---the negation of the assumption responsible---from *just* the input.

Let's rewrite the (contingent, because it relies on an assumption the solver tried) proof we generated before. We'll *remove* assumptions from the tree and recompute the result of every resolution step, resulting in a proof of something weaker that isn't contingent on any assumptions. To do this, we'll recursively walk the tree, treating inputs as the base case and resolution steps as the recursive case. In the end, we should get something like this:

Notice that we need to re-run resolution *after processing each node's children* to produce the new result for that node. This suggests some of the structure we'll need:

- If one child is an assiumption, then "promote" the other child and use that value, without re-running resolution. (
**Think: Why is this safe to this? It has to do with the way DPLL makes guesses.**) - Otherwise, recur on children first, then re-run resolution on the new child nodes, then return a new node with the new value.

Break down these operations into small helper functions, and write test cases for each of them. Really! It's easy for something to go wrong somewhere in the pipeline, and if your visibility into behavior is only at the top level of DPLL, you'll find it *much* harder to debug issues.

Remember that you can use PBT on these helpers as well. The assignment doesn't strictly require it, but it can be quite helpful. Use the testing tools that are available to you; they'll help find bugs during development.

#### Takeaway

This should illustrate **the power of being able to treat proofs as just another data structure**. Resolution proofs are just trees. Because they are trees, we can manipulate them programatically. We just transformed a proof of the empty clause via assumptions into a proof of something else, without assumptions.

This is what you'll do for your homework.

### Combining sub-proofs

Suppose you ran DPLL on the false branch (assuming `(-2)`

) next. Since the overall input is unsatisfiable, you'd get back a proof of `(2)`

from the inputs. And, given a proof tree for `(-2)`

and a proof tree for `(2)`

, how could you combine them to show that the overall CNF is unsatisfiable?

## Think, then click!

Just combine them with a resolution step! If you have a tree rooted in `(2)`

and another tree rooted in `(-2)`

, you'd produce a new resolution step node with those trees as its children, deriving the empty clause.

### Property-Based Testing (the unsatisfiable case)

Given one of these resolution proofs of unsatisfiability for an input CNF, you can now apply PBT to your solver's `False`

results, because they are no longer *just* `False`

.

What properties would you want to hold?

## Think, then click!

For the tree to prove that the input is unsatisfiable, you'd need to check:

- the internal nodes of the tree are valid resolution steps;
- the leaves of the tree are taken only from the input clauses; and
- the root of the tree is the empty clause.