Intro#

While reading on nonstandard models of arithmetic, I recalled a fact that had come up once in a class on logic. While a formal language with many connectives is nice to work in, it becomes unwieldy to work on, requiring more inductive cases to be covered in the process of proving whatever fact you’re interested on. Because of this, we chose to work on a very simple language with only the connectives for not (\(\neg\)) and implies (\(\rightarrow\)).

I started wondering what was actually going on here: why is it that when we start composing some sets of functions, they express more than others? To phrase it in another way: how many types of electrical logic gates do we need to make a computer? Since it felt accessible enough I took it up as a small self study project and a chance to work on my Python.

Universal Sets & Sheffer Functions#

Definitions and Examples#

A Boolean function is a function \(f: \{0,1\}^n \to \{0,1\}\) from the Boolean cube \(\mathbb{B}^n := \{0,1\}^n\) to the two element set \(\{0,1\}\). It’s called the Boolean cube since it looks like the n-dimensional version of one if you define a partial order on its elements like so:

\[(x_1,...x_n) \leq (y_1,...,y_n) \mbox{ if } x_i \leq y_i \mbox{ for some }i \]

Fig 1: Hasse diagram of the Boolean cube \(\mathbb{B}^3\)

Fig 2: The closely related Boullion cube

While there are many interpretations we can give to a Boolean function, the way they were introduced to me was with logical connectives. If I have formulas X and Y I can combine them together into a new statement using these connectives. Based on the truth values of the component formulas, I can also figure out the truth value of the new, bigger statement that has been made.

Since we’re working with a finite domain and range, logical connectives are usually defined using a truth table which is just an explicit listing of a function’s values on every input in its domain. Later on though we will come across arithemtical representations of Boolean functions (which will be polynomials over \(\mathbb{R}\) or \(\mathbb{Z}/2\mathbb{Z}\)). Some common logical connectives are:

  • Constants (\(\mathbb{0}\),\(\mathbb{1}\)): These functions return the same value no matter what input.

  • Projections (\(\pi^{n}_{j}\)): These functions give back the value of one input, and ignore the others. In general we’ll call \(\pi^{n}_{j}\) the projection that outputs the \(j\)th of a total of \(n\) variables. When it’s clear from the context what the arity of the projection is, we’ll omit the superscript and write \(\pi_{j}\).

  • AND (\(\wedge\)): This function returns 1 when both of its inputs are 1.

  • OR (\(\vee\)): This function is 1 when at least one of its inputs is 1.

  • IF (\(\rightarrow\)): This function is only false when the first argument is true, but the second is false. Think of it as saying “X implies Y”.

  • IFF (\(\leftrightarrow\)): Returns 1 when both inputs have the same value.

  • XOR (\(\oplus\)): Returns 1 when exactly one input is 1.

Here are the truth tables for these functions.

\( x \) \(y\) \(\mathbb{0}\) \(\mathbb{1}\) \(\pi_{0}(x,y)\) \(\pi_{1}(x,y)\)
1 1 0 1 1 0
1 0 0 1 1 1
0 1 0 1 0 0
0 0 0 1 0 1
\( x \) \(y\) \(x\wedge y\) \(x\vee y\) \(x\rightarrow y\) \(x\leftrightarrow y\) \(x\oplus y\)
1 1 1 1 1 1 0
1 0 0 1 0 0 1
0 1 0 1 1 0 1
0 0 0 0 1 1 0

Note that there are \(2^{2^n}\) boolean functions from \(\mathbb{B}^n\) to \(\{0,1\}\).

Throughout the rest of this, we’re going to use whatever interpretation suits us best based on the context, and will (for now) follow the convention that True is equivalent to 1, and False to 0. We’ll also use the symbol \(\Leftrightarrow\) as a meta-logical symbol to say that two formulas are logically equivalent.

Disjunctive Normal Form#

It turns out that \(\wedge,\vee, \) and \(\neg\) are all that we need to create any Boolean function, and the truth table representation is the key to recognizing this:

  • Each row in the truth table is a possible evaluation of the function.
  • For any evaluation of the function, it corresponds to exactly one of the rows in the truth table. Meaning that no matter in what scenario we find ourselves, its either described by row 1, or row 2, or … etc.
  • Within each row, we are saying \(x_1\) will take this value and \(x_2\) will take this value andand the function overall will take this value.
  • Finally, since we’re working with just two truth values, this simplifies to saying that either \(x_i\) is true or \(x_i\) is false.

So from the above, it kind of looks like we can represent a row our function as a conjunction of bunch of atoms (terms and their negations), and that we can represent the function as a whole by taking a disjunction over the conjunctions for each row. This motivates the following definitions:

  • An atomic term is just a variable.

  • A literal is a term, or the negation of a term.

  • A formula is in disjunctive normal form if it is a disjunction of one or more conjunctions of one or more literals.

  • A formula is in conjunctive normal form if it is a conjunction of one or more disjunctions of one or more literals.

A more formal proof:

Given some Boolean function \(f\), consider row \(i\) of its truth table. Let \(A_{ij} = x_j\) if \(x_j\) is assigned 1 in row \(j\), or \(\neg x_j\) otherwise. For each row, we’ll construct a formula depending on the value of \(f\) on that row. Let \(\overline{x}\) be a tuple with the valuation corresponding to row \(j\). Then define our new formula \(R_i\) to be

\[R_i(\overline{x}) := \bigwedge_{j=1}^n A_{ij}\ \mbox{ if } f(\overline{x}) = 1\]

\[R_i(\overline{x}) := \bigwedge_{j=1}^n A_{ij} \wedge (p \wedge \neg p) \mbox{ if } f(\overline{x}) = 0 \]

where \(p\) is a new variable appearing nowhere in the \(A_{ij}\)’s.

Finally, let

\[\phi(\overline{x}) := \bigvee_{i=i}^m R_i(\overline{x})\]

Let \(\overline{x}\) be a tuple that corresponds to the evaluation of \(f\) on the \(i\)th row of its truth table: then we have \(R_i(\overline{x}) = f(\overline{x})\). Consider the value of \(f(\overline{x})\):

  • If this value is 1, \(R_i\) specifies the conditions on the variables that would happen in row \(i\).
  • If this value is 0, the dummy clause \(p \wedge \neg p\) ensures that \(R_i\) will be false.

Finally, we know at most one of the \(R_i\)s will be true at a time, since in a truth table we consider each evaluation of the function exactly once. Thus \(\phi(\overline{x})\) is a formula in disjunctive normal form that is true exactly when \(f\) is. \(\blacksquare\)

  1. Technically we don’t need the part where we add in a \(p \wedge \neg p\) clause. The proof goes through without this fine if we exclude \(R_i\)’s corresponding to rows where \(f\) is false from the formula \(\phi\), and this is the typical approach. If we’re concerned with the size of our formula, its more efficient to leave this clause out, and reduce the constant cases to \(\bot\) or \(\top\).

  2. This proof also shows that the set \( \{\neg, \wedge, \oplus\} \) is universal. Since only a single disjunct will be true at a time (if any), xor’ing them all together will still return a statement that is true whenever the original one is.

Conjunctive Normal Form#

It’s not hard to show that we can convert a formula to CNF as well using the following properties:

Distributivity

\[A\wedge(B\vee C) \Leftrightarrow (A\wedge B)\vee (A\wedge C) \]

\[ A\vee (B\wedge C) \Leftrightarrow (A \vee B) \wedge (A \vee C) \]

De Morgan’s Law’s

\[\neg(A\wedge B) \Leftrightarrow \neg A \vee \neg B\]

\[\neg(A\vee B) \Leftrightarrow \neg A \wedge \neg B \]

Double Negative Elimination

\[\neg \neg A \Leftrightarrow A\]

There is also an intuitive way to understand CNF as the logical dual of DNF. Where DNF looks at the rows of the truth table where the function is 1, CNF starts by looking at where the rows are 0 and in effect saying “If we want this function to be true, then all of the following valuations must be false”.

Other Ways to Universality#

Outside of a direct proof that a set of connectives is universal, we can do a proof by reduction: if we show that a set of connectives can be used to make every connective in some universal set, then our original set is also universal.

It’s worth noting here that although the sets \(\{\neg,\vee,\wedge\}\) and \(\{\neg,\oplus,\wedge\}\) are functionally complete they are not minimally so. With De Morgan’s laws we can dispence with either \(\vee\) or \(\wedge\) and show that \(\{\wedge, \neg\}\) and \(\{\vee, \neg\}\) are universal sets. If we allow another type of Boolean function, the constants, it can also be shown that the set \(\{\oplus, \wedge, \mathbb{1} \}\) is functionally complete, since \(a\wedge \mathbb{1} \Leftrightarrow \neg a\). This is a form we’ll come back to called the algebraic normal form, since it will allow us to represent Boolean functions as arithmetic formulas over \(\mathbb{F}_2\).

Classifying Binary Sheffer Functions#

A Boolean function is a Sheffer function if it is universal on its own. Playing around with the functions some, it becomes clear that not every binary function will be able to do this: constants are obviously out, as are projections and their negations. The first hint we have in finding these is the fact that the sets \(\{\wedge, \neg\}\) and \(\{\vee, \neg\}\) are universal: something about the combined behavior of these connectives is “enough” to get us all possible behaviours. So let’s consider their compositions, \(A\uparrow B\) (NAND) and \(A\downarrow B\) (NOR). These are defined by the following truth tables

\( x \) \(y\) \(x \mbox{ } \uparrow y\)
1 1 0
1 0 1
0 1 1
0 0 1
\( x \) \(y\) \(x \mbox{ } \downarrow y\)
1 1 0
1 0 0
0 1 0
0 0 1

It’s not hard to verify the following logical equivalances for \(\uparrow\), showing its universality:

\[\neg(A) \Leftrightarrow A\uparrow A\]

\[ A \wedge B \Leftrightarrow ( A\uparrow B) \uparrow (A\uparrow B) \]

and similarly for \(\downarrow\).

Final Thoughts#

Already we’ve seen some interesting structure in the Boolean binary functions. We’ve seen that there is an “algebra” of how they combine together, and that some connectives are more expressive than others, to the point where certain binary functions can express any other function. This answers the question that started this article: if we were on a desert island full of semiconductors we would only need enough time and one type of logic gate to make a computer to shitpost on.

Next time we’ll take a look at confirming if there are other binary Sheffer functions other than \(\uparrow\) and \(\downarrow\).

Open Questions#

  1. How many other normal forms are there?
  2. If we’re working in a multi-valued logic (or equivalently considering functions from \(\{0,1,...m\}^n \to \{0,1,...m\}\), what sort of normal forms exist?
  3. Do we still have Sheffer functions when we work over more values, or is there something about a two element set that allows this?

Stumbling Blocks#

  1. I’m rusty when it comes to writing proofs.
  2. Intuitively understand what the CNF and DNF mean was tricky. This was one of those things I proved but didn’t get the first time around.