## a canonicity proof via gradient induction

A *canonical form* for a language defines a representative among an equivalent class of terms. It can help identify equivalent terms in the language. Here I present a proof for the canonicity of the *sum-over-product* normal form for arithmetics, to demonstrate an interesting technique that I call *induction over derivatives*. A more catchy name I thought of is *gradient induction*.

*untitled (reaper drone), trevor paglen, 2012, ICA/Boston*

First let's clarify what we mean by *canonical form*: when two expressions in a language, considered as programs, evaluate to the same result on all possible inputs, we say they are *semantically equivalent*. We therefore hope to find a “standard way” to write such expressions, so that when we rewrite any two expressions to the standard form, we can immediately tell if they are semantically equivalent by just looking at them. Such a standard form is considered *canonical* – we say that two terms, *e _{1}* and

*e*, share the same canonical form

_{2}*if and only if*they are

*semantically equivalent*, where

*semantically equivalent*means the terms always compute the same result given same inputs. Formally:

canonicalize(e) ≡_{1}canonicalize(e) ⇔ ∀_{2}.xeval(e,_{1}) =xeval(e,_{2})x

Here ≡ denotes syntactic equality, and = denotes semantic (value) equality. In our case, the expressions are in the language of arithmetics (+, *, ** x**, ℝ):

Definition. An arithmetic expression is either a variable, a constant, the sum of two expressions, or the product of two expressions.

And our normal form is essentially the standard polynomials:

Definition. Thesum-over-productnormal form of an arithmetic expression is the sum of products of literals, where a literal is either a variable or a constant. Furthermore, we combine monomials that only differ in their coefficients, e.g.2xy + 3xyis rewritten into5xy.

One can rewrite an expression to SoP with the standard laws (associativity, commutativity & distributivity). That is, we keep pulling + up and merge terms. For example, the SoP canonical form of *(x + z) (x + y)* is *x ^{2} + xy + xz + yz*.

Proposition. Thesum-over-product normal formis canonical:C≡_{sop}(e_{1})C⇔ ∀_{sop}(e_{2})x . eval(e=_{1}, x)eval(e_{2}, x)

*Proof*. The left-to-right direction can be proven by structural induction over the SoP normal form syntax, together with the fact that the standard rewrite rules we use preserve the semantics of arithmetics.

I now prove the contrapositive of the backward direction:

C≢_{sop}(e_{1})C⇒ ∃_{sop}(e_{2})x . eval(e≠_{1}, x)eval(e_{2}, x)

There are two cases for *C _{sop}(e_{1})* ≢

*C*: 1.

_{sop}(e_{2})*e*and

_{1}*e*differ in their constant term (e.g.

_{2}*e*=

_{1}*2xy + 4*and

*e*=

_{2}*3yz + 7*), and 2. otherwise (e.g.

*e*=

_{1}*2xy + 4*and

*e*=

_{2}*3yz + 4*). Note that we only look at the lonely constants, not the coefficients in other terms.

*Case 1* is simple, since if two expressions have different constants they evaluate to different results (i.e the constants themselves) on all-zero inputs.

To prove *case 2*, I break down the goal into two steps:

C≢_{sop}(e_{1})C⇒ ∃_{sop}(e_{2})y . ∂ e≠_{1}/ ∂ y∂ e_{2}/ ∂ y

and

∃

y . ∂ e≠_{1}/ ∂ y∂ e⇒ ∃_{2}/ ∂ y≠. eval(ex_{1},)xeval(e_{2},)x

Recall that ≠ is *semantic* inequivalence.

The latter is simple: pick ** x_{1}** and

**that only differ in the**

*x*_{2}*y*variable (from

*∂ y*above). Since the derivatives differ, we can always find a pair of

**and**

*x*_{1}**such that either**

*x*_{2}*eval(e*≠

_{1},**x**)_{1}*eval(e*≠

_{2},**x**) or eval(e_{1}_{1},**x**)_{2}*eval(e*.

_{2},**x**)_{2}To prove *C _{sop}(e_{1})* ≢

*C*⇒ ∃

_{sop}(e_{2})*y . ∂ e*≠

_{1}/ ∂ y*∂ e*, we perform induction

_{2}/ ∂ y**of the expressions, with our original proof goal as the inductive hypothesis:**

*over the derivatives**C*≡

_{sop}(e_{1})*C*⇔ ∀

_{sop}(e_{2})*x . eval(e*=

_{1}, x)*eval(e*. Since

_{2}, x)*C*≢

_{sop}(e_{1})*C*, we know ∃

_{sop}(e_{2})*y . ∂ e*≢

_{1}/ ∂ y*∂ e*(syntactically). Since the derivative of a canonical form is also canonical (not that obvious, but you'll see it after thinking a little harder), by our inductive hypothesis, ∃

_{2}/ ∂ y*y . ∂ e*≠

_{1}/ ∂ y*∂ e*(semantically).

_{2}/ ∂ yThe preceding induction is sound because taking the derivative makes any expression *simpler*, eventually bringing it to a constant. □

The main takeaway here is that, when we want to perform an inductive proof, we need every inductive step to make the terms *simpler*. Usually this is done by structural induction over the language definition; but when the language is differentiable, the derivative is another tool for inductively simplifying the terms. This can come handy for the PL researcher, since we now know a wide range of languages are differentiable – not just polynomials!

*p.s.* Haotian Jiang & Sorawee Porncharoenwase pointed out a much simpler proof: given two semantically equivalent arithmetic expressions, their difference is always zero. Therefore, the expression that represents the difference has infinitely many roots. According to the fundamental theorem of algebra, the two expressions must be the same polynomial, since otherwise their difference would be a none-zero polynomial and has finitely many roots.