Common Lisp code optimisation

Common Lisp is one of the few languages which is both dynamic and also gives you a full native compiler and the ability to declare types so that the optimiser can provide more efficient code.

The following examples are with full optimisation enabled and safety flags off, just to get rid of the explicit type checks in the generated code.

Let's take a small function that simply adds 100 to its argument:

(defun foo (x)
  (+ x 100))

After defining the function (no need to run a separate compiler or anything like that, this is all in the REPL) I can see the generated machine code by calling (disassemble #'foo)

CL-USER> (disassemble #'foo)
; disassembly for FOO
; Size: 18 bytes. Origin: #x52BF4B06                          ; FOO
; 06:       BFC8000000       MOV EDI, 200
; 0B:       FF142500010052   CALL QWORD PTR [#x52000100]      ; GENERIC-+
; 12:       488BE5           MOV RSP, RBP
; 15:       F8               CLC
; 16:       5D               POP RBP
; 17:       C3               RET

OK, that's a bit much, and notice how it calls a library function to perform the actual addition. This is because Common Lisp supports things like rational numbers and bignums, and since there are no declarations the compiler have to assume that the argument can be any type of number.

Let's help the compiler a bit by telling it that the argument is in fact a FIXNUM (Lisp terminology for an integer that fits in a register):

(defun foo (x)
  (declare (type fixnum x))
  (+ x 100))

This is what the disassembly looks like now:

CL-USER> (disassemble #'foo)
; disassembly for FOO
; Size: 25 bytes. Origin: #x52BBBDC9                          ; FOO
; C9:       4883C264         ADD RDX, 100
; CD:       48D1E2           SHL RDX, 1
; D0:       710A             JNO L0
; D2:       48D1DA           RCR RDX, 1
; D5:       FF142560010052   CALL QWORD PTR [#x52000160]      ; ALLOC-SIGNED-BIGNUM-IN-RDX
; DC: L0:   488BE5           MOV RSP, RBP
; DF:       F8               CLC
; E0:       5D               POP RBP
; E1:       C3               RET

Interesting. It now does a regular machine addition but there is more stuff going on here. What really happens is that even though we know that the input will fit in a FIXNUM, we don't know if the result will. After all, the input could have been just right at the maximum value for a FIXNUM. The code checks for an overflow, and if that happens, it allocates a bignum and returns it.

Let's change the declaration to say that the input is simply an integer between 0 and 10000000. This way the compiler can be sure the result will fit in a FIXNUM:

(defun foo (x)
  (declare (type (integer 0 10000000) x))
  (+ x 100))

And now let's disassemble the code:

CL-USER> (disassemble #'foo)
; disassembly for FOO
; Size: 13 bytes. Origin: #x52BFC186                          ; FOO
; 86:       4881C2C8000000   ADD RDX, 200
; 8D:       488BE5           MOV RSP, RBP
; 90:       F8               CLC
; 91:       5D               POP RBP
; 92:       C3               RET

Nice, now we can see that it simply performs a simple ADD instruction and returns the result.

For completeness sake, I want to explain two things that may be confusing:

First of all, why is it adding 200 and not 100? This is because SBCL (which I am using for these examples) use the lowest bits of a register for type tags. If the least significant bit is 0, then the rest of the bits represents a FIXNUM. Since the value to add is a constant, the compiler simply adds 200 instead of emitting instructions to do bit shifting.

The second question may be why there is a CLC instruction there. This is because Common Lisp supports multiple return values (a function can return any number of arguments, although returning a single one is the typical case). If the carry flag is cleared after a function call, that means that the function returned a single value, which is stored in RDX. If the carry flag is set, then that means there are multiple return values. When returning multiple values, the primary return value is still in RDX which means that a caller who doesn't care about the other values can simply ignore the carry flag. It's pretty clever if you ask me.

What's really cool about all of this is that I can do these things without ever restarting my Lisp runtime. I get the benefit of a high level language where I can work with the code without having to restart anything, but I can still take advantage of a native compiler and really good performance.

There are a lot of declarations one can apply. Not just type declarations, but several others that can be used to control the compiler. All the details can be found in the reference: http://clhs.lisp.se/Body/s_declar.htm#declare

I can be reached on the Fediverse @loke@functional.cafe