RFC-31

by Darius Kazemi, Jan 31 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

Down with long boxes

RFC-31 is titled “Binary Message Forms in Computer Networks”. It's authored by Daniel Bobrow of BB&N and William R. Sutherland of Lincoln Laboratory.

The technical content

This is a short technical paper. Its purpose is to provide a kind of grammar for describing the data content of messages sent over a computer network. The introduction talks about the contemporaneous state of the art of describing network messages:

Most attempts at describing the content of binary messages jump immediately into a consideration of the bit codings to be used. Long, thin rectangles are drawn to represent the binary bit stream; this stream is sliced up into boxes, and tables generally describe the bit options for each box.

This is true; I've seen a lot of these rectangles in ARPANET documents. Here's one example from RFC-13:

a technical drawing of a big long box with labels showing what the different bit fields mean

Anyway, the paper proposes a sort of symbolic logic drawing from set theory to formally describe the contents of these messages, rather than these box diagrams.

This logic works a little bit like you might specify a computer language. I'll give my own example, since their “simple” example is actually a pretty complex hierarchical structure describing graphical data. Instead let's imagine we want to describe a simple image format where we store pixels of different colors to be drawn at different coordinates on a screen.

Basically, we define our simple fields (primitives), then give them “equivalents” which are basically ENUM values, then define our “characterizations” which are more complex fields made of the simple fields. Lastly we list out the size of the simple fields in number of bits.

Here's our basic image format:

   Title: Very simple color image format that only makes sense for this blog post.

     Simple Fields:
        OPT - A control field
        COLOR - Numerical value representing a color
        COORD - Numerical value of single coordinate on one axis
        COUNT - Number of units in message

     Characterizations:
        CPAIR   <- COORD = 2
        POINT   <- CPAIR + COLOR + COLOR + COLOR
        PICTURE  <- '1' OPT + POINT = N + '0' OPT

     Simple Field Sizes:
        OPT   1
        COORD 14
        COLOR 4

So we say a coordinate pair (CPAIR) is two coordinates. A point is a coordinate pair plus 3 color values (for red, green, and blue). A picture is a special 1 control message followed by N points followed by a special 0 control message. We use one bit for OPT because it's just 0 or 1, 14 bits for COORD because we might be displaying pixels all over the place, and 4 bits for color because we only want 15 levels of color for each color primitive.

There are also rules for conditional values — “if the value of field X is 1, then also include this extra bit of data”. Specifically an expression like [V = C1 ⸧ PPAIRS] can be read as “field V is equal to C1 implies that PPAIRS goes here”. That is often read as “implies” in logical notation and is what they choose to use as their conditional.

Analysis

The transcription loses a bunch of information because, somewhat ironically, it seems like the transcription is sticking to ASCII characters. But the machines these documents were written on provided far more than just ASCII characters. As I mentioned above the formulas in the paper included the material implication symbol but were transcribed to > in the official transcription. In my opinion this makes the paper way less clear, as > has a pretty solid semantic meaning of “greater than”, which is not at all what the authors meant.

While inspecting original RFC documents at the Computer History Museum, I noticed that there's a transcription error in the official IETF version of RFC-31. Namely, there is a missing line from a formula on the last page. Here's the page with the line circled by me:

It reads '1110' ⸧ PPAIRS = '14' D] and makes up the second line of an expression where the first line is in the IETF transcript. It seems like it was just left off by accident since it was on a different line. So the full formula in the “Conditional Alternatives” section should in fact read:

SM := W : F1 + CPAIR + [W = C1 ⸧ CPAIR / C2 ⸧ PPAIRS /
'1110' ⸧ PPAIRS = '14' D]

It doesn't actually matter though! Just a little erratum for you.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.