365 RFCs

Commenting on one RFC a day in honor of the 50th anniversary of the first RFC.

by Darius Kazemi, Mar 4 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

Belated

RFC-63 is titled “Belated Network Meeting Report”, authored July 31st, 1970 by Vint Cerf of UCLA.

The technical content

This RFC is indeed belated! It's meeting minutes for a meeting that happened almost 3 months prior on May 8th, 1970 at Lincoln Laboratory. This is the meeting to discuss Lincoln's proposal for a Local Interaction Language that was proposed back in RFC-43.

The document itself is a short summary so I will summarize that summary by saying: they talked about some things and came to some agreements related to the message data type system in RFC-42. My complaint about that RFC was it was a bit unclear and unspecific. This meeting ironed out a few specifics and made it more implementable.

Analysis

There is barely any mention of Local Interaction Language in this document other than to mention that it was discussed. According to RFC-43, learning about LIL was the purpose of this meeting. So now I'm going to put on my highly opinionated and possibly projecting personal experience hat. I have been to technical meetings where the pretext is we are going to discuss some technology that nobody really understands or particularly cares to learn about. You might be open to hearing about the tech but the real reason you go is that having a bunch of your colleagues in the same room is useful. And boy there were lots of east coast and west coast ARPANET implementors (or their representatives) in the room for this meeting. Twenty-seven by my count. The tone of this RFC suggests that people politely listened to the LIL presentation and then there was “considerable discussion” of message data types for the remainder of the meeting.

Further reading

The author notes that the full details of the Local Interaction Language are available in a Lincoln Lab technical report titled “Graphics”, which you can read here. As far as I can tell it was never implemented.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Mar 3 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

A revision of another shot at interprocess communication

RFC-62 is titled “A System for Interprocess Communication in a Resource Sharing Computer Network”, authored August 3rd, 1970 by Dave Walden of BBN.

The technical content

This RFC is a revision of RFC-61. It adds:

  • a more “narrative” introduction
  • a conclusion section
  • several new technical questions and possible answers that arose since the previous revision
    • how to match SEND and RECEIVE buffers for size
    • how to maintain unique numbers across a geographically diverse network
    • how to maintain a high bandwidth connection between remote processes

And is generally edited for clarity and layout.

The paper is overall very similar so I won't go into detail. (Also: I am tired. The last ten RFCs have been very long and very technical.)

Further reading

As I mentioned in my RFC-61 post, these drafts form the basis of a paper that would eventually be published in April 1972 in Communications of the ACM and is fairly heavily cited by other papers. The full paper is hosted by Walden at his web site.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Mar 2 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

Another shot at interprocess communication

RFC-61 is titled “A Note on Interprocess Communication in a Resource Sharing Computer Network”, authored July 17th, 1970 by Dave Walden of BBN.

After a long career in computing and management, Walden has taken to researching various aspects of computing history. I have already written here about the history of BBN that he co-edited (I'm partway through and it's great). He contacted me early on in this project to offer valuable contextual information from his point of view as someone involved in the IMP/ARPANET project at BBN, for which I am grateful.

The technical content

This document is a draft of a study Walden is in the process of writing at BBN. The paper tackles “inter-process communication” on a network, which is the same thing that the host-host protocol and NCP combination proposed by Crocker et al had been attempting to solve. Specifically the study is trying to lay out a standard by which all computers on a network can invoke programs and move data around, rather than an ad-hoc system that would be different for every single pair of computers on the network. The ideas herein would never be fully implemented as they are laid out but certainly influenced the development of other such schemes.

The document makes reference to the host-host paper that Carr, Crocker, and Cerf proposed at the 1970 Spring Joint Computer Conference just a few months prior.

Here for the first time in an RFC we get a clear definition of “monitor” and “process” in their operating system context. In full:

The model time-sharing system has two pieces: the monitor and the processes. The monitor performs several functions, including switching control from process to process as appropriate (e.g., when a process has used “enough” time or when an interrupt occurs), managing core and the swapping medium [memory], controlling the passing of control from one process to another (i.e., protection mechanisms), creating processes, caring for sleeping processes, etc.

The processes perform most of the functions normally thought of as being supervisor functions in a time-sharing system (system processes) as well as the normal user functions (user processes). A typical system process is the disc handler or the file system. For efficiency reasons it may be useful to think of system processes as being locked in core.

A process can talk to the monitor and ask it to do the following things:

  • start another process
  • halt or pause the current process
  • send a message to another specific process
  • prepare to receive a message from another specific process
  • send a message to ANY process (so, broadcast, basically)
  • receive a message from ANY process (so, consume any incoming messages)
  • get a “unique” number from the monitor (presumably to uniquely identify messages being sent out)

The actual commands are a little more wonky than this. A SEND is actually a request to another process to “SEND something to me” so it operates logically backwards from how you might assume. But all the above functionality is there even if the grammar is weird.

The paper then lays out how this kind of communication could happen at a single physical computer. The examples are very clear.

The next section generalizes this communication to communications between a local host and a remote host (so, over the network).

I really like this image that Walden paints of what a networked computer might look like in an abstract sense:

Consider first a simple configuration of processes distributed around the points of a star. At each point of the star there is an autonomous time-sharing system. A rather large, smart computer system, called the Network Controller, exists at the center of the star. No processes can run in this center system, but rather it should be thought of as an extension of the monitor of each time-sharing system in the network.

So the computer has many processes on it, and every process talks to a special computer system (called the NCP in other RFCs), and that computer system really ought to be thought of as part of the core operating system itself. This Network Controller acts as a broker between all the processes on the computer and the rest of the network. Walden makes the claim that

if the Network Controller is able to perform the operations SEND, RECEIVE, SEND FROM ANY, RECEIVE ANY, and UNIQUE and that if all of the monitors in all of the time-sharing systems in the network do not perform these operations themselves but rather ask the Network Controller to perform these operations for them, then we have solved the problem of interprocess communication between remote processes. We have no further change to make. [emphasis mine]

Next he says that somewhere on the computer is a table of unmatched SEND and RECEIVE requests (or RECEIVE ANY requests that haven't gotten anything yet). This table of unmatched pairs lets a computer listen for outstanding messages it hasn't yet received.

The last big piece is that the UNIQUE number function must provide numbers that are unique across the network. This is handled by giving out unique blocks of numbers to each computer to parcel out on their end as needed.

Because this is a BBN paper, and BBN designed and implemented the IMP (recall: basically a proto-router), the IMP is considered of great importance. The IMP is hardly even mentioned in RFCs by the various west coast entities. It's fully “black boxed” to everyone except for the people who invented it. But BBN knows what is in the black box and thus has some things to say about it.

In the scheme put forward by this RFC, in the ARPANET context, the IMP itself will host the NCP rather than the host. This is a new proposal; prior to this RFC the idea was that the Host would be the sole arbiter of interprocess communications. It seems to me that main reason for the IMP role in this is so that there is some central source of unique numbers that is still “neutral”. That way it's not just some Host machine at UCLA doling out number ranges to the other computers but a more neutral party in the network. The other reason for putting the NCP software on the IMP is that the IMP is known hardware and software. You only need to write the NCP once for the IMP system. Since every host has the same type of IMP installed on site, you no longer have to write a new NCP for every new kind of computer you connect to the network.

Analysis

This paper is long but, in my opinion, well worth reading (or perhaps just read its final edition, linked in the “Further reading” section below). To me, this is by far the clearest elucidation in the RFC series to date of how processes can talk to each other over the network (and there have been many attempts!).

Here is a photo of a few pages of this RFC that I took at the Computer History Museum's warehouse.

Four handsomely typewritten pages fanned out on a desk.

You might not be impressed with what you see here but after combing through dozens and dozens of these documents, there is something uniquely beautiful about any BBN document I come across. For example, note that the headings are written in an entirely different type face from the body of the document (compare the “T” in “A MODEL FOR A TIME-SHARING” with the “T” that begins several body paragraphs). There is italicized and bolded text. You'll have to trust me on this but the physical paper quality is really nice. According to Walden himself in one of our email correspondences, BBN had an illustration department, an editor, and a secretary that did excellent work when it came to corporate document preparation and publication.

Further reading

This RFC is an early draft of a paper that would eventually be published in April 1972 in Communications of the ACM and is fairly heavily cited by other papers. The full paper is hosted by Walden at his web site.

Walden has a short memoir here.

Here is The Nucleus of a Multiprogramming System, the 1970 paper by Per Brinch Hansen cited in the RFC.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Mar 1 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

A simpler NCP

RFC-60 is titled “A Simplified NCP Protocol”, authored July 13th, 1970 by Richard Kalin of MIT.

The technical content

This document proposes a simpler Network Control Program than the one laid out in RFC-55. The simplification mostly comes from a central premise: that we can assume that all ( or nearly all) communications will be bidirectional. The author points out that simple communication theory dictates that a useful communication will be a conversation between two entities, a feedback loop of sorts.

By assuming bidirectional communication, you essentially cut the complexity of the NCP in half. Much of the paper is proving this intuitive idea out, and provides a set of commands that can be implemented.

Interestingly, the assumption in this RFC is that larger foreign hosts will lie about the amount of storage they have in their buffers. What a host can (and I believe in the eyes of this author, pretty much should do) is say “I can get away with claiming X amount of storage because statistically speaking, nobody is every going to need all of that storage at once.”

An extreme analogy would be having a one-room hotel in a remote location and saying “my hotel can handle up to 365 guests” when really you mean 365 guests per year at a rate of one each day. If your hotel is remote enough that two guests will basically never need a room at the same time, you can make that claim and it'll never be a problem! (As I said, this is an extreme analogy.)

Unlike RFC-59, this document touches on the trade-offs between “coding simplicity” and technical performance.

Analysis

I like this author's approach to persuasion. They end the RFC with the following paragraph:

It is the hope of the author that the above protocol presents an attractive alternative to that proposed by RFC 54 and its additions. Although it appears at a late date, it should not be more than a minor jolt to implementation efforts. It is simple enough to be implemented quickly. If adopted, a majority of the present sites could be talking intelligently with one another by the end of the summer.

Basically: I'm sorry I'm late, I think this will be easy to program, and implementing it will result in us being where we all want very soon.

Further reading

A bit of an aside, but two incredibly important publications happened in 1948: Norbert Weiner's book [Cybernetics](https://en.wikipedia.org/wiki/Cybernetics:OrControlandCommunicationintheAnimalandtheMachine), and Claude Shannon's paper “A Mathematical Theory of Communication”. The ideas in these books completely changed scientific thought. I mention them here because the casual claim in this RFC that “[a]ll communication requires a cyclical flow of information” is only possible in a post-Weiner and post-Shannon world.

Here's a brief biography of Weiner.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Feb 28 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

The case for perfect

RFC-59 is another odd one. It's missing the standard RFC cover page and as such I had to poke around for its title, which is available on page 2 and beyond in the header. It's called “Flow Control – Fixed Versus Demand Allocation”. It is authored by Edwin W. Meyer of MIT's Project MAC and dated June 27th, 1970.

This RFC also features an erratum (correction). Errata are a formal aspect of the RFC series. Since an RFC, once submitted, can never be changed in place, instead corrections happen by appending errata, which are short notes that say, for example, “this section had a typo, it should read xyz instead of abc”. Here is the errata entry for this RFC. The errata entry corrects an incorrect reference to a different RFC number.

The technical content

This RFC is a strongly worded disagreement with the official host-host protocol offering by Crocker et al in RFC-54, particularly the section of that RFC on “Flow Control”.

They list a number of disadvantages with the RFC-54 flow control scheme, which I'll excerpt here verbatim:

(i)  chronic underutilization of resources,

(ii) highly restricted bandwidth,

(iii)considerable overhead under normal operation,

(iv) insufficient flexibility under conditions of increasing load,

(v)  it optimizes for the wrong set of conditions, and

(vi) the scheme breaks down because of message length indeterminacy.

They go on to describe these in detail, but the crux of the document is, the RFC-54 scheme has these six problems, they do their best to prove that the scheme has these problems (and convincingly, I think), and they implore the NWG to use a previously proposed flow control scheme instead.

But even though the technical details of this paper seem correct... well... I'll save it for the analysis section, which is where my opinions come in.

Analysis

Once again this seems like a battle of “good enough” versus “perfect”. In RFC-54, the authors state

[t]his new procedure has demonstrable limitations, but has the advantages that it is more cleanly implementable and will support initial network use.

Specifically, the authors of RFC-54 say “this isn't perfect but we can build it today and it will be fine for now.” The authors of this RFC disagree, opening with what they see as the six major defects of the RFC-54 scheme.

When the authors of this document lay the pros and cons of their scheme side by side with the pros and cons of the RFC-54 scheme, one thing remains curiously absent: ease of implementation, which was very specifically a key advantage laid out by the authors of RFC-54, in fact probably the key advantage.

To my eye, this document is either written in bad faith, or written by authors who are unable to see the social compromises necessary to get things working at all instead of working correctly. Now I might just be fully projecting here and saying this will probably get me some angry emails, but... this does seem like the kind of thing to expect from the Project MAC team, who gave us Multics, where to my knowledge, the abstract idea of technical correctness took precedence over concepts like ease of use.

Further reading

For you documentation and organization nerds, here's the 2008 proposal for how the RFC errata system works today.

Also for Latin nerds: I use “errata entry” instead of “erratum” most of the time because the convention of the RFC Editors is to avoid the Latin singular.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Feb 27 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

A big box of paper

RFC-58 is titled “Logical Message Synchronization”, and authored by Thomas P. Skinner of MIT's Project MAC.

The technical content

This document is primarily an attempt to urge the NWG to settle an unsettled matter. The matter in question is that of physical versus logical message boundaries. I'm a bit out of my depth here but I think I understand the basics and will attempt to summarize.

A physical message is a certain number of bits of data. Imagine a piece of standard office printer paper that can only hold so many typewritten characters: that piece of paper is the physical message. The paper may contain dozens of sentences, or it may contain the word “hi”, but one piece of paper is still one piece of paper and is the same size as another piece of paper from the same box.

The logical message is the actual information that is being conveyed. So a large piece of paper (the physical message) may contain a very short message like “hi” (the logical message). Or on the other extreme, a single logical message may span multiple physical messages (imagine an entire novel delivered over multiple pieces of paper).

Continuing the paper metaphor: assume we've put a bunch of paper documents in a cardboard box and mailed them to our friend. It is easy for the friend to see the boundaries of a physical message. The boundary is just the pieces of paper themselves. But where do we draw boundaries of logical messages? In our paper example we'd need to agree on some kind of notice to the reader that you've reached the end of my message. Maybe we type “END OF MESSAGE” at the end of our document, or maybe we number our pages 1 of 3, 2 of 3, 3 of 3. Maybe in case the pages get mixed up in the box with other documents, we need to put a unique ID on our document so that I can tell my 1 of 3 from someone else's 1 of 3. And so on.

Ultimately these solutions are rather technical but they actually boil down to the same kinds of solutions you'd use in an office filing system, and I won't belabor them here. The important bit of this document is at the end:

I have not intended to suggest a solution to the problem, but merely bring it to light. If we want to restrict logical messages to begin on physical boundaries we must plan this early in the game. (It probably works out that way in most cases anyway.) Other schemes can be tried later. We must, however, face up to this problem fairly soon.

Mostly, Skinner just wants everyone to agree on a solution as soon as possible.

Analysis

I think Skinner is right and they need to agree ASAP on a solution. I think he is also right when he says

I can think of several solutions to the problem at the moment. None of them seems to be very good.

I don't think there will ever be a good solution to this kind of problem. Just good enough. And that's okay.

Further reading

I've linked it here before, but this is a good background on MIT's Project MAC, which is where the Multics operating system was initially created.

This is Thomas Skinner's website.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Feb 26 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

Edge cases

RFC-57 is called “Thoughts and Reflections on NWG/RFC #54” and is authored by Mike Kraley and John Newkirk of Harvard University. It's dated June 19th, 1970.

The technical content

As the title suggests, this document is related to RFC-54, although RFC-48 also plays a big role.

To jog your memory if you've been reading along with this blog, RFC-54 is the first ever submission of a formal specification as an RFC, and the authors of this RFC were co-authors on that document. RFC-54 was an attempt at a truly-really-definitely-official host-host protocol, which is the basic language that computers on the network use to talk to each other. This document is kind of an addendum to that host-host protocol. It contains ideas that arose during the writing of the protocol but were untested or under-discussed so they felt they should leave it out of the official protocol and instead discuss it in this separate document.

First, the authors distinguish between a “real error” and an “overflow condition”. Basically a real error is when there is an actual bug in the software or hardware that causes a problem. The idea there is someone messed up in their implementation and they need to fix it. An “overflow condition” is when, for example, there's a traffic jam on the network and messages can't be sent. Nobody did anything wrong here. There were simply too many people using the ARPANET at one time.

Specifically, in RFC-54 an overflow condition can trigger an error message that looks like a real error! So the programmer who gets the error message might waste a bunch of time looking for bugs in their code when really the problem was just congestion on the network.

They go on to specify a bunch of error types and error metadata that would make error messages far more verbose and useful for debugging.

The other section of the document proposes a solution for when a remote computer crashes. It needs a way to come back online and say, “hey, I had crashed but I'm back now, please don't forget about me.” These are provided through a proposed pair of RESET and RESET REPLY messages that broadcast out to the network when coming back from a crash or manual reset.

Analysis

Both of these proposals seem reasonable. If I were reading this RFC in June 1970 I would give it my vote of approval.

I had to laugh a bit at this paragraph:

Overflow does not require serious consideration if it is a significantly rare occurrence. We do not believe this will be the case, and we further believe that its absence will be an unnecessary restriction upon the user.

What an amazing double negative! My translation: “None of this will matter if traffic jams are going to be rare occurrences on the network, but we are pretty certain that there are going to be a ton of traffic jams so please listen to us.”

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Feb 25 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

A bit of a hit-and-run

RFC-56 is by Ed Belove and Dave Black of Harvard University, and Bob Flegel and Lamar G. Farquar of University of Utah. It's dated June 1970 (no more specific date than that). It's titled “Third Level Protocol: Logger Protocol”.

The technical content

This is outlining something called a “logger protocol” but it doesn't seem to have anything to do with “logging” in the modern sense of keeping automated statistics and notes on what is happening with a computer program.

Instead this is a way “to allow a user teletype to communicate with a foreign monitor”. I assume this is using “monitor” in its operating systems sense from this era: a monitor is a piece of software deep in the core of an operating system that keeps an eye out (“monitors”) for what programs are doing.

It proposes, among other things:

  • the network should adopt USASCII for communication for all remote connections
  • the previously-specified interrupt (<INT>) command should be used as a “panic button” for breaking out of programs (what you'd use ctrl-c for in many modern consoles)
    • every local host (or local teletype?) would choose its own seldom-used character that would then be mapped to the <INT>
    • due to an edge case involving special character collisions, you'd have to custom map characters on a foreign-host by foreign-host basis

Analysis

This is an odd RFC. It seems out of touch with the rest of the series in tone, style, jargon, and also content. Broadly, and ask risk of sounding harsh, this RFC seems entirely irrelevant to the discourse in the NWG at the moment of publication. What this RFC is proposing is adding an extra layer of complexity on top of the Host-Host protocol in order to accomplish precisely what, to my reading, the Host-Host protocol was designed for. The only thing it really would do in addition to the base protocol is make the whole thing more restrictive in terms of local implementations, and would in fact require local implementations to know ahead of time certain information about their remote host (such as what its 'break' character is). This may have made some amount of sense a year prior when there was no consensus, but this is coming right after 3 months of painstaking consensus-building.

The fact that these authors would never author another RFC does seem to corroborate that this was a bit of a hit-and-run.

(Another David Black would co-author RFC 2474 in 1998 but it seems from his online resume that he is not the same David Black who was at Harvard in 1970.)

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Feb 24 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

Squishy amoebas

RFC-55 is by Newkirk, Kraley, Postel, and Crocker. This is the same set of authors as RFC-54 but in reverse order; I assume they are using academic convention and as such it's the Harvard group that is the primary author of this RFC. But maybe not! I could easily imagine the group just agreeing to swap author positions between the two RFCs for the sake of equity. It's dated June 19th, 1970 and titled “A Prototypical Implementation of the NCP”.

The technical content

This document is an example implementation of a Network Control Program. Basically it's the program on the HOST machine that negotiates between the network and the operating system.

This document is emphatically NOT a specification.

There is, of course, absolutely no requirement to implement anything which is contained in this document. The only rigid rules which an NCP must conform to are stated in NWG/RFC#54. This description is intended only as an example, not as a model.

This is because the authors can't know what the internals of every single computer that attempts to connect to ARPANET will look like. But they feel it is a good idea to at least provide an example that people could look to. (And they are right.)

They define a set of basic assumptions about the computer they are describing in this paper. This theoretical computer is a time-shared computer with multiple users. Individual programs are run by specific users. Programs have their own internal “ports” for data to flow in and out (for example, a port might connect a program to a file in memory so it can write to that file). A port can have a socket attached to it; this socket is the port's interface to the network.

Next they define a set of pseudocode type commands for the NCP. Things like CONNECT my socket to a foreign socket, TRANSMIT data over the connection, CLOSE the connection. These are defined in a fairly detailed way, with arguments to the commands defined and various status codes enumerated.

Then they provide a kind of block-level system understanding (with accompanying diagram) of the NCP. There's an input buffer and an output buffer as described at length in the intro to RFC-54. There's a sub-program that interprets inputs from the network and shunts them to the right place (this is an error so do xyz with it, this is a message for the user so do abc with it). There's another sub-program that prioritizes different messages to be sent as output to the network (control messages take priority over regular data). And the system call interpreter is a sub-program that interprets input from the local user and does various things with it.

Next they provide descriptions of the data tables for the program, outlining how important data about the network is stored by the NCP. This includes records of which links are free to use, which HOSTs are online, which connections are active, etc. You may recall from my RFC-44 post that Edgar Cobb of IBM was in the process of inventing the modern relational database at this time. This kind of thing in a modern system would be stored in a database, but here it just lives in a series of data structures (a small difference but worth noting for historical context).

Analysis

Of note is a note in the document that says “[s]quishy amoeba-like objects” (!!!) in the diagram represent component programs. The amoeba-like objects were transcribed as boxes in the RFC transcription process, but here are my photos of the two accompanying diagrams for the RFC. These are from my visit to the archives of the Computer History Museum.

the flow chart is full of amoeba-looking blobs instead of rectangles

the flow chart is full of amoeba-looking blobs instead of rectangles

I've been fascinated by the biological nature of many of these early internet drawings. I was almost wondering if I was projecting my interpretations onto the drawings but right here in the document the original authors claim the figures are “squishy” and “amoeba-like”! I assume this is to show that these programs have indefinite conceptual boundaries, whereas the queues are a well-defined data structure and as such get a well-defined shape.

I've also noticed that some time in the past maybe ten RFCs, the writing convention has changed and what was once a HOST (all caps) is now known as a Host (title case). I will probably use Host from here on to keep parity with the convention for the documents.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.

by Darius Kazemi, Feb 23 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

A proffering

RFC-54 has a whopping four authors. To date we've seen a maximum of two listed on one RFC. They are: Steve Crocker, Jon Postel, and two newcomers to the series, John Newkirk and Mike Kraley of Harvard. Recall that Harvard was added to the distribution list in RFC-52. This is dated June 18, 1970 and titled “An Official Protocol Proffering” (say that five times fast).

The technical content

This is the promised first-ever attempt at an official protocol being published as an RFC. Per the rules laid out in RFC-53, they give a deadline for comment of July 13th 1970, a little less than a month after the date of this RFC (so probably about 3 full weeks for people to respond once they've received this RFC in the mail).

The first section is an overview and defense of the concepts in the document.

They have decided to single out “only network-wide issues” for an official standard that people will be forced to adopt. They still want to let everything on the HOST side be as open to interpretation as possible. As long as you speak the lingua franca of the ARPANET in public, the ARPANET doesn't care what language you use at home.

In the course of designing this current protocol, we have come to understand that flow control is more complex than we imagined.

By “flow control” they mean controlling network traffic so it doesn't get backed up. I was wondering why they seemed to assume up to this point that it would be a trivial solution; of course I am reading these documents with literally half a century of hindsight. Now we finally see them bump into these problems.

The basic proposal is that when a HOST A wants to send data to HOST B, it is up to B to maintain a memory buffer and let A know how much buffer space remains. And A is supposed to respect that. A physical analogy: suppose you want to ship 1000 crates of goods to a warehouse, but they have a limited number of storage bays. So you say “I want to send you 1000 crates of goods” and they reply “we only have room for 50 crates today”, and then you decide whether to send them 50 crates or just cancel sending entirely because it's not worth it to you.

The authors claim that this algorithm simplifies some of the existing designs for flow control, so there will be fewer commands over the network to keep track of. They would like to keep this initial version of flow control as simple as possible, even at the cost of being probably too conservative in how much data is sent over the network. They are open to future protocols that are more complex but I get the sense that they just want this thing to ship, you know?

They also propose two meetings, one at UCLA and one at Harvard, for people to attend with any questions they have about the protocol. They specify that they want at most one programmer from each institution to attend.

The second section is more of a technical specification.

First they define a “connection” as “a simplex communication path [...] between two processes”. In other words, a connection is a one-way flow of data between two different computer programs over the network.

The main purpose of this protocol is to establish how connections are created, how flow control is to work in an open connection, and how they are terminated.

The way a connection is opened is through two “sockets”. A socket has a unique ID known throughout the network and corresponds to a particular program running on a particular HOST. You open a connection saying “I want this socket to connect to that socket”, which is another way of saying “I want this program on my computer to connect to that program on your computer.” (A close analogue to a socket today would be a combination of an IP address and a port number.)

Every connection gets its own link. This is in contrast to earlier proposals of “multiplexing systems” (like RFC-38) where multiple connections could share the same link. (Maybe you recall from my RFC-11 post that multiplexing is when you combine one or more signals in such a way that you can send them over the same communication channel but detangle them from one another on receipt.)

When a socket is in use it is reserved. Basically only one program can't talk to multiple other programs on the network at the same time. Connections and their attendant sockets are one-to-one.

The final section lays out some more general thoughts.

They identify three areas where future protocols might improve on this one: better flow control, error handling, and what to do when a computer is only part-time connected to the network.

They “suggest that hosts come onto the network gingerly.” Specifically they recommend that a HOST talk to its own sockets first as a test mechanism; then it should talk to a HOST that is known to be working well; then it can try talking to all HOSTs. The folks at UCLA offer to be the test case for anyone who wants it.

Apparently local sites have already been using the IMP as a kind of local router! “For example, Harvard is connecting its PDP-1 to its PDP-10 via an IMP”. That's pretty cool. They beseech people doing this to use caution so as not to like, bring down the entire ARPANET by accident.

Analysis

I suspect one reason there are four authors on this is to both have a lot of eyes look at this first official specification attempt, and also to show that two very different institutions (UCLA and Harvard) are on board with it. You know, for once it's not just those zany kids Crocker and Postel at UCLA coming up with this stuff!

One question I have is: how are sockets “known throughout the network”? Partly it's by convention. Even numbered sockets are assumed to be for receiving data; odd numbered sockets are assumed to be for sending data. But beyond that I'm not sure how everyone is supposed to know about everyone else's socket IDs.

Everything about this particular protocol proposal screams “simple” to me. It's relatively easy to implement. It's not maximally efficient but it gets the job done. I like it a lot. It's frankly the first proposal I've seen here that I believe I would not have pushed back on had I been one of these original NWG members.

I sort of assumed that this kind of thing would be literally the first thing we'd see in like RFC-1, but over the past 14 months of RFCs, we have seen proposals of medium complexity (like the original HOST-HOST protocol) all the way up to extremely high complexity (DEL, NIL).

It took them more than a year to arrive at something simple!

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm a Mozilla Fellow and I do a lot of work on the decentralized web with both ActivityPub and the Dat Project.