by Darius Kazemi, May 27 2019
RFC-147 is titled “The Definition of a Socket”. It's authored by Joel M. Winett of Lincoln Laboratory, dated May 7th, 1971.
The technical content
This RFC attempts to formally define what a socket is. The first time the term “socket” was used in an RFC was way back in RFC-33, more than a year prior to the publication of this RFC, so it seems like it's about time.
A socket is defined to be the unique identification to or from which information is transmitted in the network.
A socket is a 32-bit identifier. Even-numbered sockets are for receiving information, odd-numbered sockets are for sending information. Each socket maps to a “port”, which is a local input or output device, either physical (hardware like a printer) or logical (pure software like a file system). Simultaneously that socket maps to a “process”, or program. I think the implication is thus that programs also map to I/O devices.
Some sockets should be universally known and agreed upon by convention. For example, the socket for initial connection should be the same on all hosts—otherwise you'd need to maintain the references of every initial connection socket on every computer on the network in order to make things work. Having the socket the same for all initial connections means nobody has to keep track of this, all you need to do to connect to UCLA is know UCLA's network ID and then connect to the universally agreed-upon socket for initial connections.
Importantly, sockets are related to accounting procedures, aka “how do we track who does what and possibly charge them money for the privilege?” If an NCP logs the amount of time each remote socket is connected to a local socket, and if the remote socket can be associated with some user account on the remote server, then you can charge money for usage accordingly (or figure out who deleted your file, etc).
Given the importance of user identification, the author recommends using the 32 bits of a socket like so:
- 8 bits to identify the host server
- 16 bits to identify the user
- 7 bits for a unique number that is mapped to a local process ID
- 1 bit for the send/receive polarity
So in a modern context, by this definition, a socket would be something like the combination of an IP address and a port—though a modern port is full-duplex, whereas a socket is half-duplex, requiring an even/odd socket pair for both sending and receiving. (So like, we use port 80 for http traffic by convention, but this specification would mean using port 80 for receiving http traffic and port 81 for sending http traffic.)
Apportioning 8 bits for host server identification in a socket allows for only 256 hosts, which probably seems like a lot in May 1971 but within a few years will be woefully insufficient.
I think it's funny and cute that Winett signed his cover page with not just the title, his name, and affiliation, but also the machine the RFC was prepared on, an IBM 360/67. This is kind of like “Sent from my iPhone” at the bottom of an email.
There's an excellent 2015 paper by Bradley Fidler and Amelia Acker (available free online) that goes into the history of sockets on the ARPANET and proposes that sockets are both a kind of infrastructure and metadata.