by Darius Kazemi, December 4 2019

In 2019 I'm reading one RFC a day in chronological order starting from the very first one. More on this project here. There is a table of contents for all my RFC posts.

The real world of users and servers

RFC-338 is titled “EBCDIC/ASCII Mapping for Network RJE”. It's authored by Bob Braden of the UCLA Campus Computing Network and dated May 17, 1972.

The technical content

Okay so now we are back to some bread-and-butter ARPANET stuff: character encoding. In RFC-112, a survey of ARPANET sites conducted by Raytheon, we learned that as of April 1971, a sizable minority of ARPANET computers internally used EBCDIC as their character encoding, rather than using ASCII or something else. EBCDIC was essentially the in-house choice of IBM mainframes at the time, including the popular IBM 360 series.

The author of this RFC points out that in RFC-183, Joel Winett “sounded a clarion call for all EBCDIC sites to join in defining a Network standards mapping.” UCSB has had reports from users of its Remote Job Entry system who have been experiencing bugs related to EBCDIC/ASCII conversion.

To get a sense of how maddening the (non) overlap of different standards was, check out this Venn diagram drawn by Braden for the RFC.

A hand drawn diagram of five overlapping rectangles showing what characters are supported by: Basic EBCDIC, 33/35, full ASCII, AT&T TWX (Mod 33/35 tty), and PL1 Set.

Braden has decided to incorporate into NETRJS (the program for RJE operation) the second and third character mappings laid out in RFC-183. This warms my heart because when I was reading RFC-183 I noted that the first character mapping, the one suggested by IBM, is “frankly weird”. I'm heartened that Braden-in-1972 agrees with my low assessment of IBM's mappings.

Braden talks about an error they made originally: they assumed that people executing IBM 360 programs would not be providing ASCII characters that not supported by EBCDIC in the first place. These characters ( [, ], {, }, ^, \` ) are transcoded to EBCDIC question marks. The problem here is that some of these IBM 360 programs were designed to manipulate ASCII text from the ARPANET! So they absolutely did need to be able to encode every available ASCII character in that case.

There are even more problems with Model 33/35 Teletype users, as you might be able to infer from the above diagram. Braden offers a third mapping that he declares “is ugly, but it is probably the best we can do.

Braden concludes with an observation that ARPANET discussions tend to put all the burden on user sites to map things into formats that are network-compatible, but he says that “in the real world of users and servers, the server will have to do the adapting”. By my estimate, in 1972 Bob Braden was probably the single most experienced human being when it came to providing and maintaining heavily used remote services over the ARPANET, so this is extremely hard-earned and novel wisdom for the time.

How to follow this blog

You can subscribe to this blog's RSS feed or if you're on a federated ActivityPub social network like Mastodon or Pleroma you can search for the user “@365-rfcs@write.as” and follow it there.

About me

I'm Darius Kazemi. I'm an independent technologist and artist. I do a lot of work on the decentralized web with ActivityPub, including a Node.js reference implementation, an RSS-to-ActivityPub converter, and a fork of Mastodon, called Hometown. You can support my work via my Patreon.