Monday, January 21, 2013

Converting ITS documentation

I've been playing around a bit more with the Incompatible Timesharing System (ITS). I figured out enough networking on my Mac to be able to use the Java Supdup terminal application to interact with KLH-10 simulating the KS10 architecture.

As part of my study, I have also been converting some INFO files to Texinfo, starting the with the file (slow link to ITS system) INFO;DDT > which Bjorn Victor converted to an HTML version: DDT Primer). I like Texinfo because it can (theoretically) print nice looking books for an ebook reader like my new Kindle Paperwhite, as well as Info compatible for Emacs navigation, and HTML.

Actually distributing these files is problematic. ITS generally seems to have operated with a pretty free notion of distribution: send a magnetic tape to pretty much anyone with a PDP-10 who asked, without executing any kind of explicit license agreement, or even having copyright notices on the text. Anyone with access to the system could take it upon themselves to edit these files. It is pretty much impossible to tell who wrote or collaborated on these files, at what time, for whom they were working, and what their employer (mostly the AI Lab at MIT) agreed to allow. Some of these files were electronic versions of AI memos and other MIT publications.

MIT apparently released a fraction of the ITS code (not enough to actually use, and without the documents) under GPL, but fuller distributions have been made informally, probably originating with the former operators of the ITS systems. Some initial versions of these dumps had a bunch of personal e-mail files and the personal data in the user data base, and later versions were scrubbed of most of this information. Even these scrubbed versions are generally unavailable, though this may be from negligence in the web hosting rather than the result of a legal takedown request.

Technically speaking, the ebook conversions are still a bit messy: I am using dbtoepub (a Ruby script converting Docbook to EPUB, and a 'texinfo-to-mobi' shell script which invokes makeinfo, dbtoepub, and Amazon's kindlegen binary to create a Kindle document. I have issues with the Table of Contents, Chapter headings for untitled chapters, and links from the index not navigating to the ideal place. I also found Emacs Info doesn't like it if I use a UTF-8 multibyte character: it seems to improperly count by characters where makeinfo counted by bytes.

I am using UTF-8 to provide a (SAIL-style?) lozenge rendering of "Altmode." Altmode is ASCII code ESC (27 decimal, 33 octal, 0x1B hex), although sometimes it is octal 175; it seems ASCII and terminals changed what they meant. Much of the time, Altmode echoes as a dollar sign $. In Supdup (see RFC 734), octal 4033 is apparently the ◊ lozenge character, and *that* can be used to echo Altmode. I like the lozenge better; it definitely stands out in Courier-style script more than $ does. Unicode denotes this as U+25CA. The TeX side of Texinfo doesn't really "get" Unicode, or even very much "funny characters," but I was able to find a TeX macro in plain.tex for \diamond that I could hack into my texinfo document and my conversion tools translated this to Unicode acceptably.

The "keystroke" index is a bit funky: it doesn't like my altmode characters. Metavariable notation using <var> is not supported by the semantic markup: info would show it as VAR, which I think is a really bad choice when all the other command keys are in upper case. Likewise for Control characters; I'd like ^Z, etc., to look nice, and be semantically understood for indexing, but not use C-Z or C-z notation. I'd like to be able to specify initial Texinfo variable settings from the command line (so that I could say 'makeinfo --altmodechar=$' or '--altmodechar=\033' to render my Info files in other ways), but I can only specify boolean flags that way.