TRS-80 Model II TRSDOS: paranoia strikes deep

Eric — Sat, 31 Mar 2012 09:32:59 +0000

It looks like the developers of TRS-80 Model II TRSDOS were very paranoid that someone might be able to bypass the filesystem and access data on a floppy directly. I’m not sure if their primary concern was file password protection, or if they had other reasons. Obviously you could write a program that accesses the floppy directly, by talking to the FDC and DMAC chips yourself, and there’s not really anything that can be done to prevent that.

Oddly enough, this was exactly the opposite of what Apple did in Apple DOS. Apple published the APIs to read and write sectors (RWTS), but never published the “File Manager” APIs that allowed access to the file system through means other than passing commands through the character output vector (e.g., the BASIC statement PRINT CHR$(4);”OPEN FOO”).

I’ll mostly describe how things work in Model II TRSDOS 1.2, the earliest version I’ve been able to obtain. I haven’t studied 2.0 nearly as much yet. The TRSDOS 1.2 “kernel” consists of three parts, while later versions are more monolithic.

The Model II boot ROM loads all of drive 0 track 0 (single density, 26 sectors of 128 bytes) into memory starting at 0e00. First it looks for the four characters “DIAG” at 1400 and “BOOT” at 1000. If either are missing, it refuses to proceed. It calls the code at 1404, which in TRSDOS is a simple hardware diagnostic. When that returns, it jumps to the first stage boot loader code at 1004. Some other operating systems don’t bother with a diagnostic, and just start their boot code at 1404, never returning to the ROM.

The first stage boot loader actually understands the TRSDOS filesystem enough to find the directory entries of files in TRSDOS load module format, and load them into memory. In 1.2, it loads “IODVRS/SYS” and then “TRSDOS/SYS”, and jumps into the latter. The Model II TRSDOS filesystem is similar in many regards to that of Model I TRSDOS, but not enough to actually be compatible. Unsurprisingly, it looks like an intermediate step in the evolution from Model I TRSDOS to Model III TRSDOS. As in Model III TRSDOS, files can only have a single directory entry, with a limited number of extents.

IODVRS/SYS contains, as the name implies, the low level I/O drivers for the system, including the keyboard, display, printer, and floppy drives, the dispatching for system (SVC) calls, and a few utility SVCs. However, it only contains the SVC handlers for services 0-28, the I/O functions and basic utility SVCs. Note in particular that it contains no file system code. IODVRS/SYS is conceptually similar to the CP/M BIOS, though lacking CP/Ms charming simplicity. IODVRS/SYS provides several undocumented SVCs for internal use by TRSDOS, including floppy subsystem initialization (13), floppy sector read (14), and floppy sector write (16). Note that at the time IODVRS/SYS is loaded, no call is made into it to initialize it.

TRSDOS/SYS, however, is called after being loaded. It basically performs the TRSDOS initialization that only has to happen at boot time. It has another implementation of filesystem reading and load module format handling, very similar to what is present in the stage 1 boot, but now instead of talking to the FDC and DMAC directly, it uses the undocumented floppy SVCs described previously. After various initialization, it loads SYSRES/SYS and jumps into it.

SYSRES/SYS contains the filesystem code and other relatively high-level TRSDOS infrastructure code. It generally relies on SVC calls into IODVRS/SYS to perform all I/O, and has very little other dependence on IODVRS/SYS internals. This is conceptually similar to the CP/M BDOS. It loads system overlays to handle some SVCs and user commands. Overlays SYS0/SYS through SYS9/SYS are small overlays, occupying one disk granule (five sectors) and loading into 2200-26ff. Other overlays may be larger, and load at 2800 or higher. Many of the overlays do depend on knowledge of the internals of SYSRES/SYS, directly accessing subroutines and data structures without the use of vector tables or the like. This means that SYSRES/SYS and the overlays must have been built at the same time, and would generally not be interchangeable with earlier or later releases.

Anyhow, getting back to the paranoia part. Someone apparently decided that simply not documenting the SVCs that provide sector-level access to the floppy was not sufficient to thwart those that might want to bypass the filesystem. After TRSDOS/SYS uses those SVCs for its part in the boot process, it actually removes them from the SVC vector table, and sets up jumps to them at undocumented internal TRSDOS locations 1130 (read sector) and 1133 (write sector).

In TRSDOS 1.2, access to all of the system files, including overlays, is done through the file system. The system files have normal file system entries. Unlike Model I TRSDOS, neither the system file directory entries nor the file contents need to be in any special location on the disk.

In TRSDOS 2.0, things are much more monolithic. The stage 1 boot code only loads and jumps into a single file, SYSRES/SYS. The boot code does not care where this file is located, but other parts of the system do. All of the overlays, small and large, are stored in a single file, SYSTEM/SYS, which is required to start on the track after the primary directory. The first sector of SYSTEM/SYS contains a kind of overlay directory that gives the track and sector numbers at which each overlay starts.

There is perhaps some advantage to putting all of the overlays in a single file, since the number of directory entries on the diskette is limited to 96. However, the need for a second, special directory mechanism for overlays is ugly, even if it is only a simple one. Requiring the system files to be at fixed locations on the disk (at least relative to the primary directory) might be a reasonable requirement if it yielded some performance gain, but it generally doesn’t. (With 1.2, the system files are set up when the disk is formatted, so even though they could be anywhere, in practice they are grouped together.)

TRSDOS 2.0 introduced changes to the disk organization, such that TRSDOS 1.2 and 2.0 diskettes are not interchangeable, except that the 2.0 XFERSYS utility can convert a 1.2 diskette to 2.0 format. The disk organization changes are basically gratuitous, and don’t provide any benefit to the user, while obviously being a great inconvenience to users with TRSDOS 1.2. They mashed the GAT (granule allocation table) and HIT (hash index table), which were sectors 1 and 2 of the directory track in 1.2, into just sector 1 in 2.0. In 1.2, the directory occupied sectors 3-26, while in 2.0 it occupies sectors 2-25. The only apparent rationale for doing this is to free up sector 26 on the directory track. In TRSDOS 1.2, sector 26 was not used on any track but the directory track, for any purpose. In TRSDOS 2.0, sector 26 of every track is used to store five bytes of unique disk ID, to better detect disk changes. (it has been suggested that those bytes might also have been used for software copy protection.) However, rather than mashing the GAT and HIT together, which made it impossible to support larger disks such as double-sided disks, they easily could have special cased the directory track(s) and stored the disk ID in either the GAT or HIT sector.

TRSDOS 4.0 introduced much larger changes to the disk organization, in order to support double-sided disks and hard disks. I haven’t yet begun to dig into the 4.0 code.

Reverse-engineering a binary-only Linux library and executable

Eric — Sun, 21 Oct 2007 20:44:14 +0000

I’ve been trying to reverse-engineer a particular proprietary binary-only Linux library, to learn what algorithms it uses. Unsurprisingly, the vendor ships both the library and the application that calls it without debugging information and with symbols stripped. However, because they dynamically link to the library, the function names of the library entry points are still present. Better yet, many of the library entry points are C++ methods, so the mangled names encode the argument profile. Initially I had a hard time setting breakpoints on the C++ methods because I was trying to use the mangled name, but then I discovered that once the library is loaded, you can give a partial command like

break 'Class::Method

then hit tab and get autocompletion. The single quote is important. If there are multiple matches, either because you didn’t type the full method name, or because there are multiple methods of the same name but with different argument profiles, it will give a list of the possible choices.

I wanted to start debugging only after a specific file has been opened, but many other files are opened first. It took me a little while to come up with a suitable conditional breakpoint command for the open() system call:

br open if strcmp(*((char**)($rsp+48)),"/lib64/libm.so.6")==0

The 48 is the offset in the stack frame of the path pointer, and is probably different on other architectures.

I haven’t yet figured out much detail of the data structures used by the library, but I found which library function does the specific transformation in which I’m interested. The arguments don’t seem to point directly to the data I care about, so they probably point to objects that in turn point to the data. Eventually I’ll have to puzzle it out, but for now I’ve found a way to identify the areas of memory which the function alters. (There may be an easier way.)

I break on entry to the function, then use gdb’s gcore command to save a core dump. Then I use the finish command, which will run the function and break again when it returns. I do a second gcore, create hex dumps of the two core dumps using objdump -s, and diff the two dumps.

It would be really nice to have a gdb command to search memory. I haven’t yet found such a command, though the idea was discussed on a gdb mailing list back in 2001.

What's All This Brouhaha? » Reverse-engineering

TRS-80 Model II TRSDOS: paranoia strikes deep

Reverse-engineering a binary-only Linux library and executable