Tuesday, February 17, 2026

Tracing execution of a program without sources on RSX11M-Plus

   Recently, a colleague, Mark Matlock, has been working to correct some issues in a  program that runs under RSX11M-Plus. No sources are available for that program, which presents some challenges.

  If this was being done under VAX/VMS, it would be trivial. There, you can RUN/DEBUG any program, whether it was linked /DEBUG or not. and get control of it, set breakpoints, examine data and instructions, and modify both as needed. Once you do that to see what needs to be changed, the PATCH utility allows you to use the information gathered via DEBUG and easily make permanent changes to the executable image.

  But, if "ifs" and "buts" were candy and nuts, we'd all have a happy Christmas. This program runs under RSX, where the tools are not so rich as in VAX/VMS. 

  The RSX family has a middlin'  selection of debugging tools available. Of the ones I've used, there are the good, (PDP-11 Debug, and Everhart's DDT22),  and the not so good (ODT, aka Odious Debugging Tool - hope you have up to date printouts of your program's listing, map files, and an HP16 calculator handy). But all of them require task building the tools into the executable task image. That's not gonna happen here, since there are no source or object files available.

  I gave this some thought. There is a bit in the Processor Status Word (henceforth to be referred to as the PSW) called the T-bit. When set, it causes the processor to trap through vector 14 after each instruction is executed. A while back I'd written some code that runs in a no OS environment, that used the T-bit trap to interrupt execution after every instruction, and copy the instruction to a circular buffer. Something like that would be handy here, to see where the problems spots were and what was being executed when they occurred. (detailed in my blog entry "Using the PDP11 T-bit Trap).

 But that code ran on a machine with no OS (Rules? There are no rules!). There are no rules!

 This time around, we have to follow the RSX rules to get the work done.

  So, first, we need someplace to write the address and instruction values as each trap occurs. The easiest place to store data while running in a system sort of mode (like via the trap vector) is Pool, the Dynamic Storage Region, aka primary pool. It's a relatively scarce resource, but not as scarce as it used to be since secondary pool and directive regions came along. We can allocate around a couple thousand bytes, on most systems, by calling kernel routine $alocb in a privileged task.

         mov     #poosiz,r1      ;size of data block
         call    $alocb          ;allocate it from pool
         mov     r0,pooadr       ;store its address in pooadr

  So that's where the data will be written. OK, so now we need a place to put the code that is pointed to by vector 14, and will be run whenever the task executes an instruction. My first impulse was to also load that into Pool, since that's the way we did things way way back in the day, when if you said RSX, you were talking about RSX11D. But, way back then was before Instruction and Data space separation was implemented in RSX. I&D space means that, your code uses a different mapping of 16 bit addresses to 18 or 22 bit physical memory, depending on whether it is executing code or accessing data. It's a great feature that can double the amount of space available to your task...if it's got the right mix of data and code.

  That means, for instance, that the 4000 in the two lines of code below don't necessarily access the same physical memory when they access 16 bit virtual address 4000.

4000:    CLR R0
              MOV    #240,4000

  The PDP11 can tell by the access mode of the instruction whether it's an I space (the fetching of the instruction in the first line) or D space (the second arguent of the second line) access.

 There are 3 sets of memory mapping registers on the larger and newer PDP11s. They're called APRs - Active Page Registers. The three sets are for the 3 processor modes - Kernel, Supervisor and User modes. Each mode has 16 APRs. Each set of 16 APRs is divided into 2 sets, one for mapping instructions and one for mapping data - I&D, ya dig?  Each APR maps up to 8 KB of 16 bit space to up to 8 KB of physical memory. So, in each mode, 64KB of instructions and 64 KB of data can be addressed at a time.

When you are executing via the T-bit trap, your code is using  RSX's kernel mode  APRs. The RSX kernel maps primary pool using Data space APRs - so if you copy code into primary pool and try to execute it from there, it will actually look somewhere else for it - wherever the Kernel I space APRs are pointing. But, there's an easy alternative. There is a region RSX maintains called ICB pool. It is used for Interrupt Control Blocks, which contain some code along with some data. Since they have to be accessible both as code and data, RSX puts them into low core, where the kernel has set the I and D APRs to both point to the same physical memory locations. The upshot of that is, I can load code into ICB pool via MOV instructions, which use the Data  APRs, and then  trap to the same location and execute that code there, which uses Instruction APRs.

  ICB pool is also a limited resource, like primary Pool, but we only need a handful of bytes of code to get the PC and they value it points to at each trap written to primary pool. ICB pool is allocated thusly

        mov     $icavl,r0       ;icb pool listhead to r0
        sub     #2,r0           ;subtract 2, since that's how it's done
        mov     #codsiz,r1      ;size arg goes into r1
        call    $aloc1          ;call general pool allocator
        mov     r0,codadr       ;store the address           


  Shucks, we're getting into TL;DR territory and haven't really talked much about the code yet. I'm thinking of changing the name of this blog to TL;DR.

  Alright, alright alright...I wrote a  program that allocated some Pool and ICB pool. Logical names are created (POOADR and ICBADR) that contain the addresses of the two allocated blocks, so you can see what they were for debugging, and so that the other programs involved  in this project will know where they are

TRACSPACE.MAC

TRACSPACE.CMD

  Next comes the program that prepares the Pool area as a circular buffer, then loads the routine that saves PC addresses and instructions into that Pool buffer, into ICB pool, and sets vector 14 to point to that routine. It's a separate program from TRACSPACE, because you can rerun it to clear out the data space and reload new tracing code, without reallocating Pool, which if you do too many times, you'll run out - and that's bad.

TRACLOAD.MAC

TRACLOAD.CMD

  It would be good to be able to stop and start tracing - you only have so much room in the curcular buffer, so it's useful to be able to stop and start the program tracing as needed. TRACON allows you to stop, start and check on the status of tracing. It also allows you to specify a number of instructions to delay before starting to trace. The default state of tracing is OFF, figuring that more often than not, you need to get the program being traced to some known state  before you start.

TRACON.MAC

TRACON.CMD

  These three programs get the system loaded and ready to start tracing. Now all I need is to set the T-bit in the PSW of a program you want to trace.

  I could have written a program that finds the task in memory and modifies its PSW. But that sounded like a lot of work. Instead. I'd like for the program to start with the T-bit already set.

  I had a look at where a typical task gets its initial setting of the PSW. It turns out that there is a word in the task header (the header is a part of the task's file on disk, that gets loaded into memory when a task is installed, loaded and run) that is documented as the initial value of the PSW, at offset H.IPS in the header. That sounded good - I thought to myself, "almost done". All I had to do is modify the task image so that word has the T-bit set, and it will wind up set and trapping when it runs. I wrote TRACTASK.MAC, a program that opens a task file, locates that word, and sets or clears the T-bit, bit 4. Most tasks have the value 0XF00F in that field (some have 0xF80F - not sure why - that 8 would indicate use of the alternate register set on some PDP11 models - but AFAIK, no DEC OS uses the alternate register set. Anyway, that's not what we're here for now...).  After it's set, it is 0XF01F (or 0xF81F, depending).  Here's the program that makes that change in a task

TRACTASK.MAC

TRACTASK.CMD

   I INStalled the modified task and ran it. Then I checked the circular buffer in pool. Nothing. Well, that was a disappointment. I had thought I was done. Even stranger, I dumped the task file and had a look at the header to make sure the T-bit was set. It wasn't. I assumed I had just mistakenly not set it, or there was a problem with my setting program. I set it again, and dumped it out to make sure the T-bit was set. It was. I re-installed it and ran it again. Still no results in the Pool buffer, Even crazier...I dumped the task file out again. The T-bit in the header was now clear again. I repeated this test a few times, and concluded that the act of INStalling a task causes (at least) that bit in the header to be rewritten in the task file - who knows what else. I've never heard that INStalling a task can cause it to be modified, but, there you have it. 

  I had a look at the sources for INS.TSK. I found a line of code that overwrites the value of the PSW in the header in memory when the task is being installed.

        MOV     #170017,H.IPS(R4)  

   And further on, there's some complicated logic that writes the header back to the task file. Crazy, nicht wahr?

  Anyway, not a real problem. I copied INS.TSK to INZ.TSK. Then, in the copy, I  ZAPped the 3 words  of that instruction to octal 240 (the NOP instruction, which does nothing at all, which is one of my favorite things). I installed INZ.TSK with a different task name. I created a new task to do all this, because if I messed up patching INS.TSK....how was I gonna INStall the original unpatched one?

>INS INZ.TSK/TASK=...INZ

  and then used it to install a test program to be traced

>INZ TEST.TSK

  Then I ran the modified program again, and checked the pool buffer for traces. Still nothing. At this point, this was going like most of my internals projects go.

 Internals Projects Life Cycle 

But, I'm too stupid to know when to quit, so I GREPed every .MAC source file on the system for H.IPS, figuring that RSX was trapping it somewhere else as well as in INS.

  And I found it....in three separate places in LOADR.MAC (the curiously named source file for ...LDR).

  Two instances of 

       CMPB    #17,140000+H.IPS ;SUCCESSFUL READ ?

  and one of

        CMPB    #17,H.IPS(R0)   ;VALID TASK IMAGE?

  17 octal is bits N, Z V and C set...and T bit clear. In all three of these places, if the CMPB is not equal, they branch off somewhere that fails the load of the task. Geez, DEC sure went to a lot of trouble to avoid letting people load a task that has the T-bit already set. In any case, copying LDR.TSK to another file with a different name  and trying to install it and use it, like I did with INS.TSK, wasn't going to work here. LDR is special - it doesn't get loaded and run like a regular task (an old Zen koan for RSX - "what loads ...LDR?). It's INStalled and FIXed by VMR during a SYSGEN. Little known FIX fact - after a task in INStalled and FIXed, you can delete its .TSK file and it will still run fine). I could patch the LDR.TSK file and then VMR it into a new system...but, if something goes wrong I wind up with an unusable system until I SYSGEN a new system. Not a good outcome...

  Instead, I decided to patch ...LDR in memory. It's FIXed, so it's not going to move around or reload from its disk image, so in memory changes will stick until a reboot. I could have used OPE to patch it, but that would require using specific addresses. Plus OPE is a pain in the sitz-platz to use in a command file. Instead, I wrote LDRFIX, a program that searches for the three places where ...LDR monkeys with the initial PSW, and NOPs them out. Searching for these instances give this a tiny chance of working on some other version of RSX. Doing it as an in memory patch means that if  sometihing goes wrong, a reboot erases those changes and gets you back to normal - without a big SYSGEN or VMR repair effort.

LDRFIX.MAC

LDRFIX.CMD

  OK, now I run the program I wanted to trace (so long ago in this story). and, success!. PCs and instruction  get recorded for every instruction executed.

  It's a tortured complicated path we took to get here, so let me try to sum it up a little bit..  

Have a look at your free Pool space and ICB pool space. Use RMD.

If needed,modify defintions POOSIZ in TRACSPACE, TRACLOAD and TRACREAD. Make sure that it's not too big for the space available, and is the same in all 3 programs. Check the  value of ICBSIZ in TRACSPACE, make sure it's not too big.  Assemble and link them, and  TRACON and LDRFIX as well.

@TRACSPACE
@TRACLOAD
@TRACREAD
@TRACTASK
@TRACON
@LDRFIX

  TRACSPACE, TRACLOAD, and LDRFIX need to be run once after each system reboot. Task INZ.TSK needs to be created by zapping INS.TSK, just once. It needs to be INStallted once per boot. See below for details.

  TRACLOAD can be rerun later if desired,  to clear out the Pool buffer and reload the vector code.

  TRACTASK needs to be run once on a task you want to trace, to set its initial PSW. Be aware, that if you forget or have some reason to INStall it using INS instead of INZ, the T-bit will be cleared by INS, and will have to be reset again with TRACTASK.

  TRACON needs to be run to control the state of tracing. SInce the default state is OFF, you'll have to run it to set the state to ON. It needs to be done once per boot.

TRACREAD can be run to have a look at the contents of the Pool buffer. Or you can just use OPE /KNLD to view the buffer, using the value of logical POOADR (sho log pooadr/all).

RUN TRACSPACE to create blocks in Pool and ICB pool.

RUN TRACLOAD to load the trace code and point  the T-bit vector at it, and prepare Pool for the circular buffer.

RUN LDRFIX to to patch out the T-bit tests in ...LDR 

RUN TRACTASK to toggle the T-bit in the task to be traced's header. It toggles the bit on and off each time you run it on a task. 

Copy [1,54]INS.TSK to INZ.TSK. Use ZAP in absolute mode to depost 3 words of 240 at location 14706. Zap's a funny little utility - you can't enter the filename on the command line, gotta just enter ZAP, then give the filename and /AB switch to the ZAP> prompt. Anyway....

>ZAP
ZAP>inz.tsk/ab
_14706/
000:014706/ 012764
_240
_000:014710/ 170037
_240
_
000:014712/ 000014
_240
_
000:014714/ 016401
_X
>

  Once per boot, Install INZ.TSK as task ...INZ

>INS INZ/TASK=...INZ
>

Then, once per boot, use INZ to INZtall the task you have toggled the T-bit on a few steps ago

>INZ PROGRAM.TSK

If you want tracing to start immediately, use TRACON to set tracing on. TRACLOAD sets tracing off when it prepares the trace buffer, because you usually don't want to trace from the very beginnning of a program. You can also set tracing to start after a specificed number of instructions. Here; we set it on from the start.

RUN TRACON
TRACON Version V01A01 - 12-JAN-2026 - Control-G Consultants
Input and output radix is Octal.
HELP command will print a summary of confusing commands.
TRC>
help
Start   - Sets number of instructions to skip before tracing starts
On      - Begin or resume tracing
Off      - Stop tracing
Check - Print current value of tracing control word
Exit     - Exit the program
Quit    - Exit the program
Exit and Quit do the same thing
Check value, high bit set indicates tracing is off, clear idicates on
I call this "Yes, we have no bananas" logic.
TRC>ON
ON
>

  And now, finally, you can run the program in question and accumulate some traces. If you're still reading this now, after all of this persiflage, you must really want to trace some program's execution....

>RUN PROGRAM

  You can read the trace buffer using OPE, to just read the contents of  the pool block, using the address contained in the logical  pooadr (>sho log pooadr/all) - it was also printed out when you ran TRACSPACE, or you can dump it out using TRACREAD.

TRACREAD.MAC

TRACREAD.CMD


TRACREAD will read the oldest entry and print htme in order up to the most recent.

>RUN TRACREAD

TRACREAD Version V01A01 - 14-FEB-2026 - Control-G Consultants

Pool block address is
055750
Circular index value
000010

Control Word
100000

012701
000006
007073
000010
000011

...und so weiter. 


  TIme for a disclaimer. These programs change mode to kernel, allocate scarce critical resources, change APRs to map all over the place, and modify things in important programs. I can pretty much guarantee there are bugs in them.  There's plenty that can go wrong in this scenario. Use at your own risk. Let me know if you find any problems. Heck, let me know if you ever use any of  this stuff, or even read this far.


No comments:

Post a Comment

Comments?