Wednesday, January 19, 2022

Using the PDP11 T-Bit trap

   Recently, I've been writing a lot of PDP-11/73 code that runs with no OS. After a little practice, I've gotten the hang of it, more or less, and after you're used to it, it's really no harder than writing code for RSX. In some ways, it's easier...Device drivers? Don't need em! Rules? There are no rules!

  Where it suffers in comparison is the debugging tools available. The machine itself provides Micro-ODT, a limited subset of RSX ODT - basically you can read and change values by address or register number.

  So that leads to the old classic technique of diagnostic print - adding code to the program that prints out values as it executes, as you look for where a problem is occurring. And that works pretty well, although it can be a bit of an effort, and time consuming as well.

  On one of these projects, I ran into a problem where an RTS instruction was being overwritten by...something (this programming environment doesn't support read only PSECTS for code). After attacking with diagnostic prints and Micro-ODT, I wasn't getting anywhere - I couldn't home in on exactly when it happened, or what was the culprit. There were thousands of lines of  code,  and some of the code was interrupt driven, so things weren't happening in a simple linear fashion.

  It occurred to me that I could use the T-bit trap to determine exactly what instruction was overwriting the RTS.

  The T-bit is a bit in the PDP11 processor status word (bit 4, mask value 16.). When set, it causes each instruction executed to  trap via vector location 14. You can load this trap vector to point to a routine, which does...whatever it needs to do. It can  then return by doing an RTT (Return from Trap)  instruction.  Basically. There are some more fine points to it, but for this project, that's it in a nutshell.

  So, I coded up a routine to check the value of the mauled RTS instruction after each instruction. When the routine detects that the RTS has been changed, it halts the machine, to allow for further inspection via Micro-ODT. Additionally, it stores the return PC on the stack each time it is called, in a simplified circular list. When the T-bit is set, this code will execute after every instruction. It will give me a log of where it's been, and halt right after the instruction that mauls the RTS instruction. Should take me right to the scene of the crime, nicht wahr? Here's the code.

  Lessee. a few notes about the routine first...the PC list is just a block of words. The first word is used as the address of the next free entry in the list. When the list is full, the index is set back to the beginning of the list - thus it's more or less "circular". Label ZAZEN is the address of the RTS routine that is getting clobbered. At TBITX, we check to see it it is still 207 - the octal value of an RTS instruction. If so, proceed. If not, it's like Jim Morrison said... "WAIT! There's been a slaughter here!" Halt and figure out what just happened. Oh, and push is a macro that translates to mov    arg,-(sp), and pop translates to mov    (sp)+,arg.


ttrap:
        push    r0               ;save r0
        mov     ttab,r0          ;address of next free entry to r0
        mov     2(sp),(r0)+      ;save return PC in that entry
        mov     r0,ttab          ;update next free entry pointer
        cmp     #ttab+2000.,r0   ;are we at the end of the table?
        bge     ttrapx           ;if not, pass on
        mov     #ttab+2,ttab     ;if so, reset to top of table

ttrapx: cmp     zazen,#207       ;is the RTS instruction still there?
        bne     ttrapy           ;if not, halt
        pop     r0               ;restore r0
        rtt                      ;get back and execute the next
                                 ;instruction

ttrapy:  halt                    ;something overwrote the RTS!
                                 ;break out the .LST file and
                                 ;get to looking!

ttab:   .word   ttab+2           ;stores addr of next free wd in ttab
        .blkw   1024.            ;1000 words, plus some slop at end


  OK, now all I gotta do is load the vector, set the T-bit, run the program, and open the champagne. Warmed by the mellow glow of my own cleverness, I proceeded to do just that. I added this at the top of my program.

   mov     #ttrap,@#14   ;load t-bit vector with a debug routine
   mov     #340,@#16     ;load a regular sort of PSW in the vector
                         ;priority 7 - don't wanna get interrupted

   psw = 177776          ;16 bit address of the PSW
                         ;we're using 16 bit addresses for this
                         ;project - it's a utility, not an OS.

   tbit = 16.            ;bit 4 is the T-bit, 16. as a mask value
   bisw    #tbit,@#psw   ;this is an 11/73, so I can directly access
                         ;the psw, or use MTPS.

   But - it didn't work. I tried it again, and displayed the PSW after I set it. It didn't "take" - the bit was not set. I tried it again using MTPS (MoveToProcessorStatusword ) instruction - but. no soap there either.

  A little research found a footnote in one of the PDP11 processor handbooks, that pointed out that you can't set the T-bit value like the rest of the bits in the psw - you have to create a PC/PSW pair of words on the stack,with the bit set in the psw word, then let it take effect when that word gets put into the psw during an RTT instruction. OK, then, easy enough to do - I pushed a return address and new PSW on the stack, and then did an RTT instruction to make it happen.


   push    #60       ;push a PSW value with T-bit set, priority 1,
                     ;on the stack
   push    #zzz      ;push the address of the rest o' the program
   rtt               ;then return from trap, setting the T-bit and
                     ;resuming execution

zzz:    (rest of program goes here)


  And that set the T-bit just fine.

   At that point, I got a trap to my routine after every instruction. When the RTS instruction was overwritten, the machine halted, and I used Micro-ODT to have a look at the PC log buffer, to see what instruction overwrote it, and how it got there. Only...the log showed that the instruction that did it was part of a timing loop, that merely incremented a register, and tested its value. It didn't write memory at all, much less write to where the RTS instruction was. The instructions pointed to by the PC's stored in the list leading up to it were all the same, more iterations of the timing instructions. There was no way that this caused the overwrite. Yet, here it was. A little more thinking made me realize, that if the CPU hadn't overwritten the instruction, then it had to have been DMA from a controller card. I suspect that the RQDX3 was doing the overwrite - the RTS instruction was located adjacent to an RQDX3 buffer, and the RQDX3 accesses host buffers via DMA.

 I fiddled with the RQDX3 code for a bit, and couldn't figure out why it was machine gunning DMA where it wasn't welcome. So, for a solution, I put a couple or words between the RTS instruction, and the RQDX3 buffer, and the overwrite landed on those words instead of the RTS instruction. Not sure exactly why the RQDX3 is writing a word before its buffer starts, but, there's plenty I don't know about how the RQDX3 works. Maybe it's alignment related. But, it's results that I'm interested in on this project... The spice must flow. All's well that ends - and the T-bit trap gave me the clue I needed to figure it out.


No comments:

Post a Comment

Comments?