Wednesday, January 19, 2022

Using the PDP11 T-Bit trap

   Recently, I've been writing a lot of PDP-11/73 code that runs with no OS. After a little practice, I've gotten the hang of it, more or less, and after you're used to it, it's really no harder than writing code for RSX. In some ways, it's easier...Device drivers? Don't need em! Rules? There are no rules!

  Where it suffers in comparison is the debugging tools available. The machine itself provides Micro-ODT, a limited subset of RSX ODT - basically you can read and change values by address or register number.

  So that leads to the old classic technique of diagnostic print - adding code to the program that prints out values as it executes, as you look for where a problem is occurring. And that works pretty well, although it can be a bit of an effort, and time consuming as well.

  On one of these projects, I ran into a problem where an RTS instruction was being overwritten by...something (this programming environment doesn't support read only PSECTS for code). After attacking with diagnostic prints and Micro-ODT, I wasn't getting anywhere - I couldn't home in on exactly when it happened, or what was the culprit. There were thousands of lines of  code,  and some of the code was interrupt driven, so things weren't happening in a simple linear fashion.

  It occurred to me that I could use the T-bit trap to determine exactly what instruction was overwriting the RTS.

  The T-bit is a bit in the PDP11 processor status word (bit 4, mask value 16.). When set, it causes each instruction executed to  trap via vector location 14. You can load this trap vector to point to a routine, which does...whatever it needs to do. It can  then return by doing an RTT (Return from Trap)  instruction.  Basically. There are some more fine points to it, but for this project, that's it in a nutshell.

  So, I coded up a routine to check the value of the mauled RTS instruction after each instruction. When the routine detects that the RTS has been changed, it halts the machine, to allow for further inspection via Micro-ODT. Additionally, it stores the return PC on the stack each time it is called, in a simplified circular list. When the T-bit is set, this code will execute after every instruction. It will give me a log of where it's been, and halt right after the instruction that mauls the RTS instruction. Should take me right to the scene of the crime, nicht wahr? Here's the code.

  Lessee. a few notes about the routine first...the PC list is just a block of words. The first word is used as the address of the next free entry in the list. When the list is full, the index is set back to the beginning of the list - thus it's more or less "circular". Label ZAZEN is the address of the RTS routine that is getting clobbered. At TBITX, we check to see it it is still 207 - the octal value of an RTS instruction. If so, proceed. If not, it's like Jim Morrison said... "WAIT! There's been a slaughter here!" Halt and figure out what just happened. Oh, and push is a macro that translates to mov    arg,-(sp), and pop translates to mov    (sp)+,arg.


ttrap:
        push    r0               ;save r0
        mov     ttab,r0          ;address of next free entry to r0
        mov     2(sp),(r0)+      ;save return PC in that entry
        mov     r0,ttab          ;update next free entry pointer
        cmp     #ttab+2000.,r0   ;are we at the end of the table?
        bge     ttrapx           ;if not, pass on
        mov     #ttab+2,ttab     ;if so, reset to top of table

ttrapx: cmp     zazen,#207       ;is the RTS instruction still there?
        bne     ttrapy           ;if not, halt
        pop     r0               ;restore r0
        rtt                      ;get back and execute the next
                                 ;instruction

ttrapy:  halt                    ;something overwrote the RTS!
                                 ;break out the .LST file and
                                 ;get to looking!

ttab:   .word   ttab+2           ;stores addr of next free wd in ttab
        .blkw   1024.            ;1000 words, plus some slop at end


  OK, now all I gotta do is load the vector, set the T-bit, run the program, and open the champagne. Warmed by the mellow glow of my own cleverness, I proceeded to do just that. I added this at the top of my program.

   mov     #ttrap,@#14   ;load t-bit vector with a debug routine
   mov     #340,@#16     ;load a regular sort of PSW in the vector
                         ;priority 7 - don't wanna get interrupted

   psw = 177776          ;16 bit address of the PSW
                         ;we're using 16 bit addresses for this
                         ;project - it's a utility, not an OS.

   tbit = 16.            ;bit 4 is the T-bit, 16. as a mask value
   bisw    #tbit,@#psw   ;this is an 11/73, so I can directly access
                         ;the psw, or use MTPS.

   But - it didn't work. I tried it again, and displayed the PSW after I set it. It didn't "take" - the bit was not set. I tried it again using MTPS (MoveToProcessorStatusword ) instruction - but. no soap there either.

  A little research found a footnote in one of the PDP11 processor handbooks, that pointed out that you can't set the T-bit value like the rest of the bits in the psw - you have to create a PC/PSW pair of words on the stack,with the bit set in the psw word, then let it take effect when that word gets put into the psw during an RTT instruction. OK, then, easy enough to do - I pushed a return address and new PSW on the stack, and then did an RTT instruction to make it happen.


   push    #60       ;push a PSW value with T-bit set, priority 1,
                     ;on the stack
   push    #zzz      ;push the address of the rest o' the program
   rtt               ;then return from trap, setting the T-bit and
                     ;resuming execution

zzz:    (rest of program goes here)


  And that set the T-bit just fine.

   At that point, I got a trap to my routine after every instruction. When the RTS instruction was overwritten, the machine halted, and I used Micro-ODT to have a look at the PC log buffer, to see what instruction overwrote it, and how it got there. Only...the log showed that the instruction that did it was part of a timing loop, that merely incremented a register, and tested its value. It didn't write memory at all, much less write to where the RTS instruction was. The instructions pointed to by the PC's stored in the list leading up to it were all the same, more iterations of the timing instructions. There was no way that this caused the overwrite. Yet, here it was. A little more thinking made me realize, that if the CPU hadn't overwritten the instruction, then it had to have been DMA from a controller card. I suspect that the RQDX3 was doing the overwrite - the RTS instruction was located adjacent to an RQDX3 buffer, and the RQDX3 accesses host buffers via DMA.

 I fiddled with the RQDX3 code for a bit, and couldn't figure out why it was machine gunning DMA where it wasn't welcome. So, for a solution, I put a couple or words between the RTS instruction, and the RQDX3 buffer, and the overwrite landed on those words instead of the RTS instruction. Not sure exactly why the RQDX3 is writing a word before its buffer starts, but, there's plenty I don't know about how the RQDX3 works. Maybe it's alignment related. But, it's results that I'm interested in on this project... The spice must flow. All's well that ends - and the T-bit trap gave me the clue I needed to figure it out.


Thursday, January 6, 2022

VAXstation 1 Software Install - part 5 - RQDX3 IO

  The main difficulty in reading and writing disk blocks with the RQDX3 was a lack of documentation. There's no programmers guide to the RQDX3. There are generic guides to using MSCP, and there is some info about programming UDA50s available, which was useful since MSCP devices are all programmed pretty much the same. It took quite a bit of experimentation to get simple single block reads and writes to work. But, all's well that ends. Bob Schor's commented disassembly of the RT11 MSCP bootstrap was very helpful. The  best doccos I found for  doing this project were found online in the usual places...

 Mass_Storage_Control_Protocol_Ver_2.4.0_Jun92.txt. 
AA-L621A-TK_UnibusPortDescription_1982.pdf
AA-L619A-TK_MSCP_BasicDiscFnsV1.2_Apr82.pdf

  QBUS configuration rules put the first RQDX3 at  172150, which translates to ^X20001468 in MicroVAX 1 IO address speak. If you have a second RQDX3 in a MicroVAX 1, well, you're on your own. It will go in floating address space. The interrupt vector of the 1st RQDX3 is 154, which we don't care about at all, since we don't use disk IO interrupts in this project - it's hard enough to get all of this to work without trying to  overlap disk and ethernet IO.

  The RQDX3 only has two registers, which doesn't make it easier to deal with, since they are  overloaded to hell and gone, doing a dozen different things, depending on what's happening. They are rich in state, and it's up to you to keep track of what state things are in. The two registers are called IP and SA. 

 The IP register has two uses. When written with any value, it causes a hard reset of the controller.   When read while things are operating. it causes the controller to initiate "polling", whatever that is. Just kidding, I know what that is. I'm just not all that interested in it for this project. 

  The SA register (where do they get these crazy names?) has four functions (see? I told you, they overload the hell out of these two registers). When read during initialization, it sends and receives data related to initialization. When written during initialization, it sends information to the controller. When read during normal operation, it returns status and error info. When a 0 is written to it, at any time, it tells the controller that a purge has been completed, whatever that is. Seriously, we don't use it in this program, so I don't need it.

  The real communication with the card is done a lot like the way the DEQNA does it - through lists of buffers that the card DMAs in and out of memory, willy-nilly, anytime it feels like it. Again, just kidding. You can control when it happens. More or less.

  But enough of these vague generalities - let's get right on to initialization. 

  To get ready for this, you need to prepare a table of four "steps" - longword values that will be passed into the RQDX3 during initialization. The table looks like this...


    istbl:
    ; Step 1 - assorted configuration bits        
               
.word   ^X8000  
                        ;Step 1 bits
                        ;[15] = 1,
                        ;[14] (WR) = 0,
                        ;[13:11] (cmd  ring size as power of 2) = 0
                        ;[10:8]  (resp ring size as power of 2) = 0,
                        ;[7]  (Interrupt Enable/Disable) = 0
                        ;[6:0] (int vect/4) = 0
                        ;command and response rings = 1 element
                        ;no interrupts
                        ;no interrupt vector.

    ; Step 2 - Ring base address, low 16 bits. Entered here as 0,
               since PIC coding requires that we fill it in at                       runtime
     zazz:     .word    0

    ;Step 3 - Ring base address high 6 bits (always 0 for us),
              and Purge/Poll bits (likewise, 0 for us).
              .word   0 

    ;Step 4 - The GO bit. Finishes up the initialization                            .word   1

    ;End Marker - a null longword to signal the end
              .long   0 


  Since the table of values used for initialization requires the address of the command and response rings (to be stored in location zazz, above), let's discuss setting them up here, even though they don't get used until we do the ONLINE command.

  We indicated that we have only one command and one response packet, in the second entry above. You specify how many you have as a power of 2 - we entered zero for each there, so 2 to the 0th power is 1. RP and CP are the response and command packets, below. You put their addresses into the command and response rings table. Gotta do almost all of this at run time, since we're writing PIC here. Each entry of a command packet in the rings table consists of the address of the packet, and an ownership flag word - when it's ^X8000, we are saying that the the RQDX3 owns it. When the RQDX3 finishes processing it, it sets the flag to be non-negative - that's how we know it's done, and that we can use it again - and more important, when doing a read or write command, we then know that the card is through and the data is now valid

rp:    .blkb    64
cp:    .blkb    64

rings:
rpd:    .word    0
        .word    0
cpd:    .word    0
        .word    0


                moval    rp,rpd
                movl    #^X8000,rpd+2
                moval    cp,cpd
                movl    #^X8000,cpd+2

    
            moval    rings,zazz


  OK, Once all of this is set up, to start the initialization, you write anything into the IP register. CLRW writes a 0 to it.

    ;Define some symbols for the registers
           rcsr   = ^X20001468     ;RQDX3 CSR starts her
           rip    = rcsr           ;1st reg is called I
           rsa    = rip+2          ;2nd reg is called SA

           clrw    @#rip

  I should mention that longword accesses aren't allowed to the IO space on the MicroVAX - use word instructions. Also, data structures need to be longword aligned, or difficult to debug problems can occur.


  So that starts this moving. Now, read a word from the SA register. If it's negative, an error has occurred.  If it's positive, check and see if the initial step value bit is set in it (the initial match  value is ^X0800). If that bit is set, shift the match value one bit left, and then write the first value from ISTL into the SA register. Get another value out of the SA register, and test it for negative and bit match. If so, write the next istbl value to SA.  Repeat these steps until the match value hits ^X8000. If you get that far successfully, then the card is now initialized.


  OK, then you have to put the card into the online state. It's a whole 'nother story. The ONLINE command uses the command packets and rings we discussed above. Let's talk bout the contents of the packets themselves.


 The command packet for the ONLINE command is laid out thusly... 


                   31                             0               
                  +-------------------------------+              
                  |   command reference number    |              
                  +---------------+---------------+              
                  |   reserved    |  unit number  |              
                  +---------------+-------+-------+              
                  |   modifiers   | rsvd  | opcode|              
                  +---------------+-------+-------+              
                  |  unit flags   |    reserved   |              
                  +---------------+---------------+              
                  |           reserved            |              
                  +-------------------------------+              
                  |                               |              
                  +---        reserved         ---+              
                  |                               |              
                  +-------------------------------+              
                  |  device dependent parameters  |              
                  +-------------------------------+              
                  |           reserved            |              
                  +-------------------------------


  Command reference number - we don't use it - we're doing one command at at time, so we don't need to keep track of them. Unit number for me is always gonna be 0. If you need a different unit number - you know what to do.  The opcode is mscp$k_op_onlin, ^X0009. Modifiers - don't need any. Unit flags, likewise. Device dependent parameters - none. All in all, it's a pretty simple packet.

  The response packet is where the RQDX3 tells us how the command went. For the ONLINE command, it is laid out like this...

                   31                             0               
                  +-------------------------------+              
                  |   command reference number    |              
                  +---------------+---------------+              
                  |sequence number|  unit number  |              
                  +---------------+-------+-------+              
                  |    status     | flags |endcode|              
                  +---------------+-------+-------+              
                  |  unit flags   |multiunit code |              
                  +---------------+---------------+              
                  |      reserved         |spndles|              
                  +-------------------------------+              
                  |                               |              
                  +---     unit identifier     ---+              
                  |                               |              
                  +-------------------------------+              
                  |     media type identifier     |              
                  +-------------------------------+              
                  |           reserved            |              
                  +-------------------------------+              
                  |           unit size           |              
                  +-------------------------------+              
                  |      volume serial number     |              
                  +-------------------------------+              


  Command reference number and sequence numbers - I don't need 'em. Endcode - forget about  it. The status is where the status of the action is returned - we check it for errors. All the rest, don't need 'em. The media type identifier, unit size and volume serial number are interesting, but...don't really need 'em for this project


  OK, so you've set up these packets, and have gone through initialization as per above. Now, to execute the ONLINE command in the command packet,  write anything but zero into the IP register to launch things into motion. Then, loop until the value in rpd+2 is not negative, which means the online command has completed. Check the response packet status field and see if there was an error (success will be zero, anything else was an error). What could be simpler?

            mov    #1,@#rip

  Then you loop on the value in rpd+2 until it is non-negative

            1$:    tst    rpd+2
              blss    1$

  OK, all of this work, and all we've done is put the disk successfully online. Now, let's see if we can write a block (that we've presumably received on the ethernet - remember the whole point of this project?)


  Writing a block (and reading one, for that matter)  is not all that different from the ONLINE process. You set up the Command and Response packets, set up the RIng table, and then write a non-zero value to the IP registers. The main difference is in the command packet - it's a little different.


                   31                             0              

                  +-------------------------------+             
                  |   command reference number    |             
                  +---------------+---------------+             
                  |   reserved    |  unit number  |             
                  +---------------+-------+-------+             
                  |   modifiers   |  caa  | opcode|             
                  +---------------+-------+-------+             
                  |          byte count           |             
                  +-------------------------------+             
                  |                               |             
                  +---         buffer          ---+             
                  |                               |             
                  +---       descriptor        ---+             
                  |                               |             
                  +-------------------------------+             
                  |      logical block number     |             
                  +---------------+---------------+             
                  |   entry id    | hrn or entloc | (optional)  
                  +---------------+---------------+             



  You have to enter a byte count (512, since we're dealing with disk blocks), a logical block number (they start at 0), and a buffer descriptor. The format of a buffer descriptor includes support for all sorts of exotic situations, none of which apply here, using a QBUS and an RQDX3. It boils down to, put the address of the (read or write) buffer in the first longword, and zeroes in the other two. Unit number - we're still using 0. The opcodes for read and write, mscp$k_op_read and mscp$k_op_write, are ^X21 and ^X22. When you get the packet set up, set up the cpd, rpd and rings table again (if you need to), and once again write something to the IP register to make the magic happen. Loop on the value of rpd+2, just like for the ONLINE command, to wait for it to complete. And, if everything has gone just right, your block of data should have been read or written....


  In using the above techniques to read and write, I noticed that every few hundred thousand reads or write, the RQDX3 hangs up - never indicates completion. When that happens, all I do is go through the initialization and online steps again, and then do the read or write again.

  And a note about the returned statuses. a status of 0 is success. Errors take a myriad of formats. The most common one I've run into is invalid input data. This error code consists of offset*256+1. The offset refers to the offset in the command packet that is invalid - so for instance, a returned status of ^X1C01  - the 1 in the low bit tells us, invalid input. Divide it all by 256 and you get ^X1C - that's an offset into the command packet of 28, which is the logical block number field - you get a 1C01 error when you try and write to a logical block number that doesn't exist (ie, too high for the size of the disk).

  So, that's a sketchy, incomplete and confusing summary of how to talk to an RQDX3 in the simplest way possible. Poorly written, I'm well aware - but, this, along with the documentation  sources I mentioned above should give you a fighting change to make an RQDX3 do something. And pretty much everything I've described applies to RQDX3's when used in a QBUS PDP11 as well as in a QBUS MicroVAX. Actually, when I wrote the PDP11 versions of the code for this project, it was easier than the MicroVAX version - PDP11 code loads at 0, instead of 7000, so I didn't have to write it PIC. And the IO page addressing is simpler on the PDP11, which made finding the cards easier.

 Next and final installment - the programs that use all this info to  load and save disk images to a VAXstation/MicroVAX 1, and QBUS PDP11s as well.