Thursday, August 25, 2022

Software Differences between the 6800 and the 6801/6803

For those not familiar with the 6801, the issues when trying to run 6800 code on the 6801 or 6803 would fall under the following categories: 

(1) Differences in the condition code calculations for CPX;

(2) Differences in function of any op-codes undefined in the 6800 which may be used;

(3) Timing differences for software timing loops;

(4) Differences in the memory map, which depend on the mode the MPU is operating in.

(Remember that the 6803 is a 6801 with the ROM missing or disabled.) 

Revisiting this a week later, I realize that I've left out something important by not mentioning the rest of the 6801's extensions to the 6800:

  • A and B accumulators are concatenated to form the D double accumulator for certain new instructions. Condition codes fully reflect the results in all 16 bits, which is more important than just getting things done in fewer instructions.
  • The new instructions that work with the D accumulator are
    • LDD and STD, 16-bit load and store of the double accumulator;
    • ADDD and SUBD, 16-bit add to and subtract from the double accumulator; and
    • LSRD and ASLD/LSLD, 16-bit logical shifts right and left of the double accumulator.
  • The new MUL instruction also works with the double accumulator, sort-of. It's an 8-bit by 8-bit unsigned multiply of the contents of A and B, leaving the results in D.
    You can synthesize a 16-bit multiply using four of these and appropriate ADD instructions, which uses a few more bytes than shifting and adding, but is about seven times as fast.
  • Improvements in the condition code calculations for CPX, the compare with X instruction which I already mentioned above.
  • The PSHX and PULX instructions push X to and pop it from the return address stack.
  • The ABX instruction adds B to X, which is very helpful in accessing record fields and such.
  • The JSR instruction has a new direct page addressing mode, which can be used to speed calling a few carefully selected short, heavily used routines.
  • There is now an official branch never instruction, BRN, effectively a two-byte NOP. This can be useful in debugging and in simplifying code generation in some high-level languages.
  • There were changes in Motorola's assembler and the manual itself which I kind of shrugged my shoulders at, but may be of interest:
    • LSL is added as an alias of ASL.  (Note that LSLD is an alias of ASLD.)
    • BHS and BLO are added as aliases of BCC and BCS, respectively.
      (Branch High or Same == Branch Carry Clear)
      (Branch Low == Branch Carry Set)


    Details of Code Porting Issues:

    (I'm summarizing information from the MC6801RM (AD2) MC6801 8-bit Single-chip Microcomputer Reference Manual -- )

     (1) On the 6800, the CPX instruction is not recommended for anything other than comparing X for equality -- in other words, you would generally want to follow CPX on the 6800 with either BEQ or BNE, but not with other branches. 

    On the 6801, CPX affects all the condition codes correctly for the 16-bit comparison, and it can be used in the full range of signed and unsigned comparisons. If the 6800 code confines itself to the recommended use of CPX, there should be no problem.  

    But much of the existing 6800 code does do tricky things with CPX. In particular, it is often used as a NO-OP in the assumption that CPX will not affect the carry flag. In particular, it is often used to hide another op-code in the operand field. For example, on the 6800, with

    SKIPR CPX #$8601

    executing through SKIPR would only see a probably meaningless comparison of X with hex 8601. But branching to SKIPR+1 would see, instead, LDAA #1. This is a fragile optimization, of course, and the effort to use it usually costs more than whatever it was supposed to gain. 

    What to do with this? It only costs 1 more byte and actually uses 1 clock cycle less to spell it out properly:

    SKIPR  BRA NOLOAD
    LOAD1 LDAA #1
    NOLOAD

    If that 1 byte is fatal, you can probably find something nearby to clean up slightly and save a byte, perhaps using one of the 6801 extensions mentioned above. You really don't want such fragile optimizations, anyway.

    (2) is similar to (1), in that the incompatibilities are the result of tricks engineers really should avoid. (You don't want to find yourself faced with an undefined op-code changing its behavior in future mask sets, not to mention possible architecture extensions like the 68HC11, among other things.) Again, you can usually find places in nearby code to clean up, or to take advantage of the new 6801 instructions in a way that allows avoiding undefined op-code abuse, even if using the undefined op-code actually did save a byte. 

    (3) is the downside of the improved timings for the 6801 instructions. There's nothing to do here but recalculate the timing constants, which should be enough. (But note the exception for CPX timings.) 

    Maybe I should try to give a summary of the improved timings:

    • Branches take 1 cycle less (3 vs. 4). 
    • Branch to subroutine, BSR, takes 2 cycles less (6 vs. 8).
    • Indexed mode binary operand byte instructions (ADDA/B n,X; etc.) take 1 cycle less (4 vs. 5).
    • Indexed mode unary byte instructions (ASLA/B n,X; etc.) take 1 cycle less (6 vs. 7).
    • Indexed mode 16-bit load and store instructions (LDS n,X; etc.) take 1 cycle less (5 vs. 6).
    • CPX takes one cycle more except in indexed mode, I guess to get the flags right doing it 8 bits at a time:
      • immediate is 4 for 6801 vs. 3 for 6800, 
      • direct page is 5 vs. 4, 
      • indexed is 6 for both processors,
      • extended/absolute address is 6 vs. 5.
    • Inherent mode 16-bit instructions (DES, TSX, etc.) take 1 cycle less (3 vs. 4).
    • JMP in indexed mode takes 1 less (3 vs. 4).
    • JSR gets some nice improvements: 
      • 5 cycles vs. not available on the 6800 in direct page mode, as mentioned above,
      • 6 vs. 8 in indexed mode,
      • 6 vs. 9 in extended/absolute address mode.

    (4) The differences in the memory map show up at the bottom, top, and middle of memory:

    At the top of memory, the 6801/6803 define new interrupt vectors for the built-in peripheral devices. 6800 code probably puts something at those addresses that will need to be moved, and/or the 6801/3 code will need to include instructions that mask out the associated device interrupts.

    In the middle of memory, you may find the internal ROM on the 6801, but not on the 6803. 

    The 6801 in several of its operating modes will have ROM somewhere in the middle of memory -- usually from $F800 to $FFFF, but not always. You can switch the ROM out of memory by the operating mode.

    In particular, the 6803, which has no functional ROM, should only be operated in either mode 2 or mode 3, which are the (mostly) external bus modes. Mode 2 keeps the internal RAM (addresses $0080 to $00FF) in the memory map, and Mode 3 switches it out. These two modes can be used to avoid conflicts when existing 6800 code wants to use the addresses where the ROM would be.

    At the bottom of memory, you have the devices themselves and the built-in RAM. The built-in RAM is less likely to get in the way, but you can use operation mode 3 to switch it out of the memory map.

    The built-in peripheral interface registers cannot all be switched out. There will always be devices at address $0000 to $0003 and $0008 to $001F.

    Since the direct page is special, 6800 code should really not be written to conflict with the addresses at the bottom of memory in a way that doesn't allow just moving some of the direct page variables up a bit, but quite a lot of code is written in a way that does conflict.

    One example is Very Tiny Language. (VTL-2 is what you will probably find if you go hunting for it.), VTL-2 basically allocates the language's variables A, B, ... Z to addresses defined by their ASCII values, which are in the direct page. I have found a way to mostly work around this for VTL (see my recent posts on adopting VTL-2 to hardware other than the MITS/ALTAIR 680), but I'm not perfectly confident it's a perfect work-around.

    Anyway, that probably covers the differences.

    If you want another example of how these changes actually affect code, I have adapted the fig-Forth interpreter model for the 6800 to the 6801, optimizing it with the new instructions. (I may have missed some possible optimizations.) That source may be interesting to examine. You can find it either in my adaptation of Joe H. Allen's EXORsim:

    https://osdn.net/users/reiisi/pf/exorsim6801/scm/tree/master/
    or in the source tree of my 6800/6801 assembler:

    https://sourceforge.net/p/asm68c/code/ci/master/tree/fig-forth/

    No comments:

    Post a Comment