Introduction to Byte Arithmetic
on the 68000
If you haven't already worked through the introduction to byte arithmetic on the 6800, 6801, and 6809, you should. I explained the byte math there, so I'm going to focus on 68000 code here, and on the differences between the 68000 code and the 6800/6801/6809 code.
I'm going to show three different versions for the 68000. The first version follows, as closely as possible, the 6800/6801/6809 code.
But to reduce the back-and-forth between the editor, the assembler, and the
debugger, and to reduce the number of files to edit, I'm putting the addition
and subtraction code together into a single file:
OPT LIST,SYMTAB ; Options we want for the stand-alone assembler.
MACHINE MC68000 ; because there are a lot the assembler can do.
OPT DEBUG ; We want labels for debugging.
OUTPUT
***********************************************************************
* Adding and subtracting two 8-bit numbers in memory
* and storing them in memory,
* transliterating the 6800 code --
*
EVEN
ENTRY JMP ADDM8
* The constants and result storage for the additions:
NL1 DC.B 34 ; just an arbitrary small number
NR1 DC.B 66 ; another arbitrary small number
C1 DS.B 1 ; To hold one bit of carry from the sum.
R1 DS.B 1 ; To hold the eight bit sum.
NL2 DC.B 132 ; Somewhat larger arbitrary number
NR2 DC.B 188 ; And another
C2 DS.B 1 ; carry from 2nd sum
R2 DS.B 1 ; 2nd sum
* I could have coordinated these better,
* But I'll rename the constants for the subtractions
* to ML1, ML2, and ML3:
ML1 DC.B 66 ; just an arbitrary small number
MR1 DC.B 34 ; another arbitrary small number
RH1 DS.B 1 ; To hold borrow/sign byte from the difference.
RL1 DS.B 1 ; To hold the eight bit difference.
ML2 DC.B 132 ; Somewhat larger arbitrary number
MR2 DC.B 188 ; And another
RH2 DS.B 1 ; borrow/sign from 2nd difference
RL2 DS.B 1 ; 2nd difference
ML3 DC.B 34 ; just an arbitrary small number
MR3 DC.B 66 ; another arbitrary small number
RH3 DS.B 1 ; To hold borrow/sign byte from the difference.
RL3 DS.B 1 ; To hold the eight bit difference.
EVEN
ADDM8 CLR.B D6 ; clear a register for the carry bit
MOVE.B NL1,D7 ; Get the augend (left side addend).
ADD.B NR1,D7 ; Add the addend (right side addend).
* 8-bit result is safely in D7.
ROXL.B #1,D6 ; Recover the eXtend carry bit.
MOVE.B D6,C1 ; Save it away.
MOVE.B D7,R1 ; Save the sum itself.
*
CLR.B D6 ; 2nd carry bit
MOVE.B NL2,D7 ; 2nd augend (left side addend)
ADD.B NR2,D7 ; 2nd addend (right side addend)
MOVE.B D7,R2 ; 2nd sum, 8 bits
* MOVE does not alter X bit, carry eXtend still safe.
ROXL.B #1,D6 ; 2nd carry bit
MOVE.B D6,C2 ; Save it away.
* Could also use the ADDX.B with D5 pre-cleared method
* used in the subtraction examples below.
*
NOP
NOP
SUBM8 CLR.B D5 ; Need a constant zero. (No SUBX #0,Dn.)
CLR.B D6 ; clear register D6 for the borrow/sign extension
MOVE.B ML1,D7 ; Get the minuend (left side).
SUB.B MR1,D7 ; Subtact the subtrahend (right side).
* Result is safely in D7.
SUBX.B D5,D6 ; Recover the eXtend borrow bit, sign extended.
MOVE.B D6,RH1 ; Save high byte away.
MOVE.B D7,RL1 ; Save the difference low byte.
*
CLR.B D5 ; should still be clear, actually.
CLR.B D6 ; 2nd sign extension
MOVE.B ML2,RL2 ; 2nd minuend to memory, just because we can
MOVE.B MR2,D7 ; 2nd subtrahend in register
SUB.B D7,RL2 ; 8-bit result safely saved away
SUBX.B D5,D6 ; 2nd borrow, sign extended
MOVE.B D6,RH2 ; Save high byte away.
*
CLR.B D5 ; Again, should still be clear.
CLR.B D6 ; 3rd borrow bit
MOVE.B ML3,D7 ; 3rd minuend (left side)
SUB.B MR3,D7 ; 3rd subtrahend (right side)
MOVE.B D7,RL3 ; Save 8-bit result away
* MOVE does not alter X bit, borrow eXtend still safe.
SUBX.B D5,D6 ; 3rd borrow, sign extended.
MOVE.B D6,RH3 ; Save it away.
*
NOP
NOP
The comments pretty much explain what I've done with the variable names, also pretty much what is going on in the code.
Go ahead and open up a second browser window with the 6800/6801/6809 code and compare.
You can see that we can use memory and registers in much the same way as on the 8-bit processors, with some odd exceptions.
You'll note that having lots of data registers gives us more options in how to organize our code. In some places, I deliberately altered some steps to show that. But, in other places, I was forced to make some changes because of those odd exceptions.
Other than having lots of registers, specific things to pay attention to:
(1) Rotating bits into memory is limited to 16-bit wide memory targets. We can't rotate only 8 bit targets in memory. But we can rotate 8-bit targets in a data register, so we do that.
Speaking of rotating, rotates and shifts on the 68000 do not use the Carry flag!
They use the eXtended carry flag, thus, we have the mnemonic, ROXL.
Why not ROL?
Lemme 'splain!
Rotating through carry is actually a 9-bit rotate.
That's 8 bits in the target plus the carry (on the 6800, etc.) or the eXtended carry (on the 68000).
So you don't really have an 8-bit rotate on the 6800/6801/6809.
Fortunately, 9 bit rotates is what is most common, but, if 9 bits is too many, you have to short-circuit the carry with some odd logic that I'm not going to show yet.
The 68000 gives you true 8-bit rotates -- and 16-bit and 32-bit -- without going through the (eXtended) carry, and ROtate Left by 8/16/32 bits (without going through the eXtended carry) is what ROL and ROR mean on the 68000.
Erk. Lot's of explanation we didn't think we wanted just here, but keep it in mind. It'll come in useful later. Or, perhaps, it's a waste of micro-instructions in the 68000 CPU, but that's something to think about later.
Anyway, that's part of what's behind the differences in the bit rotation code.
(Why split the carry function into Carry and eXtended carry?
SHHHH!
You're not supposed to ask that.
Heh. Yet.)
(2) The Carry flag is cleared by MOVEs. The eXtended carry flag is not. This
allows some variation in execution order with the X flag, which can come in
handy sometimes. It also forces execution order sometimes.
(3) Instead of ADd with Carry (ADC) and SuBtract with Carry (SBC), the 68000 has ADD with eXtend and SUBtract with eXtend, ADDX and SUBX.
... aaaaand ...
Where other instructions on the 68000 are really flexible as to where the source or target are, the ADDX, SUBX, and CMPX instructions are not.
You can ADDX, SUBX, or CMPX two registers. Not memory-to-register, not register-to-memory. Or you can ADDX or SUBX memory-to-memory in auto-increment mode. And you can CMPX memory-to-memory in auto-decrement mode.
No immediate mode operands. No ADDX #0.
ARRRGGGGGGHHHHHHH!!!!!! WHY?!?!?!?!?!?
Sigh. There is reason to this madness, sort-of. Maybe just madness. We'll come
back to this.
For now, it can be made to work by using another register to hold the zero.
Yes, that RISC trick.
Well, a partial response to the madness is due here. The 68000 design project began about the time certain departments within IBM were just digging into the 801 processor. Patterson would not go on sabbatical to DEC for a couple of years yet. Not just Motorola, everybody was exploring unknown territory, going boldly where no man (in history) had gone before. Motorola made some mistakes. We all made mistakes -- and are now living with the consequences.
RISC isn't the Holy Grail, anyway.
Guys like me will daydream about what might have been, but daydreaming has limits.
We can work with less-than-optimal. With some care, we can get pretty close to
the what should have been, and that's why I'm writing this tutorial.
Let me step off the soap box, and we'll take a look at some nifty gadgetry in the 68000 that provides an alternate way of doing this. But first, copy the code to a text file, save it, maybe as arithm8.s or something, and assemble it with something like
$ vasmm68k_mot -Ftos -no-opt -o ARITHM8.PRG -L arithm8.lst arithm8.sGet Hatari running, go through the
CTRL-Z, CD to the target emulated directory, ALT-BREAK, set the "b pc = TEXT :once" breakpoint, (c)ontinue
protocol and use the debugger to (s)tep through the code, showing (r)egister and (m)emory contents as appropriate.
The final results in memory should be
> m TEXT 20
00013D10: 4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22 N...=*"B.d...@B"
00013D20: 00 20 84 bc ff c8 22 42 ff e0 42 06 1e 39 00 01 . ...."B..B..9..
Well, the actual address of TEXT should be (may be?) different, but the
contents should be there. Showing the results you want in
red (after the numbers we added and
subtracted in blue):
4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22
00 20 84 bc ff c8 22 42 ff e0 42 06 1e 39 00 01
And you'll notice that I'm so focused I didn't include exit code. That's okay,
just (q)uit when you're done.
Okay, that nifty gadgetry I promised:
OPT LIST,SYMTAB ; Options we want for the stand-alone assembler.
MACHINE MC68000 ; because there are a lot the assembler can do.
OPT DEBUG ; We want labels for debugging.
OUTPUT
***********************************************************************
* Adding and subtracting two 8-bit numbers in memory
* and storing them in memory,
* Using the Set conditional instruction to capture the Carry/borrow bit
* instead of the eXtended carry/borrow bit
*
EVEN
ENTRY JMP ADDM8
* The constants and result storage for the additions:
NL1 DC.B 34 ; just an arbitrary small number
NR1 DC.B 66 ; another arbitrary small number
C1 DS.B 1 ; To hold one bit of carry from the sum.
R1 DS.B 1 ; To hold the eight bit sum.
NL2 DC.B 132 ; Somewhat larger arbitrary number
NR2 DC.B 188 ; And another
C2 DS.B 1 ; carry from 2nd sum
R2 DS.B 1 ; 2nd sum
* I could have coordinated these better,
* But I'll rename the constants for the subtractions
* to ML1, ML2, and ML3:
ML1 DC.B 66 ; just an arbitrary small number
MR1 DC.B 34 ; another arbitrary small number
RH1 DS.B 1 ; To hold one bit of borrow from the difference.
RL1 DS.B 1 ; To hold the eight bit difference.
ML2 DC.B 132 ; Somewhat larger arbitrary number
MR2 DC.B 188 ; And another
RH2 DS.B 1 ; borrow from 2nd difference
RL2 DS.B 1 ; 2nd difference
ML3 DC.B 34 ; just an arbitrary small number
MR3 DC.B 66 ; another arbitrary small number
RH3 DS.B 1 ; To hold one bit of borrow from the difference.
RL3 DS.B 1 ; To hold the eight bit difference.
EVEN
ADDM8 CLR.B C1 ; clear memory for the carry bit/sign byte
MOVE.B NL1,D7 ; Get the augend (left side addend).
ADD.B NR1,D7 ; Add the addend (right side addend).
* Result is safely in D7, carry in C,
* But a MOVE will destroy the Carry in C.
SCS C1 ; Recover the Carry bit.
AND.B #1,C1 ; Mask off the unnecessary bits.
MOVE.B D7,R1 ; Save the 8-bit sum itself.
*
CLR.B C2 ; memory for 2nd carry bit/sign byte
MOVE.B NL2,R2 ; 2nd augend to result memory
MOVE.B NR2,D7 ; 2nd addend to a register
ADD.B D7,R2 ; 8-bit result safely stored
* A MOVE will destroy the Carry in C.
SCS C2 ; 2nd carry bit
AND.B #1,C2 ; Mask off the unnecessary bits.
*
NOP
NOP
SUBM8 CLR.B RH1 ; memory for 1st borrow/sign byte
MOVE.B ML1,D7 ; Get the 1st minuend (left side).
SUB.B MR1,D7 ; Subtact the 1st subtrahend (right side).
* Result is safely in D7, borrow in C,
* But a MOVE will destroy the borrow in C.
SCS RH1 ; 1st sign byte/borrow, DO NOT MASK!
MOVE.B D7,RL1 ; Save the result low byte.
*
CLR.B RH2 ; 2nd borrow/sign extension
MOVE.B ML2,RL2 ; 2nd minuend (left side) to memory
MOVE.B MR2,D7 ; 2nd subtrahend (right side)
SUB.B D7,RL2 ; 2nd result safely stored.
* A MOVE will destroy the borrow in C.
SCS RH2 ; 2nd borrow, sign extended
*
CLR.B D6 ; 3rd borrow/sign byte in register
MOVE.B ML3,D7 ; 3rd minuend (left side)
SUB.B MR3,D7 ; 3rd subtrahend (right side)
SCS D6 ; Get sign/borrow in regiser
* A MOVE will destroy the borrow in C.
MOVE.B D7,RL3 ; Save 3rd difference, low byte.
MOVE.B D6,RH3 ; Save sign/borrow away.
*
NOP
NOP
(1) Set conditionally is an interesting instruction. You can set a byte to all 1s on the specified condition. It does not text eXtended carry, but it does test Carry. So you can use Set Carry Set (SCS) immediately after an ADD or SUBtract or such, but if you do a MOVE, the Carry is gone.
(2) After an ADD, since we only want the borrow/carry from the high bit of the byte, all 1s from a SCS is too many 1s. So we can mask all but the bottom 1 out with AND immediate 1.
(3) AND is one of several instructions that can operate directly on memory, so if the target of SCS is in memory, we can mask the target directly.
Note that it takes more time to mask something in memory, because the
processor has to read it out of memory into an hidden latch to work on it,
then write it back. This is true of any operator that works directly on
memory, including the Set conditionally operators. So there are trade-offs.
(3) After a SUBtract, SCS setting all ones on borrow is actually what we want,
since all 1s is how we know it's negative. So there's no need to mask after a
subtract.
Oddly enough, for as nifty as this Set conditionally and operate-directly-on-memory gadgetry is nifty, it turns out not to be very exciting. But that is not necessarily bad.
After stepping through and watching the registers and memory, make one final check to see that the values store are correct as in the first example, com;pare below:
00013D10: 4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22 N...=*"B.d...@B"
00013D20: 00 20 84 bc ff c8 22 42 ff e0 42 39 00 01 3d 18 . ...."B..B9..=.
Not that the instructions following the results are different from the first example, as they should be. Showing the results in red again:
4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22
00 20 84 bc ff c8 22 42 ff e0 42 39 00 01 3d 18
Finally, thinking about the 6801 and 6809's double accumulator instructions, the 68000 has the ability to do both 16-bit and 32-bit math and loads and stores.
But it has no easy way to concatenate the least significant bytes of two registers similar to the 6801/6809 double accumulator instructions. Well, okay, it's not that hard, I guess, but it's harder than just being able to STD.
And there are restrictions, such as 16-bit and 32-bit operands in memory have
to be 16-bit aligned. (Some of the later processors in the family relax this
requirement.)
To see how that will play out on the 68000, here's some code:
OPT LIST,SYMTAB ; Options we want for the stand-alone assembler.
MACHINE MC68000 ; because there are a lot the assembler can do.
OPT DEBUG ; We want labels for debugging.
OUTPUT
***********************************************************************
* Adding and subtracting two 8-bit numbers in memory
* and storing them in memory,
* Using 68000's 16-bit arithmetic --
*
EVEN
ENTRY JMP ADDM8
* Instructions are always on even boundaries,
* and use an even number of bytes.
* Remember that the 68000 requires 16-bit and 32-bit accesses
* to be on 2-byte even boundaries.
* The constants for the additions:
NL1 DC.B 34 ; just an arbitrary small number
NR1 DC.B 66 ; another arbitrary small number
EVEN
NSUM1 DS.W 1 ; 2 bytes to hold carry/sign and 8-bit sum
NDH1 EQU NSUM1 ; carry/sign byte
NDL1 EQU NSUM1+1 ; 8-bit sum
NL2 DC.B 132 ; Somewhat larger arbitrary number
NR2 DC.B 188 ; And another
EVEN
NSUM2 DS.W 1 ; carry/sign and 8-bit sum
NDH2 EQU NSUM2 ; carry/sign byte
NDL2 EQU NSUM2+1 ; 8-bit sum
* I could have coordinated these better,
* But I'll rename the constants for the subtractions
* to ML1, ML2, and ML3:
ML1 DC.B 66 ; just an arbitrary small number
MR1 DC.B 34 ; another arbitrary small number
EVEN
MDIFF1 DS.W 1 ; borrow/sign and 8-bit differenc
MDH1 EQU MDIFF1 ; To hold borrow/sign from the difference.
MDL1 EQU MDIFF1+1 ; To hold the eight bit difference.
ML2 DC.B 132 ; Somewhat larger arbitrary number
MR2 DC.B 188 ; And another
EVEN
MDIFF2 DS.W 1 ; borrow/sign and 8-bit differenc
MDH2 EQU MDIFF2 ; borrow/sign from 2nd difference
MDL2 EQU MDIFF2+1 ; 2nd difference
ML3 DC.B 34 ; just an arbitrary small number
MR3 DC.B 66 ; another arbitrary small number
EVEN
MDIFF3 DS.W 1 ; borrow/sign and 8-bit differenc
MDH3 EQU MDIFF3 ; To hold borrow/sign from 3r difference.
MDL3 EQU MDIFF3+1 ; 3rd eight bit difference.
EVEN
ADDM8 CLR.W D6 ; Prepare left side for 16-bit math.
MOVE.B NL1,D6 ; Get the augend (left side addend).
CLR.W D7 ; Prepare right side for 16-bit math.
MOVE.B NR1,D7 ; Get the addend (right side addend).
ADD.W D7,D6 ; Add right into left.
* Result is safely in D6, including carry.
MOVE.W D6,NSUM1 ; Save the 8-bit sum along with the carry/sign.
*
CLR.W D6 ; Prepare left side for 16-bit math.
CLR.W D7 ; Prepare right side for 16-bit math.
MOVE.B NL2,D6 ; Get the augend (left side addend).
MOVE.B NR2,D7 ; Get the addend (right side addend).
ADD.W D7,D6 ; Add right into left.
* Result is safely in D6, including carry.
MOVE.W D6,NSUM2 ; Save the 8-bit sum along with the carry/sign.
*
NOP
NOP
SUBM8 CLR.W D6 ; Prepare left side for 16-bit math.
MOVE.B ML1,D6 ; Get the minuend (left side).
CLR.W D7 ; Prepare right side for 16-bit math.
MOVE.B MR1,D7 ; Get the subtrahend (right side).
SUB.W D7,D6 ; Subtract right from left.
* Result is safely in D6, including carry.
MOVE.W D6,MDIFF1 ; Save the 8-bit sum along with the carry/sign.
*
CLR.W D6 ; Prepare left side for 16-bit math.
CLR.W D7 ; Prepare right side for 16-bit math.
MOVE.B ML2,D6 ; Get the minuend (left side).
MOVE.B MR2,D7 ; Get the subtrahend (right side).
SUB.W D7,D6 ; Subtract right from left.
* Result is safely in D6, including carry.
MOVE.W D6,MDIFF2 ; Save the 8-bit sum along with the carry/sign.
*
CLR.W D6 ; Prepare left side for 16-bit math.
MOVE.B ML3,D6 ; Get the minuend (left side).
CLR.W D7 ; Prepare right side for 16-bit math.
MOVE.B MR3,D7 ; Get the subtrahend (right side).
SUB.W D7,D6 ; Subtract right from left.
* Result is safely in D6, including carry.
MOVE.W D6,MDIFF3 ; Save the 8-bit sum along with the carry/sign.
*
NOP
NOP
Check the results:
00013D10: 4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22 N...=*"B.d...@B"
00013D20: 00 20 84 bc ff c8 22 42 ff e0 42 46 1c 39 00 01 . ...."B..BF.9..
The sums and differences in red:
4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22
00 20 84 bc ff c8 22 42 ff e0 42 46 1c 39 00 01
The comments do tell a lot about what is happening, but they are a little terse.
Knowing what the code looks like, I can say that all the EVEN directives in
there are unnecessary. But I put them in for emphasis, and just in case I do
something strange with the code later when I've forgotten a lot about it. And
for anyone else who needs to read the code.
Why am I using 16-bit addition and subtraction here, when the 6800, etc. code did not?
If we use byte addition and subtraction, we end up with the carry/borrow in the flags, but not in the register. That means we have to have instructions to bring it in, which we have already seen. We're trying to avoid that.
Well, there's an approach I haven't mentioned, using branches, but I am deliberately avoiding that. It looks like this:
...
ADDM8 MOVE.B NL1,D7 ; Get the augend (left side addend).
ADD.B NR1,D7 ; Add the addend (right side addend).
BCC.S ADDM8NC
OR.W #$0100,D7 ; set bit 8
BRA.S ADDM8M
ADDM8NC AND.W #$FEFF,D7 ; clear bit 8
ADDM8M AND.W #$01FF,D7 ; clear the rest
MOVE.W D7,NSUM1 ; Save the 8-bit sum along with the carry/sign.
...
SUBM8 MOVE.B ML1,D7 ; Get the minuend (left side).
SUB.B MR1,D7 ; Subtract the subtrahend (right side).
BCC.S SUBM8NB
OR.W #$FF00,D7 ; set the borrow/sign
BRA.S SUBM8M
SUBM8NB AND.W #$00FF,D7 ; clear the borrow/sign
SUBM8M MOVE.W D7,MDIFF1 ; Save the 8-bit sum along with the carry/sign.
...
For the addition code, the 68000 Bit Set (BSET) and Bit Clear (BCLR)
instructions could also be used, but you still have to mask off the remaining
bits with the final AND.W, because you don't know what they were when you
started.
And I am deliberately avoiding mentioning that, you understand.
:-/
If you're wondering, it's good to avoid branches if it doesn't cost too much to do so.
Back to the file we were working on -- because we are going to use word arithmetic to do byte math, we need to clear the lower 16 bits of the registers we are using. Otherwise, we don't know what's up there in the higher order byte, and it's likely to be stuff that trashes our results.
We can clear both words and then load both words, or we can clear and load and clear and load. Either way.
We clear them because the bytes are unsigned. If the were signed, we could use the EXT.W instruction to sign-extend them after loading, instead.
This is a pattern on the 68000 -- for extending unsigned data that will be widened, you clear the width you need before loading the data. For extending signed data, you load and then sign EXTend the data.
After that, I hope it's all straightforward. Do the math in 16-bit width, store the result in 16-bit width. Subtracting at the wider width takes care of the high byte for you.
Does this feel like the approach we should have been looking at from the start? I hope so.
By the way, RISC CPUs use this approach pretty much exclusively. It's part of
the concept of RISC.
While we're talking about it, how would we go about expanding the 8-bit operands to use 16-bit math on the 6809? Maybe we should take a quick look at that next.
No comments:
Post a Comment