Sunday, September 22, 2024

ALPP 02-06 -- Introduction to Byte Arithmetic on the 68000

Introduction to Byte Arithmetic
on the 68000

(Title Page/Index)

 

If you haven't already worked through the introduction to byte arithmetic on the 6800, 6801, and 6809, you should. I explained the byte math there, so I'm going to focus on 68000 code here, and on the differences between the 68000 code and the 6800/6801/6809 code. 

I'm going to show three different versions for the 68000. The first version follows, as closely as possible, the 6800/6801/6809 code. 

But to reduce the back-and-forth between the editor, the assembler, and the debugger, and to reduce the number of files to edit, I'm putting the addition and subtraction code together into a single file:

	OPT LIST,SYMTAB	; Options we want for the stand-alone assembler.
	MACHINE MC68000	; because there are a lot the assembler can do.
	OPT DEBUG	; We want labels for debugging.
	OUTPUT
***********************************************************************
* Adding and subtracting two 8-bit numbers in memory
* and storing them in memory,
* transliterating the 6800 code --
*
	EVEN
ENTRY	JMP	ADDM8


* The constants and result storage for the additions:

NL1	DC.B	34	; just an arbitrary small number
NR1	DC.B	66	; another arbitrary small number
C1	DS.B	1	; To hold one bit of carry from the sum.
R1	DS.B	1	; To hold the eight bit sum.

NL2	DC.B	132	; Somewhat larger arbitrary number
NR2	DC.B	188	; And another
C2	DS.B	1	; carry from 2nd sum
R2	DS.B	1	; 2nd sum


* I could have coordinated these better,
* But I'll rename the constants for the subtractions 
* to ML1, ML2, and ML3:

ML1	DC.B	66	; just an arbitrary small number
MR1	DC.B	34	; another arbitrary small number
RH1	DS.B	1	; To hold borrow/sign byte from the difference.
RL1	DS.B	1	; To hold the eight bit difference.

ML2	DC.B	132	; Somewhat larger arbitrary number
MR2	DC.B	188	; And another
RH2	DS.B	1	; borrow/sign from 2nd difference
RL2	DS.B	1	; 2nd difference

ML3	DC.B	34	; just an arbitrary small number
MR3	DC.B	66	; another arbitrary small number
RH3	DS.B	1	; To hold borrow/sign byte from the difference.
RL3	DS.B	1	; To hold the eight bit difference.



	EVEN
ADDM8	CLR.B	D6	; clear a register for the carry bit
	MOVE.B	NL1,D7	; Get the augend (left side addend).
	ADD.B	NR1,D7	; Add the addend (right side addend).
* 8-bit result is safely in D7.
	ROXL.B	#1,D6	; Recover the eXtend carry bit.
	MOVE.B	D6,C1	; Save it away.
	MOVE.B	D7,R1	; Save the sum itself.
*
	CLR.B	D6	; 2nd carry bit
	MOVE.B	NL2,D7	; 2nd augend (left side addend)
	ADD.B	NR2,D7	; 2nd addend (right side addend)
	MOVE.B	D7,R2	; 2nd sum, 8 bits
* MOVE does not alter X bit, carry eXtend still safe.
	ROXL.B	#1,D6	; 2nd carry bit
	MOVE.B	D6,C2	; Save it away.
* Could also use the ADDX.B with D5 pre-cleared method
* used in the subtraction examples below.
*
	NOP
	NOP


SUBM8	CLR.B	D5	; Need a constant zero. (No SUBX #0,Dn.)
	CLR.B	D6	; clear register D6 for the borrow/sign extension
	MOVE.B	ML1,D7	; Get the minuend (left side).
	SUB.B	MR1,D7	; Subtact the subtrahend (right side).
* Result is safely in D7.
	SUBX.B	D5,D6	; Recover the eXtend borrow bit, sign extended.
	MOVE.B	D6,RH1	; Save high byte away.
	MOVE.B	D7,RL1	; Save the difference low byte.
*
	CLR.B	D5	; should still be clear, actually.
	CLR.B	D6	; 2nd sign extension
	MOVE.B	ML2,RL2	; 2nd minuend to memory, just because we can
	MOVE.B	MR2,D7	; 2nd subtrahend in register
	SUB.B	D7,RL2	; 8-bit result safely saved away
	SUBX.B	D5,D6	; 2nd borrow, sign extended
	MOVE.B	D6,RH2	; Save high byte away.
*
	CLR.B	D5	; Again, should still be clear.
	CLR.B	D6	; 3rd borrow bit
	MOVE.B	ML3,D7	; 3rd minuend (left side)
	SUB.B	MR3,D7	; 3rd subtrahend (right side)
	MOVE.B	D7,RL3	; Save 8-bit result away
* MOVE does not alter X bit, borrow eXtend still safe.
	SUBX.B	D5,D6	; 3rd borrow, sign extended.
	MOVE.B	D6,RH3	; Save it away.
*
	NOP
	NOP

The comments pretty much explain what I've done with the variable names, also pretty much what is going on in the code. 

Go ahead and open up a second browser window with the 6800/6801/6809 code and compare.

You can see that we can use memory and registers in much the same way as on the 8-bit processors, with some odd exceptions. 

You'll note that having lots of data registers gives us more options in how to organize our code. In some places, I deliberately altered some steps to show that. But, in other places, I was forced to make some changes because of those odd exceptions. 

Other than having lots of registers, specific things to pay attention to:

(1) Rotating bits into memory is limited to 16-bit wide memory targets. We can't rotate only 8 bit targets in memory. But we can rotate 8-bit targets in a data register, so we do that. 

Speaking of rotating, rotates and shifts on the 68000 do not use the Carry flag!

They use the eXtended carry flag, thus, we have the mnemonic, ROXL.

Why not ROL? 

Lemme 'splain!

Rotating through carry is actually a 9-bit rotate.

That's 8 bits in the target plus the carry (on the 6800, etc.) or the eXtended carry (on the 68000). 

So you don't really have an 8-bit rotate on the 6800/6801/6809. 

Fortunately, 9 bit rotates is what is most common, but, if 9 bits is too many, you have to short-circuit the carry with some odd logic that I'm not going to show yet. 

The 68000 gives you true 8-bit rotates -- and 16-bit and 32-bit -- without going through the (eXtended) carry, and ROtate Left by 8/16/32 bits (without going through the eXtended carry) is what ROL and ROR mean on the 68000.

Erk. Lot's of explanation we didn't think we wanted just here, but keep it in mind. It'll come in useful later. Or, perhaps, it's a waste of micro-instructions in the 68000 CPU, but that's something to think about later.

Anyway, that's part of what's behind the differences in the bit rotation code.

(Why split the carry function into Carry and eXtended carry? 

SHHHH! 

You're not supposed to ask that. 

Heh. Yet.)

(2) The Carry flag is cleared by MOVEs. The eXtended carry flag is not. This allows some variation in execution order with the X flag, which can come in handy sometimes. It also forces execution order sometimes.

(3) Instead of ADd with Carry (ADC) and SuBtract with Carry (SBC), the 68000 has ADD with eXtend and SUBtract with eXtend, ADDX and SUBX.

... aaaaand ...

Where other instructions on the 68000 are really flexible as to where the source or target are, the ADDX, SUBX, and CMPX instructions are not.

You can ADDX, SUBX, or CMPX two registers. Not memory-to-register, not register-to-memory. Or you can ADDX or SUBX memory-to-memory in auto-increment mode. And you can CMPX memory-to-memory in auto-decrement mode.

No immediate mode operands. No ADDX #0.

ARRRGGGGGGHHHHHHH!!!!!!     WHY?!?!?!?!?!?

Sigh. There is reason to this madness, sort-of. Maybe just madness. We'll come back to this.

For now, it can be made to work by using another register to hold the zero.

Yes, that RISC trick.

Well, a partial response to the madness is due here. The 68000 design project began about the time certain departments within IBM were just digging into the 801 processor. Patterson would not go on sabbatical to DEC for a couple of years yet. Not just Motorola, everybody was exploring unknown territory, going boldly where no man (in history) had gone before. Motorola made some mistakes. We all made mistakes -- and are now living with the consequences.

RISC isn't the Holy Grail, anyway.

Guys like me will daydream about what might have been, but daydreaming has limits. 

We can work with less-than-optimal. With some care, we can get pretty close to the what should have been, and that's why I'm writing this tutorial.

Let me step off the soap box, and we'll take a look at some nifty gadgetry in the 68000 that provides an alternate way of doing this. But first, copy the code to a text file, save it, maybe as arithm8.s or something, and assemble it with something like

$ vasmm68k_mot -Ftos -no-opt -o ARITHM8.PRG -L arithm8.lst arithm8.s
Get Hatari running, go through the 

CTRL-Z, CD to the target emulated directory, ALT-BREAK, set the "b pc = TEXT :once" breakpoint, (c)ontinue 

protocol and use the debugger to (s)tep through the code, showing (r)egister and (m)emory contents as appropriate.  

The final results in memory should be
> m TEXT 20
00013D10: 4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22   N...=*"B.d...@B"
00013D20: 00 20 84 bc ff c8 22 42 ff e0 42 06 1e 39 00 01   . ...."B..B..9..

Well, the actual address of TEXT should be (may be?) different, but the contents should be there. Showing the results you want in red (after the numbers we added and subtracted in blue):

4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22
00 20 84 bc ff c8 22 42 ff e0 42 06 1e 39 00 01

And you'll notice that I'm so focused I didn't include exit code. That's okay, just (q)uit when you're done.

Okay, that nifty gadgetry I promised:

	OPT LIST,SYMTAB	; Options we want for the stand-alone assembler.
	MACHINE MC68000	; because there are a lot the assembler can do.
	OPT DEBUG	; We want labels for debugging.
	OUTPUT
***********************************************************************
* Adding and subtracting two 8-bit numbers in memory
* and storing them in memory,
* Using the Set conditional instruction to capture the Carry/borrow bit
* instead of the eXtended carry/borrow bit 
*
	EVEN
ENTRY	JMP	ADDM8


* The constants and result storage for the additions:

NL1	DC.B	34	; just an arbitrary small number
NR1	DC.B	66	; another arbitrary small number
C1	DS.B	1	; To hold one bit of carry from the sum.
R1	DS.B	1	; To hold the eight bit sum.

NL2	DC.B	132	; Somewhat larger arbitrary number
NR2	DC.B	188	; And another
C2	DS.B	1	; carry from 2nd sum
R2	DS.B	1	; 2nd sum


* I could have coordinated these better,
* But I'll rename the constants for the subtractions 
* to ML1, ML2, and ML3:

ML1	DC.B	66	; just an arbitrary small number
MR1	DC.B	34	; another arbitrary small number
RH1	DS.B	1	; To hold one bit of borrow from the difference.
RL1	DS.B	1	; To hold the eight bit difference.

ML2	DC.B	132	; Somewhat larger arbitrary number
MR2	DC.B	188	; And another
RH2	DS.B	1	; borrow from 2nd difference
RL2	DS.B	1	; 2nd difference

ML3	DC.B	34	; just an arbitrary small number
MR3	DC.B	66	; another arbitrary small number
RH3	DS.B	1	; To hold one bit of borrow from the difference.
RL3	DS.B	1	; To hold the eight bit difference.



	EVEN
ADDM8	CLR.B	C1	; clear memory for the carry bit/sign byte
	MOVE.B	NL1,D7	; Get the augend (left side addend).
	ADD.B	NR1,D7	; Add the addend (right side addend).
* Result is safely in D7, carry in C,
* But a MOVE will destroy the Carry in C.
	SCS	C1	; Recover the Carry bit.
	AND.B	#1,C1	; Mask off the unnecessary bits.
	MOVE.B	D7,R1	; Save the 8-bit sum itself.
*
	CLR.B	C2	; memory for 2nd carry bit/sign byte
	MOVE.B	NL2,R2	; 2nd augend to result memory
	MOVE.B	NR2,D7	; 2nd addend to a register
	ADD.B	D7,R2	; 8-bit result safely stored
* A MOVE will destroy the Carry in C.
	SCS	C2	; 2nd carry bit
	AND.B	#1,C2	; Mask off the unnecessary bits.
*
	NOP
	NOP


SUBM8	CLR.B	RH1	; memory for 1st borrow/sign byte
	MOVE.B	ML1,D7	; Get the 1st minuend (left side).
	SUB.B	MR1,D7	; Subtact the 1st subtrahend (right side).
* Result is safely in D7, borrow in C,
* But a MOVE will destroy the borrow in C.
	SCS	RH1	; 1st sign byte/borrow, DO NOT MASK!
	MOVE.B	D7,RL1	; Save the result low byte.
*
	CLR.B	RH2	; 2nd borrow/sign extension
	MOVE.B	ML2,RL2	; 2nd minuend (left side) to memory
	MOVE.B	MR2,D7	; 2nd subtrahend (right side)
	SUB.B	D7,RL2	; 2nd result safely stored.
* A MOVE will destroy the borrow in C.
	SCS	RH2	; 2nd borrow, sign extended
*
	CLR.B	D6	; 3rd borrow/sign byte in register
	MOVE.B	ML3,D7	; 3rd minuend (left side)
	SUB.B	MR3,D7	; 3rd subtrahend (right side)
	SCS	D6	; Get sign/borrow in regiser
* A MOVE will destroy the borrow in C.
	MOVE.B	D7,RL3	; Save 3rd difference, low byte.
	MOVE.B	D6,RH3	; Save sign/borrow away.
*
	NOP
	NOP

(1) Set conditionally is an interesting instruction. You can set a byte to all 1s on the specified condition. It does not text eXtended carry, but it does test Carry. So you can use Set Carry Set (SCS) immediately after an ADD or SUBtract or such, but if you do a MOVE, the Carry is gone.

(2) After an ADD, since we only want the borrow/carry from the high bit of the byte, all 1s from a SCS is too many 1s. So we can mask all but the bottom 1 out with AND immediate 1.

(3) AND is one of several instructions that can operate directly on memory, so if the target of SCS is in memory, we can mask the target directly. 

Note that it takes more time to mask something in memory, because the processor has to read it out of memory into an hidden latch to work on it, then write it back. This is true of any operator that works directly on memory, including the Set conditionally operators. So there are trade-offs.

(3) After a SUBtract, SCS setting all ones on borrow is actually what we want, since all 1s is how we know it's negative. So there's no need to mask after a subtract.

(4) ADD and SUBtract are also operators that can work directly on memory. Again, they take more time operating on memory than they do operating on registers.

Oddly enough, for as nifty as this Set conditionally and operate-directly-on-memory gadgetry is nifty, it turns out not to be very exciting. But that is not necessarily bad. 

After stepping through and watching the registers and memory, make one final check to see that the values store are correct as in the first example, com;pare below: 

00013D10: 4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22   N...=*"B.d...@B"
00013D20: 00 20 84 bc ff c8 22 42 ff e0 42 39 00 01 3d 18   . ...."B..B9..=.

Not that the instructions following the results are different from the first example, as they should be. Showing the results in red again:

4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22
00 20 84 bc ff c8 22 42 ff e0 42 39 00 01 3d 18

Finally, thinking about the 6801 and 6809's double accumulator instructions, the 68000 has the ability to do both 16-bit and 32-bit math and loads and stores.

But it has no easy way to concatenate the least significant bytes of two registers similar to the 6801/6809 double accumulator instructions. Well, okay, it's not that hard, I guess, but it's harder than just being able to STD.

And there are restrictions, such as 16-bit and 32-bit operands in memory have to be 16-bit aligned. (Some of the later processors in the family relax this requirement.)

To see how that will play out on the 68000, here's some code:

	OPT LIST,SYMTAB	; Options we want for the stand-alone assembler.
	MACHINE MC68000	; because there are a lot the assembler can do.
	OPT DEBUG	; We want labels for debugging.
	OUTPUT
***********************************************************************
* Adding and subtracting two 8-bit numbers in memory
* and storing them in memory,
* Using 68000's 16-bit arithmetic --
*
	EVEN
ENTRY	JMP	ADDM8
* Instructions are always on even boundaries, 
* and use an even number of bytes.

* Remember that the 68000 requires 16-bit and 32-bit accesses 
* to be on 2-byte even boundaries.

* The constants for the additions:

NL1	DC.B	34	; just an arbitrary small number
NR1	DC.B	66	; another arbitrary small number
	EVEN
NSUM1	DS.W	1	; 2 bytes to hold carry/sign and 8-bit sum
NDH1	EQU	NSUM1	; carry/sign byte
NDL1	EQU	NSUM1+1	; 8-bit sum

NL2	DC.B	132	; Somewhat larger arbitrary number
NR2	DC.B	188	; And another
	EVEN
NSUM2	DS.W	1	; carry/sign and 8-bit sum
NDH2	EQU	NSUM2	; carry/sign byte
NDL2	EQU	NSUM2+1	; 8-bit sum


* I could have coordinated these better,
* But I'll rename the constants for the subtractions 
* to ML1, ML2, and ML3:

ML1	DC.B	66	; just an arbitrary small number
MR1	DC.B	34	; another arbitrary small number
	EVEN
MDIFF1	DS.W	1	; borrow/sign and 8-bit differenc
MDH1	EQU	MDIFF1	; To hold borrow/sign from the difference.
MDL1	EQU	MDIFF1+1	; To hold the eight bit difference.

ML2	DC.B	132	; Somewhat larger arbitrary number
MR2	DC.B	188	; And another
	EVEN
MDIFF2	DS.W	1	; borrow/sign and 8-bit differenc
MDH2	EQU	MDIFF2	; borrow/sign from 2nd difference
MDL2	EQU	MDIFF2+1	; 2nd difference

ML3	DC.B	34	; just an arbitrary small number
MR3	DC.B	66	; another arbitrary small number
	EVEN
MDIFF3	DS.W	1	; borrow/sign and 8-bit differenc
MDH3	EQU	MDIFF3	; To hold borrow/sign from 3r difference.
MDL3	EQU	MDIFF3+1	; 3rd eight bit difference.



	EVEN
ADDM8	CLR.W	D6	; Prepare left side for 16-bit math.
	MOVE.B	NL1,D6	; Get the augend (left side addend).
	CLR.W	D7	; Prepare right side for 16-bit math.
	MOVE.B	NR1,D7	; Get the addend (right side addend).
	ADD.W	D7,D6	; Add right into left.
* Result is safely in D6, including carry.
	MOVE.W	D6,NSUM1	; Save the 8-bit sum along with the carry/sign.
*
	CLR.W	D6	; Prepare left side for 16-bit math.
	CLR.W	D7	; Prepare right side for 16-bit math.
	MOVE.B	NL2,D6	; Get the augend (left side addend).
	MOVE.B	NR2,D7	; Get the addend (right side addend).
	ADD.W	D7,D6	; Add right into left.
* Result is safely in D6, including carry.
	MOVE.W	D6,NSUM2	; Save the 8-bit sum along with the carry/sign.
*
	NOP
	NOP

SUBM8	CLR.W	D6	; Prepare left side for 16-bit math.
	MOVE.B	ML1,D6	; Get the minuend (left side).
	CLR.W	D7	; Prepare right side for 16-bit math.
	MOVE.B	MR1,D7	; Get the subtrahend (right side).
	SUB.W	D7,D6	; Subtract right from left.
* Result is safely in D6, including carry.
	MOVE.W	D6,MDIFF1	; Save the 8-bit sum along with the carry/sign.
*
	CLR.W	D6	; Prepare left side for 16-bit math.
	CLR.W	D7	; Prepare right side for 16-bit math.
	MOVE.B	ML2,D6	; Get the minuend (left side).
	MOVE.B	MR2,D7	; Get the subtrahend (right side).
	SUB.W	D7,D6	; Subtract right from left.
* Result is safely in D6, including carry.
	MOVE.W	D6,MDIFF2	; Save the 8-bit sum along with the carry/sign.
*
	CLR.W	D6	; Prepare left side for 16-bit math.
	MOVE.B	ML3,D6	; Get the minuend (left side).
	CLR.W	D7	; Prepare right side for 16-bit math.
	MOVE.B	MR3,D7	; Get the subtrahend (right side).
	SUB.W	D7,D6	; Subtract right from left.
* Result is safely in D6, including carry.
	MOVE.W	D6,MDIFF3	; Save the 8-bit sum along with the carry/sign.
*
	NOP
	NOP

Check the results:

00013D10: 4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22   N...=*"B.d...@B"
00013D20: 00 20 84 bc ff c8 22 42 ff e0 42 46 1c 39 00 01   . ...."B..BF.9..

The sums and differences in red:

4e f9 00 01 3d 2a 22 42 00 64 84 bc 01 40 42 22
00 20 84 bc ff c8 22 42 ff e0 42 46 1c 39 00 01

The comments do tell a lot about what is happening, but they are a little terse.

Knowing what the code looks like, I can say that all the EVEN directives in there are unnecessary. But I put them in for emphasis, and just in case I do something strange with the code later when I've forgotten a lot about it. And for anyone else who needs to read the code.

I hope the data declarations and allocations are clear enough. Unlike high-level languages, assembler can't enforce the intended access widths and boundaries. The best we can do is provide labels where access is intended, with comments about the intended width. Trailing equates can help when you intend alternate access points and widths.

Why am I using 16-bit addition and subtraction here, when the 6800, etc. code did not?

If we use byte addition and subtraction, we end up with the carry/borrow in the flags, but not in the register. That means we have to have instructions to bring it in, which we have already seen. We're trying to avoid that.

Well, there's an approach I haven't mentioned, using branches, but I am deliberately avoiding that. It looks like this:

	...
ADDM8	MOVE.B	NL1,D7	; Get the augend (left side addend).
	ADD.B	NR1,D7	; Add the addend (right side addend).
	BCC.S	ADDM8NC
	OR.W	#$0100,D7	; set bit 8
	BRA.S	ADDM8M
ADDM8NC	AND.W	#$FEFF,D7	; clear bit 8
ADDM8M	AND.W	#$01FF,D7	; clear the rest
	MOVE.W	D7,NSUM1	; Save the 8-bit sum along with the carry/sign.
	...
SUBM8	MOVE.B	ML1,D7	; Get the minuend (left side).
	SUB.B	MR1,D7	; Subtract the subtrahend (right side).
	BCC.S	SUBM8NB
	OR.W	#$FF00,D7	; set the borrow/sign
	BRA.S	SUBM8M
SUBM8NB	AND.W	#$00FF,D7	; clear the borrow/sign
SUBM8M	MOVE.W	D7,MDIFF1	; Save the 8-bit sum along with the carry/sign.
	...

For the addition code, the 68000 Bit Set (BSET) and Bit Clear (BCLR) instructions could also be used, but you still have to mask off the remaining bits with the final AND.W, because you don't know what they were when you started.

And I am deliberately avoiding mentioning that, you understand.

:-/

If you're wondering, it's good to avoid branches if it doesn't cost too much to do so.

Back to the file we were working on -- because we are going to use word arithmetic to do byte math, we need to clear the lower 16 bits of the registers we are using. Otherwise, we don't know what's up there in the higher order byte, and it's likely to be stuff that trashes our results.

We can clear both words and then load both words, or we can clear and load and clear and load. Either way.

We clear them because the bytes are unsigned. If the were signed, we could use the EXT.W instruction to sign-extend them after loading, instead.

This is a pattern on the 68000 -- for extending unsigned data that will be widened, you clear the width you need before loading the data. For extending signed data, you load and then sign EXTend the data.

After that, I hope it's all straightforward. Do the math in 16-bit width, store the result in 16-bit width. Subtracting at the wider width takes care of the high byte for you. 

Does this feel like the approach we should have been looking at from the start? I hope so.

By the way, RISC CPUs use this approach pretty much exclusively. It's part of the concept of RISC.

While we're talking about it, how would we go about expanding the 8-bit operands to use 16-bit math on the 6809?  Maybe we should take a quick look at that next.


(Title Page/Index)

 

No comments:

Post a Comment