Wednesday, October 2, 2024

ALPP 02-09 -- On the Beach with Parameters -- 16-bit Arithmetic on the 6801

On the Beach with Parameters --
16-bit Arithmetic
on the 6801

(Title Page/Index)

 

So we've worked through three different ways to pass parameters at run-time on the 6800.

Now let's see how the 6801 extensions to the 6800 come into play with all of that.

The declarations from code we borrowed from the improved Hello World examples don't really change compared with the 6800 code, but the stack initialization and pushes and pops get some improvements from PULX and LDD/STD:
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
PSP	RMB	2	; parameter stack pointer
SSAVE	RMB	2	; a place to keep S so we can return clean
*
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	31	; 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
*
*
INISTKS	LDX	#PSTKBAS	; Set up the parameter stack
	STX	PSP
	PULX		; get return address
	STS	SSAVE	; Save what the monitor gave us.
	LDS	#SSTKBAS	; Move to our own stack
	JMP	0,X	; return via X
*
PPOP16	LDX	PSP
	LDD	0,X
	INX
	INX
	STX	PSP
	RTS
*
PPSH16	LDX	PSP
	DEX
	DEX
	STX	PSP
	STD	0,X
	RTS

What about LD16I?

We now have the LDD instruction to explicitly load immediate values to the A:B pair like this:

VALUE	EQU	$1234
	...
	LDD	#VALUE

Of course, we can even load address to the A:B pair like this

BUFFER	RMB	80	; text buffer
	...
	LDD	#BUFFER

So we don't need LD16I at all! Hoorah, hoorah! 

If we needed it, it would be much cleaner to write, but we don't!

* Don't need LD16I.
* If we needed it, it would look like this, but we don't.
*
* You could use it like this:
*	JSR	LD16I	; load D immediate
*	FDB	$1234	; "immediate" 16-bit value to load
*	JSR	SOMEWHERE ; or some other executable code.
*
* LD16I	PULX		; point to the instruction stream
*	LDD	0,X	; from instruction stream
*	JMP	2,X	; return to the byte after the constant.
*
* But use
*	LDD	#1234	; 16 bits!
* instead.

What for are you looking at me strange like that again? 

(cough)

Actually, remembering this little bit of syntactic sugar may come in handy down the road, for such things as pointing to tables of constants kept in the code itself.

And that's part of the rest of the story on that little snippet. We look forward to using it.

Anyway, referring back to Wozniak's Sweet 16 virtual machine, we find that key elements of Sweet 16's 16-bit functionality are present in the 6801's native instruction set, and what remains is dead simple to implement. Combined with the 6801's new direct page mode for JSR, we could even make a really nifty and clean 16-bit relative BRanch Always. More fun than a barrel of monkeys. Later.

How can we improve our addition and subtraction subroutines?

* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit sum
ADD16	LDX	PSP
	LDD	2,X	; left 
	ADDD	0,X	; right
	INX		; adjust parameter stack first
	INX
	STX	PSP
	STD	0,X	; sum (N, Z, & C flags should be correct)
	RTS
* Flags: Specifically,
*        N and Z get set correctly by the final store double;
*        C should make it through manipulating X and storing D.
*        V gets walked on.
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit difference
SUB16	LDX	PSP
	LDD	2,X	; left
	SUBD	0,X	; right
	INX		; adjust parameter stack first
	INX
	STX	PSP
	STD	0,X	; difference (N, Z, & C flags should be correct)
	RTS
* Flags: Specifically,
*        N and Z get set correctly by the final store double;
*        C should make it through manipulating X and storing D.
*        V gets walked on.

That speeds things up a bit, but, surprisingly, what sticks out most is that maintaining the software stack now well outweighs the meat of the function.

Bummer.

On the other hand, we will often find ourselves directly using the new 16-bit wide ADDD and SUBD instructions instead of calling these routines.

BUT THERE's MORE!

Notice those comments. Careful organization of the  code allows us to keep the Zero, Negative, and Carry flags for the caller to use. oVerflow gets walked on. If we need it, we could preserve it with some TPA and bit twiddling and TAP, like we did in the 6800 code, but, really, we'd just use the ADDD and SUBD instructions directly if we need the oVerflow flag.

(Or, really, any of the flags, but, please be patient with this. There is a madness to my methods. Or something.)

So, here's the complete test frame for software parameter stack on the 6801:

* 16-bit addition and subtraction for 6801 on parameter stack,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
PSP	RMB	2	; parameter stack pointer
SSAVE	RMB	2	; a place to keep S so we can return clean
*
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	31	; 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
*
*
INISTKS	LDX	#PSTKBAS	; Set up the parameter stack
	STX	PSP
	PULX		; get return address
	STS	SSAVE	; Save what the monitor gave us.
	LDS	#SSTKBAS	; Move to our own stack
	JMP	0,X	; return via X
*
PPOP16	LDX	PSP
	LDD	0,X
	INX
	INX
	STX	PSP
	RTS
*
PPSH16	LDX	PSP
	DEX
	DEX
	STX	PSP
	STD	0,X
	RTS
*
* Don't need LD16I.
* If we needed it, it would look like this, but we don't.
*
* You could use it like this:
*	JSR	LD16I	; load D immediate
*	FDB	$1234	; "immediate" 16-bit value to load
*	JSR	SOMEWHERE ; or some other executable code.
*
* LD16I	PULX		; point to the instruction stream
*	LDD	0,X	; from instruction stream
*	JMP	2,X	; return to the byte after the constant.
*
* But use
*	LDD	#1234	; 16 bits!
* instead.
*
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit sum
ADD16	LDX	PSP
	LDD	2,X	; left 
	ADDD	0,X	; right
	INX		; adjust parameter stack first
	INX
	STX	PSP
	STD	0,X	; sum (N, Z, & C flags should be correct)
	RTS
* Flags: Specifically,
*        N and Z get set correctly by the final store double;
*        C should make it through manipulating X and storing D.
*        V gets walked on.
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit difference
SUB16	LDX	PSP
	LDD	2,X	; left
	SUBD	0,X	; right
	INX		; adjust parameter stack first
	INX
	STX	PSP
	STD	0,X	; difference (N, Z, & C flags should be correct)
	RTS
* Flags: Specifically,
*        N and Z get set correctly by the final store double;
*        C should make it through manipulating X and storing D.
*        V gets walked on.
*
*
START	JSR	INISTKS
*
	LDD	#$1234
	JSR	PPSH16
	LDD	#$CDEF
	JSR	PPSH16
	JSR	ADD16	; result should be $E023
	LDD	#$8765
	JSR	PPSH16
	JSR	SUB16	; result should be $58BE
	LDX	PSP
	LDD	0,X	; load the result into A:B
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

Make sure you've copied everything correctly, step through it, try other constants. Convince yourself that you'd rather use the 6801 than the 6800.

(Why didn't Motorola release the 6801 core in a package that could be dropped into a socket for the 6800? Yeah, yeah, I was the unpaying customer with great demands.)

And let's see how it might look with the single interleaved stack discipline I keep disparaging:

* 16-bit addition and subtraction for 6801 on return stack,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
SSAVE	RMB	2	; a place to keep S so we can return clean
*
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	NOP		; bump to aligned
	RMB	2	; a little bumper space
SSTKLIM	RMB	95	; (64+31) roughly 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
*
*
INISTKS	PULX		; Get return address.
	STS	SSAVE	; Save what the monitor gave us.
	LDS	#SSTKBAS	; Move to our own stack
	JMP	0,X	; return via X
*
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit sum
ADD16	TSX
	LDD	4,X	; left
	ADDD	2,X	; right
ADD16S	STD	4,X	; sum
	LDX	0,X	; return address before we deallocate it
	INS		; drop return address
	INS
	INS		; drop right-hand addend
	INS
	JMP	0,X	; return
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit difference
SUB16	TSX
	LDD	4,X	; left
	SUBD	2,X	; right
	BRA	ADD16S	; Steal code.
* Could steal code this way in the parameter stack example, as well.
*
*
START	JSR	INISTKS
*
	LDD	#$1234
	PSHB		; push in correct order
	PSHA
	LDD	#$CDEF
	PSHB
	PSHA
	JSR	ADD16	; result should be $E023
	LDD	#$8765
	PSHB
	PSHA
	JSR	SUB16	; result should be $58BE
	PULA
	PULB
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

Again, being able to use the native push and pop instructions seems to clean up the code significantly.

But we are still playing dodgy games avoiding the return address, and those games will still tend to keep you too amused late at night.

 And lets try it using a scratch area in the DP to pass values in and out:

* 16-bit addition and subtraction for 6801 via scratch pad,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
SSAVE	RMB	2	; a place to keep S so we can return clean
* parameter/scratch area for leaf functions only:
NLFT	RMB	2	; binary operator left side parameter
NRT	RMB	2	; binary operator right side parameter
NRES	RMB	2	; unary/binary operator result
NTEMP	RMB	2	; general scratch register for 
NPAR	EQU	NLFT	; unary operator parameter
NSCRAT	EQU	NLFT	; 
*
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	NOP		; bump to aligned
	RMB	2	; a little bumper space
SSTKLIM	RMB	31	; roughly 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
*
*
INISTKS	PULX		; get return address
	STS	SSAVE	; Save what the monitor gave us.
	LDS	#SSTKBAS	; Move to our own stack
	JMP	0,X	; return via X
*
*
* Don't need PPOP and PPSH
*
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit sum
ADD16	LDD	NLFT
	ADDD	NRT
ADD16S	STD	NRES	; sum
	RTS
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit difference
SUB16	LDD	NLFT
	SUBD	NRT
	STD	NRES	; difference
	RTS
* Stealing code would only save 1 byte.
*
*
START	JSR	INISTKS
*
	LDD	#$1234
	STD	NLFT
	LDD	#$CDEF
	STD	NRT
	JSR	ADD16	; result should be $E023
	LDD	NRES
	STD	NLFT
	LDD	#$8765
	STD	NRT
	JSR	SUB16	; result should be $58BE
	LDD	NRES
*
* Repeat, with native instructions:
	LDD	#$1234
	ADDD	#$CDEF
	SUBD	#$8765
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

Dramatic?

But still, all of that? Just to write the equivalent of

	LDD	#$1234
	ADDD	#$CDEF
	SUBD	#$8765

??

Yeah, I jest. Again, there are things you cannot reduce to constants at design- or compile-time.

But, even though it appears dramatic, you might be able to see a trend here. Let's see how that trend continues on the 6809.


(Title Page/Index)

 

 

No comments:

Post a Comment