On the Beach with Parameters --
16-bit Arithmetic
on the 6801
So we've worked through
three different ways to pass parameters at run-time on the 6800.
Now let's see how the 6801 extensions to the 6800 come into play with all of
that.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
PSP RMB 2 ; parameter stack pointer
SSAVE RMB 2 ; a place to keep S so we can return clean
*
*
ORG $2000 ; MDOS says this is a good place for usr stuff
NOENTRY JMP START
RMB 2 ; a little bumper space
SSTKLIM RMB 31 ; 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
*
*
INISTKS LDX #PSTKBAS ; Set up the parameter stack
STX PSP
PULX ; get return address
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
PPOP16 LDX PSP
LDD 0,X
INX
INX
STX PSP
RTS
*
PPSH16 LDX PSP
DEX
DEX
STX PSP
STD 0,X
RTS
What about LD16I?
We now have the LDD instruction to explicitly load immediate values to the A:B pair like this:
VALUE EQU $1234
...
LDD #VALUE
Of course, we can even load address to the A:B pair like this
BUFFER RMB 80 ; text buffer
...
LDD #BUFFER
So we don't need LD16I at all! Hoorah, hoorah!
If we needed it, it would be much cleaner to write, but we don't!
* Don't need LD16I.
* If we needed it, it would look like this, but we don't.
*
* You could use it like this:
* JSR LD16I ; load D immediate
* FDB $1234 ; "immediate" 16-bit value to load
* JSR SOMEWHERE ; or some other executable code.
*
* LD16I PULX ; point to the instruction stream
* LDD 0,X ; from instruction stream
* JMP 2,X ; return to the byte after the constant.
*
* But use
* LDD #1234 ; 16 bits!
* instead.
What for are you looking at me strange like that again?
(cough)
Actually, remembering this little bit of syntactic sugar may come in handy
down the road, for such things as pointing to tables of constants kept in the
code itself.
And that's part of the rest of the story on that little snippet. We look
forward to using it.
Anyway, referring back to Wozniak's Sweet 16 virtual machine, we find that key
elements of Sweet 16's 16-bit functionality are present in the 6801's native
instruction set, and what remains is dead simple to implement. Combined with
the 6801's new direct page mode for JSR, we could even make a really nifty and
clean 16-bit relative BRanch Always. More fun than a barrel of monkeys.
Later.
How can we improve our addition and subtraction subroutines?
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit sum
ADD16 LDX PSP
LDD 2,X ; left
ADDD 0,X ; right
INX ; adjust parameter stack first
INX
STX PSP
STD 0,X ; sum (N, Z, & C flags should be correct)
RTS
* Flags: Specifically,
* N and Z get set correctly by the final store double;
* C should make it through manipulating X and storing D.
* V gets walked on.
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit difference
SUB16 LDX PSP
LDD 2,X ; left
SUBD 0,X ; right
INX ; adjust parameter stack first
INX
STX PSP
STD 0,X ; difference (N, Z, & C flags should be correct)
RTS
* Flags: Specifically,
* N and Z get set correctly by the final store double;
* C should make it through manipulating X and storing D.
* V gets walked on.
That speeds things up a bit, but, surprisingly, what sticks out most is that maintaining the software stack now well outweighs the meat of the function.
Bummer.
On the other hand, we will often find ourselves directly using the new 16-bit wide ADDD and SUBD instructions instead of calling these routines.
BUT THERE's MORE!
Notice those comments. Careful organization of the code allows us to keep the Zero, Negative, and Carry flags for the caller to use. oVerflow gets walked on. If we need it, we could preserve it with some TPA and bit twiddling and TAP, like we did in the 6800 code, but, really, we'd just use the ADDD and SUBD instructions directly if we need the oVerflow flag.
(Or, really, any of the flags, but, please be patient with this. There is a
madness to my methods. Or something.)
So, here's the complete test frame for software parameter stack on the 6801:
* 16-bit addition and subtraction for 6801 on parameter stack,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
PSP RMB 2 ; parameter stack pointer
SSAVE RMB 2 ; a place to keep S so we can return clean
*
*
ORG $2000 ; MDOS says this is a good place for usr stuff
NOENTRY JMP START
RMB 2 ; a little bumper space
SSTKLIM RMB 31 ; 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
*
*
INISTKS LDX #PSTKBAS ; Set up the parameter stack
STX PSP
PULX ; get return address
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
PPOP16 LDX PSP
LDD 0,X
INX
INX
STX PSP
RTS
*
PPSH16 LDX PSP
DEX
DEX
STX PSP
STD 0,X
RTS
*
* Don't need LD16I.
* If we needed it, it would look like this, but we don't.
*
* You could use it like this:
* JSR LD16I ; load D immediate
* FDB $1234 ; "immediate" 16-bit value to load
* JSR SOMEWHERE ; or some other executable code.
*
* LD16I PULX ; point to the instruction stream
* LDD 0,X ; from instruction stream
* JMP 2,X ; return to the byte after the constant.
*
* But use
* LDD #1234 ; 16 bits!
* instead.
*
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit sum
ADD16 LDX PSP
LDD 2,X ; left
ADDD 0,X ; right
INX ; adjust parameter stack first
INX
STX PSP
STD 0,X ; sum (N, Z, & C flags should be correct)
RTS
* Flags: Specifically,
* N and Z get set correctly by the final store double;
* C should make it through manipulating X and storing D.
* V gets walked on.
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit difference
SUB16 LDX PSP
LDD 2,X ; left
SUBD 0,X ; right
INX ; adjust parameter stack first
INX
STX PSP
STD 0,X ; difference (N, Z, & C flags should be correct)
RTS
* Flags: Specifically,
* N and Z get set correctly by the final store double;
* C should make it through manipulating X and storing D.
* V gets walked on.
*
*
START JSR INISTKS
*
LDD #$1234
JSR PPSH16
LDD #$CDEF
JSR PPSH16
JSR ADD16 ; result should be $E023
LDD #$8765
JSR PPSH16
JSR SUB16 ; result should be $58BE
LDX PSP
LDD 0,X ; load the result into A:B
*
DONE LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
Make sure you've copied everything correctly, step through it, try other constants. Convince yourself that you'd rather use the 6801 than the 6800.
(Why didn't Motorola release the 6801 core in a package that could be dropped
into a socket for the 6800? Yeah, yeah, I was the unpaying customer with great
demands.)
And let's see how it might look with the single interleaved stack discipline I
keep disparaging:
* 16-bit addition and subtraction for 6801 on return stack,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
SSAVE RMB 2 ; a place to keep S so we can return clean
*
*
ORG $2000 ; MDOS says this is a good place for usr stuff
NOENTRY JMP START
NOP ; bump to aligned
RMB 2 ; a little bumper space
SSTKLIM RMB 95 ; (64+31) roughly 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
*
*
INISTKS PULX ; Get return address.
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit sum
ADD16 TSX
LDD 4,X ; left
ADDD 2,X ; right
ADD16S STD 4,X ; sum
LDX 0,X ; return address before we deallocate it
INS ; drop return address
INS
INS ; drop right-hand addend
INS
JMP 0,X ; return
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit difference
SUB16 TSX
LDD 4,X ; left
SUBD 2,X ; right
BRA ADD16S ; Steal code.
* Could steal code this way in the parameter stack example, as well.
*
*
START JSR INISTKS
*
LDD #$1234
PSHB ; push in correct order
PSHA
LDD #$CDEF
PSHB
PSHA
JSR ADD16 ; result should be $E023
LDD #$8765
PSHB
PSHA
JSR SUB16 ; result should be $58BE
PULA
PULB
*
DONE LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
Again, being able to use the native push and pop instructions seems to clean
up the code significantly.
But we are still playing dodgy games avoiding the return address, and those games will still tend to keep you too amused late at night.
And lets try it using a scratch area in the DP to pass values in and
out:
* 16-bit addition and subtraction for 6801 via scratch pad,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
SSAVE RMB 2 ; a place to keep S so we can return clean
* parameter/scratch area for leaf functions only:
NLFT RMB 2 ; binary operator left side parameter
NRT RMB 2 ; binary operator right side parameter
NRES RMB 2 ; unary/binary operator result
NTEMP RMB 2 ; general scratch register for
NPAR EQU NLFT ; unary operator parameter
NSCRAT EQU NLFT ;
*
*
ORG $2000 ; MDOS says this is a good place for usr stuff
NOENTRY JMP START
NOP ; bump to aligned
RMB 2 ; a little bumper space
SSTKLIM RMB 31 ; roughly 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
*
*
INISTKS PULX ; get return address
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
*
* Don't need PPOP and PPSH
*
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit sum
ADD16 LDD NLFT
ADDD NRT
ADD16S STD NRES ; sum
RTS
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit difference
SUB16 LDD NLFT
SUBD NRT
STD NRES ; difference
RTS
* Stealing code would only save 1 byte.
*
*
START JSR INISTKS
*
LDD #$1234
STD NLFT
LDD #$CDEF
STD NRT
JSR ADD16 ; result should be $E023
LDD NRES
STD NLFT
LDD #$8765
STD NRT
JSR SUB16 ; result should be $58BE
LDD NRES
*
* Repeat, with native instructions:
LDD #$1234
ADDD #$CDEF
SUBD #$8765
*
DONE LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
Dramatic?
But still, all of that? Just to write the equivalent of
LDD #$1234
ADDD #$CDEF
SUBD #$8765
??
Yeah, I jest. Again, there are things you cannot reduce to constants at
design- or compile-time.
But, even though it appears dramatic, you might be able to see a trend here. Let's see how that trend continues on the 6809.
No comments:
Post a Comment