On the Beach with Parameters --
16-bit Arithmetic
on the
6809
with Direct Page Moved
Having worked through three different ways to pass parameters at run-time on the 6809, we remembered that the 6809 has the direct page register. Let's use it, repeating the three ways to pass parameters.
Why?
Because I want to focus on that idea of moving the direct page before doing this all on the 68000.
These are very minor changes to the parameter stack and combined stack versions, but the changes are more significant (if still minor) for the statically allocated parameters (in the direct page) version. When you step through, pay attention to the direct page register, and to the object code when and the actual address accessed when using the direct page mode to access variables in the direct page -- the SSAVE variable (and the new DPSAVE and FINAL variables) and the parameter variables themselves in the "direct page" version.
I've been abbreviating my references, by the way, in a way that I should not have, referring to the statically allocated parameters as direct page parameters or some such. This makes sense on the 6809, and sort-of makes sense on the 6800/6801, but it doesn't map directly to the 68000, and won't map directly to processors without a direct page.
And we want to think carefully how we map the concept to the 6800/8601.
Understanding what we're doing here will help when we move on to the 68000,
and, later, if someone picks up other processors.
Let's look first at the separate parameter stack version, starting with the declarations. Where we declared SSAVE in page zero to this point, we're declaring it out in page $20 now.
ORG $2000 ; MDOS says this is a good place for usr stuff.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
*
ENTRY LBRA START
NOP ; Just want even addressed pointers for no reason.
SSAVE RMB 2 ; a place to keep S so we can return clean
DPSAVE RMB 2 ; a place to keep DP so we can return clean
FINAL RMB 2 ; Final result in DP variable (to show we can)
The SETDP declarations here should not be necessary for the assembler. I put them here more as comments, to indicate to the human reader that we intend to set the DP to point here.
And, as I've noted, different assemblers have different semantics for the SETDP declarative. The ones I generally use just take the page number, but EXORsim's assembler wants the whole base address.
I've added a DPSAVE to save the DP we get from the monitor.
I've also added a FINAL variable to store the final result in, just as a kind of interpretive demonstration.
And that's it. After that, I move up to page $21 to declare the stacks, to
show that the stacks don't have to be in the direct page. They can be if
there's room, but I don't want anyone thinking they have to be.
SETDP 0 ; Not yet set up
ORG $2100 ; Give the DP room.
RMB 2 ; a little bumper space
SSTKLIM RMB 32 ; 16 levels of call, max
* ; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
Following the stack declarations is the stack initialization routine, where we
fairly carefully get the monitor's DP and put it in Y, then calculate out the
page number by relative addressing and move the base address from X to D,
where we can access the page number in A and TransFeR it to DP.
And then we SETDP for the duration of the source, until we restore DP at the end.
Once DP is set and declared, I use the direct page variables to save the DP
and S that we get from the monitor ROM. When you check the code, you'll see
that the addresses are given in short form, as offsets from the base address
that DP points to.
INISTKS TFR DP,A
CLRB
TFR D,Y ; save old DP base for a moment
LEAX ENTRY,PCR ; Set up new DP base
TFR X,D
TFR A,DP ; Now we can access DP variables correctly.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
STY DPSAVE ; technically only need to save high byte
LEAU PSTKBAS,PCR ; Set up the parameter stack
PULS X ; get return address
STS SSAVE ; Save what the monitor gave us.
LEAS SSTKBAS,PCR ; Move to our own stack
JMP ,X ; return via X
You might be wondering whether a full 16-bit DP base register might have been more reasonable. I think so, myself. It would have allowed better granularity for locating whatever you put in the direct page.
I assume that Motorola was planning on the shorter DP using less resources in the CPU and fewer cycles in the DP relative accesses. I'm not sure it worked out that way. DP accesses cost as much as short offset indexed register accesses.
(And you hear me again muttering about the lack of DP mode in the index mode postbyte.)
From there until just before DONE, the rest of the source code is the same, and the effects are in accesses to variables in the direct page, which now access them in page $21 instead of page $00.
Just before the DONE label, I've stored the result in FINAL, and then at DONE
I restore the stack pointer and direct page base that the monitor gave us, and
that's that.
LDD ,U++ ; load the result into A:B
STD FINAL
*
DONE LDS SSAVE ; restore the monitor stack pointer
LDD DPSAVE ; restore the monitor DP
TFR A,DP
SETDP 0 ; For lack of a better way to set it.
NOP
NOP ; landing pad
Here's the full source for the parameter stack version:
* 16-bit addition and subtraction for 6809 on parameter stack
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $2000 ; MDOS says this is a good place for usr stuff.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
*
ENTRY LBRA START
NOP ; Just want even addressed pointers for no reason.
SSAVE RMB 2 ; a place to keep S so we can return clean
DPSAVE RMB 2 ; a place to keep DP so we can return clean
FINAL RMB 2 ; Final result in DP variable (to show we can)
*
*
SETDP 0 ; Not yet set up
ORG $2100 ; Give the DP room.
RMB 2 ; a little bumper space
SSTKLIM RMB 32 ; 16 levels of call, max
* ; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
*
*
INISTKS TFR DP,A
CLRB
TFR D,Y ; save old DP base for a moment
LEAX ENTRY,PCR ; Set up new DP base
TFR X,D
TFR A,DP ; Now we can access DP variables correctly.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
STY DPSAVE ; technically only need to save high byte
LEAU PSTKBAS,PCR ; Set up the parameter stack
PULS X ; get return address
STS SSAVE ; Save what the monitor gave us.
LEAS SSTKBAS,PCR ; Move to our own stack
JMP ,X ; return via X
*
* PPOP and PPUSH are completely unnecessary,
* but if we had to have them, here's one way to do it:
*PPOP16 LDD ,U++
* RTS
*
*PPSH16 STD ,--U
* RTS
*
* Or, of course,
*PPOP16 PULU A,B
* RTS
*
*PPSH16 PSHU A,B
* RTS
*
*
* Don't need LD16I.
* If we needed it, it could look like this, but we don't.
*
* You could use it like this:
* LBSR LD16I ; load D immediate
* FDB $1234 ; "immediate" 16-bit value to load
* BSR SOMEWHERE ; or some other executable code.
*
* LD16I PULS X ; point to the instruction stream
* LDD ,X ; from instruction stream
* JMP 2,X ; return to the byte after the constant.
*
* But use
* LDD #1234 ; 16 bits!
* instead.
*
* And if we need to index ROMmed tables or such,
* we have something much better for that, too:
*
* TABLE FCB SOMETHING
* ...
* LEAX TABLE,PCR
*
*
* We often will not need these, but we'll go ahead and define them:
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit sum
ADD16 LDD 2,U ; left
ADDD ,U++ ; right
STD ,U ; sum (N, Z, & C flags should be correct)
RTS
* Flags: Specifically,
* N and Z get set correctly by the final store double;
* C should make it through manipulating X and storing D.
* V gets cleared.
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit difference
SUB16 LDD 2,U ; left
SUBD ,U++ ; right
STD ,U ; difference (N, Z, & C flags should be correct)
RTS
* Flags: Specifically,
* N and Z get set correctly by the final store double;
* C should make it through manipulating X and storing D.
* V gets cleared.
*
*
* Let's use what we have:
START LBSR INISTKS
*
LDD #$1234
PSHU A,B
LDD #$CDEF
PSHU A,B
LBSR ADD16 ; result should be $E023
LDD #$8765
PSHU A,B
LBSR SUB16 ; result should be $58BE
LDD ,U++ ; load the result into A:B
STD FINAL
*
DONE LDS SSAVE ; restore the monitor stack pointer
LDD DPSAVE ; restore the monitor DP
TFR A,DP
SETDP 0 ; For lack of a better way to set it.
NOP
NOP ; landing pad
And, basically, the changes are the same, except for one less stack to set up,
for the combined stack version that I keep disparaging (so that you understand
that I don't think it's the way things should be done):
* 16-bit addition and subtraction for 6809 on return stack
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $2000 ; MDOS says this is a good place for usr stuff.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
*
ENTRY LBRA START
NOP ; Just want even addressed pointers for no reason.
SSAVE RMB 2 ; a place to keep S so we can return clean
DPSAVE RMB 2 ; a place to keep DP so we can return clean
FINAL RMB 2 ; Final result in DP variable (to show we can)
*
*
SETDP 0 ; Not yet set up
ORG $2100 ; Give the DP room.
RMB 2 ; a little bumper space
SSTKLIM RMB 96 ; (64+32) roughly 16 levels of call, max
* ; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS RMB 2 ; a little bumper space
*
*
INISTK TFR DP,A
CLRB
TFR D,Y ; save old DP base for a moment
LEAX ENTRY,PCR ; Set up new DP base
TFR X,D
TFR A,DP ; Now we can access DP variables correctly.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
STY DPSAVE ; technically only need to save high byte
PULS X ; get return address
STS SSAVE ; Save what the monitor gave us.
LEAS SSTKBAS,PCR ; Move to our own stack
JMP ,X ; return via X
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit sum
ADD16 PULS X ; get return address out of the way
LDD 2,S ; left
ADDD ,S++ ; right
STD ,S ; sum (N, Z, & C flags should be correct)
JMP ,X ; return
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit difference
SUB16 PULS X ; get return address out of the way
LDD 2,S ; left
SUBD ,S++ ; right
STD ,S ; sum (N, Z, & C flags should be correct)
JMP ,X ; return
*
*
START LBSR INISTK
*
LDD #$1234
PSHS A,B
LDD #$CDEF
PSHS A,B
LBSR ADD16 ; result should be $E023
LDD #$8765
PSHS A,B
LBSR SUB16 ; result should be $58BE
LDD ,S++ ; load the result into A:B
STD FINAL
*
DONE LDS SSAVE,PCR ; restore the monitor stack pointer
LDD DPSAVE ; restore the monitor DP
TFR A,DP
SETDP 0 ; For lack of a better way to set it.
NOP
NOP ; landing pad
[EDIT JMR202510059924:]
See the edits in the above code from the version of this that does not move the direct page, for the mistake I made while dancing around the return address. The code above is fixed now.
[END EDIT JMR202510059924.]
And the changes really are basically the same for the DP version, where we expect to see the most effect. I've included the statically allocated (scratch) parameter variables in the direct page because that's basically where such parameters should go, in the use of the DP that I am promoting here:
* 16-bit addition and subtraction for 6809 via DP scratch pad
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $2000 ; MDOS says this is a good place for usr stuff.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
*
ENTRY LBRA START
NOP ; Just want even addressed pointers for no reason.
SSAVE RMB 2 ; a place to keep S so we can return clean
DPSAVE RMB 2 ; a place to keep DP so we can return clean
FINAL RMB 2 ; Final result in DP variable (to show we can)
* parameter/scratch area for leaf functions only:
NLFT RMB 2 ; binary operator left side parameter
NRT RMB 2 ; binary operator right side parameter
NRES RMB 2 ; unary/binary operator result
NTEMP RMB 2 ; general scratch register for
NPAR EQU NLFT ; unary operator parameter
NSCRAT EQU NLFT ;
*
*
SETDP 0 ; Not yet set up
ORG $2100 ; Give the DP room.
RMB 2 ; a little bumper space
SSTKLIM RMB 32 ; roughly 16 levels of call, max
* ; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS RMB 2 ; a little bumper space
*
*
INISTK TFR DP,A
CLRB
TFR D,Y ; save old DP base for a moment
LEAX ENTRY,PCR ; Set up new DP base
TFR X,D
TFR A,DP ; Now we can access DP variables correctly.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
STY DPSAVE ; technically only need to save high byte
PULS X ; get return address
STS SSAVE ; Save what the monitor gave us.
LEAS SSTKBAS,PCR ; Move to our own stack
JMP ,X ; return via X
*
*
* Don't need PPOP and PPSH, but wait 'til we need SCRPSH!
*
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit sum
ADD16 LDD NLFT
ADDD NRT
ADD16S STD NRES ; sum
RTS
*
* input parameters:
* 16-bit left, right
* output parameter:
* 16-bit difference
SUB16 LDD NLFT
SUBD NRT
STD NRES ; difference
RTS
* Stealing code would only save 1 byte.
*
*
START LBSR INISTK
*
LDD #$1234
STD NLFT
LDD #$CDEF
STD NRT
LBSR ADD16 ; result should be $E023
LDD NRES
STD NLFT
LDD #$8765
STD NRT
LBSR SUB16 ; result should be $58BE
LDD NRES
STD FINAL
*
* Repeat, with native instructions:
LDD #$1234
ADDD #$CDEF
SUBD #$8765
*
DONE LDS SSAVE,PCR ; restore the monitor stack pointer
LDD DPSAVE ; restore the monitor DP
TFR A,DP
SETDP 0 ; For lack of a better way to set it.
NOP
NOP ; landing pad
What should go in the direct page? Different people have different ideas.
For my part, the monitor ROM should point DP to where the principle I/O registers are, perhaps, when it is accessing them, and otherwise point it to where the monitor's statically allocated variables are.
Then, every process should point DP to its own statically allocated variables,
both global to the process and local to the individual functions of the
process. This allows a certain degree of actual separation of process variable
spaces.
For the record, if the monitor is able to handle allocation of the direct page and the stacks, the monitor itself should set them up for the processes and the processes should not have to save them. This would provide the greatest separation.
And now we can begin to see what the point of all my ramblings about stacks and such is -- logical separation of access to variables by whether they are statically (globally) allocated or dynamically (locally) allocated.Can we do something like a local static allocation area for the 6800/6801?
Well, if we have a local base (LB?) pointer somewhat analogous to the PSP parameter stack pointer, most likely declared (and allocated) right there with the PSP, we could get such a thing, but, as with the cost of the software stack, it would come at a small cost. We'd have to load it into X every time we need it, wiping out whatever pointer was in X, and thrashing X even more.
But such a local base pointer would not need the maintenance PSP needs, which means it would not cost as much to use.
Another option would be to have an area in the page zero direct page of the 6800/6801, probably adjacent to the PSP, which the multi-tasking OS or monitor would copy to private space when switching processes.
I'll try to talk about both those options when we have a better opportunity.
Yeah. If you have a hardware app with a very small number of concurrent processes, the 6801 isn't really a bad option, no worse than the Z-80, maybe a little better.
No comments:
Post a Comment