Thursday, October 3, 2024

ALPP 02-11 -- On the Beach with Parameters -- 16-bit Arithmetic on the 6809 with Direct Page Moved

On the Beach with Parameters --
16-bit Arithmetic
on the 6809
with Direct Page Moved

(Title Page/Index)

 

Having worked through three different ways to pass parameters at run-time on the 6809, we remembered that the 6809 has the direct page register. Let's use it, repeating the three ways to pass parameters.

Why?

Because I want to focus on that idea of moving the direct page before doing this all on the 68000.

These are very minor changes to the parameter stack and combined stack versions, but the changes are more significant (if still minor) for the statically allocated parameters (in the direct page) version. When you step through, pay attention to the direct page register, and to the object code when and the actual address accessed when using the direct page mode to access variables in the direct page -- the SSAVE variable (and the new DPSAVE and FINAL variables) and the parameter variables themselves in the "direct page" version.

I've been abbreviating my references, by the way, in a way that I should not have, referring to the statically allocated parameters as direct page parameters or some such. This makes sense on the 6809, and sort-of makes sense on the 6800/6801, but it doesn't map directly to the 68000, and won't map directly to processors without a direct page. 

And we want to think carefully how we map the concept to the 6800/8601.

Understanding what we're doing here will help when we move on to the 68000, and, later, if someone picks up other processors.

Let's look first at the separate parameter stack version, starting with the declarations. Where we declared SSAVE in page zero to this point, we're declaring it out in page $20 now.

	ORG	$2000	; MDOS says this is a good place for usr stuff.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
*
ENTRY	LBRA	START
	NOP		; Just want even addressed pointers for no reason.
SSAVE	RMB	2	; a place to keep S so we can return clean
DPSAVE	RMB	2	; a place to keep DP so we can return clean
FINAL	RMB	2	; Final result in DP variable (to show we can)

The SETDP declarations here should not be necessary for the assembler. I put them here more as comments, to indicate to the human reader that we intend to set the DP to point here. 

And, as I've noted, different assemblers have different semantics for the SETDP declarative. The ones I generally use just take the page number, but EXORsim's assembler wants the whole base address. 

I've added a DPSAVE to save the DP we get from the monitor.

I've also added a FINAL variable to store the final result in, just as a kind of interpretive demonstration.

And that's it. After that, I move up to page $21 to declare the stacks, to show that the stacks don't have to be in the direct page. They can be if there's room, but I don't want anyone thinking they have to be.

	SETDP	0	; Not yet set up
	ORG	$2100	; Give the DP room.
	RMB	2	; a little bumper space
SSTKLIM	RMB	32	; 16 levels of call, max
* 			; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec

Following the stack declarations is the stack initialization routine, where we fairly carefully get the monitor's DP and put it in Y, then calculate out the page number by relative addressing and move the base address from X to D, where we can access the page number in A and TransFeR it to DP.

And then we SETDP for the duration of the source, until we restore DP at the end.

Once DP is set and declared, I use the direct page variables to save the DP and S that we get from the monitor ROM. When you check the code, you'll see that the addresses are given in short form, as offsets from the base address that DP points to.

INISTKS	TFR	DP,A
	CLRB
	TFR	D,Y		; save old DP base for a moment
	LEAX	ENTRY,PCR	; Set up new DP base
	TFR	X,D
	TFR	A,DP		; Now we can access DP variables correctly.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
	STY	DPSAVE		; technically only need to save high byte
	LEAU	PSTKBAS,PCR	; Set up the parameter stack
	PULS	X		; get return address
	STS	SSAVE		; Save what the monitor gave us.
	LEAS	SSTKBAS,PCR	; Move to our own stack
	JMP	,X	; return via X

You might be wondering whether a full 16-bit DP base register might have been more reasonable. I think so, myself. It would have allowed better granularity for locating whatever you put in the direct page. 

I assume that Motorola was planning on the shorter DP using less resources in the CPU and fewer cycles in the DP relative accesses. I'm not sure it worked out that way. DP accesses cost as much as short offset indexed register accesses.

(And you hear me again muttering about the lack of DP mode in the index mode postbyte.)

From there until just before DONE, the rest of the source code is the same, and the effects are in accesses to variables in the direct page, which now access them in page $21 instead of page $00.

Just before the DONE label, I've stored the result in FINAL, and then at DONE I restore the stack pointer and direct page base that the monitor gave us, and that's that.

	LDD	,U++	; load the result into A:B
	STD	FINAL
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	LDD	DPSAVE	; restore the monitor DP
	TFR	A,DP
	SETDP	0	; For lack of a better way to set it.
	NOP
	NOP		; landing pad

Here's the full source for the parameter stack version:

* 16-bit addition and subtraction for 6809 on parameter stack
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$2000	; MDOS says this is a good place for usr stuff.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
*
ENTRY	LBRA	START
	NOP		; Just want even addressed pointers for no reason.
SSAVE	RMB	2	; a place to keep S so we can return clean
DPSAVE	RMB	2	; a place to keep DP so we can return clean
FINAL	RMB	2	; Final result in DP variable (to show we can)
*
*
	SETDP	0	; Not yet set up
	ORG	$2100	; Give the DP room.
	RMB	2	; a little bumper space
SSTKLIM	RMB	32	; 16 levels of call, max
* 			; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
*
*
INISTKS	TFR	DP,A
	CLRB
	TFR	D,Y		; save old DP base for a moment
	LEAX	ENTRY,PCR	; Set up new DP base
	TFR	X,D
	TFR	A,DP		; Now we can access DP variables correctly.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
	STY	DPSAVE		; technically only need to save high byte
	LEAU	PSTKBAS,PCR	; Set up the parameter stack
	PULS	X		; get return address
	STS	SSAVE		; Save what the monitor gave us.
	LEAS	SSTKBAS,PCR	; Move to our own stack
	JMP	,X	; return via X
*
* PPOP and PPUSH are completely unnecessary, 
* but if we had to have them, here's one way to do it:
*PPOP16	LDD	,U++
*	RTS
*
*PPSH16	STD	,--U
*	RTS
*
* Or, of course,
*PPOP16	PULU	A,B
*	RTS
*
*PPSH16	PSHU	A,B
*	RTS
*
*
* Don't need LD16I.
* If we needed it, it could look like this, but we don't.
*
* You could use it like this:
*	LBSR	LD16I	; load D immediate
*	FDB	$1234	; "immediate" 16-bit value to load
*	BSR	SOMEWHERE ; or some other executable code.
*
* LD16I	PULS	X	; point to the instruction stream
*	LDD	,X	; from instruction stream
*	JMP	2,X	; return to the byte after the constant.
*
* But use
*	LDD	#1234	; 16 bits!
* instead.
*
* And if we need to index ROMmed tables or such, 
* we have something much better for that, too:
*
* TABLE	FCB	SOMETHING
*	...
* 	LEAX	TABLE,PCR
*
*
* We often will not need these, but we'll go ahead and define them:
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit sum
ADD16	LDD	2,U	; left 
	ADDD	,U++	; right
	STD	,U	; sum (N, Z, & C flags should be correct)
	RTS
* Flags: Specifically,
*        N and Z get set correctly by the final store double;
*        C should make it through manipulating X and storing D.
*        V gets cleared.
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit difference
SUB16	LDD	2,U	; left
	SUBD	,U++	; right
	STD	,U	; difference (N, Z, & C flags should be correct)
	RTS
* Flags: Specifically,
*        N and Z get set correctly by the final store double;
*        C should make it through manipulating X and storing D.
*        V gets cleared.
*
*
* Let's use what we have:
START	LBSR	INISTKS
*
	LDD	#$1234
	PSHU	A,B
	LDD	#$CDEF
	PSHU	A,B
	LBSR	ADD16	; result should be $E023
	LDD	#$8765
	PSHU	A,B
	LBSR	SUB16	; result should be $58BE
	LDD	,U++	; load the result into A:B
	STD	FINAL
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	LDD	DPSAVE	; restore the monitor DP
	TFR	A,DP
	SETDP	0	; For lack of a better way to set it.
	NOP
	NOP		; landing pad

And, basically, the changes are the same, except for one less stack to set up, for the combined stack version that I keep disparaging (so that you understand that I don't think it's the way things should be done):

* 16-bit addition and subtraction for 6809 on return stack
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$2000	; MDOS says this is a good place for usr stuff.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
*
ENTRY	LBRA	START
	NOP		; Just want even addressed pointers for no reason.
SSAVE	RMB	2	; a place to keep S so we can return clean
DPSAVE	RMB	2	; a place to keep DP so we can return clean
FINAL	RMB	2	; Final result in DP variable (to show we can)
*
*
	SETDP	0	; Not yet set up
	ORG	$2100	; Give the DP room.
	RMB	2	; a little bumper space
SSTKLIM	RMB	96	; (64+32) roughly 16 levels of call, max
* 			; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS	RMB	2	; a little bumper space
*
*
INISTK	TFR	DP,A
	CLRB
	TFR	D,Y		; save old DP base for a moment
	LEAX	ENTRY,PCR	; Set up new DP base
	TFR	X,D
	TFR	A,DP		; Now we can access DP variables correctly.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
	STY	DPSAVE		; technically only need to save high byte
	PULS	X		; get return address
	STS	SSAVE		; Save what the monitor gave us.
	LEAS	SSTKBAS,PCR	; Move to our own stack
	JMP	,X	; return via X
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit sum
ADD16	PULS	X	; get return address out of the way
	LDD	2,S	; left 
	ADDD	,S++	; right
	STD	,S	; sum (N, Z, & C flags should be correct)
	JMP	,X	; return
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit difference
SUB16	PULS	X	; get return address out of the way
	LDD	2,S	; left 
	SUBD	,S++	; right
	STD	,S	; sum (N, Z, & C flags should be correct)
	JMP	,X	; return
*
*
START	LBSR	INISTK
*
	LDD	#$1234
	PSHS	A,B
	LDD	#$CDEF
	PSHS	A,B
	LBSR	ADD16	; result should be $E023
	LDD	#$8765
	PSHS	A,B
	LBSR	SUB16	; result should be $58BE
	LDD	,S++	; load the result into A:B
	STD	FINAL
*
DONE	LDS	SSAVE,PCR	; restore the monitor stack pointer
	LDD	DPSAVE	; restore the monitor DP
	TFR	A,DP
	SETDP	0	; For lack of a better way to set it.
	NOP
	NOP		; landing pad

[EDIT JMR202510059924:]

See the edits in the above code from the version of this that does not move the direct page, for the mistake I made while dancing around the return address. The code above is fixed now.

[END EDIT JMR202510059924.]

And the changes really are basically the same for the DP version, where we expect to see the most effect. I've included the statically allocated (scratch) parameter variables in the direct page because that's basically where such parameters should go, in the use of the DP that I am promoting here:

* 16-bit addition and subtraction for 6809 via DP scratch pad
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$2000	; MDOS says this is a good place for usr stuff.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
*
ENTRY	LBRA	START
	NOP		; Just want even addressed pointers for no reason.
SSAVE	RMB	2	; a place to keep S so we can return clean
DPSAVE	RMB	2	; a place to keep DP so we can return clean
FINAL	RMB	2	; Final result in DP variable (to show we can)
* parameter/scratch area for leaf functions only:
NLFT	RMB	2	; binary operator left side parameter
NRT	RMB	2	; binary operator right side parameter
NRES	RMB	2	; unary/binary operator result
NTEMP	RMB	2	; general scratch register for 
NPAR	EQU	NLFT	; unary operator parameter
NSCRAT	EQU	NLFT	; 
*
*
	SETDP	0	; Not yet set up
	ORG	$2100	; Give the DP room.
	RMB	2	; a little bumper space
SSTKLIM	RMB	32	; roughly 16 levels of call, max
*			; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS	RMB	2	; a little bumper space
*
*
INISTK	TFR	DP,A
	CLRB
	TFR	D,Y		; save old DP base for a moment
	LEAX	ENTRY,PCR	; Set up new DP base
	TFR	X,D
	TFR	A,DP		; Now we can access DP variables correctly.
*	SETDP	$20	; some other assemblers
	SETDP	$2000	; EXORsim
	STY	DPSAVE		; technically only need to save high byte
	PULS	X		; get return address
	STS	SSAVE		; Save what the monitor gave us.
	LEAS	SSTKBAS,PCR	; Move to our own stack
	JMP	,X	; return via X
*
*
* Don't need PPOP and PPSH, but wait 'til we need SCRPSH!
*
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit sum
ADD16	LDD	NLFT
	ADDD	NRT
ADD16S	STD	NRES	; sum
	RTS
*
* input parameters:
*   16-bit left, right
* output parameter:
*   16-bit difference
SUB16	LDD	NLFT
	SUBD	NRT
	STD	NRES	; difference
	RTS
* Stealing code would only save 1 byte.
*
*
START	LBSR	INISTK
*
	LDD	#$1234
	STD	NLFT
	LDD	#$CDEF
	STD	NRT
	LBSR	ADD16	; result should be $E023
	LDD	NRES
	STD	NLFT
	LDD	#$8765
	STD	NRT
	LBSR	SUB16	; result should be $58BE
	LDD	NRES
	STD	FINAL
*
* Repeat, with native instructions:
	LDD	#$1234
	ADDD	#$CDEF
	SUBD	#$8765
*
DONE	LDS	SSAVE,PCR	; restore the monitor stack pointer
	LDD	DPSAVE	; restore the monitor DP
	TFR	A,DP
	SETDP	0	; For lack of a better way to set it.
	NOP
	NOP		; landing pad

What should go in the direct page? Different people have different ideas.

For my part, the monitor ROM should point DP to where the principle I/O registers are, perhaps, when it is accessing them, and otherwise point it to where the monitor's statically allocated variables are.

Then, every process should point DP to its own statically allocated variables, both global to the process and local to the individual functions of the process. This allows a certain degree of actual separation of process variable spaces.

For the record, if the monitor is able to handle allocation of the direct page and the stacks, the monitor itself should set them up for the processes and the processes should not have to save them. This would provide the greatest separation. 

And now we can begin to see what the point of all my ramblings about stacks and such is -- logical separation of  access to variables by whether they are statically (globally) allocated or dynamically (locally) allocated.

Can we do something like a local static allocation area for the 6800/6801?

Well, if we have a local base (LB?) pointer somewhat analogous to the PSP parameter stack pointer, most likely declared (and allocated) right there with the PSP, we could get such a thing, but, as with the cost of the software stack, it would come at a small cost. We'd have to load it into X every time we need it, wiping out whatever pointer was in X, and thrashing X even more. 

But such a local base pointer would not need the maintenance PSP needs, which means it would not cost as much to use.

Another option would be to have an area in the page zero direct page of the 6800/6801, probably adjacent to the PSP, which the multi-tasking OS or monitor would copy to private space when switching processes.

I'll try to talk about both those options when we have a better opportunity.

So. Why not just use the 6801?

Yeah. If you have a hardware app with a very small number of concurrent processes, the 6801 isn't really a bad option, no worse than the Z-80, maybe a little better.

Let's take a look at all this on the 68000.

(Title Page/Index)

 

 

No comments:

Post a Comment