Sunday, September 1, 2024

ALPP 02-03 -- Foothold! (Split Stacks ... barely on the Beach) -- 6809

Foothold!
(Split Stacks ... barely on the Beach)
6809

(Title Page/Index)

 

The differences we saw between the 6800 and the 6801 were relatively minor, but not unimportant. The very small advances in the 6801 make it significantly easier to keep the CPU responsive and stable while handling interrupts and other kinds of concurrency.

And now we finally get a look at some of the important differences between the 6800 / 6801, and the 6809. It may be a little mind-boggling. You really want to open up separate browser windows and compare the code for each, preferably on separate monitors. 

Even if you don't quite follow what's going on, compare the code. The more times you see this from different points of view, the easier it gets.

Again, as I have said, one major purpose of the split stack is to separate return addresses from parameters and local variables, producing somewhat more robust, less difficult to debug code. 

And, I'm going to repeat myself (again?), you probably won't believe me yet, but the split stack also simplifies and streamlines the call interface for the subroutines and procedures you write. 

Let's start taking a look at why. 

The 6809 includes the same ability to concatenate A and B that the 6801 has. (Historically, the 6809's design actually came first.)

The 6809 also gives us two more index registers, or four, depending on how you count, adding Y and U to X and S. Where in the 6800/6801, only X can take an offset and be used as a general index register, all four, plus PC, can take an offset in the 6809. Additionally, U also specifically supports pushes and pops, and we shall be using it instead of PSP for the parameter stack. 

(We'll talk about the 6809's partially indexable DP and how to use it well later.)

Enough sales talk, let's take a look at how OUTC can change on the 6809.

OUTC will use the 6809's pop instruction (PULU) for U:

* Walks on contents of B.
OUTC	PULU	D	; Get the character, 
	TFR	B,A	; Put it where XOUTCH wants it.
	JSR	XOUTCH	; output via monitor ROM
	RTS

Or we could do two PULU A instructions and preserve B.

* Does not Walk on B.
OUTC	PULU	A	; Get the high byte out of the way, 
	PULU	A	; Get the character where XOUTCH wants it.
	JSR	XOUTCH	; output via monitor ROM
	RTS

But the 6809 also provides a way to access the U stack directly:

* Preserves B.
OUTC	LDA	1,U	; Get the character in A where XOUTCH wants it.
	LEAU	2,U	; Got it in A, now drop it from the stack.
	JSR	XOUTCH	; output via monitor ROM
	RTS

What if we even want to preserve A? (We probably don't, but ...):

OUTC	PSHS	A	; Save A
	LDA	1,U	; Get the character in A where XOUTCH wants it.
	LEAU	2,U	; Got it in A, now drop it from the stack.
	JSR	XOUTCH	; output via monitor ROM
	PULS	A,PC	; Pop and return in one instruction.

Load Effective Address shows up again, this time replacing a separate INCrement instruction for U, or, rather, replacing what would have been two INU instructions if the 6809 had them. 

For source registers other than PC, LEA adds the signed offset to the specified index register, in this case +2 to U.

For PC relative source addresses (which we will see again below), the assembler figures out the offset from the label, so you use the label in the source code, but the offset is assembled to the object code.

(If you have a macro assembler, you can define INU as LEAU 1,U, but that's rather wasteful if you use it more than once in a row.)

And there are several other interesting ways we could do this.

Again, if you haven't already, open up separate browser windows for the introduction to split stacks on the 6800 and the 6801 so you can compare.

Note that X is not used, no need to push and pop it, it's preserved. 

Note also that PPUSHD and PPOPD simply go away (along with PPUSHX and PPOPX), replaced by the 6809's more flexible PSHU and PULU instructions.

The 6809 gives us some really wonderful options for setting up the stack pointers. Not only can we pop the return address directly to X like we can on the 6801, but we can save the monitor's stack pointer on our own stack, eliminating yet another global variable:

INISTKS	LEAU	PSTKBAS,PCR	; Set up the parameter stack
	PULS	X	; 6809 lets us do this -- return address in X
	TFR	S,Y	; Save what the monitor gave us.
	LEAS	SSTKBAS,PCR	; Move to our own stack
	PSHS	Y	; Save monitor's stack pointer on ours
	JMP	,X	; return via X

When we're done, we can restore the monitor's stack pointer with a simple load:

DONE	LDS	,S	; restore the monitor stack pointer

Let's look at the complete code to test character output:

* Essential control codes
LF	EQU	$0A	; line feed
CR	EQU	$0D	; carriage return
* Essential monitor ROM routines
XOUTCH	EQU	$F018
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
* No need for PSP in 6809 --
* U will be our parameter stack pointer,
* And we will save S on our own stack.
*
	ORG	$2000	; MDOS says this is a good place for user stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	32	; 16 levels of call, max
* 6809 is pre-dec (decrement before store) push
SSTKBAS	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
* (But this example only uses two levels.)
*
*
INISTKS	LEAU	PSTKBAS,PCR	; Set up the parameter stack
	PULS	X	; 6809 lets us do this -- return address in X
	TFR	S,Y	; Save what the monitor gave us.
	LEAS	SSTKBAS,PCR	; Move to our own stack
	PSHS	Y	; Save monitor's stack pointer on ours
	JMP	,X	; return via X
*
* No need for PPOPD or PPUSHD in 6809 code.
*
* We can handle the parameter stack directly.
* Preserves B.
OUTC	LDA	1,U	; Get the character in A where XOUTCH wants it.
	LEAU	2,U	; Got it in A, now drop it from the stack.
	JSR	XOUTCH	; output via monitor ROM
	RTS
*
*
START	LBSR	INISTKS
*
	LDD	#'H
	PSHU	D
	LBSR	OUTC	; 6809 lets us do long relative branches
*
DONE	LDS	,S	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

And that's it.

Change back to the directory for JHAllen's unmodified EXORsim and run an instance of EXORsim 6809, assemble the source, set a breakpoint at the landing pad NOPs, and run it. (Refer back to Hello World for 6809 in the 8-bit Hello World chapter if you don't remember how to do this.) 

Do set the breakpoint. 

EXORsim is nice enough to break on undefined opcodes, but, with the 6809, zero is a defined opcode -- the direct-page form of the NEGate instruction. So it won't stop, It will wander off and will eventually start executing the monitor or something and end up jumping into MDOS or something, which will be confusing.

Be aware, the actual 6800 is not nice enough to break on undefined opcodes. So if you were getting lazy about breakpoints on the 6800 and 6801, remember you can't do that on real hardware.

This is so short that I almost want to do the 68000 code here in the same chapter and be done with it. But I want you to be able to compare the 68000 code to the 6809 code side-by-side, so we'll keep that separate.

So, it's time to look at the string routines?

Really straightforward.

* First 6809 Foothold on the split stack beach,
* by Joel Matthew Rees September 2024, Copyright 2024 -- All rights reserved.
*
* Essential control codes
LF	EQU	$0A	; line feed
CR	EQU	$0D	; carriage return
NUL	EQU	0
*
* Essential monitor ROM routines
XOUTCH	EQU	$F018
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
* We will save S on our own stack.
*
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	32	; 16 levels of call, max
* 6809 is pre-dec (decrement before store) push
SSTKBAS	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
*
*
HELLO	FCB	CR,LF	; Put message at beginning of line
	FCC	"Ashi-ba ga dekita!"	; Whatever the user wants here.
	FCB	CR,LF,NUL	; Put the debugger's output on a new line.
*
*
INISTKS	LEAU	PSTKBAS,PCR	; Set up the parameter stack
	PULS	X	; 6809 lets us do this -- return address in X
	TFR	S,Y	; Save what the monitor gave us.
	LEAS	SSTKBAS,PCR	; Move to our own stack
	PSHS	Y	; Save monitor's stack pointer on ours
	JMP	,X	; return via X
*
*
* No need for PPOPD, PPUSHD, PPOPX, or PPUSHX in 6809 code.
*
* We can handle the parameter stack directly.
* Preserves B.
OUTC	LDA	1,U	; Get the character in A where XOUTCH wants it.
	LEAU	2,U	; Got it in A, now drop it from the stack.
	BSR	OUTCV	; output via monitor ROM
	RTS
*
OUTCV	JMP	XOUTCH	; common hook
*
* And we can handle the parameter stack directly here, too.
OUTS	PULU	X	; get the string pointer
OUTSL	LDA	,X+	; get the byte, update the pointer
	BEQ	OUTDN	; if NUL, leave
	BSR	OUTCV	; use the same call OUTC uses.
	BRA	OUTSL	; next character
OUTDN	RTS
*
* All the newlines we need are in the string.
*
*
START	LBSR	INISTKS
*
	LEAX	HELLO,PCR
	PSHU	X
	LBSR	OUTS	; Our string has its own CR/LF this time around.
*
DONE	LDS	,S	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

Note the use of the auto-increment index mode in the string output routine. 

Before we leave, I think I should point out something I briefly mentioned in the 6800 and 6801 code examples. It is quite easy to have OUTS call OUTC, like this:

OUTS	PULU	X	; get the string pointer
OUTSL	LDA	,X+	; get the byte, update the pointer
	BEQ	OUTDN	; if NUL, leave
	PSHU	A	; push the low byte of the character
	CLR	,-U	; push the high byte of zero for OUTC
	BSR	OUTC	; use OUTC itself.
	BRA	OUTSL	; next character
OUTDN	RTS

We can't just CLeaR B and PuSH D on U, because that will be in the wrong byte order. But the predecrement mode CLR works nicely for putting the high byte of zero in the right place.

We do need to be careful to check that the stack remains balanced after the parameter pushing and popping, particular since it can be a little distracting that it's happening in a loop.

You can see that takes a bit of extra code and ends up pushing and popping a little redundantly. 

On the 6800 and 6801, that's going to be a full call and return penalty to let PPUSHD and PPOPD handle it. On the 6809, it's not nearly as much of a penalty, since the 6809 provides the extra stack, PSHU, PULU, and the predecrement addressing modes. But it's still redundant code that takes room in the object and takes cycles to execute.

On the other hand, having OUTS call OUTC allows all character output issues to be taken care of in OUTC.

As a compromise, I'm having both OUTS and OUTC call OUTCV. 

(You might think a branch would work instead of a call, stealing the monitor's RTS, and, indeed, it should work, it often does, and it has been done in existing, functioning code. But there's a certain lack of clarity in doing so, and I have other things I want you to be thinking about.)

Again, lightly tested, should work for you. Play with the code a little while we're here. And think about how to make sure the stacks have stayed in balance.

(A note, while we're looking at it: From early code examples from Motorola and even the CPU's order of saving registers on interrupt, it appears that the original intent in the design of the 6800 might have been that A would be the low byte and B the overflow -- high -- byte, when used together -- and that, for some reason, that intent was inverted in the 6801 and 6809. That may be the reason the monitor looks for the character in A instead of B.)

After you're done playing a bit, let's move on to getting this foothold on the 68000.


(Title Page/Index)


No comments:

Post a Comment