Saturday, August 17, 2024

ALPP 02-01 -- Foothold! (Split Stacks ... barely on the Beach) -- 6800

Foothold!
(Split Stacks ... barely on the Beach)
6800

(Title Page/Index)

 

We've had a look at getting characters out on the screen on the EXORciser and Atari ST emulators. Maybe you also took a quick detour through my debugging puzzles. Maybe you've even read my meanderings about studying other processors.

Anyway, we've had a look at using the resources a monitor or a BIOS can provide to help us use a CPU to put out strings of text. But the EXbug monitor ROM and the Atari ST BIOS both use different disciplines than the discipline I promised to show you. So we don't yet really have that foothold on the beach we are attacking.

Now, we could pervert our simulators and go completely raw on them, replacing the monitor, BIOS, DOS, etc. with code that uses that discipline, but we want some cooked code to take with us when we do -- and a lot more experience. 

Yes, at some point we'll do a little bit-twiddling and BLiTting. But first we'll use the character I/O that the monitor and BIOS provide, and prepare some interesting code for when we get there. 

If we limit our use of the existing monitor and BIOS code to the character I/O routines, we make it simpler to move the code to other targets.

To that end, our project for this chapter is to write a bit of glue code for character and text/string terminal I/O on the EXORciser under EXbug, to connect between the split stack discipline I use and the single-stack discipline used in EXbug. Later, we'll extend to using persistent store -- disk I/O. There also, we'll limit our use of code from the other disciplines, to make it easier to port the code we produce to other targets.

Why split stack? 

What is split stack? 

It's going to be easier to show than to explain, but a little hand-waving at the start might help, so I'll give it a try. See (as I wave my hands), it's like this ...

Unless you went spelunking in the EXbug monitor ROM, you may not have noticed it in there. But the PSH (push) and PUL (pop) instructions on the 6800 and 6801 push and pop registers -- on the same stack as the return address. This is the case on most currently commercially successful CPUs. 

Most currently successful CPUs -- the typical run-time models implemented on even the 6809 and 68000, which provide push and pop instructions for multiple stacks, tend to support and use only a single stack.

(I realize this is jumping in a little deep all of a sudden, but we need to get a look at where we are going. It will become clear after a few more topics.)

A single stack is all fine and good, better than none, anyway, but what happens when you forget what you pushed on and try to return to a parameter (in other words, interpreted as an address)? Or when you pop one too many bytes of parameter off and try to return to an address that is part of a parameter and part of something else? 

Yes, things blow up. Or freeze. Or both. Or, worse, silently fail and leave you with hidden corruption in your data. Or, somebody with a far too clever and sneaky bent leaves too much data in your network buffers, overwrites a return address on your stack, and starts remote controlling your computer to do nefarious things like reveal your credit card numbers and send mail in your name to your friends to lure them into traps.

No, that's not being too alarmist, although the split stack is not a complete and perfect cure for all bad coding woes.

What the split stack can do is keep the return address separate from the parameters and local variables, so that if you screw up in popping, pushing, dropping, etc., at least your code eventually returns where it might have been supposed to be instead of somewhere completely irrelevant. It returns in a bad state, yes, but in a somewhat controlled state.

I'm getting too excited about this. Let's look at some ways of doing the glue code:

OUTC	LDX	PSP	; get the parameter stack pointer
	LDAA	1,X	; get the low byte where the EXbug's 7-bit character should be.
	INX		; drop the passed character off the stack
	INX
	STX	PSP	; update the stack pointer
	JSR	XOUTCH	; output A via monitor ROM
	RTS

You were wondering what we were going to use for a parameter stack pointer, weren't you? The 6800 has only one stack, after all, and it is the return address stack I was just getting excited about (not) using.

What we're going to do is implement a software stack. The variable PSP will be its stack pointer, and we can load it into X to use it.

Of course, if we have to maintain it every time we use it, we may find it a bit unwieldy to use. So we can borrow from the Forth playbook and define a couple of routines to do that for us: 

  • Parameter POP Double accumulator and 
  • Parameter PUSH Double accumulator

like this:

PPOPD	LDX	PSP
	LDAA	0,X
	LDAB	1,X
	INX
	INX
	STX	PSP
	RTS
*
PPUSHD	LDX	PSP
	DEX
	DEX
	STX	PSP
	STAA	0,X
	STAB	1,X
	RTS

Under the split-stack discipline, parameters are usually passed on the stack. These two routines will be a little different from that, in that they will use the accumulators to pass the value being pushed or popped. 

We can use PPOPD in our new OUTC as follows. It uses a few more cycles, but the OUTC routine will take fewer bytes than the first version I showed above:

OUTC	JSR	PPOPD	; get the character in B
	TBA		; put it where XOUTCH wants it.
	JSR	XOUTCH	; output via monitor ROM
	RTS

And we can pass the character in using PPUSHD as follows;

	CLRA
	LDAB	#'H
	JSR	PPUSHD
	JSR	OUTC

You may be wondering what happens to X. And, if you think about it, A and B.

Under this approach, if you have something important in the index register or the accumulators, you have to save it before calling routines like these. We could use another global variable to save X while we are using for something like this, but we would then need to worry (more) about interrupts and re-entrancy. (Recursion, we can explicitly disallow.) That is, the more such globals we use the more we have to worry about them.

On the 6801, we could save X by using PSHX to push it to the return address stack, which on the one hand brings us back toward those problems I mentioned about mixing return addresses with saved registers, but on the other hand avoids the interrupt-time issues of globally referenced variables. On the 6809 and 68000, we have enough index registers and addressing modes to not have to resort to this kind of game, and we'll look at those when we have got reasonable solutions to this for the 6800 and 6801.

Let's see what this looks like using PPUSHD and PPOPD as above:

* Essential monitor ROM routines
XOUTCH	EQU	$F018
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
PSP	RMB	2	; parameter stack pointer
SSAVE	RMB	2	; a place to keep S so we can return clean
*
	ORG	$2000	; MDOS says this is a good place for user stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	31	; 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
* (But this example only uses two levels.) * * INISTKS LDX #PSTKBAS ; Set up the parameter stack STX PSP TSX ; point to return address LDX 0,X ; return address in X INS ; drop the return pointer on stack INS STS SSAVE ; Save what the monitor gave us. LDS #SSTKBAS ; Move to our own stack JMP 0,X ; return via X * PPOPD LDX PSP LDAA 0,X LDAB 1,X INX INX STX PSP RTS * PPUSHD LDX PSP DEX DEX STX PSP STAA 0,X STAB 1,X RTS * * OUTC JSR PPOPD ; get the character in B TBA ; put it where XOUTCH wants it. JSR XOUTCH ; output via monitor ROM RTS * * START JSR INISTKS CLRA LDAB #'H JSR PPUSHD JSR OUTC * DONE LDS SSAVE ; restore the monitor stack pointer NOP NOP ; landing pad

I've lightly tested it, it should run. Go ahead and give it a try. Check previous chapters if you've forgotten something. 

And remember the (h)elp command if there's something you want to try but don't know how. No promises, but there are things in EXORsim I haven't explained yet that might be helpful.

Now, just for completeness, here's the same thing, but letting the code handle PSP directly, instead of by subroutines, per the first example:

* Essential monitor ROM routines
XOUTCH	EQU	$F018
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
PSP	RMB	2	; parameter stack pointer
SSAVE	RMB	2	; a place to keep S so we can return clean
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	31	; 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
*
*
INISTKS	LDX	#PSTKBAS	; Set up the parameter stack
	STX	PSP
	TSX		; point to return address
	LDX	0,X	; return address in X
	INS		; drop the return pointer on stack
	INS
	STS	SSAVE	; Save what the monitor gave us.
	LDS	#SSTKBAS	; Move to our own stack
	JMP	0,X	; return via X
*
*
OUTC	LDX	PSP	; get the parameter stack pointer
	LDAA	1,X	; get the low byte where the EXbug's 7-bit character should be.
	INX		; drop the passed character off the stack
	INX
	STX	PSP	; update the stack pointer
	JSR	XOUTCH	; output A via monitor ROM
	RTS
*
*
START	JSR	INISTKS
	LDX	PSP
	DEX
	DEX
	STX	PSP
	CLR	0,X
	LDAB	#'H
	STAB	1,X
	JSR	OUTC
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

About the use of the addresses down at $80, that's in the 6800's direct page. (It's called page zero on some other CPUs.) 

The direct page is in the same overall address space as everything else in the 6800, but you can address it in either of two ways when using binary memory-to-register instructions, specifically, either extended addressing or direct page addressing. Direct page addressing is shorter and quicker.

To explain, if we have a 2-byte pointer variable at address $0080, we can load it into X it with either 

FE 00 80            LDX >$0080

or

DE 80               LDX <$0080

Using the right angle brackets (greater-than and less-than) symbols to force extended mode and direct mode addressing only works on some assemblers. It does not work on EXORsim's interactive mode assembler. Don't get hung up on that, just remember that EXORsim's assembler chooses direct page for you when it can.

Let's look closely at the object code for the two different op-codes:

  • $FE is the op-code for extended mode LDX.
    It has a 2-byte address field, so it takes 3 bytes to encode and runs in 5 cycles.
  • $DE is the op-code for direct-page mode addressing LDX.
    It has a 1-byte address field, so it takes 2 bytes to encode and runs in 4 cycles.

So if we keep a virtual stack pointer down there, we can save some bytes and cycles referencing it. 

In other words, with PSP allocated in the direct page, it's just a tad bit faster to load and store, and just a tad byte shorter code, as well. 

Of course, then you have to be careful to make sure that putting PSP down there won't conflict with other stuff down there, kind of like being careful that using U on the 6809 or A6 on the 68000 as a parameter stack pointer won't conflict with the way the OS uses those registers. 

Yeah, there is software that uses variables that will conflict with our variables at $80, but I assume we aren't using those at the same time we're doing this. If so, we can move our variables a bit higher, maybe to $90.

For a quick detour, how about if we did like everyone else and passed the character on the return address stack? PSHB and PSHA are pretty quick.

NO_OUTC	PULA	; Get the high byte out of the way, we think.
	PULA	; get the low byte where the EXbug's 7-bit character should be, we think.
	JSR	XOUTCH	; output A via monitor ROM, we think
	RTS		; we think

Keeeewelll! Why didn't we just do this in the first place?!!?!

(Cough.)

Talk about trying to output the low byte of the return address instead of the character passed, and trying to return to the character that was supposed to be the parameter as if it were an address -- $0048.

Dang. Okay, what about this?

HMMOUTC	PULX	; get the return address out of the way.
	PULA	; Get the high byte out of the way.
	PULA	; get the low byte where the EXbug's 7-bit character should be.
	JSR	XOUTCH	; output A via monitor ROM
	JMP	0,X	; return through X (XOUTCH preserves X, doesn't it?)

That would actually work on the 6801, since the 6801 has a PULX. But the 6800 does not.

AAAARRRRGGGHHHH!!!!!!! NO FAIR!!

Heh. Okay, let's try something that would actually work on the 6800.

OUTC1S	TSX	; Point to the return address on the stack.
	LDAA	3,X	; Skip the return address and high byte.
	LDX	0,X	; get the return address
	INS		; get rid of what we no longer need
	INS		; bump S past the return address
	INS
	INS		; past the character passed in
	JSR	XOUTCH	; output A via monitor ROM
	JMP	0,X	; return through X

Now, unless you have already read up about how TSX works, you should be wondering why the offsets in that are not one off. 

Motorola, in their wisdom, made TSX and TXS to adjust addresses between X and S so that the pre-decrement push nature of the 6800 S is hidden away. Cool, and convenient, but it bites you when you forget and save S to memory and then try to load it to X for some reason.

But you still have to do stuff to maintain the stack if you put parameters on the return stack on the 6800.

Okay, back from the detour. Let's look at getting strings output. I know, I know, there's already a lot to chew on in this chapter, but we're trying to build a beachhead, and we need a foothold. We aren't there yet. 

We want a routine to output a string with some kind of terminator. EXbug used EOT ($04) as a terminator. The Atari ST's DOS calls used a NUL ($00).

The programming language C uses a NUL as a terminator for many of its standard library functions. There are some problems in doing so, but it's an easy function to define. Point to the string, grab characters in sequence and send them to the output device until we grab a NUL.

This requires learning how to repeat sections of code in a loop, and we'll show how to do that with conditional and absolute branches. (We've already had a soft introduction in the workarounds for EXORsim6801.)

The instruction mnemonic BEQ means Branch if EQual to zero after many instructions.

After a CoMPare instruction or a SUBtract instruction, it means Branch if the two operands of the previous compare or subtract were EQual -- in other words, if the difference is zero.

BNE means Branch if Not Equal to zero. Or, after a compare or subtract, Branch if Not Equal.

Mnemonics can be a little dicey.

Some instructions have more or less unfortunate mnemonics. At least, I might wish the engineers had chosen "BR" instead of "BRA" for BRanch Always. Oh, well. Motorola was not the first to use the mnemonic by any means. 

It's best not to get too hung up on linguistic infelicities.

In a sort of pseudo-code, the string output code might be written something like

Point to first character of the string.
Do in a loop:
	Get the character.
	If it is not NUL, 
		output the character.
until you get to a NUL.

Brainstorming --

If we load the string address into A and B and push it on the parameter stack, we know how to access it as a local variable now, don't we? 

No?

Yes. Have a look:

	LDX	#HELLO
	STX	XWORK
	LDAA	XWORK	; there are other ways to do this.
	LDAB	XWORK+1
	JSR	PPUSHD	; local variable to point into the string
	LDX	PSP	; point to the local variable

So we got the address of the string onto the parameter stack, and we grabbed the top of the parameter stack and that points to the local copy of the pointer to the string.

But then what? The moment we use X to point to the string, we've lost our pointer to the local variables. And if we load our pointer to the local variable, we've lost our pointer to the string. 

To get a little better view of what we want to do, let's see if we can handle the parameter stack directly and write a string output routine something like this:

OUTS	LDX	PSP	; get the parameter stack pointer
* Point to first character of the string:
	LDX	0,X	; point to the string
* Do in a loop:
* 	Get the character.
OUTSL	LDAA	0,X	; get the byte that's out there
* 	If it is not NUL, 
	BEQ	OUTDN	; if NUL, leave
* 		output the character.
	JSR	XOUTCH	; Call through EXbug
	INX		; point to the next
* until you get to a NUL.
	BRA	OUTSL	; do the next character
OUTDN	LDX	PSP	; drop pointer from parameter stack
	INX
	INX
	STX	PSP
	RTS

So, to recap, we

  • get the pointer to the top of the parameter stack;
  • get the pointer to the string, leave PSP as it was;
  • (The label OUTSL marks the beginning of the loop.)
  • load accumulator A via the pointer to the string,
    setting the Z flag if it's a NUL;
  • if the Z flag is set,
    branch out of the loop, to label OUTDN;
  • increment the pointer to the string;
  • branch unconditionally to OUTSL,
    the beginning of the loop, to go again.
  • (The label OUTDN marks the first instruction location
    after the loop body.)

And after the loop is complete, at label OUTDN, we

  • get the pointer to the top of the parameter stack back;
  • increment it twice; and
  • update the top of the stack
    to drop the pointer from the stack.

It looks like it should work.

Here's the complete test program, which light testing shows does work:

* Essential control codes
LF	EQU	$0A	; line feed
CR	EQU	$0D	; carriage return
NUL	EQU	0	; ASCII NUL
*
* Essential monitor ROM routines
XOUTCH	EQU	$F018
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
PSP	RMB	2	; parameter stack pointer
SSAVE	RMB	2	; a place to keep S so we can return clean
*
* XWORK must not be used by any routine that calls another routine!
* It must also not be used for values with long duration.
XWORK	RMB	2	; a place to work on X for very short calcualations.
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	31	; 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
*
*
HELLO	FCB	CR,LF	; Put message at beginning of line
	FCC	"Ashi-gakari ga dekita!"	; Whatever the user wants here.
	FCB	CR,LF,NUL	; Put the debugger's output on a new line.
*
*
INISTKS	LDX	#PSTKBAS	; Set up the parameter stack
	STX	PSP
	TSX		; point to return address
	LDX	0,X	; return address in X
	INS		; drop the return pointer on stack
	INS
	STS	SSAVE	; Save what the monitor gave us.
	LDS	#SSTKBAS	; Move to our own stack
	JMP	0,X	; return via X
*
*
OUTC	LDX	PSP	; get the parameter stack pointer
	LDAA	1,X	; get the low byte where the EXbug's 7-bit character should be.
	INX		; drop the passed character off the stack
	INX
	STX	PSP	; update the stack pointer
	BSR	OUTCV	; output A via monitor ROM
	RTS
*
OUTCV	JMP	XOUTCH	; Centralize the calls into the monitor.
*
OUTS	LDX	PSP	; get the parameter stack pointer
	LDX	0,X	; get the string pointer
OUTSL	LDAA	0,X	; get the byte out there
	BEQ	OUTDN	; if NUL, leave
	BSR	OUTCV	; use the same call OUTC uses.
	INX		; point to the next
	BRA	OUTSL	; next character
OUTDN	LDX	PSP	; drop pointer from parameter stack
	INX
	INX
	STX	PSP
	RTS
*
*
START	JSR	INISTKS
	LDX	#HELLO	; Other assemblers allow splitting addresses in half.
	STX	XWORK
	LDAA	XWORK
	LDAB	XWORK+1
	LDX	PSP
	DEX
	DEX
	STX	PSP
	STAA	0,X
	STAB	1,X
	JSR	OUTS
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

Note that I am centralizing the call into the monitor ROM so that, if you use this code on another platform, you only need to change the address that OUTCV jumps to. 

Also, the FCC directive, Form Constant Character, is used for making strings. Many assemblers will allow strings to be assembled with FCB, but some will require FCC instead.

Now, if we want to cut this up into re-usable subroutines, like PPUSHD and PPOPD, we probably want to define PPUSHX and PPOPX. If we are careful, PPUSHX and PPOPX can re-use PPUSHD and PPOPD.

* Essential control codes
LF	EQU	$0A	; line feed
CR	EQU	$0D	; carriage return
NUL	EQU	0	; ASCII NUL
*
* Essential monitor ROM routines
XOUTCH	EQU	$F018
*
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS and EXbug docs say it should be okay here.
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
PSP	RMB	2	; parameter stack pointer
SSAVE	RMB	2	; a place to keep S so we can return clean
*
* XWORK must not be used by any routine that calls another routine!
* XWORK must also not be used for values with long duration.
XWORK	RMB	2	; a place to work on X for very short calcualations.
* More accurately, it must not be in use when a routine that uses it 
* calls another routine.
*
* XSTKSV is strictly for PPUSHX and PPOPX
XSTKSV	RMB	2	; a place to keep X for pushing and popping.
*
	ORG	$2000	; MDOS says this is a good place for usr stuff
NOENTRY	JMP	START
	RMB	2	; a little bumper space
SSTKLIM	RMB	31	; 16 levels of call, max
SSTKBAS	RMB	1	; 6800 is post-dec (post-store-decrement) push
	RMB	2	; a little bumper space
PSTKLIM	RMB	64	; 16 levels of call at two parameters per call
PSTKBAS	RMB	2	; bumper space -- parameter stack is pre-dec
*
*
HELLO	FCB	CR,LF	; Put message at beginning of line
	FCC	"Ashi-gakari ga dekita!"	; Whatever the user wants here.
	FCB	CR,LF,NUL	; Put the debugger's output on a new line.
*
*
INISTKS	LDX	#PSTKBAS	; Set up the parameter stack
	STX	PSP
	TSX		; point to return address
	LDX	0,X	; return address in X
	INS		; drop the return pointer on stack
	INS
	STS	SSAVE	; Save what the monitor gave us.
	LDS	#SSTKBAS	; Move to our own stack
	JMP	0,X	; return via X
*
*
PPOPD	LDX	PSP
	LDAA	0,X
	LDAB	1,X
	INX
	INX
	STX	PSP
	RTS
*
PPOPX	BSR	PPOPD
	STAA	XSTKSV
	STAB	XSTKSV+1
	LDX	XSTKSV
	RTS
*
PPUSHD	LDX	PSP
	DEX
	DEX
	STX	PSP
	STAA	0,X
	STAB	1,X
	RTS
*
PPUSHX	STX	XSTKSV
	LDAA	XSTKSV
	LDAB	XSTKSV+1
	BRA	PPUSHD	; rob RTS
*
OUTC	JSR	PPOPD	; get the character in B
	TBA		; put it where XOUTCH wants it.
	BSR	OUTCV	; output A via monitor ROM
	RTS
*
OUTCV	JMP	XOUTCH
*
OUTS	JSR	PPOPX	; get the string pointer
OUTSL	LDAA	0,X	; get the byte out there
	BEQ	OUTDN	; if NUL, leave
	BSR	OUTCV	; use the same call OUTC uses.
	INX		; point to the next
	BRA	OUTSL	; next character
OUTDN	RTS
*
*
START	JSR	INISTKS
	LDX	#HELLO	; There are other ways to push the address.
	JSR	PPUSHX
	JSR	OUTS
*
DONE	LDS	SSAVE	; restore the monitor stack pointer
	NOP
	NOP		; landing pad

How does that work? 

We hide the resource conflicts in the pushes and pops of the glue subroutines.

And, incidentally, here we can see that, when the code is close, we can use branches instead of jumps, including Branch to SubRoutine instead of Jump to SubRoutine.

Branches on the 6800 and 6801 use byte offsets instead of 2-byte absolute addresses, and are limited to a range of -128 to +127 from the address after the offset. This saves a byte and a cycle. 

Also, branch offsets are not absolute addresses, but that is not as important as we might wish it were -- not until the 6809 and 68000 (which needs a chapter or two of its own to explain, way down the road).

It's worth noting here that, if we intend to use the code above to make our own monitor or BIOS, we'll need to consider a number of things we're ignoring here, such as what to do during interrupts -- and how to make sure the stacks have remained in balance.

And, with that, I think we are ready to see how the 6801 improves things just a little.


(Title Page/Index)

 

 

 

No comments:

Post a Comment