Foothold!
(Split Stacks ... barely on the Beach)
6800
We've had a look at getting characters out on the screen on the EXORciser and
Atari ST emulators. Maybe you also took a quick detour through my debugging puzzles. Maybe you've even read my meanderings about studying other processors.
Anyway, we've had a look at using the resources a monitor or a BIOS can provide
to help us use a CPU to put out strings of text. But the EXbug monitor ROM and the Atari ST BIOS
both use different disciplines than the discipline I promised to show you. So we don't yet really have that foothold on the beach we are attacking.
Now, we could pervert our simulators and go completely raw on them, replacing the monitor, BIOS, DOS, etc. with code that uses that discipline, but we want some cooked code to take with us when we do -- and a lot more experience.
Yes, at some point we'll do a little bit-twiddling and BLiTting. But first we'll use the character I/O that the monitor and BIOS provide, and prepare some interesting code for when we get there.
If we limit our use of the
existing monitor and BIOS code to the character I/O routines, we make it simpler to move the code to other targets.
To that end, our project for this chapter is to write a bit of glue code for character and text/string terminal I/O on the EXORciser under EXbug, to connect between the split stack discipline I use and the single-stack discipline used in EXbug. Later, we'll extend to using persistent store -- disk I/O. There also, we'll limit our use of code from the other disciplines, to make it easier to port the code we produce to other targets.
Why split stack?
What is split stack?
It's going to be easier to show than to explain, but a little hand-waving at the start might help, so I'll give it a try. See (as I wave my hands), it's like this ...
Unless you went spelunking in the EXbug monitor ROM, you may not have noticed it in there. But the PSH (push) and PUL (pop) instructions on the 6800 and 6801 push and pop registers -- on the same stack as the return address. This is the case on most currently commercially successful CPUs.
Most currently successful CPUs -- the typical run-time models implemented on even the 6809 and 68000, which provide push and pop instructions for multiple stacks, tend to support and use only a single stack.
(I realize this is jumping in a little deep all of a sudden, but we need to get a look at where we are going. It will become clear after a few more topics.)
A single stack is all fine and good, better than none, anyway, but what happens when you forget what you pushed on and try to return to a parameter (in other words, interpreted as an address)? Or when you pop one too many bytes of parameter off and try to return to an address that is part of a parameter and part of something else?
Yes, things blow up. Or freeze. Or both. Or, worse, silently fail and leave
you with hidden corruption in your data. Or, somebody with a far too clever
and sneaky bent leaves too much data in your network buffers, overwrites a
return address on your stack, and starts remote controlling your computer to
do nefarious things like reveal your credit card numbers and send mail in your
name to your friends to lure them into traps.
No, that's not being too alarmist, although the split stack is not a complete
and perfect cure for all bad coding woes.
What the split stack can do is keep the return address separate from the
parameters and local variables, so that if you screw up in popping, pushing,
dropping, etc., at least your code eventually returns where it might have been
supposed to be instead of somewhere completely irrelevant. It returns in a bad
state, yes, but in a somewhat controlled state.
I'm getting too excited about this. Let's look at some ways of doing the glue
code:
OUTC LDX PSP ; get the parameter stack pointer
LDAA 1,X ; get the low byte where the EXbug's 7-bit character should be.
INX ; drop the passed character off the stack
INX
STX PSP ; update the stack pointer
JSR XOUTCH ; output A via monitor ROM
RTS
You were wondering what we were going to use for a parameter stack pointer, weren't you? The 6800 has only one stack, after all, and it is the return address stack I was just getting excited about (not) using.
What we're going to do is implement a software stack. The variable PSP will be its stack pointer, and we can load it into X to use it.
Of course, if we have to maintain it every time we use it, we may find it a bit unwieldy to use. So we can borrow from the Forth playbook and define a couple of routines to do that for us:
- Parameter POP Double accumulator and
- Parameter PUSH Double accumulator
like this:
PPOPD LDX PSP
LDAA 0,X
LDAB 1,X
INX
INX
STX PSP
RTS
*
PPUSHD LDX PSP
DEX
DEX
STX PSP
STAA 0,X
STAB 1,X
RTS
Under the split-stack discipline, parameters are usually passed on the stack. These two routines will be a little different from that, in that they will use the accumulators to pass the value being pushed or popped.
We can use PPOPD in our new OUTC as follows. It uses a few more cycles, but the OUTC
routine will take fewer bytes than the first version I showed above:
OUTC JSR PPOPD ; get the character in B
TBA ; put it where XOUTCH wants it.
JSR XOUTCH ; output via monitor ROM
RTS
And we can pass the character in using PPUSHD as follows;
CLRA
LDAB #'H
JSR PPUSHD
JSR OUTC
You may be wondering what happens to X. And, if you think about it, A and B.
Under this approach, if you have something important in the index register or the accumulators, you have to save it before calling routines like these. We could use another global variable to save X while we are using for something like this, but we would then need to worry (more) about interrupts and re-entrancy. (Recursion, we can explicitly disallow.) That is, the more such globals we use the more we have to worry about them.
On the 6801, we could save X by using PSHX to push it to the return address stack, which on the one hand brings us back toward those problems I mentioned about mixing return addresses with saved registers, but on the other hand avoids the interrupt-time issues of globally referenced variables. On the 6809 and 68000, we have enough index registers and addressing modes to not have to resort to this kind of game, and we'll look at those when we have got reasonable solutions to this for the 6800 and 6801.
Let's see what this looks like using PPUSHD and PPOPD as above:
* Essential monitor ROM routines
XOUTCH EQU $F018
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
PSP RMB 2 ; parameter stack pointer
SSAVE RMB 2 ; a place to keep S so we can return clean
*
ORG $2000 ; MDOS says this is a good place for user stuff
NOENTRY JMP START
RMB 2 ; a little bumper space
SSTKLIM RMB 31 ; 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
* (But this example only uses two levels.)
*
*
INISTKS LDX #PSTKBAS ; Set up the parameter stack
STX PSP
TSX ; point to return address
LDX 0,X ; return address in X
INS ; drop the return pointer on stack
INS
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
PPOPD LDX PSP
LDAA 0,X
LDAB 1,X
INX
INX
STX PSP
RTS
*
PPUSHD LDX PSP
DEX
DEX
STX PSP
STAA 0,X
STAB 1,X
RTS
*
*
OUTC JSR PPOPD ; get the character in B
TBA ; put it where XOUTCH wants it.
JSR XOUTCH ; output via monitor ROM
RTS
*
*
START JSR INISTKS
CLRA
LDAB #'H
JSR PPUSHD
JSR OUTC
*
DONE LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
I've lightly tested it, it should run. Go ahead and give it a try. Check previous chapters if you've forgotten something.
And remember the (h)elp command if there's something you want to try but don't
know how. No promises, but there are things in EXORsim I haven't explained yet
that might be helpful.
Now, just for completeness, here's the same thing, but letting the code handle PSP directly, instead of by subroutines, per the first example:
* Essential monitor ROM routines
XOUTCH EQU $F018
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
PSP RMB 2 ; parameter stack pointer
SSAVE RMB 2 ; a place to keep S so we can return clean
*
ORG $2000 ; MDOS says this is a good place for usr stuff
NOENTRY JMP START
RMB 2 ; a little bumper space
SSTKLIM RMB 31 ; 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
*
*
INISTKS LDX #PSTKBAS ; Set up the parameter stack
STX PSP
TSX ; point to return address
LDX 0,X ; return address in X
INS ; drop the return pointer on stack
INS
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
*
OUTC LDX PSP ; get the parameter stack pointer
LDAA 1,X ; get the low byte where the EXbug's 7-bit character should be.
INX ; drop the passed character off the stack
INX
STX PSP ; update the stack pointer
JSR XOUTCH ; output A via monitor ROM
RTS
*
*
START JSR INISTKS
LDX PSP
DEX
DEX
STX PSP
CLR 0,X
LDAB #'H
STAB 1,X
JSR OUTC
*
DONE LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
About the use of the addresses down at $80, that's in the 6800's direct page. (It's called page zero on some other CPUs.)
The direct page is in the same overall address space as everything else in the 6800, but you can address it in either of two ways when using binary memory-to-register instructions, specifically, either extended addressing or direct page addressing. Direct page addressing is shorter and quicker.
To explain, if we have a 2-byte pointer variable at address $0080, we can load it into X it with either
FE 00 80 LDX >$0080
or
DE 80 LDX <$0080
Using the right angle brackets (greater-than and less-than) symbols to force extended mode and direct mode addressing only works on some assemblers. It does not work on EXORsim's interactive mode assembler. Don't get hung up on that, just remember that EXORsim's assembler chooses direct page for you when it can.
Let's look closely at the object code for the two different op-codes:
-
$FE is the op-code for extended mode LDX.
It has a 2-byte address field, so it takes 3 bytes to encode and runs in 5 cycles.
-
$DE is the op-code for direct-page mode addressing LDX.
It has a 1-byte address field, so it takes 2 bytes to encode and runs in 4 cycles.
So if we keep a virtual stack pointer down there, we can save some bytes and cycles referencing it.
In other words, with PSP allocated in the direct page, it's just a tad bit faster to load and store, and just a tad byte shorter code, as well.
Of course, then you have to be careful to make sure that putting PSP down there won't conflict with other stuff down there, kind of like being careful that using U on the 6809 or A6 on the 68000 as a parameter stack pointer won't conflict with the way the OS uses those registers.
Yeah, there is software that uses variables that will conflict with our variables at $80, but I assume we aren't using those at the same time we're doing this. If so, we can move our variables a bit higher, maybe to $90.
For a quick detour, how about if we did like everyone else and passed the character on the return address stack? PSHB and PSHA are pretty quick.
NO_OUTC PULA ; Get the high byte out of the way, we think.
PULA ; get the low byte where the EXbug's 7-bit character should be, we think.
JSR XOUTCH ; output A via monitor ROM, we think
RTS ; we think
Keeeewelll! Why didn't we just do this in the first place?!!?!
(Cough.)
Talk about trying to output the low byte of the return address instead of the character passed, and trying to return to the character that was supposed to be the parameter as if it were an address -- $0048.
Dang. Okay, what about this?
HMMOUTC PULX ; get the return address out of the way.
PULA ; Get the high byte out of the way.
PULA ; get the low byte where the EXbug's 7-bit character should be.
JSR XOUTCH ; output A via monitor ROM
JMP 0,X ; return through X (XOUTCH preserves X, doesn't it?)
That would actually work on the 6801, since the 6801 has a PULX. But the 6800
does not.
AAAARRRRGGGHHHH!!!!!!! NO FAIR!!
Heh. Okay, let's try something that would actually work on the 6800.
OUTC1S TSX ; Point to the return address on the stack.
LDAA 3,X ; Skip the return address and high byte.
LDX 0,X ; get the return address
INS ; get rid of what we no longer need
INS ; bump S past the return address
INS
INS ; past the character passed in
JSR XOUTCH ; output A via monitor ROM
JMP 0,X ; return through X
Now, unless you have already read up about how TSX works, you should be wondering why the offsets in that are not one off.
Motorola, in their wisdom, made TSX and TXS to adjust addresses between X and
S so that the pre-decrement push nature of the 6800 S is hidden away. Cool,
and convenient, but it bites you when you forget and save S to memory and then
try to load it to X for some reason.
But you still have to do stuff to maintain the stack if you put parameters on the return stack on the 6800.
Okay, back from the detour. Let's look at getting strings output. I know, I know, there's already a lot to chew on in this chapter, but we're trying to build a beachhead, and we need a foothold. We aren't there yet.
We want a routine to output a string with some kind of terminator. EXbug used EOT ($04) as a terminator. The Atari ST's DOS calls used a NUL ($00).
The programming language C uses a NUL as a terminator for many of its standard library functions. There are some problems in doing so, but it's an easy function to define. Point to the string, grab characters in sequence and send them to the output device until we grab a NUL.
This requires learning how to repeat sections of code in a loop, and we'll
show how to do that with conditional and absolute branches. (We've already had a soft introduction in the workarounds for EXORsim6801.)
The instruction mnemonic BEQ means Branch if EQual to zero after many instructions.
After a CoMPare instruction or a SUBtract instruction, it means Branch if the two operands of the previous compare or subtract were EQual -- in other words, if the difference is zero.
BNE means Branch if Not Equal to zero. Or, after a compare or subtract, Branch if Not Equal.
Mnemonics can be a little dicey.
Some instructions have more or less unfortunate mnemonics. At least, I might wish the engineers had chosen "BR" instead of "BRA" for BRanch Always. Oh, well. Motorola was not the first to use the mnemonic by any means.
It's best not to get too hung up on linguistic infelicities.
In a sort of pseudo-code, the string output code might be written something like
Point to first character of the string.
Do in a loop:
Get the character.
If it is not NUL,
output the character.
until you get to a NUL.
Brainstorming --
If we load the string address into A and B and push it on the parameter stack, we know how to access it as a local variable now, don't we?
No?
Yes. Have a look:
LDX #HELLO
STX XWORK
LDAA XWORK ; there are other ways to do this.
LDAB XWORK+1
JSR PPUSHD ; local variable to point into the string
LDX PSP ; point to the local variable
So we got the address of the string onto the parameter stack, and we grabbed the top of the parameter stack and that points to the local copy of the pointer to the string.
But then what? The moment we use X to point to the string, we've lost our pointer to the local variables. And if we load our pointer to the local variable, we've lost our pointer to the string.
To get a little better view of what we want to do, let's see if we can handle
the parameter stack directly and write a string output routine something like
this:
OUTS LDX PSP ; get the parameter stack pointer
* Point to first character of the string:
LDX 0,X ; point to the string
* Do in a loop:
* Get the character.
OUTSL LDAA 0,X ; get the byte that's out there
* If it is not NUL,
BEQ OUTDN ; if NUL, leave
* output the character.
JSR XOUTCH ; Call through EXbug
INX ; point to the next
* until you get to a NUL.
BRA OUTSL ; do the next character
OUTDN LDX PSP ; drop pointer from parameter stack
INX
INX
STX PSP
RTS
So, to recap, we
- get the pointer to the top of the parameter stack;
- get the pointer to the string, leave PSP as it was;
- (The label OUTSL marks the beginning of the loop.)
-
load accumulator A via the pointer to the string,
setting the Z flag if it's a NUL; - if the Z flag is set,
branch out of the loop, to label OUTDN; - increment the pointer to the string;
-
branch unconditionally to OUTSL,
the beginning of the loop, to go again. -
(The label OUTDN marks the first instruction location
after the loop body.)
And after the loop is complete, at label OUTDN, we
- get the pointer to the top of the parameter stack back;
- increment it twice; and
-
update the top of the stack
to drop the pointer from the stack.
It looks like it should work.
Here's the complete test program, which light testing shows does work:
* Essential control codes
LF EQU $0A ; line feed
CR EQU $0D ; carriage return
NUL EQU 0 ; ASCII NUL
*
* Essential monitor ROM routines
XOUTCH EQU $F018
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
PSP RMB 2 ; parameter stack pointer
SSAVE RMB 2 ; a place to keep S so we can return clean
*
* XWORK must not be used by any routine that calls another routine!
* It must also not be used for values with long duration.
XWORK RMB 2 ; a place to work on X for very short calcualations.
*
ORG $2000 ; MDOS says this is a good place for usr stuff
NOENTRY JMP START
RMB 2 ; a little bumper space
SSTKLIM RMB 31 ; 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
*
*
HELLO FCB CR,LF ; Put message at beginning of line
FCC "Ashi-gakari ga dekita!" ; Whatever the user wants here.
FCB CR,LF,NUL ; Put the debugger's output on a new line.
*
*
INISTKS LDX #PSTKBAS ; Set up the parameter stack
STX PSP
TSX ; point to return address
LDX 0,X ; return address in X
INS ; drop the return pointer on stack
INS
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
*
OUTC LDX PSP ; get the parameter stack pointer
LDAA 1,X ; get the low byte where the EXbug's 7-bit character should be.
INX ; drop the passed character off the stack
INX
STX PSP ; update the stack pointer
BSR OUTCV ; output A via monitor ROM
RTS
*
OUTCV JMP XOUTCH ; Centralize the calls into the monitor.
*
OUTS LDX PSP ; get the parameter stack pointer
LDX 0,X ; get the string pointer
OUTSL LDAA 0,X ; get the byte out there
BEQ OUTDN ; if NUL, leave
BSR OUTCV ; use the same call OUTC uses.
INX ; point to the next
BRA OUTSL ; next character
OUTDN LDX PSP ; drop pointer from parameter stack
INX
INX
STX PSP
RTS
*
*
START JSR INISTKS
LDX #HELLO ; Other assemblers allow splitting addresses in half.
STX XWORK
LDAA XWORK
LDAB XWORK+1
LDX PSP
DEX
DEX
STX PSP
STAA 0,X
STAB 1,X
JSR OUTS
*
DONE LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
Note that I am centralizing the call into the monitor ROM so that, if you use this code on another platform, you only need to change the address that OUTCV jumps to.
Also, the FCC directive, Form Constant Character, is used for making strings. Many assemblers will allow strings to be assembled with FCB, but some will require FCC instead.
Now, if we want to cut this up into re-usable subroutines, like PPUSHD and
PPOPD, we probably want to define PPUSHX and PPOPX. If we are careful, PPUSHX
and PPOPX can re-use PPUSHD and PPOPD.
* Essential control codes
LF EQU $0A ; line feed
CR EQU $0D ; carriage return
NUL EQU 0 ; ASCII NUL
*
* Essential monitor ROM routines
XOUTCH EQU $F018
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $80 ; MDOS and EXbug docs say it should be okay here.
ENTRY JMP START
NOP ; Just want even addressed pointers for no reason.
PSP RMB 2 ; parameter stack pointer
SSAVE RMB 2 ; a place to keep S so we can return clean
*
* XWORK must not be used by any routine that calls another routine!
* XWORK must also not be used for values with long duration.
XWORK RMB 2 ; a place to work on X for very short calcualations.
* More accurately, it must not be in use when a routine that uses it
* calls another routine.
*
* XSTKSV is strictly for PPUSHX and PPOPX
XSTKSV RMB 2 ; a place to keep X for pushing and popping.
*
ORG $2000 ; MDOS says this is a good place for usr stuff
NOENTRY JMP START
RMB 2 ; a little bumper space
SSTKLIM RMB 31 ; 16 levels of call, max
SSTKBAS RMB 1 ; 6800 is post-dec (post-store-decrement) push
RMB 2 ; a little bumper space
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
*
*
HELLO FCB CR,LF ; Put message at beginning of line
FCC "Ashi-gakari ga dekita!" ; Whatever the user wants here.
FCB CR,LF,NUL ; Put the debugger's output on a new line.
*
*
INISTKS LDX #PSTKBAS ; Set up the parameter stack
STX PSP
TSX ; point to return address
LDX 0,X ; return address in X
INS ; drop the return pointer on stack
INS
STS SSAVE ; Save what the monitor gave us.
LDS #SSTKBAS ; Move to our own stack
JMP 0,X ; return via X
*
*
PPOPD LDX PSP
LDAA 0,X
LDAB 1,X
INX
INX
STX PSP
RTS
*
PPOPX BSR PPOPD
STAA XSTKSV
STAB XSTKSV+1
LDX XSTKSV
RTS
*
PPUSHD LDX PSP
DEX
DEX
STX PSP
STAA 0,X
STAB 1,X
RTS
*
PPUSHX STX XSTKSV
LDAA XSTKSV
LDAB XSTKSV+1
BRA PPUSHD ; rob RTS
*
OUTC JSR PPOPD ; get the character in B
TBA ; put it where XOUTCH wants it.
BSR OUTCV ; output A via monitor ROM
RTS
*
OUTCV JMP XOUTCH
*
OUTS JSR PPOPX ; get the string pointer
OUTSL LDAA 0,X ; get the byte out there
BEQ OUTDN ; if NUL, leave
BSR OUTCV ; use the same call OUTC uses.
INX ; point to the next
BRA OUTSL ; next character
OUTDN RTS
*
*
START JSR INISTKS
LDX #HELLO ; There are other ways to push the address.
JSR PPUSHX
JSR OUTS
*
DONE LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
How does that work?
We hide the resource conflicts in the pushes and pops of the glue subroutines.
And, incidentally, here we can see that, when the code is close, we can use branches instead of jumps, including Branch to SubRoutine instead of Jump to SubRoutine.
Branches on the 6800 and 6801 use byte offsets instead of 2-byte absolute addresses, and are limited to a range of -128 to +127 from the address after the offset. This saves a byte and a cycle.
Also, branch offsets are not absolute addresses, but that is not as important
as we might wish it were -- not until the 6809 and 68000 (which needs a chapter or two of its own to explain, way down the road).
It's worth noting here that, if we intend to use the code above to make our own monitor or BIOS, we'll need to consider a number of things we're ignoring here, such as what to do during interrupts -- and how to make sure the stacks have remained in balance.
And, with that, I think we are ready to see how the 6801 improves things just
a little.
No comments:
Post a Comment