Balancing on the Beach --
Comparing Pointers,
Checking the Stack
Four different processors, three different parameter passing modes. And
natural width (address width) arithmetic -- 16-bit on the 8-bit processors,
but
32-bit on the "16-bit" 68000.
And I really don't want to leave a certain set of questions begging.
* How do you make sure that your use of statically allocated parameters (and variables) has no internal conflicts? How do you make sure that you aren't using, say, a parameter/variable to allocate an I/O buffer, then, before you finish the allocation maintenance code, you go off and start a routine to collect de-allocated buffers and use the same variable for a conflicting purpose?
Discipline!
There are several disciplines for static parameters and variables, and the disciplines allow the use of automated tools that can check for conflicting uses.
If you don't have such tools, you have to analyze their use yourself. Discipline helps, but, ultimately, you want to avoid using more statically allocated parameters and variables than you can track and analyze.
I'll talk about various approaches to discipline as we go, variable naming and
use, file and directory strategies and such -- because we really can't avoid
some degree of static allocation.
* How do you make sure your return address stack is balanced?
Well, if you don't put temporaries and parameters and such on the return address stack, that's a tautology. If you come back from every subroutine you call, your return address stack is balanced. If you don't, it isn't.
Well, there are other reasons for not coming back, such as crashing the stack
and infinite loops, topics which we will brush on here and there.
* What if you do use a combined stack?
Actually, if you use a combined stack and you manage to return from every call, you can be pretty sure it's balanced -- given the exceptions mentioned above. But when you have a lot of parameters and variables on that stack, the probability of not coming back increases.
So you use a stack frame, which I have been waiting to explain until we come to some functions that are complicated enough to need a stack frame
Otherwise, just make sure you have a pop for every push and never have too many pops, and watch for your code to fail to come back from a subroutine call.
And if it doesn't return?
8-0Yeah, seeing things blow up and figuring out which call it didn't return from can become something of a black art. So you tend to overuse the stack frame.
* If you use a separate parameter stack, how do you make sure you've kept it
balanced? The return stack and the parameter stack might get out of sync,
might they not? And you could return from every call, but still leave
parameters out there to be processed, or end up having tried to process
parameters that were never pushed out there.
Just make sure you have a pop for every push and never have too many pops.
Simple. ...
And how do you do that?
AAAARRRRRGGGGGHHHHHH!!!!!
No magic.
But we do have strategies, some of which we use and some of which we don't,
depending on various things. A few of them are
- Testing pushes and pops before you do them.
-
Periodically checking that the parameter stack pointer is pointing at valid
parameter stack space.
-
Providing bumpers, or buffer zones between the stack space and other spaces,
and periodically checking that nothing has been written into those zones.
- Saving the stack pointer when you enter a function and checking that it matches before you leave.
-
Using a stack frame on the parameter stack, in effect pushing everything at
once, and popping it all at once.
- Letting the hardware allocate the parameter stack in an area of memory isolated from the rest, so that overflows and underflows cause memory errors.
These strategies all work on the single combined stack, as well, by the
way.
We really aren't prepared to talk about all of those, but we can talk about some of them.
For instance, here's code to check the stack pointer on exit and periodically
check it mid-flight, on the 68000:
* This is not the best way for every use.
...
PSTKLIM DS.L 32 ; roughly 16 levels of call at two parameters per call
PSTKBAS DS.L 1 ; bumper space -- parameter stack is pre-dec
* ...
START BSR.W INISTKS
* ...
CMP.L #PSTKBAS,A6
BHI.W STKERR
CMP.L #PSTKLIM,A6
BLO.W STKERR
* ...
DONE CMP.L #PSTKBAS,A6
BEQ.S EXIT
STKERR * ...
* ...
EXIT MOVEM.L A4SAVE-LB_ADDR(A5),A4-A7 ; restore the monitor's A4-A7
NOP
NOP ; landing pad
What is this CoMPare instruction?
It effectively subtracts the operands, but doesn't store the result. It only effects the flags, so you can keep your value but branch on the basis of the comparison.
Many of the branch instructions Motorola implements on their CPUs test multiple flags, and the mnemonics they specify are supposed to help you remember what the combinations mean.
- HI means that the target of the CoMPare is HIgher (unsigned compare) than the source. (Carry is set.) So if A6 is higher than PSTKBAS, it branches to STKERR. Remember, stack pushed to low memory, so it's "upside-down", so to speak.
- LO means the target is LOwer (unsigned) than the source. (Carry is set and Zero is clear.) So if A6 is lower than PSTKLIM, it branches to STKERR.
- EQ means the source is EQual to the target. (Zero flag is set.) So if, after the body of the main function is done, A6 is equal to PSTKBAS -- nothing left on stack, and not pointing beyond the bottom of stack -- it branches around the STKERR function to EXIT.
- NE, by the way, means Not Equal. (Zero flag is clear.)
This is essentially something we could do for all four processors -- For the 6800:
* This is not the best way for every use.
* 6800 CPX flags are incomplete,
* but Z is valid.
* And N should be valid all by itself,
* even though the manual says
* "not intended for conditional branching".
* ...
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
* ...
START JSR INISTKS
* ...
LDX PSP
CPX #PSTKLIM ; Z is valid, N should be valid
BEQ TSTOVR ; have to test separately
BPL TSTOVR
JMP STKERR
TSTOVR CPX #PSTKBAS
BEQ CONT
BPL STKERR
CONT * ...
* ...
DONE LDX PSP
CPX #PSTKBAS
BEQ EXIT
STKERR * ...
* ...
EXIT LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
For the 6801:
* This is not the best way for every use.
* 6801 CPX flags are all valid
* ...
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
* ...
START JSR INISTKS
* ...
LDX PSP
CPX #PSTKLIM ; All flags valid
BLO STKERR
CPX #PSTKBAS
BHI STKERR
CONT * ...
* ...
DONE LDX PSP
CPX #PSTKBAS
BEQ EXIT
STKERR * ...
* ...
EXIT LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
For the 6809 using absolute addressing:
* This is not the best way for every use.
* 6809 CMPU flags are all valid
* ...
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
* ...
START LBSR INISTKS
* ...
CMPU #PSTKLIM
BLO STKERR
CMPU #PSTKBAS
BHI STKERR
CONT * ...
* ...
DONE CMPU #PSTKBAS
BEQ EXIT
STKERR * ...
* ...
EXIT LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
But we had initialized U via PC relative addressing. so we should use LEA to calculate the addresses and do it this way:
* This is not the best way for every use.
* 6809 CMPU flags are all valid
* ...
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKBAS RMB 2 ; bumper space -- parameter stack is pre-dec
* ...
START LBSR INISTKS
* ...
LEAX PSTKLIM,PCR
PSHS X
CMPU ,S++
BLO STKERR
LEAX PSTKBAS,PCR
PSHS X
CMPU ,S++
BHI STKERR
CONT * ...
* ...
DONE CMPU #PSTKBAS
BEQ EXIT
STKERR * ...
* ...
EXIT LDS SSAVE ; restore the monitor stack pointer
NOP
NOP ; landing pad
And, come to think of it, our initialization code for the 68000 was also PC
relative, so here's how to do it PC relative on the 68000:
* This is not the best way for every use.
*...
PSTKLIM DS.L 32 ; roughly 16 levels of call at two parameters per call
PSTKBAS DS.L 1 ; bumper space -- parameter stack is pre-dec
*...
START BSR.W INISTKS
* ...
LEA PSTKBAS(PC),A0
CMP.L A0,A6
BHI.W STKERR
LEA PSTKLIM(PC),A0
CMP.L A0,A6
BLO.W STKERR
* ...
DONE LEA PSTKBAS(PC),A0
CMP.L A0,A6
BEQ.S EXIT
STKERR * ...
* ...
EXIT MOVEM.L A4SAVE-LB_ADDR(A5),A4-A7 ; restore the monitor's A4-A7
NOP
NOP ; landing pad
That's pretty heavy duty. If it's just on the main function, I suppose it isn't so bad. Particularly, it's something we might insert in the main function when we are debugging and need to watch what's happening to the stack.
But if checking the stack on function entry and exit is going to be that hard, do we really want to work the processor that hard?
(This is one of the places where current CPUs are all lacking. The CPU should define limit registers for the stacks and handle these checks at run-time transparently. Current practice is to isolate the single stack in its own MMU segment, frame function calls in rigid stack frames, and let the MMU hardware trap stack-out-of-bounds. But there are ways hostile code, or code running wild, can get around that in most modern MMU implementations. We are too focused on speed.)
Simply checking that the stack is balanced between entry and exit isn't quite as good as full bounds checking, but it does help. If all functions remain balanced, the only problem we might have is stack overflow.
(cough mumble functions that allocate or deallocate stack mutter mumble)
* This is not the best way for every use.
FENTRY MOVE.L A4,-(A7) ; Save A4 on entry to function
MOVE.L A6,A4 ; Mark the current allocation depth.
...
CMP.L A4,A6 ; subtract A4 from A6, don't store result
BNE.W STKERR ; STKERR will have to deal with the saved A4
MOVE.L (A7)+,A4
RTS
And you insert the meat of the function in place of the ellipsis.
How would it look on the 6809? We don't have enough registers to just use one, but we could push the marker to the return stack. Of course, then we have to be careful to keep that in balance, but if we let the return stack go out of balance we've got problems anyway.
* This is not the best way for every use.
FENTRY PSHS U
...
CMPU ,S++ ; compare and pop
BNE STKERR
RTS
And, if you're thinking that we could have done that on the 68000, yeah, we
could have, but we don't seem to be using A4 for anything, and it was
convenient.
How would it look on the 6801?
* This is not the best way for every use.
FENTRY LDX PSP
PSHX
...
PULX
CPX PSP ; inverted compare,
BEQ FEXIT ; but we're only looking at equal
JMP STKERR
FEXIT RTS
And how about the 6800? Can it be done reasonably effeciently?
We're gonna need a subroutine, but that means we're going to be fighting with the return address. But, yeah it can be done. It's a lotta code.
PMARK DES ; Need space.
DES
TSX
LDAA 2,X ; return address
LDAB 3,X
STAA 0,X ; move it down
STAB 1,X
LDAA PSP ; get the mark
LDAB PSP+1
STAA 2,X ; move it in
STAB 3,X
RTS
* ...
FENTRY JSR PMARK
* ... ; meat of function here.
TSX
LDX 0,X
CPX PSP
BEQ FEXIT
JMP STKERR
FEXIT RTS
I guess that's not so bad. Maybe.
But, wait!
(Told you so.)
So far, our functions have cleaned up their own stacks, and the parameter stack pointer on entrance is not same as the stack pointer on exit. It changes. We can't use these.
Or can we, with some modifications?
Usually, we know how much the function is going adjust the stack by on return. Could we save the mark pre-adjusted?
Hmm.
Let's look at pre-adjusted balance checks on the 68000.
68000, with adjusted A4:
FENTRY MOVE.L A4,-(A7) ; Save A4 on entry to function
LEA ADJVAL(A6),A4 ; Mark the target allocation depth.
...
CMP.L A4,A6
BNE.W STKERR ; STKERR will have to deal with the saved A4
MOVE.L (A7)+,A4
RTS
68000, adjusted and pushed on return stack:
FENTRY LEA ADJVAL(A6),A0 ; Use a volatile register
MOVE.L A0,-(A7) ; Save pre-adjusted on entry to function
* ...
CMP.L (A7),A6 ; deliberately leave it on the stack for STKERR
BNE.W STKERR
LEA NATWID(A7),A7
RTS
And on the 6809, adjusted and pushed on the return stack:
FENTRY LEAX ADJVAL,U
PSHS X
* ...
CMPU ,S ; leave it on stack for STKERR
BNE STKERR
LEAS NATWID,S
RTS
On the 6801, again, adjusted and pushed on the return stack:
FENTRY LDD #ADJVAL ; Unsigned ABX doesn't help.
ADDD PSP
PSHB
PSHA
* ...
PULX
CPX PSP ; inverted compare,
BEQ FEXIT ; but we're only looking at equal
JMP STKERR ; STKERR can see both mark in X and PSP
FEXIT RTS
And, finally, on the 6800 (wow!):
* Signed offset in B, -128 to +127
PMARK DES ; Need space.
DES
TSX
LDAA 2,X ; return address
STAA 0,X ; move it down
LDAA 3,X ; leave B alone
STAA 1,X
CLRA
TSTB
BPL PMARKA
COMA ; sign extend
PMARKA ADDB PSP+1
ADCA PSP
STAA 2,X ; move it in
STAB 3,X
RTS
* ...
FENTRY LDAB #ADJVAL
JSR PMARK
* ...
TSX
LDX 0,X
CPX PSP
BEQ FEXIT
JMP STKERR
FEXIT RTS
Okay, so it's possible to mark the stack. But maybe we only want to do it for those functions that we aren't confident are going to stay balanced because different branches of the code do different things.
Or when we are debugging.
Am I sure that using a combined stack, with stack frames, is not going to be easier, more efficient, and safer?
Is it time to look at stack frames?
No! NO! Noooooooooooo!
I DON'T want to do that!
First I want to show you how to get numeric output, so that we don't have to depend on the debugger to see the effects of our code! We know enough to get binary output.
Sigh.
Besides, we really need something more complex than what we have at this point to make it make sense. Don't we?
Maybe not, especially if we start with frame pointers on a split stack.
No comments:
Post a Comment