Wednesday, October 16, 2024

ALPP 02-16 -- One Foot on One Beach, One Foot on Another -- Stack Frame for Single Stack: 68000 and 6809

  One foot on One Beach, One Foot on Another --
Stack Frame for Single Stack:
68000 and 6809

(Title Page/Index)

Now that I've shown you how you can do stack frames on split-stack run-times, both with and without frame pointers, at long last, I'm going to show you "normal" single interleaved stack stack frames -- one kind, anyway.

There are several ways to do this, and there is a particular reason (although not a very good reason) I've picked this one.

Let's get a look at the single stack being used for parameters, temporaries, and variables without frames during a routine. 

Below, nnnn is stuff we don't really know about, but it's not garbage. RA_0 is the return address to whatever called the routine we are in. T1_1 and T1_2 are either temporaries or variables, we don't care which:

nnnn


RA_0

T1_1

T1_2 <= _RSP_
????

And here's how it looks after the routine we are in pushes parameters P2_1 and P2_2 and enters a second routine:

nnnn


RA_0

T1_1

T1_2

P2_1

P2_2

RA_1 <= _RSP_
????

And after a third call:

nnnn


RA_0

T1_1

T1_2

P2_1

P2_2

RA_1

T2_1

T2_2

T2_3

P3_1

RA_2 <= _RSP_
????

And it's hard to look at the stack and tell what's what. If i hadn't been tracking the calls, labeling what was pushed as it was pushed, I wouldn't have this map. 

Each routine knows it's own context (maybe), but heaven help it if it has to access a caller's context.

And when a debugger looks inside, well, a good debugger can read the context of each call from the source code and build its map backwards, but it has to be a really good debugger, the sort that you don't have until your platform has been on the market for several years.

The engineer doing the debugging can do what a good debugger can, but he takes much more time to do so, and that slows down debugging and creates opportunities for mistakes. 

For a variety of reasons, we want to impose some order on that stack.

Assume your routines on the 68000 have this entry and exit protocol:
	MOVE.L	A6,-(A7)
	MOVE.L	A7,A6
	LEA	-FRAMESIZE(A7),A7
*	...
	MOVE.L	A6,A7
	MOVE.L	(A7)+,A6

With this protocol, here's how the stack looks before the first routine pushes parameters for the first call:

nnnn


RA_0

FA_0 <= _FP_
T1_1

T1_2 <= _RSP_
????

There's something new in there. There's a frame pointer (FP), and it's pointing to frame address 0 (FA_0) -- maybe the saved frame address of whatever called the first routine, whatever that might be.

And here's how it looks after it pushes parameters and calls a second routine, but before it executes the entry protocol:

nnnn


RA_0

FA_0 <= _FP_
T1_1

T1_2

P2_1

P2_2

RA_1 <= _RSP_
????

And here's how it looks after it executes the protocol:

nnnn


RA_0

FA_0 <= _FA_1
T1_1

T1_2

P2_1

P2_2

RA_1

FA_1 <= _FP_
T2_1

T2_2

T2_3 <= _RSP_
????

The variables and temporaries of the second routine won't be initialized yet, but the frame has been constructed, and it is relatively easy to walk it backwards to the previous frame. 

Here's what it looks like after the third call, with 1 parameter and 1 variable or temporary allocated:

nnnn


RA_0

FA_0 <= _FA_1
T1_1

T1_2

P2_1

P2_2

RA_1

FA_1 <= FA_2
T2_1

T2_2

T2_3

P3_1

RA_2

FA_2 <= _FP_
T3_1 <=
_RSP_
????

Still easy to walk.

Of course it does require knowing what a context is supposed to look like, to access that context. That's not a problem. The engineer or debugger only needs to look at the context of the routine of interest.

But it's also relatively easy for a rogue routine to walk all over the stack, including frame links and return addresses, which is why I don't care for it.

Hmm. This indicates another approach I could show for doing stack frames in the split stack discipline. But I'm not going to do that, even though it seems interesting. I'll leave that as an exercise for the reader. I'm not a fan of stack frames, even well constructed ones where the parts that hold them together are kept separate from what each frame contains.

Why did I use this protocol? I said there was a reason.

The snippet above isn't really enough to get an idea of what the code will look like. Let's look at the actual code for all four CPUs.

I ought to save the 68000 for last, but I'll spoil the surprise and do it first. Otherwise, the 6800 and 6801 code are going to wear both of us out and we won't be able to appreciate it.  

[JMR2024191759 note: I need to edit this. The code snippets and the explanation are slightly wrong. But, until I can get back to it, i have a concrete example of this which should clear things up, especially for the 68000, at: https://joels-programming-fun.blogspot.com/2024/10/alpp-02-18-ascending-wrong-island-single-stack-frame-example-68000.html.]

* calling protocol for single-stack stack frames on 68000
* with A6 as the frame pointer, 
* and routines saving the frame pointer
* and allocating the new frame on entry.
ROUTINE1
*	...
	MOVE.L	PARAMETER1,-(A7)
	MOVE.L	PARAMETER2,-(A7)
	BSR.W	ROUTINE2
	LEA	PARAMETER_SIZE(A7),A7    ; drop 
	MOVE.L	D0,RESULTVAR_OFFSET(A6)    ; negative offset
*	...

Wait. 

For the linked list of frames to be valid, every routine has to use the same protocol. Every routine that is a routine, anyway. Let's show that:

* calling protocol for single-stack stack frames on 68000
* with A6 as the frame pointer, 
* and routines saving the frame pointer
* and allocating the new frame on entry.
FRAME_SIZE	SET	SOMETHING	; ROUTINE1's frame size
ROUTINE1
	LINK	A6,#FRAME_SIZE	; This context's frame size
*	...
	MOVE.L	PARAMETER1,-(A7)
	MOVE.L	PARAMETER2,-(A7)
	BSR.W	ROUTINE2
	LEA	PARAMETER_SIZE(A7),A7	; drop parameters
	MOVE.L	D0,RESULTVAR_OFFSET(A6)	; negative offset
*	...
	UNLK	A6
	RTS
* and the caller moves results where they go in its own frame 
* before using the return value register, which is usually immediately
* after dropping the parameters that are no longer in use.
*	...
*
*
* called routine entry protocol for single-stack stack frames
* with A6 as the frame pointer,
* and routines saving the frame pointer
* and allocating the new frame on entry.
* (Caller, of course, pushes call parameters before call,
* and pops return parameters to where they go on return.):
FRAME_SIZE	SET	SOMETHING	; ROUTINE2's frame size
ROUTINE2
	LINK	A6,#FRAME_SIZE
*	...
* No code for dealing with large return values. Can't do that.
* Large return values have to be dealt with outside of the stack frame.	
*	...	
	MOVE.L	PARAMETER2OFFSET+8(A6),D1	; dodge frame link and return address
*	...
	SUB.L	PARAMETER1OFFSET+8(A6),D3	; dodge frame link and return address
*	...
	MOVE.L	(A6),A0	; get the caller's frame pointer
	ADD.L	CALLER.VARIABLENOFFSET(A0),D5	; or something -- negative offset
*	...
* routine return protocol in all cases:
	LEA	FRAMESIZE-RETURNSIZE(A6),A6
	UNLK	A6
	RTS

That looks pretty, doesn't it?

WHAT is that LINK instruction? And the UNLK?

There's the reason for the protocol I chose. The 68000 does it for us. Trade five instructions for two, reduce the apparent cost of frames, help maintain their integrity. 

Get in the way of large return values.

How do we do large return values using this kind of stack frame?

Well, let me tell you. It involves something called static allocation from a memory pool, constructors and destructors, and garbage collection, and ...

Or you can do it the C way, and the caller passes, as a parameter, an explicit pointer to where the return value is ultimately supposed to go anyway. There is elegance to that solution, and it is the one I usually recommend for returning large values anyway. 

I'll have more to say about what constitutes a large return value and what constitutes a large return value later, when we have something (relatively) concrete to look at.

Unfortunately, we do not have the LINK and UNLK instructions in the 6809's repertoire. But it's no great misfortune:

* 6809
* calling protocol for single-stack stack frames on 6809
* with U as the frame pointer
* and routines saving the frame pointer
* and allocating the new frame on entry.
FRAME_SIZE	SET	SOMETHING	; ROUTINE1's frame size
ROUTINE1
	PSHS	U
	TFR	S,U
	LEAS	-FRAMESIZE,S
*	...
	LDX	PARAMETER1
	LDD	PARAMETER2
	PSHS	D,X
	LBSR	ROUTINE2
	LEAS	PARAMETER_SIZE,S	; drop parameters
	STD	RESULTVAR_OFFSET,U	; store result -- negative offset
*	...
	TFR	U,S		; restore/deallocate
	PULS	U
	RTS
* and the caller moves results where they go in its own frame 
* before using the return value register, which is usually immediately
* after dropping the parameters that are no longer in use.
*	...
*
*
* called routine entry protocol for single-stack stack frames
* with U as the frame pointer,
* and routines saving the frame pointer
* and allocating the new frame on entry.
* (Caller, of course, pushes call parameters before call,
* and pops return parameters to where they go on return.):
FRAME_SIZE	SET	SOMETHING	; ROUTINE2's frame size
ROUTINE2
	PSHS	U
	TFR	S,U
	LEAS	-FRAMESIZE,S
*	...
* No code for dealing with large return values. Can't do that.
* Large return values have to be dealt with outside of the stack frame.	
*	...	
	LDD	PARAMETER2OFFSET,U	; access the second parameter
*	...
	SUBD	PARAMETER1OFFSET,U	; access the first parameter
*	...
	LDX	,U			; get the caller's frame pointer
	ADDD	CALLER.PARAMETERNOFFSET,X	; or something -- negative offset
*	...
	TFR	U,S		; restore/deallocate
	PULS	U
	RTS

Nicely enough done.

But, the 6801 and the 6800 -- with the conflicts between uses of the stack, index, accumulator, offset calculations, and so on, lack of LINK and UNLK instructions is great misfortune on the 6800 and 6801. (But how would adding those instructions be done?) 

It's tempting to just punt on the 6800 and 6801. The sort of code required to do it goes way beyond what we've looked at so far.

You can mix and match various stack frames and frameless in the split stack runtime, as long as you either don't reach into the caller's stack frame, or at least know what the caller's stack frame looks like so you know how to reach in. 

But you can't mix and match in a single-stack runtime -- not without losing the benefits of the stack frame.

Really, if we're being sensible, we're not usually going to want to bother with stack frames anyway. 

So I'm going to recommend that you move ahead to numeric binary output, rather than dig further into stack frames on the 6801 and 6800 (or the 68000 or 6809).

(Title Page/Index)


 

No comments:

Post a Comment