Saturday, October 19, 2024

ALPP 02-18 -- Ascending the Wrong Island -- Single-stack Stack Frame Example: 68000

I guess it didn't sit there at the bottom of the pool so long, after all. Concrete examples are useful.

  Ascending the Wrong Island --
Single-stack Stack Frame Example:
68000

(Title Page/Index)

In the abstract, single-stack stack frames on the 6801 seem rather unreasonable.

Concrete examples can help clear things up a bit. The example I give below is overly simplified, but it should help show how things can work. Below the single stack example, I am providing the equivalent in split stack with explicit frame pointer, for comparison. 

It should also show how trying to be too general made the explanation for the 6801 too tedious for me to follow, even as I was writing it. Is it any wonder it should have been tedious for you?

Why are we even bothering? 

Because this is in essence the discipline and paradigm that underlies the code produced by pretty much all "modern" compilers. We need to understand it to work with it at the level we need to work with it, and we can't work with computers these days without working with it.

We'll start with the 68000 code because that is the most concise, and work through to the more verbose code in the 6800.

Let the comments be your guide.

	OPT LIST,SYMTAB	; Options we want for the stand-alone assembler.
	MACHINE MC68000	; because there are a lot the assembler can do.
	OPT DEBUG	; We want labels for debugging.
	OUTPUT
***********************************************************************
*
* 16-bit addition as example of single-stack stack frame discipline on 68000
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	4	; 4 bytes in the CPU's natural integer
HLFNAT	EQU	2	; half natural integer
*
*
	EVEN
LB_ADDR	EQU	*
ENTRY	BRA.W	START
	NOP		; A little buffer zone.
	NOP
A4SAVE	DS.L	1	; a place to keep A4 to A7 so we can return clean
A5SAVE	DS.L	1	; using it as pseudo-DP
A6SAVE	DS.L	1	; FP
A7SAVE	DS.L	1	; SP
	DS.L	2	; gap
HPPTR	DS.L	1	; heap pointer (not yet managed)
HPALL	DS.L	1	; heap allocation pointer
	DS.L	2	; gap
FINAL	DS.L	1	; unused statically allocated variable
GAP1	DS.L	51	; gap, make it an even 256 bytes.
*
	DS.L	2	; a little bumper space
SSTKLIM	DS.L	16*8	; 16 levels of call, with room for stack frames
* 			; 68000 is pre-dec (pre-store-decrement) push
SSTKBAS	DS.L	4	; for canary return
	DS.L	2	; bumper
HBASE	DS.L	$1000	; heap space (not yet managing it)
HLIM	DS.L	2	; bumper
*
*
	EVEN
INISTKS	MOVEM.L	(A7)+,A0	; get the return address from some other stack
	LEA	LB_ADDR(PC),A3
	MOVEM.L	A4-A7,A4SAVE-LB_ADDR(A3)	; Store away what the BIOS gives us.
	MOVE.L	A3,A5	; set up our local base (pseudo-DP)
	LEA	SSTKBAS+4*NATWID-LB_ADDR(A5),A7	; set up our return stack
	PEA	-NATWID(A7)	; self-link for fake frame
	PEA	STKUNDR(PC)	; fake return to stack underflow handler
	PEA	-NATWID(A7)	; self-link for fake frame
	MOVE.L	A7,A6		; set up fake frame pointer
	PEA	STKUNDR(PC)	; fake return to stack underflow handler
	LEA	HBASE-LB_ADDR(A5),A4	; as if we actually had a heap
	MOVE.L	A4,HPPTR-LB_ADDR(A5)
	MOVE.L	A4,HPALL-LB_ADDR(A5)
	JMP	(A0)		; return via A0
*

***
* Stack after LINK #0 when functions are called by MAIN
* with two parameters
* (#0 means no local variables)
* We will return result in D0:D1
* [<SELF>  ] <= <SELF>
* [STKUNDR ]
* [<SELF>  ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ] 
* [FRMPTR0 ] <= FRMPTR1
* [--------]
* [--------] 
* [PARAM2_1]
* [PARAM2_2]
* [RETADR1 ] 
* [FRMPTR1 ] <= FP,SP
*

* Signed 16 bit add to 32 bit result
* Why do this? Stack cell is 32-bit, parameters are 16.
* Handle sign overflow without losing precision.
* input parameters:
*   16-bit left, right in 32-bit
* output parameter:
*   17-bit sum in 32-bit D1
ADD16S	LINK	A6,#0
	MOVE.W	2*NATWID+HLFNAT(A6),D0	; right (16-bit only)
	EXT.L	D0
	MOVE.W	3*NATWID+HLFNAT(A6),D1	; add to left
	EXT.L	D1
	ADD.L	D0,D1
	UNLK	A6
	RTS		; return, *** all flags valid!! ***
*
* Unsigned 16 bit add to 32 bit result
* input parameters:
*   16-bit left, right in 32-bit
* output parameter:
*   17-bit sum in 32-bit D1
ADD16U	LINK	A6,#0
	CLR.L	D0
	MOVE.W	2*NATWID+HLFNAT(A6),D0	; right (16-bit only)
	CLR.L	D1
	MOVE.W	3*NATWID+HLFNAT(A6),D1	; add to left
	ADD.L	D0,D1
	UNLK	A6
	RTS		; return, *** all flags valid!! ***
*
* Etc.
*

***
* Stack after LINK #0 when functions are called by MAIN
* with one parameter
* (#0 means no local variables)
* We will return result in D0:D1
* [<SELF>  ] <= <SELF>
* [STKUNDR ]
* [<SELF>  ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ] 
* [FRMPTR0 ] <= FRMPTR1
* [VAR1_1--]
* [VAR1_2--] 
* [PARAM2_1]
* [RETADR1 ] 
* [FRMPTR1 ] <= FP,SP

* To show how to walk the stack --
* Add 16-bit signed parameter
* to 32 bit caller's 2nd 32-bit internal variable.
* input parameter:
*   16-bit addend in 32-bit
* target parameter in caller
*   2nd 32-bit variable at offset -2*NATWID
* no output parameter:
SUB16SI	LINK	A6,#0
	MOVE.W	2*NATWID+HLFNAT(A6),D1
	EXT.L	D1
	MOVE.L	(A6),A0		; get caller's frame pointer
	ADD.L	D1,-2*NATWID(A0)	; add to caller's 2nd variable
	UNLK	A6
	RTS		; return, *** all flags valid!! ***
*
*
***
* Stack after LINK
* [<SELF>  ] <= <SELF>
* [STKUNDR ]
* [<SELF>  ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ] 
* [FRMPTR0 ] <= FP
* [VAR1_1--]
* [VAR1_2--] <= SP
*
MAIN	LINK	A6,#-2*NATWID	; room for 2 variables
	CLR.L	-NATWID(A6)
	CLR.L	-2*NATWID(A6)
	MOVE.L	#$1234,-(A7)
	MOVE.L	#$CDEF,-(A7)
	BSR.W	ADD16U	; result in D1 should be $E023
	LEA	2*NATWID(A7),A7	; could reuse, instead
	MOVE.L	D1,-(A7)
	MOVE.L	#$8765,-(A7)
	BSR.W	ADD16S	; result in D1 should be $FFFF6788 (and carry set)
	LEA	2*NATWID(A7),A7
	MOVE.L	D1,-2*NATWID(A6)
	MOVE.L	#$A5A5,-(a7)
	BSR.W	SUB16SI		; result in 2nd variable should be FFFF0D2D (Carry set)
	MOVE.L	-2*NATWID(A6),FINAL-LB_ADDR(A5)	; store the result
	UNLK	A6
	RTS
*
***
* Stack at START:
* (what BIOS/OS gave us) <= SP (A7)
***
* (who knows?) <= FP (A6)
***
*
* Stack after initialization:
* [<SELF>  ] <= <SELF>
* [STKUNDR ]
* [<SELF>  ] <= <SELF>,FP
* [STKUNDR ]SSTKBAS <= SP
***
* Stack after LINK (at call to MAIN)
* [<SELF>  ] <= <SELF>
* [STKUNDR ]
* [<SELF>  ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= SP,FP
*
START	BSR.W	INISTKS
*
	LINK	A6,#0	; initialize frame list
*
	BSR.W	MAIN
*
*
DONE	MOVEM.L	A4SAVE-LB_ADDR(A5),A4-A7	; restore the monitor's A4-A7
	NOP
	NOP		; landing pad
	NOP
	NOP
* One way to return to the OS or other calling program
	CLR.W	-(A7)	; there should be enough room on the caller's stack
	TRAP	#1	;	quick exit
*
STKUNDR	EQU	DONE	; want better error handling
ERROR	EQU	STKUNDR

I've tested the code. It runs, and it builds the stack frames and tears them down as advertised. I'm not going to guarantee that it can be generalized. Neither will I guarantee that it can be generated by some appropriately constructed compiler. 

[JMR202410232030 note: 

Something I realized while transliterating the code for this onto the 6809,  I could put the STKUNDR label on one of the NOPs after DONE, and I would be able to set breakpoints separately on DONE and on STKUNDR. Likewise the ERROR label. 

This would allow me to tell by which breakpoint I hit whether I'd safely gotten through the code, had tried to return through one of the stack underflow fake returns, or (hypothetically) had jumped to ERROR.

JMR202410232030 note end.]

Just for comparison, let's see what that would look like with split stacks and, say, a literal frame pointer. Just for grins.

	OPT LIST,SYMTAB	; Options we want for the stand-alone assembler.
	MACHINE MC68000	; because there are a lot the assembler can do.
	OPT DEBUG	; We want labels for debugging.
	OUTPUT
***********************************************************************
*
* 16-bit addition as example of split-stack, literal frame pointer
* stack frame discipline on 68000
* with test code
* Joel Matthew Rees, October 2024
*
NATWID	EQU	4	; 4 bytes in the CPU's natural integer
HLFNAT	EQU	2	; half natural integer
*
*
	EVEN
LB_ADDR	EQU	*
ENTRY	BRA.W	START
	NOP		; A little buffer zone.
	NOP
A4SAVE	DS.L	1	; (FP) a place to keep A4 to A7 so we can return clean
A5SAVE	DS.L	1	; using it as pseudo-DP
A6SAVE	DS.L	1	; using it as parameter stack pointer
A7SAVE	DS.L	1	; SP
	DS.L	2	; gap
HPPTR	DS.L	1	; heap pointer (not yet managed)
HPALL	DS.L	1	; heap allocation pointer
	DS.L	2	; gap
FINAL	DS.L	1	; unused statically allocated variable
GAP1	DS.L	51	; gap, make it an even 256 bytes.
*
	DS.L	2	; a little bumper space
SSTKLIM	DS.L	16*2	; 16 levels of call, with room for frame pointers
* 			; 68000 is pre-dec (pre-store-decrement) push
SSTKBAS	DS.L	3	; for canary return
	DS.L	2	; bumper
PSTKLIM	DS.L	16*4	; roughly 16 levels of call
PSTKBAS	DS.L	2	; bumper
HBASE	DS.L	$1000	; heap space (not yet managing it)
HLIM	DS.L	2	; bumper
*
*
	EVEN
INISTKS	MOVEM.L	(A7)+,A0	; get the return address from some other stack
	LEA	LB_ADDR(PC),A3
	MOVEM.L	A4-A7,A4SAVE-LB_ADDR(A3)	; Store away what the BIOS gives us.
	MOVE.L	A3,A5	; set up our local base (pseudo-DP)
	LEA	SSTKBAS+4*NATWID-LB_ADDR(A5),A7	; set up our return stack
	LEA	PSTKBAS-LB_ADDR(A5),A6	; set up our parameter stack
	PEA	STKUNDR(PC)	; fake return to stack underflow handler
	MOVE.L	A6,-(A7)	; empty frame pointer for fake frame
	PEA	STKUNDR(PC)	; fake return to stack underflow handler
	LEA	HBASE-LB_ADDR(A5),A4	; as if we actually had a heap
	MOVE.L	A4,HPPTR-LB_ADDR(A5)
	MOVE.L	A4,HPALL-LB_ADDR(A5)
	MOVE.L	A6,A4		; synch FP and PSP
	JMP	(A0)		; return via A0
*

***
* Return stack when functions are called by MAIN
* Return stack on entry:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>] <= RSP
* [RETADR1 ]
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>]
* [RETADR1 ]
* [FRMPTR1 ] <= RSP
*
* Parameter stack when called by MAIN
* with two 16-bit parameters,
* after mark (no local allocation)
* [<unknown>] <= FRMPTR0,FRMPTR1
* [32:VAR1_1--]
* [32:VAR1_2--]
* [16:PARAM2_1]
* [16:PARAM2_2] <= PSP,FP
*

* Signed 16 bit add to 32 bit result
* Handle sign overflow without losing precision.
* input parameters:
*   16-bit left, right
* output parameter:
*   17-bit sum in 32-bit
ADD16S	MOVE.L	A4,-(A7)	; link, mark, and restore could be optimized out.
	MOVE.L	A6,A4	; mark
	MOVE.W	(A6)+,D0	; right (16-bit only)
	EXT.L	D0
	MOVE.W	(A6)+,D1	; add to left
	EXT.L	D1
	ADD.L	D0,D1
	MOVE.L	D1,-(A6)
	MOVE.L	(A7)+,A4	; restore FP
	RTS		; return, *** all flags valid!! ***
*
* Unsigned 16 bit add to 32 bit result
* input parameters:
*   16-bit left, right in 32-bit
* output parameter:
*   17-bit sum in 32-bit D1
ADD16U	MOVE.L	A4,-(A7)	; link
	MOVE.L	A6,A4	; mark
	CLR.L	D0
	MOVE.W	(A6)+,D0	; right (16-bit only)
	CLR.L	D1
	MOVE.W	(A6)+,D1	; add to left
	ADD.L	D0,D1
	MOVE.L	D1,-(A6)
	MOVE.L	(A7)+,A4	; restore FP
	RTS		; return, *** all flags valid!! ***
*
* Etc.
*

***
* Parameter stack when called by MAIN
* with one 16-bit parameters,
* after mark (no local allocation)
* [<unknown>] <= FRMPTR0,FRMPTR1
* [32:VAR1_1--]
* [32:VAR1_2--]
* [16:PARAM2_1] <= PSP,FP

* To show how to walk the stack --
* Add 16-bit signed parameter
* to 32 bit caller's 2nd 32-bit internal variable.
* input parameter:
*   16-bit addend in 32-bit
* target parameter in caller
*   2nd 32-bit variable at offset -2*NATWID
* no output parameter:
SUB16SI	MOVE.L	A4,-(A7)	; link
	MOVE.L	A6,A4	; mark
	MOVE.W	(A6)+,D1
	EXT.L	D1
	MOVE.L	(A7),A0		; get caller's frame pointer from return stack
	ADD.L	D1,-2*NATWID(A0)	; add to caller's 2nd variable
	MOVE.L	(A7)+,A4	; restore FP
	RTS		; return, *** all flags valid!! ***
*
*
***
* Return stack on entry:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ] <= RSP
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>] <= RSP
*
* Parameter stack after mark and local allocation
* [<unknown>] <= FP,FRMPTR0
* [VAR1_1--]
* [VAR1_2--] <= PSP
*
MAIN	MOVE.L	A4,-(A7)	; link
	MOVE.L	A6,A4	; mark
	LEA	-2*NATWID(A6),A6	; allocate
	MOVE.W	#$1234,-(A6)
	MOVE.W	#$CDEF,-(A6)
	BSR.W	ADD16U	; result on parameter stack should be $E023
	LEA	HLFNAT(A6),A6	; adjust to 16 bit, could be optimized out
	MOVE.W	#$8765,-(A6)
	BSR.W	ADD16S	; result on parameter stack should be $FFFF6788 (and carry set)
	MOVE.L	(A6)+,-2*NATWID(A4)	; save in local
	MOVE.W	#$A5A5,-(a6)
	BSR.W	SUB16SI		; result in 2nd variable should be FFFF0D2D (Carry set)
	MOVE.L	-2*NATWID(A4),FINAL-LB_ADDR(A5)	; store the result
	MOVE.L	(A7)+,A4	; restore FP
	RTS
*
***
* Stack at START:
* (what BIOS/OS gave us) <= SP (A7)
***
* (who knows?) <= FP (A6)
***
*
***
* Return stack will always be in pairs:
* [RETADRNN  ]
* [CALLERFMNN]
*
* Return stack after initialization:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS <= RSP
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>] <= RSP
*
* Parameter stack after initialization, mark:
* [<unknown] <= PSP,FP==<EMPTYP>
*
START	BSR.W	INISTKS
	MOVE.L	A4,-(A7)	; link
	MOVE.L	A6,A4	; mark
*
	BSR.W	MAIN
*
*
DONE	NOP
	NOP		; landing pad
	NOP
	NOP
	MOVE.L	(A7)+,A4
	MOVEM.L	A4SAVE-LB_ADDR(A5),A4-A7	; restore the monitor's A4-A7
	NOP
	NOP		; landing pad
	NOP
	NOP
* One way to return to the OS or other calling program
	CLR.W	-(A7)	; there should be enough room on the caller's stack
	TRAP	#1	;	quick exit
*
STKUNDR	EQU	DONE	; want better error handling
ERROR	EQU	STKUNDR

[JMR202410232040 note: See the note after the single stack code, above, and the 6809 parallel code concerning making the labels STKUNDR and ERROR more useful without actually defining error routines.]

Since you're here, you probably want to look at the 6809 code for the same, but, if not, go ahead and move on to getting numeric output in binary.


(Title Page/Index)


 

 

 

 

No comments:

Post a Comment