I guess
it didn't sit there at the bottom of the pool
so long, after all. Concrete examples are useful.
Ascending the Wrong Island --
Single-stack Stack Frame Example:
68000
In the abstract,
single-stack stack frames on the 6801 seem rather unreasonable.
Concrete examples can help clear things up a bit. The example I give below is overly simplified, but it should help show how things can work. Below the single stack example, I am providing the equivalent in split stack with explicit frame pointer, for comparison.
It should also show how trying to be too general made the explanation for the 6801 too tedious for me to follow, even as I was writing it. Is it any wonder it should have been tedious for you?
Why are we even bothering?
Because this is in essence the discipline and paradigm that underlies the code
produced by pretty much all "modern" compilers. We need to understand it to
work with it at the level we need to work with it, and we can't work with
computers these days without working with it.
We'll start with the 68000 code because that is the most concise, and work through to the more verbose code in the 6800.
Let the comments be your guide.
OPT LIST,SYMTAB ; Options we want for the stand-alone assembler.
MACHINE MC68000 ; because there are a lot the assembler can do.
OPT DEBUG ; We want labels for debugging.
OUTPUT
***********************************************************************
*
* 16-bit addition as example of single-stack stack frame discipline on 68000
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 4 ; 4 bytes in the CPU's natural integer
HLFNAT EQU 2 ; half natural integer
*
*
EVEN
LB_ADDR EQU *
ENTRY BRA.W START
NOP ; A little buffer zone.
NOP
A4SAVE DS.L 1 ; a place to keep A4 to A7 so we can return clean
A5SAVE DS.L 1 ; using it as pseudo-DP
A6SAVE DS.L 1 ; FP
A7SAVE DS.L 1 ; SP
DS.L 2 ; gap
HPPTR DS.L 1 ; heap pointer (not yet managed)
HPALL DS.L 1 ; heap allocation pointer
DS.L 2 ; gap
FINAL DS.L 1 ; unused statically allocated variable
GAP1 DS.L 51 ; gap, make it an even 256 bytes.
*
DS.L 2 ; a little bumper space
SSTKLIM DS.L 16*8 ; 16 levels of call, with room for stack frames
* ; 68000 is pre-dec (pre-store-decrement) push
SSTKBAS DS.L 4 ; for canary return
DS.L 2 ; bumper
HBASE DS.L $1000 ; heap space (not yet managing it)
HLIM DS.L 2 ; bumper
*
*
EVEN
INISTKS MOVEM.L (A7)+,A0 ; get the return address from some other stack
LEA LB_ADDR(PC),A3
MOVEM.L A4-A7,A4SAVE-LB_ADDR(A3) ; Store away what the BIOS gives us.
MOVE.L A3,A5 ; set up our local base (pseudo-DP)
LEA SSTKBAS+4*NATWID-LB_ADDR(A5),A7 ; set up our return stack
PEA -NATWID(A7) ; self-link for fake frame
PEA STKUNDR(PC) ; fake return to stack underflow handler
PEA -NATWID(A7) ; self-link for fake frame
MOVE.L A7,A6 ; set up fake frame pointer
PEA STKUNDR(PC) ; fake return to stack underflow handler
LEA HBASE-LB_ADDR(A5),A4 ; as if we actually had a heap
MOVE.L A4,HPPTR-LB_ADDR(A5)
MOVE.L A4,HPALL-LB_ADDR(A5)
JMP (A0) ; return via A0
*
***
* Stack after LINK #0 when functions are called by MAIN
* with two parameters
* (#0 means no local variables)
* We will return result in D0:D1
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ]
* [FRMPTR0 ] <= FRMPTR1
* [--------]
* [--------]
* [PARAM2_1]
* [PARAM2_2]
* [RETADR1 ]
* [FRMPTR1 ] <= FP,SP
*
* Signed 16 bit add to 32 bit result
* Why do this? Stack cell is 32-bit, parameters are 16.
* Handle sign overflow without losing precision.
* input parameters:
* 16-bit left, right in 32-bit
* output parameter:
* 17-bit sum in 32-bit D1
ADD16S LINK A6,#0
MOVE.W 2*NATWID+HLFNAT(A6),D0 ; right (16-bit only)
EXT.L D0
MOVE.W 3*NATWID+HLFNAT(A6),D1 ; add to left
EXT.L D1
ADD.L D0,D1
UNLK A6
RTS ; return, *** all flags valid!! ***
*
* Unsigned 16 bit add to 32 bit result
* input parameters:
* 16-bit left, right in 32-bit
* output parameter:
* 17-bit sum in 32-bit D1
ADD16U LINK A6,#0
CLR.L D0
MOVE.W 2*NATWID+HLFNAT(A6),D0 ; right (16-bit only)
CLR.L D1
MOVE.W 3*NATWID+HLFNAT(A6),D1 ; add to left
ADD.L D0,D1
UNLK A6
RTS ; return, *** all flags valid!! ***
*
* Etc.
*
***
* Stack after LINK #0 when functions are called by MAIN
* with one parameter
* (#0 means no local variables)
* We will return result in D0:D1
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ]
* [FRMPTR0 ] <= FRMPTR1
* [VAR1_1--]
* [VAR1_2--]
* [PARAM2_1]
* [RETADR1 ]
* [FRMPTR1 ] <= FP,SP
* To show how to walk the stack --
* Add 16-bit signed parameter
* to 32 bit caller's 2nd 32-bit internal variable.
* input parameter:
* 16-bit addend in 32-bit
* target parameter in caller
* 2nd 32-bit variable at offset -2*NATWID
* no output parameter:
ADD16SI LINK A6,#0
MOVE.W 2*NATWID+HLFNAT(A6),D1
EXT.L D1
MOVE.L (A6),A0 ; get caller's frame pointer
ADD.L D1,-2*NATWID(A0) ; add to caller's 2nd variable
UNLK A6
RTS ; return, *** all flags valid!! ***
*
*
***
* Stack after LINK
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ]
* [FRMPTR0 ] <= FP
* [VAR1_1--]
* [VAR1_2--] <= SP
*
MAIN LINK A6,#-2*NATWID ; room for 2 variables
CLR.L -NATWID(A6)
CLR.L -2*NATWID(A6)
MOVE.L #$1234,-(A7)
MOVE.L #$CDEF,-(A7)
BSR.W ADD16U ; result in D1 should be $E023
LEA 2*NATWID(A7),A7 ; could reuse, instead
MOVE.L D1,-(A7)
MOVE.L #$8765,-(A7)
BSR.W ADD16S ; result in D1 should be $FFFF6788 (and carry set)
LEA 2*NATWID(A7),A7
MOVE.L D1,-2*NATWID(A6)
MOVE.L #$A5A5,-(a7)
BSR.W ADD16SI ; result in 2nd variable should be FFFF0D2D (Carry set)
MOVE.L -2*NATWID(A6),FINAL-LB_ADDR(A5) ; store the result
UNLK A6
RTS
*
***
* Stack at START:
* (what BIOS/OS gave us) <= SP (A7)
***
* (who knows?) <= FP (A6)
***
*
* Stack after initialization:
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FP
* [STKUNDR ]SSTKBAS <= SP
***
* Stack after LINK (at call to MAIN)
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= SP,FP
*
START BSR.W INISTKS
*
LINK A6,#0 ; initialize frame list
*
BSR.W MAIN
*
*
DONE MOVEM.L A4SAVE-LB_ADDR(A5),A4-A7 ; restore the monitor's A4-A7
NOP
NOP ; landing pad
NOP
NOP
* One way to return to the OS or other calling program
CLR.W -(A7) ; there should be enough room on the caller's stack
TRAP #1 ; quick exit
*
STKUNDR EQU DONE ; want better error handling
ERROR EQU STKUNDR
I've tested the code. It runs, and it builds the stack frames and tears them down as advertised. I'm not going to guarantee that it can be generalized. Neither will I guarantee that it can be generated by some appropriately constructed compiler.
[JMR202411182142 edit:]
Just realized, while working on this for the 6800, that the name for SUB16SI did not agree with what it is doing. So I'm fixing it, calling it ADD16SI, instead.
[JMR202411182142 edit end.]
[JMR202410232030 note:]
Something I realized while transliterating the code for this onto the 6809, I could put the STKUNDR label on one of the NOPs after DONE, and I would be able to set breakpoints separately on DONE and on STKUNDR. Likewise the ERROR label.
This would allow me to tell by which breakpoint I hit whether I'd safely gotten through the code, had tried to return through one of the stack underflow fake returns, or (hypothetically) had jumped to ERROR.
[JMR202410232030 note end.]
Just for comparison, let's see what that would look like with split stacks and, say, a literal frame pointer. Just for grins. (And you can use a separate browser window to compare by just opening this chapter again.)
OPT LIST,SYMTAB ; Options we want for the stand-alone assembler.
MACHINE MC68000 ; because there are a lot the assembler can do.
OPT DEBUG ; We want labels for debugging.
OUTPUT
***********************************************************************
*
* 16-bit addition as example of split-stack, literal frame pointer
* stack frame discipline on 68000
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 4 ; 4 bytes in the CPU's natural integer
HLFNAT EQU 2 ; half natural integer
*
*
EVEN
LB_ADDR EQU *
ENTRY BRA.W START
NOP ; A little buffer zone.
NOP
A4SAVE DS.L 1 ; (FP) a place to keep A4 to A7 so we can return clean
A5SAVE DS.L 1 ; using it as pseudo-DP
A6SAVE DS.L 1 ; using it as parameter stack pointer
A7SAVE DS.L 1 ; SP
DS.L 2 ; gap
HPPTR DS.L 1 ; heap pointer (not yet managed)
HPALL DS.L 1 ; heap allocation pointer
DS.L 2 ; gap
FINAL DS.L 1 ; unused statically allocated variable
GAP1 DS.L 51 ; gap, make it an even 256 bytes.
*
DS.L 2 ; a little bumper space
SSTKLIM DS.L 16*2 ; 16 levels of call, with room for frame pointers
* ; 68000 is pre-dec (pre-store-decrement) push
SSTKBAS DS.L 3 ; for canary return
DS.L 2 ; bumper
PSTKLIM DS.L 16*4 ; roughly 16 levels of call
PSTKBAS DS.L 2 ; bumper
HBASE DS.L $1000 ; heap space (not yet managing it)
HLIM DS.L 2 ; bumper
*
*
EVEN
INISTKS MOVEM.L (A7)+,A0 ; get the return address from some other stack
LEA LB_ADDR(PC),A3
MOVEM.L A4-A7,A4SAVE-LB_ADDR(A3) ; Store away what the BIOS gives us.
MOVE.L A3,A5 ; set up our local base (pseudo-DP)
LEA SSTKBAS+4*NATWID-LB_ADDR(A5),A7 ; set up our return stack
LEA PSTKBAS-LB_ADDR(A5),A6 ; set up our parameter stack
PEA STKUNDR(PC) ; fake return to stack underflow handler
MOVE.L A6,-(A7) ; empty frame pointer for fake frame
PEA STKUNDR(PC) ; fake return to stack underflow handler
LEA HBASE-LB_ADDR(A5),A4 ; as if we actually had a heap
MOVE.L A4,HPPTR-LB_ADDR(A5)
MOVE.L A4,HPALL-LB_ADDR(A5)
MOVE.L A6,A4 ; synch FP and PSP
JMP (A0) ; return via A0
*
***
* Return stack when functions are called by MAIN
* Return stack on entry:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>] <= RSP
* [RETADR1 ]
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>]
* [RETADR1 ]
* [FRMPTR1 ] <= RSP
*
* Parameter stack when called by MAIN
* with two 16-bit parameters,
* after mark (no local allocation)
* [<unknown>] <= FRMPTR0,FRMPTR1
* [32:VAR1_1--]
* [32:VAR1_2--]
* [16:PARAM2_1]
* [16:PARAM2_2] <= PSP,FP
*
* Signed 16 bit add to 32 bit result
* Handle sign overflow without losing precision.
* input parameters:
* 16-bit left, right
* output parameter:
* 17-bit sum in 32-bit
ADD16S MOVE.L A4,-(A7) ; link, mark, and restore could be optimized out.
MOVE.L A6,A4 ; mark
MOVE.W (A6)+,D0 ; right (16-bit only)
EXT.L D0
MOVE.W (A6)+,D1 ; add to left
EXT.L D1
ADD.L D0,D1
MOVE.L D1,-(A6)
MOVE.L (A7)+,A4 ; restore FP
RTS ; return, *** all flags valid!! ***
*
* Unsigned 16 bit add to 32 bit result
* input parameters:
* 16-bit left, right in 32-bit
* output parameter:
* 17-bit sum in 32-bit
ADD16U MOVE.L A4,-(A7) ; link
MOVE.L A6,A4 ; mark
CLR.L D0
MOVE.W (A6)+,D0 ; right (16-bit only)
CLR.L D1
MOVE.W (A6)+,D1 ; add to left
ADD.L D0,D1
MOVE.L D1,-(A6)
MOVE.L (A7)+,A4 ; restore FP
RTS ; return, *** all flags valid!! ***
*
* Etc.
*
***
* Parameter stack when called by MAIN
* with one 16-bit parameters,
* after mark (no local allocation)
* [<unknown>] <= FRMPTR0,FRMPTR1
* [32:VAR1_1--]
* [32:VAR1_2--]
* [16:PARAM2_1] <= PSP,FP
* To show how to walk the stack --
* Add 16-bit signed parameter
* to 32 bit caller's 2nd 32-bit internal variable.
* input parameter:
* 16-bit addend in 32-bit
* target parameter in caller
* 2nd 32-bit variable at offset -2*NATWID
* no output parameter:
ADD16SI MOVE.L A4,-(A7) ; link
MOVE.L A6,A4 ; mark
MOVE.W (A6)+,D1
EXT.L D1
MOVE.L (A7),A0 ; get caller's frame pointer from return stack
ADD.L D1,-2*NATWID(A0) ; add to caller's 2nd variable
MOVE.L (A7)+,A4 ; restore FP
RTS ; return, *** all flags valid!! ***
*
*
***
* Return stack on entry:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ] <= RSP
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>] <= RSP
*
* Parameter stack after mark and local allocation
* [<unknown>] <= FP,FRMPTR0
* [VAR1_1--]
* [VAR1_2--] <= PSP
*
MAIN MOVE.L A4,-(A7) ; link
MOVE.L A6,A4 ; mark
LEA -2*NATWID(A6),A6 ; allocate
MOVE.W #$1234,-(A6)
MOVE.W #$CDEF,-(A6)
BSR.W ADD16U ; result on parameter stack should be $E023
LEA HLFNAT(A6),A6 ; adjust to 16 bit, could be optimized out
MOVE.W #$8765,-(A6)
BSR.W ADD16S ; result on parameter stack should be $FFFF6788 (and carry set)
MOVE.L (A6)+,-2*NATWID(A4) ; save in local
MOVE.W #$A5A5,-(a6)
BSR.W ADD16SI ; result in 2nd variable should be FFFF0D2D (Carry set)
MOVE.L -2*NATWID(A4),FINAL-LB_ADDR(A5) ; store the result
MOVE.L (A7)+,A4 ; restore FP
RTS
*
***
* Stack at START:
* (what BIOS/OS gave us) <= SP (A7)
***
* (who knows?) <= FP (A6)
***
*
***
* Return stack will always be in pairs:
* [RETADRNN ]
* [CALLERFMNN]
*
* Return stack after initialization:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS <= RSP
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>] <= RSP
*
* Parameter stack after initialization, mark:
* [<unknown] <= PSP,FP==<EMPTYP>
*
START BSR.W INISTKS
MOVE.L A4,-(A7) ; link
MOVE.L A6,A4 ; mark
*
BSR.W MAIN
*
*
DONE NOP
NOP ; landing pad
NOP
NOP
MOVE.L (A7)+,A4
MOVEM.L A4SAVE-LB_ADDR(A5),A4-A7 ; restore the monitor's A4-A7
NOP
NOP ; landing pad
NOP
NOP
* One way to return to the OS or other calling program
CLR.W -(A7) ; there should be enough room on the caller's stack
TRAP #1 ; quick exit
*
STKUNDR EQU DONE ; want better error handling
ERROR EQU STKUNDR
[JMR202410232040 note: See the note after the single stack code, above, and the 6809 parallel code concerning making the labels STKUNDR and ERROR more useful without actually defining error routines.]
Since you're here, you probably want to look at the 6809 code for the same, but, if not, go ahead and move on to getting numeric output in binary.
And you probably don't want to forget that we have already seen this without stack frames.
[JMR202411290957 addendum:]
I believe I've mentioned a few niggles I have with the 68000.
This is not exactly a niggle, but, if I were daydreaming, since they did go to the trouble of providing a LINK instruction, it would have been nice to have a CALL instruction that pushes both the return address and the contents of a specified address register, as the parameter stack pointer, something like
CALL <address>:An
where An is the address register to push.
The return might look like
RET An
That would be asking for too much, really, but it would streamline the split-stack discipline.
It might also help with single-stack stack frames done a different way.
[JMR202411290957 addendum end.]
No comments:
Post a Comment