Well, this one did sit at the bottom of the pool for a little while. Real world stuff interfering. Concrete examples are still useful.
Ascending the Wrong Island --
Single-stack Stack Frame Example:
6809
This is more concrete work to elucidate
the problems in single-stack stack frames on the 6801. I'm translating
the concrete example for the 68000
to the 6809 here.
Again, I do not recommend a single stack discipline. But most of the current
"modern" software engineering infrastructure is built on this discipline, so
it helps to have code that allows us to compare the single stack approach with
the split stack approach. I am providing example of both for the 6809 here,
the split stack example below the single stack example.
With the 6809 written and checked, it should become possible to write a concrete example for the 6801 and even the 6800.
Again, I'm leaving the discussion for the comments, in the (not quite realistic) hopes that the
comments will be more accurate than free-form prose.
* 16-bit addition as example of single-stack stack frame discipline on 6809
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $2000 ; MDOS says this is a good place for usr stuff.
* SETDP $20 ; for some other assemblers
SETDP $2000 ; for EXORsim
*
ENTRY LBRA START
NOP ; Just want even addressed pointers for no reason.
NOP ; bumper
NOP
SSAVE RMB 2 ; a place to keep S so we can return clean
SSAVEX EQU 6 ; manufacture offsets for assemblers that can't do SSAVE-ENTRY
USAVE RMB 2 ; just for kicks, save U, too
USAVEX EQU SSAVEX+2
DPSAVE RMB 2 ; a place to keep DP so we can return clean
DPSAVEX EQU USAVEX+2
RMB 4 ; bumper
XWORK RMB 2 ; For saving an index register temporarily
XWORKX EQU DPSAVEX+6
HPPTR RMB 2 ; heap pointer (not yet managed)
HPPTRX EQU XWORKX+2
HPALL RMB 2 ; heap allocation pointer
HPALLX EQU HPPTRX+2
RMB 4 ; bumper
FINAL RMB 4 ; 32-bit Final result in DP variable (to show we can)
FINALX EQU HPALLX+6
GAP1 RMB 2 ; Mark the bottom of the gap
GAP1X EQU FINALX+4
*
LB_ADDR EQU ENTRY
*
*
SETDP 0 ; Not yet set up
ORG $2100 ; Give the DP room.
RMB 4 ; a little bumper space
SSTKLIM RMB 96 ; roughly 16 levels of call
SSTKLIMX EQU $104
* ; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS RMB 6 ; for canary return
SSTKBASX EQU SSTKLIMX+96
SSTKFAK RMB 2 ; fake frame pointer, self-link
SSTKFAKX EQU SSTKBASX+6
SSTKBMP RMB 4 ; a little bumper space
SSTKBMPX EQU SSTKFAKX+2
*
HBASE RMB $1024 ; Not using or managing heap yet.
HBASEX EQU SSTKBMPX+4
HLIM RMB 4 ; bumper
HLIMX EQU HBASEX+$1024
*
*
* If we had DP relative in postbyte,
* and if DP were defined for 2-byte transfers as DP:00,
* we could do this:
*INISTK LEAX 0,DP
* LEAY ENTRY,PCR ; Set up new DP base
* TFR Y,DP ; I think this would actually work, but isn't documented.
* STX <DPSAVE
* (If wishes were fishes ....)
* Calculate DP because we don't have DP relative in index postbyte:
INISTK TFR DP,A
CLRB
TFR D,X ; save old DP base for a moment
LEAY ENTRY,PCR ; Set up new DP base
TFR Y,D
TFR A,DP ; Now we can access DP variables correctly.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
STX <DPSAVE ; technically only need to save high byte
STU <USAVE
PULS X ; get return address
STS <SSAVE ; Save what the monitor gave us.
LEAS SSTKFAKX,Y ; Move to our own stack
STS ,S ; self-link as fake frame pointer
LEAY STKUNDR,PCR ; fake return to stack underflow handler
PSHS Y ; Using U would conflict with frame pointer use
* STS ,--S ; This would not work even if emulated correctly
LEAU -2,S ; self-link as fake frame pointer
PSHS U ; U is FP, S and U equal
PSHS Y ; one more fake return to handler
* Because we don't have DP (long) relative in postbyte,
* and can't do
* LEAY HBASEX,DP
* calculate it:
CLRB ; A still has run-time DP
ADDD #HBASEX ; calculat EA
TFR D,Y ; as if we actually had a heap
STY <HPPTR
STY <HPALL
JMP ,X ; return via X
*
***
* Stack after LINK #0 when fuctions are called by MAIN
* with two parameters
* (#0 means no local variables)
* We will return result in D0:D1
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ]
* [FRMPTR0 ] <= FRMPTR1
* [--------]
* [--------]
* [PARAM2_1]
* [PARAM2_2]
* [RETADR1 ]
* [FRMPTR1 ] <= FP,SP
*
* Signed 16 bit add to 32 bit result
* Handle sign overflow without losing precision.
* input parameters:
* 16-bit left 1st pushed, right 2nd
* output parameter:
* 17-bit sum in 32-bit D:X D high, X low
ADD16S PSHS U ; mark
TFR S,U ; link, no allocate
LDX #-1 ; sign extend right
TST 4,U ; sign bit, anyway
BMI ADD16SR
LEAX 1,X ; 0
ADD16SR PSHS X ; push right extension
LDX #-1 ; negative
LDD 6,U ; left
BMI ADD16SL
LEAX 1,X ; 0
ADD16SL PSHS X ; push left extension
ADDD 4,U ; add right
TFR D,X ; save low
PULS D ; get left sign extension
ADCB 1,S ; carry is still safe
ADCA ,S ; high word complete
TFR U,S ; result is in D:X
PULS U ; unlink
RTS ; C, N valid, Z not valid
*
* Unsigned 16 bit add to 32 bit result
* input parameters:
* 16-bit left, right
* output parameter:
* 17-bit sum in 32-bit D:X D high
ADD16U PSHS U ; mark
TFR S,U ; link, no allocate
LDD 6,U ; left
ADDD 4,U ; add right
TFR D,X ; save low
LDD #0 ; extend
ADCB #0 ; extend Carry unsigned (could ROL in)
TFR U,S ; unlink (unecessary here, but ...)
PULS U
RTS ; C, N valid, Z not valid
*
* Etc.
*
***
* Stack after LINK #0 when fuctions are called by MAIN
* with one parameter
* (#0 means no local variables)
* We will return result in D0:D1
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ]
* [FRMPTR0 ] <= FRMPTR1
* [VAR1_1--]
* [VAR1_2--]
* [PARAM2_1]
* [RETADR1 ]
* [FRMPTR1 ] <= FP,SP
*
* To show how to walk the stack --
* Add 16-bit signed parameter
* to 32 bit caller's 2nd 32-bit internal variable.
* input parameter:
* 16-bit addend
* target parameter in caller
* 2nd 32-bit variable at offset -2*NATWID
* no output parameter:
SUB16SI PSHS U ; mark
TFR S,U ; link, no allocate
LDY ,U ; get caller's FP back
LDX #-1 ; sign extend (only) parameter
TST 4,U
BMI SUB16SIP
LEAX 1,X
SUB16SIP PSHS X
LDD -6,Y ; caller's 2nd variable, low
ADDD 4,U ; 1st (only) parameter
STD -6,Y ; update low half
LDD -8,Y ; caller's 2nd variable, high
ADCB 1,S
ADCA ,S
STD -8,Y
TFR U,S ; unlink
PULS U
RTS ; C, N valid, Z not valid
*
*
***
* Stack after LINK
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= FRMPTR0
* [RETADR0 ]
* [FRMPTR0 ] <= FP
* [32:VAR1_1]
* [32:VAR1_2] <= SP
*
MAIN PSHS U ; mark
TFR S,U ; link
* LEAS -8,S ; allocate 2 32-bit variables
* LDD #0 ; (showing how to access)
* STD -8,U ; clear the variables
* STD -6,U ; there is a slightly faster way
* STD -4,U
* STD -2,U
* slightly faster, fewer bytes, too:
LDD #0
TFR D,X
PSHS D,X
PSHS D,X
*
LDX #$1234
* PSHS X ; yes we could push D and X together
LDD #$CDEF
* PSHS D
PSHS D,X
LBSR ADD16U ; result in D:X should be $E023
LEAS 4,S ; could reuse instead of dropping
* PSHS X
LDD #$8765
* PSHS D
PSHS D,X
LBSR ADD16S ; result in D1 should be $FFFF6788 (and carry set)
LEAS 4,S
STD -8,U
STX -6,U
LDD #$A5A5
PSHS D
LBSR SUB16SI ; result in 2nd variable should be FFFF0D2D (Carry set)
LDD -8,U
STD <FINAL
LDD -6,U
STD <FINAL+2
TFR U,S
PULS U
RTS ; C, N valid, Z not valid
*
*
***
* Stack at START:
* (what BIOS/OS gave us) <= SP (A7)
***
* (who knows?) <= FP (A6)
***
*
* Stack after initialization:
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FP
* [STKUNDR ]SSTKBAS <= SP
***
* Stack after LINK (at call to MAIN)
* [<SELF> ] <= <SELF>
* [STKUNDR ]
* [<SELF> ] <= <SELF>,FRMPTRX
* [STKUNDR ]SSTKBAS
* [FRMPTRX=SSTKBAS+NATWID ] <= SP,FP
*
START NOP
LBSR INISTK
NOP
*
PSHS U ; mark
TFR S,U ; link
*
LBSR MAIN
*
DONE NOP
ERROR NOP ; define error labels as something not DONE, anyway
STKUNDR NOP
LDS <SSAVE ; restore the monitor stack pointer
LDD <DPSAVE ; restore the monitor DP last
TFR A,DP
SETDP 0 ; For lack of a better way to set it.
NOP
NOP ; landing pad to set breakpoint at
NOP
NOP
JMP [$FFFE] ; alternatively, jmp through reset vector
*
* Anyway, if running in EXORsim, after RESETting,
* Ctrl-C should bring you back to EXORsim monitor,
* but not necessarily to your program in a runnable state.
Note, in the above, that moving the STKUNDR and ERROR labels away from DONE
makes it possible to set a breakpoint at DONE which would not be taken if the
code failed to finish properly. This would be the case if it returned to
STKUNDR via a fake return or (hypothetically) jumped to ERROR.
Again, I've tested the code. It runs. It builds the stack frames and tears them down as advertised. And, as always, I will not guarantee that this code can be generalized. Nor will I guarantee that it can be generated by any real compiler.
Again for comparison and for grins, let's see what it might look like with
split stacks and a literal frame pointer.
* 16-bit addition as example of split-stack stack frame discipline on 6809
* using the direct page,
* with test code
* Joel Matthew Rees, October 2024
*
NATWID EQU 2 ; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
ORG $2000 ; MDOS says this is a good place for usr stuff.
* SETDP $20 ; for lwasm and some other assemblers
SETDP $2000 ; for EXORsim and some other assemblers
*
ENTRY LBRA START
NOP ; Just want even addressed pointers for no reason.
NOP ; bumper
NOP
SSAVE RMB 2 ; a place to keep S so we can return clean
SSAVEX EQU 4 ; manufacture offsets for assemblers that can't do SSAVE-ENTRY
USAVE RMB 2 ; just for kicks, save U, too
USAVEX EQU SSAVEX+2
DPSAVE RMB 2 ; a place to keep DP so we can return clean
DPSAVEX EQU USAVEX+2
RMB 4 ; bumper
XWORK RMB 2 ; For saving an index register temporarily
XWORKX EQU DPSAVEX+6
FMTMP RMB 2 ; For saving the stack mark in Y temporarily
FMTMPX EQU XWORKX+2
HPPTR RMB 2 ; heap pointer (not yet managed)
HPPTRX EQU FMTMPX+2
HPALL RMB 2 ; heap allocation pointer
HPALLX EQU HPPTRX+2
RMB 4 ; bumper
FINAL RMB 4 ; 32-bit Final result in DP variable (to show we can)
FINALX EQU HPALLX+6
GAP1 RMB 2 ; Mark the bottom of the gap
GAP1X EQU FINALX+4
*
LB_ADDR EQU ENTRY
*
*
SETDP 0 ; Not yet set up
ORG $2100 ; Give the DP room.
RMB 4 ; a little bumper space
SSTKLIM RMB 32 ; roughly 16 levels of call
SSTKLIMX EQU $104
* ; 6809 is pre-dec (pre-store-decrement) push
SSTKBAS RMB 6 ; for canary return
SSTKBASX EQU SSTKLIMX+96
SSTKFAK RMB 2 ; fake frame pointer, self-link
SSTKFAKX EQU SSTKBASX+6
SSTKBMP RMB 4 ; a little bumper space
SSTKBMPX EQU SSTKFAKX+2
PSTKLIM RMB 64 ; 16 levels of call at two parameters per call
PSTKLIMX EQU SSTKBMPX+4
PSTKBAS RMB 4 ; bumper space -- parameter stack is pre-dec
PSTKBASX EQU PSTKLIMX+64
*
HBASE RMB $1024 ; Not using or managing heap yet.
HBASEX EQU PSTKBASX+4
HLIM RMB 4 ; bumper
HLIMX EQU HBASEX+$1024
*
*
* If we had DP relative in postbyte,
* and if DP were defined for 2-byte transfers as DP:00,
* we could do this:
*INISTK LEAX 0,DP
* LEAY ENTRY,PCR ; Set up new DP base
* TFR Y,DP ; I think this would actually work, but isn't documented.
* STX <DPSAVE
* (If wishes were fishes ....)
* Calculate DP because we don't have DP relative in index postbyte:
INISTKS TFR DP,A
CLRB
TFR D,X ; save old DP base for a moment
LEAY ENTRY,PCR ; Set up new DP base
TFR Y,D
TFR A,DP ; Now we can access DP variables correctly.
* SETDP $20 ; some other assemblers
SETDP $2000 ; EXORsim
STX <DPSAVE ; technically only need to save high byte
STU <USAVE
PULS X ; get return address
STS <SSAVE ; Save what the monitor gave us.
LEAS SSTKFAKX,Y ; Move to our own return stack
LEAU PSTKBASX,Y ; and our own parameter stack
LEAY STKUNDR,PCR ; fake return to stack underflow handler
PSHS Y
PSHS U ; fake link to empty stack
PSHS Y ; one more fake return to stack underflow handler
CLRB ; A still has run-time DP
ADDD #HBASEX ; calculat EA
TFR D,Y ; as if we actually had a heap
STY <HPPTR
STY <HPALL
JMP ,X ; return via X
*
*
***
* Return stack when functions are called by MAIN
* Return stack on entry:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>] <= RSP
* [RETADR1 ]
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>]
* [RETADR1 ]
* [FRMPTR1 ] <= RSP
*
* Parameter stack when called by MAIN
* with two 32-bit local variables
* and two 16-bit parameters,
* after mark (no local allocation)
* [<unknown>] <= FRMPTR0,FRMPTR1
* [32:VAR1_1--]
* [32:VAR1_2--]
* [16:PARAM2_1]
* [16:PARAM2_2] <= PSP,FP
*
* Signed 16 bit add to 32 bit result
* Handle sign overflow without losing precision.
* input parameters:
* 16-bit left, right
* output parameter:
* 17-bit sum in 32-bit
ADD16S PSHS Y ; link, mark, and restore could be optimized out.
TFR U,Y ; mark
LDX #-1 ; sign extend right
TST ,Y ; sign bit, anyway (Use Y to show it can be used.)
BMI ADD16SR
LEAX 1,X ; 0
ADD16SR PSHU X ; push right extension
LDX #-1 ; negative
LDD 2,Y ; left
BMI ADD16SL
LEAX 1,X ; 0
ADD16SL PSHU X ; push left extension
ADDD ,Y ; add right
STD 2,Y ; save low
PULU D ; get left sign extension
ADCB 1,U ; carry is still safe
ADCA ,U++ ; high word complete, tricky postinc
STD ,Y
PULS Y ; restore FP
RTS ; C, N valid, Z not valid
*
* Alternative: no link, mark, or restore:
*ADD16S LDX #-1 ; sign extend right
* TST ,U ; sign bit, anyway (Use Y to show it can be used.)
* BMI ADD16SR
* LEAX 1,X ; 0
*ADD16SR PSHU X ; push right extension
* LDX #-1 ; negative
* LDD 4,U ; left
* BMI ADD16SL
* LEAX 1,X ; 0
*ADD16SL PSHU X ; push left extension
* ADDD 4,U ; add right
* STD 6,U ; save low
* PULU D ; get left sign extension
* ADCB 1,U ; carry is still safe
* ADCA ,U++ ; high word complete, sneaky postinc
* STD ,U
* RTS ; C, N valid, Z not valid
*
* Unsigned 16 bit add to 32 bit result
* input parameters:
* 16-bit left, right in 32-bit
* output parameter:
* 17-bit sum in 32-bit D1
ADD16U PSHS Y ; link, mark, and restore could be optimized out.
TFR U,Y ; mark
LDD 2,Y ; left
ADDD ,Y ; add right
STD 2,Y ; save low
LDD #0 ; extend
ROLB ; extend Carry unsigned (could ADC #0)
STD ,Y
PULS Y ; restore FP
RTS ; C, N valid, Z not valid
*
* Etc.
*
*
***
* Parameter stack when called by MAIN
* with one 16-bit parameters,
* after mark (no local allocation)
* [<unknown>] <= FRMPTR0,FRMPTR1
* [32:VAR1_1--]
* [32:VAR1_2--]
* [16:PARAM2_1] <= PSP,FP
*
* To show how to walk the stack --
* Add 16-bit signed parameter
* to 32 bit caller's 2nd 32-bit internal variable.
* input parameter:
* 16-bit addend in 32-bit
* target parameter in caller
* 2nd 32-bit variable at offset -2*NATWID
* no output parameter:
SUB16SI PSHS Y ; link, mark, and restore could be optimized out.
TFR U,Y ; mark
LDX ,S ; get caller's FP back
LDD #-1 ; sign extend (single) parameter
TST ,Y
BMI SUB16SIP
LDD #0
SUB16SIP PSHU D ; save sign extension
LDD -6,X ; caller's 2nd variable, low
ADDD ,Y ; single parameter
STD -6,X ; update low half
LDD -8,X ; caller's 2nd variable, high
ADCB 1,U ; sign extension low byte
ADCA ,U ; high byte
STD -8,X ; store result
TFR Y,U ; drop parameter and sign extension
PULS Y ; restore FP
RTS ; C, N valid, Z not valid
*
*
*
***
* Return stack on entry:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ] <= RSP
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>]
* [RETADR0 ]
* [FRMPTR0==<EMPTYP>] <= RSP
*
* Parameter stack after mark and local allocation
* [<unknown>] <= FP,FRMPTR0
* [VAR1_1--]
* [VAR1_2--] <= PSP
*
MAIN PSHS Y ; link
TFR U,Y ; mark
LDD #0 ; allocate and initialize
TFR D,X
PSHU D,X
PSHU D,X
LDX #$1234
LDD #$CDEF
PSHU D,X
LBSR ADD16U ; 32-bit result on parameter stack should be $0000E023
LEAU 2,U ; drop high part (could be optimized out).
LDD #$8765
PSHU D
LBSR ADD16S ; result on parameter stack should be $FFFF6788 (and carry set)
PULU D,X
STX -6,Y
STD -8,Y
LDD #$A5A5
PSHU D
LBSR SUB16SI ; result in 2nd variable should be FFFF0D2D (Carry set)
LDD -6,Y
STD <FINAL+2
LDD -8,Y
STD <FINAL
PULS Y
RTS
*
*
***
* Stack at START:
* (what BIOS/OS gave us) <= SP (A7)
***
* (who knows?) <= FP (A6)
***
*
***
* Return stack will always be in pairs:
* [RETADRNN ]
* [CALLERFMNN]
*
* Return stack after initialization:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS <= RSP
*
* Return stack after link:
* [STKUNDR ]
* [<EMPTYP>]
* [STKUNDR ]SSTKBAS
* [FRMPTRm1==<EMPTYP>] <= RSP
*
* Parameter stack after initialization, mark:
* [<unknown] <= PSP,FP==<EMPTYP>
*
START LBSR INISTKS
PSHS U ; link
TFR U,Y ; mark in Y (will often not be used).
*
LBSR MAIN
*
*
DONE NOP
ERROR NOP ; define error labels as something not DONE, anyway
STKUNDR NOP
LDS <SSAVE ; restore the monitor stack pointer
LDU <USAVE ; restore U
LDD <DPSAVE ; restore the monitor DP last
TFR A,DP
SETDP 0 ; For lack of a better way to set it.
NOP
NOP ; landing pad to set breakpoint at
NOP
NOP
JMP [$FFFE] ; alternatively, jmp through reset vector
*
* Anyway, if running in EXORsim, after RESETting,
* Ctrl-C should bring you back to EXORsim monitor,
* but not necessarily to your program in a runnable state.
As a reminder, we've already seen what this kind of code looks like without stack frames.
I'm going to go ahead at some point relatively soon and try to get this example converted to the 6801. It should be concrete enough. But it's got a lot of pointer manipulation the hard way in it, so I'm going to do another chapter on address math first. If you aren't interested in long sequences of INX and DEX, go ahead and move on to
getting numeric output in binary.
No comments:
Post a Comment