Thursday, November 28, 2024

ALPP 02-31 -- More Looking in the Rear-view Mirror -- Single-stack No Frame Example: 6801

More treasure from the bottom of the pool.

  More Looking in the Rear View Mirror --
Single-stack No Frame Example:
6801

(Title Page/Index)

 

Not much to say that I haven't already said. We've seen frameless for the 6800, both the single-stack frameless discipline of one chapter back and the split-stack frameless discipline that we just finished. I'm not sure but what I should leave the 6801, 6809, and 68000 versions as exercises for the interested reader, but I'm a sucker for easy puzzles, so I'll post them anyway. There are plenty of things an interested reader can think of to try for him- or herself.

One thing to pay attention to as you go through is the fact that I have left the utility routines out. Doing them in-line is not that much more code than a JSR, and I didn't want to hide what's going on. That's how much of an improvement the 6801 is over the 6800.

The down side of doing it in-line (by hand) is that there are more opportunities for mistakes.

Go ahead and read the code and compare, and if you are not sure you understand what's going on, single-step through the code.
* 16-bit addition as example of single-stack no frame discipline on 6801,
* with test code
* Joel Matthew Rees, October, November 2024
*
	OPT	6801
NATWID	EQU	2	; 2 bytes in the CPU's natural integer
*
*
* Blank line will end assembly.
	ORG	$80	; MDOS says this is a good place for usr stuff.
*
ENTRY	JMP	START
	NOP		; Just want even addressed pointers for no reason.
	NOP		; bumper
	NOP		; 6 bytes to this point.
SSAVE	RMB	2	; a place to keep S so we can return clean
	RMB	4	; bumper
* All of the pseudo-registers must be saved and restored on context switch,
* cannot be accessed during interrupt service.
XWORK	RMB	2	; For saving an index register temporarily
DWORK	RMB	2	; For saving D temporarily
LB_BASE	RMB	2	; For process local variables
HPPTR	RMB	2	; heap pointer (not yet managed)
HPALL	RMB	2	; heap allocation pointer
HPLIM	RMB	2	; heap limit
* End of pseudo-registers
	RMB	4	; bumper
GAP1	RMB	2	; Mark the bottom of the gap
*
*
*
	ORG	$2000	; Give the DP room.
LB_ADDR	RMB	4	; a little bumper space
FINAL	RMB	4	; 32-bit Final result in DP variable (to show we can)
FINALX	EQU	4
	RMB	4	; buffer
STKLIM	RMB	192	; roughly 16 to 20 levels of call
STKLIMX	EQU	FINALX+8
STKBAS	RMB	4	; for canary return
STKBASX	EQU	STKLIMX+192
STKFAK	RMB	2	; fake frame pointer, self-link
STKFAKX	EQU	STKBASX+4	; 6801 is post-dec (post-store-decrement) push
STKBMP	RMB	4	; a little bumper space
STKBMPX	EQU	STKFAKX+2	; But we are going to init S through X
*
* My assembler limits RMBs to $100 long, so we'll use a different way.
HBASE	RMB	1	; $1024 or something	; Not using or managing heap yet.
HBASEX	EQU	STKBMPX+4
*HLIM	RMB	4	; bumper
*HLIMX	EQU	HBASEX+$100	; 1024
*
*
	ORG	$3000
CDBASE	JMP	ERROR		; more bumpers
	NOP
INISTK	LDX	#LB_ADDR	; set up process local space
	STX	LB_BASE		; local space functional
	LDD	LB_BASE		; bootstrap own stack
	ADDD	#STKBASX
	STD	XWORK	; avoid using BIOS stack
	LDX	XWORK	; ready own stack pointer
*
	PULA		; pop real return address
	PULB
	STS	SSAVE	; save stack pointer from monitor ROM
	TXS		; move to our own stack (let TXS convert it)
	PSHB		; put return address on own stack
	PSHA		; stack now ready for interrupts, utility routines
*
	LDD	#STKUNDR
	STD	0,X	; in the cell beyond empty stack pointer
	STD	2,X	; and the next cell, for good measure
*
	LDD	LB_BASE	
	ADDD	#HBASEX
	STD	HPPTR		; as if we were ready to use heap
	STD	HPALL
	LDD	#CDBASE
	SUBD	#4
	STD	HPLIM
	RTS		; finally done, now can return
*
***
* Not generating a stack frame
*
* Cross-section of general stack structure in called routine:
* [{LOCVAR}] for calling routine
* [{TEMP}  ] for calling routine
* [PARAM   ] from calling routine
* [RETADR  ] to calling routine
* [LOCVAR  ] for called -- current -- routine
* [TEMP    ] for called -- current -- routine
* [(PARAM) ] to be passed to a further call
*
* Broader cross-section, showing nesting for routine 3, in-flight:
* [RETADR1 ] 
* [LOCVAR2 ]
* [TEMP2   ]
* [PARAM3  ]
* [RETADR2 ]
* [LOCVAR3 ]
* [TEMP3   ]
* [(PARAM4)] <= SP (return stack pointer (6800 S is byte below))
***
*
***
* Utility routines left out
*
* Let the caller do allocation after.
*
* Stack at entry, before allocation
* when functions are called by MAIN
* with two 32-bit parameters
* We will return result in D:X
* [STKUNDR ]
* [STKUNDR ]STKBAS
* [RETADR0 ] 
* [32:VAR1_1]
* [32:VAR1_2]
* [PARAM2_1]
* [PARAM2_2]
* [RETADR1 ] <= SP (return stack pointer (6800 S is byte below))
*
* Signed 16 bit add to 32 bit result
* Handle sign overflow without losing precision.
* input parameters:
*   16-bit left 1st pushed, right 2nd
* output parameter:
*   17-bit sum in 32-bit D:X D high, X low
* Does not alter the parameters.
ADD16S	TSX		; no local allocations
	LDAA	#(-1)	; prepare for sign extension
	TST	4,X	; the left-hand operand sign bit
	BMI	ADD16SR
	CLRA		; zero extend
ADD16SR	PSHA		; push left extension
	PSHA		; left sign cell below X now
	LDAA	#(-1)	; reload
	TST	2,X	; the right-hand operand sign bit
	BMI	ADD16SL
	CLRA		; zero extend
ADD16SL	PSHA		; push right extension
	PSHA
	TSX		; point to sign extensions (4 temporary bytes on stack)
	LDD	8,X	; left-hand low cell
	ADDD	6,X	; right-hand low cell
	STD	XWORK	; save low half of result
	LDD	2,X	; left-hand extension
	ADCB	1,X	; right-hand extension
	ADCA	0,X	; high half done
*
	INS		; fastest to just drop the temporaries
	INS
	INS
	INS
	LDX	XWORK	; get low half of result
	RTS		; result is in D:X
*
* Unsigned 16 bit add to 32 bit result
* input parameters:
*   16-bit left, right
* output parameter:
*   17-bit sum in 32-bit D:X D high
ADD16U	TSX		; no local allocations
	LDD	4,X	; left
	ADDD	2,X	; right
	STD	XWORK	; save low half
	LDD	#0
	ADCB	#0
*
	LDX	XWORK	; get low half of result
	RTS		; result is in D:X
*
* Etc.
*
***
*
* Stack after LINK #0 when fuctions are called by MAIN
* with one input parameter
* (#0 means no local variables)
* [STKUNDR ]
* [STKUNDR ]STKBAS
* [RETADR0 ] 
* [32:VAR1_1]
* [32:VAR1_2] 
* [PARAM2_1] (pointer)
* [PARAM2_2] (addend)
* [RETADR1 ] <= SP (return stack pointer (6800 S is byte below))
*
* To show how to access caller's local through pointer
* instead of walking stack --
* Add 16-bit signed parameter
* to 32 bit caller's 32-bit internal variable.
* input parameter:
*   16-bit pointer to 32-bit integer
*   16-bit addend
* no output parameter:
ADD16SI	TSX		; no local allocations up front
	LDAA	#(-1)
	TST	2,X	; high byte of paramater
	BMI	ADD16SIP
	CLRA
ADD16SIP	PSHA	; save the sign extension half (2 temporary bytes on stack)
	PSHA
	LDX	4,X	; get caller's pointer
	LDD	2,X	; caller's 2nd variable, low
	TSX
	ADDD	4,X	; parameter
	LDX	6,X	; caller's pointer
	STD	2,X	; save result low half away
	LDD	0,X	; caller's 2nd variable, high
	TSX
	ADCB	1,X	; sign extension half
	ADCA	0,X
	LDX	6,X	; caller's pointer
	STD	0,X	; save result high half away
*
	INS		; drop temporary 
	INS
	RTS		; no result to load
*
*
***
* Stack after local allocation
* [STKUNDR ]
* [STKUNDR ]STKBAS
* [RETADR0 ] 
* [32:VAR1_1]
* [32:VAR1_2] <= SP
*
MAIN	LDX	#0
	PSHX		; four pushes is only one byte more than a call. 
	PSHX
	PSHX
	PSHX
*
	LDX	#$1234	; parameters
	PSHX
	LDX	#$CDEF
	PSHX
	JSR	ADD16U	; result in D:X should be $E023
	INS		; could reuse instead of dropping
	INS
	INS
	INS
	PSHX		; low half
	LDX	#$8765
	PSHX
	JSR	ADD16S	; result in D:X should be $FFFF6788
	STX	XWORK
	STD	DWORK
	INS		; could reuse instead of dropping
	INS
	INS
	INS
	TSX
	LDD	XWORK
	STD	2,X
	LDD	DWORK
	STD	0,X
*	LDAB	#0	; calculate pointer
*	ABX		; would use ABX here if there were an offset.
	PSHX
	LDX	#$A5A5
	PSHX
	JSR	ADD16SI		; result in 2nd variable should be FFFF0D2D
	INS		; drop parameters
	INS
	INS
	INS
	TSX
	LDD	2,X		; low half
	LDX	LB_BASE		; store it in FINAL, in process local space
	STD	FINALX+2,X
	TSX
	LDD	0,X		; high half
	LDX	LB_BASE
	STD	FINALX,X
*
	TSX
	LDAB	#8
	ABX
	TXS
	RTS
*
*
***
* Stack at START:
* (what BIOS/OS gave us) <= SP
***
* (who knows?) <= FP
***
* (who knows?) <= VBP
***
*
* Stack after initialization:
* [STKUNDR ]
* [STKUNDR ]STKBAS <= SP
***
*
START	NOP
	JSR	INISTK
	NOP
*
	JSR	MAIN
*
DONE	NOP
ERROR	NOP	; define error labels as something not DONE, anyway
STKUNDR	NOP
	LDS	SSAVE	; restore the monitor stack pointer
	NOP
	NOP		; landing pad to set breakpoint at
	NOP
	NOP
	LDX	$FFFE	; alternatively, jmp through reset vector
	JMP	0,X
*
* Anyway, if running in EXORsim, after RESETting,
* Ctrl-C should bring you back to EXORsim monitor, 
* but not necessarily to your program in a runnable state.

If you've seen enough binary output is still waiting. (And it will still be waiting in a few more hours or days, really.) 

If not, split stack with no stack frames is also great on the 6801, even a bit better than what we saw here.

 -- 

Maybe this would be a good place to bring up (again?) the regrets I have that Motorola didn't include a SBX subtract B from X instruction in the 6801. It would have been useful in the stack allocation code as you can see from where I used (and didn't use) ABX. It would also have been useful to have an add immediate to index AIX op-code, possibly 16-bit to do both allocation and deallocation, or signed 8-bit, or unsigned, paired with a subtract immediate from X (SIX?) instruction.

Yeah, more daydreams. Sorry. --


 (Title Page/Index)

 


 

 

No comments:

Post a Comment