Friday, November 1, 2024

ALPP 02-22 -- Some Address Math for the 6809

  Some Address Math
for the
6809

(Title Page/Index)

Maybe it feels like going around in circles, but address math is so important that I think I should show you explicit 6809 corollaries for the utility address math routines I've shown you for the 6801 and for the 6800.

When instructions become more general, they often take more bytes to encode.  And when you generalize an operation, it often takes more instructions to implement -- even with a more powerful instruction set CPU. And the more you repeat those multiple instructions, the more opportunity you have to make mistakes. 

This is why we define utility routines like we just looked at for the 6800 and 6801.

But in practice with the 6809, this is not usually the case. 

To get a sense of how the size is affected in real code, you will want to compare these examples I give to the concrete examples I have given -- and give later -- for the other processors.

As much as having actual instructions to do the work for you improves things, the more important improvement is eliminating almost all need for pseudo-registers that have to be managed when switching processes.

Remember to read the code and the comments in the code, and open up separate browser windows to compare side-by-side with the 6800 and 6801. Reading code is important.

Let me say it again: 

No need for pseudo-registers on either the 6809 or 68000!

Unless you really want to synthesize a third stack or something on the 6809. 

Almost -- That's modulo per-process global variables, depending on how you handle them. And modulo some use of stack as temporaries instead of pseudo-registers, because stack is just a better place for temporaries, and is so easily accessed on the 6809.

Let's look at the 6809 code. 

You'll (hopefully) notice that mapping the abstract operations to the 6809 works out somewhat different than for the 6800 and 6801. So I'm showing the 6809 code in a single block and relying more on comments in the code. The order of presentation is roughly the same, so it should be easy enough to find what to compare with what. 

One of the reasons I demonstrate an alternate way to NEGate the Double accumulator is to demonstrate a very useful way to use the stack to avoid using temporary variables in memory. (I guess I need to go back and make this explicit in the 6801 and 6800 address math chapters.)

Do not miss the fact that the 6809 has four indexable registers, and all the address math instructions work for all four indexable registers -- where the routines may not! Where I say in-line, that means just use the instructions rather than calling the routines.  

[JMR202411070913 addendum:]

I don't think I've explained the "here pointer" symbol and idiom yet:

ESPHIB	EQU	*

In Motorola assemblers, an asterisk where the assembler could parse an address means the location of the current instruction or directive, thus, "here". I will have to explain it further later.

[JMR202411070913 addendum end.]

(If you're wondering, fix the mnemonics for the required register -- LEAX for X, LEAY for Y, LEAU for U, LEAS for S, etc. And don't forget the addressing mode index registers. And, no, don't include the RTS at the end when you're inserting the code in-line. 8-/ I know you caught all that, but some people just copy-and-paste without thinking.)

* 6809 pointer math
*	ORG	$80
*	...
*XOFFA	RMB	1 ; don't need these at all
*XOFFB	RMB	1
*XOFFSV	RMB	2
*	...

	ORG	SOMETHING
* All of these work fine in-line, rather than called as subroutines
*
* Two ways to negate D on the 6809:
NEGD	COMA		; 6800 version -- still no NEGD
	NEGB            ; and sign extending doesn't help.
	BNE	NEGDX	; or BCS. but BNE works -- extends 0
	INCA
NEGDX	RTS
*
NEGDS	PSHS	D	; slightly slower, uses stack
	LDD	#0
	SUBD	,S++
	RTS
*
* Unsigned byte offset
* Absolutely should in-line. X only.
ADDBX	ABX	; X only
	RTS
*
* For unsigned byte offset other than X, zero extend B into A
* Destroys A.
* Should in-line for Y or U. Should use ABX for X. Must in-line for S.
ADDBY	CLRA	; for Y/U/S, zero extend B for unsigned offset
	LEAY	D,Y
	RTS
*
* Signed byte offset
* Should in-line for X, Y or U. Must in-line for S.
ADSBX	LEAX	B,X	; sign extended B, Y/U/S also
	RTS
*
* Signed byte offset
* Should in-line for X, Y or U. Must in-line for S.
SBSBX	NEGB		; signed subtract B, Y/U/S also
	LEAX	B,X
	RTS
*
* Unsigned byte offset, zero extend A
* Destroys A
* Could in-line for X, Y or U. Must in-line for S.
SUBBX	CLRA	; B is unsigned, therefore positive
* 16-bit offset, must in-line for S.
SUBDX	COMA		; no NEGD
	NEGB
	BNE	ADDDX	; or BCS. but BNE works -- extends
	INCA
* 16-bit offset, must in-line for S
ADDDX	LEAX	D,X	; Y/U/S also
	RTS

* Alternatively, use D for explicit subtraction
* Here as an example of math that can be done,
* probably not as a useful subroutine.
SUBBXS	CLRA	; B is unsigned, destroys A
SUBDXS	PSHS	D	; for subtraction
	EXG	X,D	; X to subtract, save D
	SUBD	,S++	; do the subtraction
	EXG	X,D	; Offset result to X, restore D
	RTS

* No particular reason to try to use ABX in signed byte offset.
* This is a solution to a puzzle, not useful code.
* You don't really want to do this.
ADDSBX	TSTB
	BPL	ADDSBXA
	LEAX	B,X	; Absolutely no reason not to use this in the first place.
	RTS
ADDSBXA	ABX
	RTS

*************
* For S stack
* As mentioned above, just in-line the LEAS.
* These are also provided as a solution to a puzzle,
* not as useful code.
*
* Signed byte offset
ADSBS	PULS	X	; get return address, restore stack address
	LEAS	B,S	; you really could just in-line this.
	JMP	,X	; return via X
*
* Unsigned byte offset, zero extend A, destroys A, X
ADDBS	CLRA		; just in-line the CLRA and the LEAS D,S
* 16-bit offset
ADDDS	PULS	X	; get return address, restore stack address
	LEAS	D,S
	JMP	,X	; return
*
* Do you really want to do this?
* Unsigned byte offset, zero extend into A, destroys A
SUBBS	CLRA
SUBDS	COMA
	NEGB
	BNE	ADDDS	; or BCS. but BNE works -- extend
	INCA
	BRA	ADDDS	; let ADDDS handle the return address and the math
* Do the math in D for explicit subtraction
* No more useful than the rest of this for X.
* Here just as an example of math that can be done.
SUBBSS	CLRA	; B is unsigned, destroys A
SUBDSS	LDX	,S	; get return address
	STD	,S	; save D
	TFR	S,D	; get S without endangering the stack
	ADDD	#2	; adjust for having D on the stack
	SUBD	,S	; finally subtract the offset
* Alternative 1, leaves D destroyed
	TFR	D,S	; update stack pointer
	JMP	,X	; return via X
* Alternative 2, restores offset in D
	PSHS	D	; working realllllly hard not to destroy D.
	LDD	2,S	; got the offset
	LDS	,S	; update S
	JMP	,X

* INX and DEX trains and INS and DES trains are meaningless.
* HOWEVER, just to remind ourselves:
* (And all of these work for Y and U, too but IN-LINE them!!)
* (They work for S if in-lined, as well.)
ADD16X	LEAX	16,X
	RTS
ADD14X	LEAX	14,X
	RTS
SUB16X	LEAX	-16,X
	RTS
* Etc. In-line these.
INX	LEAX	1,X	; Sigh. In-line it. Do not make trains with it. Please.
	RTS
DEX	LEAX	-1,X	; See INX. In-line it. Do not make trains with it. PLEASE.
*
* More solutions to puzzles.
* If you called these, you would have to juggle the return address as shown.
* You don't want to do that.
* Just in-line the LEAS instructions.
* Then there's no return address to juggle, no messing with X.
* DO NOT USE THIS CODE other than examples of silly walks.
ADD16S	PULS	X
	LEAS	16,S
	JMP	,X
* etc.
* Could all be replaced with just LEAS	16,S; in-line!
* That's actually cheaper than just the instruction JSR!!!


* And stacks restricted within page boundaries make no sense at all on the 6809.
* Pseudo-register somewhere in DP:
QSP	RMB	2	; a synthetic stack pointer Q
	...
	ORG	SOMETHING
	RMB	4	; buffer zone
QSTKLIM	RMB	64
QSTKBAS	RMB	4	; buffer zone
SSTKLIM	RMB	32
SSTKBAS	RMB	4	; buffer zone
	...

* signed B for synthetic stack:
ADBQSP	LDX	QSP
ADBQSX	LEAX	B,X	; does the whole pointer, negatives, too
	STX	QSP
	RTS
*
* unsigned B and D for synthetic stack:
ADUQSP	CLRA		; unsigned B entry point
ADDQSP	LDX	QSP
ADDQSX	LEAX	D,X	; does the whole pointer, negatives, too
	STX	QSP
	RTS
*
* Choose whether you want to negate D or move it around, and see above.
* Or just decide you can add a negative instead of subtracting
*
* Destroys A
SBSQSP	SEX	; sign extend B into A (Yes, that's the mnemonic.)
	BRA	SBDQSP
SBUQSP	CLRA	; B is unsigned, therefore positive
* 16-bit offset
SBDQSP	COMA		; no NEGD
	NEGB
	BNE	ADDQSX	; or BCS. but BNE works -- extends
	INCA
	BRA	ADDQSX

* Alternatively, use D for explicit subtraction
SBSQSPS	SEX	; sign extend B into A (Yes, that's the mnemonic.)
	BRA	SBDQSPS
SBUQSPS	CLRA	; B is unsigned, destroys A
SBDQSPS	PSHS	D	; for subtraction
	LDD	QSP	; Get things in the right place
	SUBD	,S++	; do the subtraction
	STD	QSP	; update
	RTS

* More stuff that there is no reason to do.
* Just in-line the LEAS B,S
ADBSP	PULS	X	; return address
	LEAS	B,S	; signed B, but full 16-bit address math.
	JMP	,X
*
* Just in-line the LEAS D,S
* D for return stack (but we saw this above):
ADDSP	PULS	X	; return address
	LEAS	D,S
	JMP	,X
*
* Just in-line the NEGB	and LEAS B,S, Still cheaper than the call.
* signed B for return stack:
SBBSP	PULS	X	; return address
	NEGB
	LEAS	B,S	; full 16-bit address math
	RTS
*
* This one might be worth a routine for,
* if you actually have to do it.
* D for return stack (but we saw this above):
SBDSP	PULS	X	; return address
	COMA
	NEGB
	BNE	SBDSPM
	INCA
SBDSPM	LEAS	D,S
	JMP	,X
* or
SBDSPS	LDX	,S	; return address	
	STD	,S	; offest
	TFR	S,D
	ADDD	#2	; adjust it
	SUBD	,S
	TFR	D,S
	JMP	,X

As you can see, the 6809 just basically does almost all the address math you need without subroutines.

Uhm, until we get to arrays, but let's not do that yet.

[JMR202411031752 correction:]

In the comments to the code, I suggested (or asserted?) that there would be no reason on the 6809 to allocate a stack entirely within a single page so that the stack pointer math would never overflow, and the increment and decrement could be handled with the INC and DEC instructions only, ignoring overflow.

On my way to bed last night, I realized that would not entirely be true.

Pointer variables in the direct page cannot be indirected without loading the variable into an index register. So if your top of stack pointer is process local, there would be no point in not using the auto-inc/dec modes and LEA instructions to do the index updates.

But if the synthesized stack or queue is global to all processes (such as a system resource allocation stack or queue), it may be reasonable to use absolute (extended mode) addressing, in which case memory indirection is available. In that case, it may be completely sensible to use the optimization of no-overflow INC or DEC in a stack or queue allocated entirely within a single page:

* A synthetic stack contained entirely in a page,
* using absolute (extended mode) addressing:
	ORG	$400	; anywhere that ESPLOB to ESPHIB-1 are all within a page
ESPLOB	RMB	4	; bumper, lowest related address
ESPLIM	RMB	64	; 32 2-byte items possible on stack
ESPBAS	RMB	4	; bumper
ESPHIB	EQU	*	; highest related address (plus 1)
	...
ESP	RMB	2	; only the low byte will change
	...
EPSHD	DEC	ESP+1	; stack all within a page!
	DEC	ESP+1	; no carry
	STD	[ESP]	; indirection
	RTS
*
EPOPD	LDD	[ESP]	; indirection
	INC	ESP+1	; stack all within a page!
	INC	ESP+1	; no carry
	RTS
*
ADDBESP	ADDB	ESP+1	; signed
	STB	ESP+1	
	RTS
*
SUBBESP	PSHS	B	; unsigned
	LDB	ESP
	ADDB	,S+
	STB	ESP
	RTS

Hopefully, I can devote a chapter or three to giving this proper treatment somewhere down the road.

[JMR202411031752 correction end.]

Oh, and I have mentioned, I think, the DP register, how it isn't as fully supported as I'd have liked

The DP can be used as a base for per-process global variables (in other words, variables local to the process, but globally/statically allocated within the process). I discussed this to a certain extent in the 6800 addressing math chapter.

* On the 6800 or 6801, this would be reference by a process-local
* LOCALBASE or similar pseudo-register, which I almost forgot to talk about.
* How to get the effective address of a variable in DP:
* Instead of 
*	LEAX	<VAR
* or
*	LEAX	VAR,DP
* or even 
*	LEAX	VAR-DPBASE,DP
* which we do not have in the 6809,
* we can do this --
*
* Given 
	ORG	$nn00		; even 256-byte page address
	SETDP	$nn
DPBAS	EQU	*
*	...
VAR	RMB	m
*
* In-line snippets --
* For variable VAR within 256 bytes of DPBAS:
	...
	LDB	#VAR-DPBASE	; put the offset in DP in B (unsigned)
	TFR	DP,A		; pull the base address high byte into A
	TFR	D,X		; move it to X
	...
*
* Using DP when VAR is 256 bytes or more away from DPBAS:
	...
	TFR	DP,A		; pull the base address high byte into A
	CLRB			; make the full base address
	ADDD	#VAR-DPBASE	; add the offset
	TFR	D,X		; move it to X
	...
*
* Or, if the assembler lets us split the offset up with advanced math:
	...
	TFR	DP,A
	LDB	#(VAR-DPBASE)&$FF	; bit-and mask -- no carry!
	ADDA	#(VAR-DPBASE)/$100	; add the high byte
	TFR	D,X
	...
* 
* As subroutines --
* unsigned offset in B:
LEADPUX	TFR	DP,A		; pull the base address high byte into A
	TFR	D,X		; move it to X
	RTS
*
* unsigned offset in D:
LEADPDX	TFR	DP,A		; pull the base address high byte into A
	CLRB			; make the full base address
	ADDD	#VAR-DPBASE	; add the offset
	TFR	D,X		; move it to X
	RTS
*
* Because DP is not in the index post-byte,
* in some applications, it may be better to keep 
* LOCBAS as a pseudo-register,
* in which case it would look like this --
* for small offsets < 128: 
ADDLBB	LDX	<LOCBAS	; but do this in-line!
	LEAX	B,X
	RTS
* for 127 < offset < 256, maybe, maybe not:
ADDLBU	CLRA		; unsigned offset
* for larger offsets
ADDLBD	LDX	<LOCBAS	; and definitely do this in-line, too!
	LEAX	D,X
	RTS	

As with the previous two chapters, I have not tested the code. It should run, modulo typos.

Even though I keep saying things like "in-line this", and "you don't need that", it may be hard to visualize the impact that 6809 addressing modes has on addressing math until we compare the stack frame code for the 6800 and 6801 to the stack frame code for the 6809.

Likewise the 68000. But let's get an overview of addressing math on the 68000 before we take a look at a concrete example of stack frames on the 6801. And on our way to addressing math on the 68000, let's take a detour for multi-byte negation on the 6809.

Or you can jump ahead to getting numeric output in binary.


(Title Page/Index)


 

 

 

 

No comments:

Post a Comment