Some Address Math
for the
6809
Maybe it feels like going around in circles, but address math is so important that I think I should show you explicit 6809 corollaries for the utility address math routines I've shown you for the 6801 and for the 6800.
When instructions become more general, they often take more bytes to encode. And when you generalize an operation, it often takes more instructions to implement -- even with a more powerful instruction set CPU. And the more you repeat those multiple instructions, the more opportunity you have to make mistakes.
This is why we define utility routines like we just looked at for the 6800 and
6801.
But in practice with the 6809, this is not usually the case.
To get a sense of how the size is affected in real code, you will want to compare these examples I give to the concrete examples I have given -- and give later -- for the other processors.
As much as having actual instructions to do the work for you improves things,
the more important improvement is eliminating almost all need for
pseudo-registers that have to be managed when switching processes.
Let me say it again:
No need for pseudo-registers on either the 6809 or 68000!
Unless you really want to synthesize a third stack or something on the 6809.
Almost -- That's modulo per-process global variables, depending on how you
handle them. And modulo some use of stack as temporaries instead of
pseudo-registers, because stack is just a better place for temporaries, and is
so easily accessed on the 6809.
Let's look at the 6809 code.
You'll (hopefully) notice that mapping the abstract operations to the 6809 works out somewhat different than for the 6800 and 6801. So I'm showing the 6809 code in a single block and relying more on comments in the code. The order of presentation is roughly the same, so it should be easy enough to find what to compare with what.
One of the reasons I demonstrate an alternate way to NEGate the Double
accumulator is to demonstrate a very useful way to use the stack to avoid
using temporary variables in memory. (I guess I need to go back and make this
explicit in the 6801 and 6800 address math chapters.)
Do not miss the fact that the 6809 has four indexable registers, and all the address math instructions work for all four indexable registers -- where the routines may not! Where I say in-line, that means just use the instructions rather than calling the routines.
[JMR202411070913 addendum:]
I don't think I've explained the "here pointer" symbol and idiom yet:
ESPHIB EQU *
In Motorola assemblers, an asterisk where the assembler could parse an address means the location of the current instruction or directive, thus, "here". I will have to explain it further later.
[JMR202411070913 addendum end.]
(If you're wondering, fix the mnemonics for the required register -- LEAX for X, LEAY for Y, LEAU for U, LEAS for S, etc. And don't forget the addressing mode index registers. And, no, don't include the RTS at the end when you're inserting the code in-line. 8-/ I know you caught all that, but some people just copy-and-paste without thinking.)
* 6809 pointer math
* ORG $80
* ...
*XOFFA RMB 1 ; don't need these at all
*XOFFB RMB 1
*XOFFSV RMB 2
* ...
ORG SOMETHING
* All of these work fine in-line, rather than called as subroutines
*
* Two ways to negate D on the 6809:
NEGD COMA ; 6800 version -- still no NEGD
NEGB ; and sign extending doesn't help.
BNE NEGDX ; or BCS. but BNE works -- extends 0
INCA
NEGDX RTS
*
NEGDS PSHS D ; slightly slower, uses stack
LDD #0
SUBD ,S++
RTS
*
* Unsigned byte offset
* Absolutely should in-line. X only.
ADDBX ABX ; X only
RTS
*
* For unsigned byte offset other than X, zero extend B into A
* Destroys A.
* Should in-line for Y or U. Should use ABX for X. Must in-line for S.
ADDBY CLRA ; for Y/U/S, zero extend B for unsigned offset
LEAY D,Y
RTS
*
* Signed byte offset
* Should in-line for X, Y or U. Must in-line for S.
ADSBX LEAX B,X ; sign extended B, Y/U/S also
RTS
*
* Signed byte offset
* Should in-line for X, Y or U. Must in-line for S.
SBSBX NEGB ; signed subtract B, Y/U/S also
LEAX B,X
RTS
*
* Unsigned byte offset, zero extend A
* Destroys A
* Could in-line for X, Y or U. Must in-line for S.
SUBBX CLRA ; B is unsigned, therefore positive
* 16-bit offset, must in-line for S.
SUBDX COMA ; no NEGD
NEGB
BNE ADDDX ; or BCS. but BNE works -- extends
INCA
* 16-bit offset, must in-line for S
ADDDX LEAX D,X ; Y/U/S also
RTS
* Alternatively, use D for explicit subtraction
* Here as an example of math that can be done,
* probably not as a useful subroutine.
SUBBXS CLRA ; B is unsigned, destroys A
SUBDXS PSHS D ; for subtraction
EXG X,D ; X to subtract, save D
SUBD ,S++ ; do the subtraction
EXG X,D ; Offset result to X, restore D
RTS
* No particular reason to try to use ABX in signed byte offset.
* This is a solution to a puzzle, not useful code.
* You don't really want to do this.
ADDSBX TSTB
BPL ADDSBXA
LEAX B,X ; Absolutely no reason not to use this in the first place.
RTS
ADDSBXA ABX
RTS
*************
* For S stack
* As mentioned above, just in-line the LEAS.
* These are also provided as a solution to a puzzle,
* not as useful code.
*
* Signed byte offset
ADSBS PULS X ; get return address, restore stack address
LEAS B,S ; you really could just in-line this.
JMP ,X ; return via X
*
* Unsigned byte offset, zero extend A, destroys A, X
ADDBS CLRA ; just in-line the CLRA and the LEAS D,S
* 16-bit offset
ADDDS PULS X ; get return address, restore stack address
LEAS D,S
JMP ,X ; return
*
* Do you really want to do this?
* Unsigned byte offset, zero extend into A, destroys A
SUBBS CLRA
SUBDS COMA
NEGB
BNE ADDDS ; or BCS. but BNE works -- extend
INCA
BRA ADDDS ; let ADDDS handle the return address and the math
* Do the math in D for explicit subtraction
* No more useful than the rest of this for X.
* Here just as an example of math that can be done.
SUBBSS CLRA ; B is unsigned, destroys A
SUBDSS LDX ,S ; get return address
STD ,S ; save D
TFR S,D ; get S without endangering the stack
ADDD #2 ; adjust for having D on the stack
SUBD ,S ; finally subtract the offset
* Alternative 1, leaves D destroyed
TFR D,S ; update stack pointer
JMP ,X ; return via X
* Alternative 2, restores offset in D
PSHS D ; working realllllly hard not to destroy D.
LDD 2,S ; got the offset
LDS ,S ; update S
JMP ,X
* INX and DEX trains and INS and DES trains are meaningless.
* HOWEVER, just to remind ourselves:
* (And all of these work for Y and U, too but IN-LINE them!!)
* (They work for S if in-lined, as well.)
ADD16X LEAX 16,X
RTS
ADD14X LEAX 14,X
RTS
SUB16X LEAX -16,X
RTS
* Etc. In-line these.
INX LEAX 1,X ; Sigh. In-line it. Do not make trains with it. Please.
RTS
DEX LEAX -1,X ; See INX. In-line it. Do not make trains with it. PLEASE.
*
* More solutions to puzzles.
* If you called these, you would have to juggle the return address as shown.
* You don't want to do that.
* Just in-line the LEAS instructions.
* Then there's no return address to juggle, no messing with X.
* DO NOT USE THIS CODE other than examples of silly walks.
ADD16S PULS X
LEAS 16,S
JMP ,X
* etc.
* Could all be replaced with just LEAS 16,S; in-line!
* That's actually cheaper than just the instruction JSR!!!
* And stacks restricted within page boundaries make no sense at all on the 6809.
* Pseudo-register somewhere in DP:
QSP RMB 2 ; a synthetic stack pointer Q
...
ORG SOMETHING
RMB 4 ; buffer zone
QSTKLIM RMB 64
QSTKBAS RMB 4 ; buffer zone
SSTKLIM RMB 32
SSTKBAS RMB 4 ; buffer zone
...
* signed B for synthetic stack:
ADBQSP LDX QSP
ADBQSX LEAX B,X ; does the whole pointer, negatives, too
STX QSP
RTS
*
* unsigned B and D for synthetic stack:
ADUQSP CLRA ; unsigned B entry point
ADDQSP LDX QSP
ADDQSX LEAX D,X ; does the whole pointer, negatives, too
STX QSP
RTS
*
* Choose whether you want to negate D or move it around, and see above.
* Or just decide you can add a negative instead of subtracting
*
* Destroys A
SBSQSP SEX ; sign extend B into A (Yes, that's the mnemonic.)
BRA SBDQSP
SBUQSP CLRA ; B is unsigned, therefore positive
* 16-bit offset
SBDQSP COMA ; no NEGD
NEGB
BNE ADDQSX ; or BCS. but BNE works -- extends
INCA
BRA ADDQSX
* Alternatively, use D for explicit subtraction
SBSQSPS SEX ; sign extend B into A (Yes, that's the mnemonic.)
BRA SBDQSPS
SBUQSPS CLRA ; B is unsigned, destroys A
SBDQSPS PSHS D ; for subtraction
LDD QSP ; Get things in the right place
SUBD ,S++ ; do the subtraction
STD QSP ; update
RTS
* More stuff that there is no reason to do.
* Just in-line the LEAS B,S
ADBSP PULS X ; return address
LEAS B,S ; signed B, but full 16-bit address math.
JMP ,X
*
* Just in-line the LEAS D,S
* D for return stack (but we saw this above):
ADDSP PULS X ; return address
LEAS D,S
JMP ,X
*
* Just in-line the NEGB and LEAS B,S, Still cheaper than the call.
* signed B for return stack:
SBBSP PULS X ; return address
NEGB
LEAS B,S ; full 16-bit address math
RTS
*
* This one might be worth a routine for,
* if you actually have to do it.
* D for return stack (but we saw this above):
SBDSP PULS X ; return address
COMA
NEGB
BNE SBDSPM
INCA
SBDSPM LEAS D,S
JMP ,X
* or
SBDSPS LDX ,S ; return address
STD ,S ; offest
TFR S,D
ADDD #2 ; adjust it
SUBD ,S
TFR D,S
JMP ,X
As you can see, the 6809 just basically does almost all the address math you need without subroutines.
Uhm, until we get to arrays, but let's not do that yet.
[JMR202411031752 correction:]
In the comments to the code, I suggested (or asserted?) that there would be no reason on the 6809 to allocate a stack entirely within a single page so that the stack pointer math would never overflow, and the increment and decrement could be handled with the INC and DEC instructions only, ignoring overflow.
On my way to bed last night, I realized that would not entirely be true.
Pointer variables in the direct page cannot be indirected without loading the variable into an index register. So if your top of stack pointer is process local, there would be no point in not using the auto-inc/dec modes and LEA instructions to do the index updates.
But if the synthesized stack or queue is global to all processes (such as a
system resource allocation stack or queue), it may be reasonable to use
absolute (extended mode) addressing, in which case memory indirection is
available. In that case, it may be completely sensible to use the optimization
of no-overflow INC or DEC in a stack or queue allocated entirely within a
single page:
* A synthetic stack contained entirely in a page,
* using absolute (extended mode) addressing:
ORG $400 ; anywhere that ESPLOB to ESPHIB-1 are all within a page
ESPLOB RMB 4 ; bumper, lowest related address
ESPLIM RMB 64 ; 32 2-byte items possible on stack
ESPBAS RMB 4 ; bumper
ESPHIB EQU * ; highest related address (plus 1)
...
ESP RMB 2 ; only the low byte will change
...
EPSHD DEC ESP+1 ; stack all within a page!
DEC ESP+1 ; no carry
STD [ESP] ; indirection
RTS
*
EPOPD LDD [ESP] ; indirection
INC ESP+1 ; stack all within a page!
INC ESP+1 ; no carry
RTS
*
ADDBESP ADDB ESP+1 ; signed
STB ESP+1
RTS
*
SUBBESP PSHS B ; unsigned
LDB ESP
ADDB ,S+
STB ESP
RTS
Hopefully, I can devote a chapter or three to giving this proper treatment
somewhere down the road.
[JMR202411031752 correction end.]
Oh, and I have mentioned, I think, the DP register, how it isn't as fully supported as I'd have liked.
The DP can be used as a base for per-process global variables (in other words,
variables local to the process, but globally/statically allocated within the
process). I discussed this to a certain extent in the 6800 addressing math chapter.
* On the 6800 or 6801, this would be reference by a process-local
* LOCALBASE or similar pseudo-register, which I almost forgot to talk about.
* How to get the effective address of a variable in DP:
* Instead of
* LEAX <VAR
* or
* LEAX VAR,DP
* or even
* LEAX VAR-DPBASE,DP
* which we do not have in the 6809,
* we can do this --
*
* Given
ORG $nn00 ; even 256-byte page address
SETDP $nn
DPBAS EQU *
* ...
VAR RMB m
*
* In-line snippets --
* For variable VAR within 256 bytes of DPBAS:
...
LDB #VAR-DPBASE ; put the offset in DP in B (unsigned)
TFR DP,A ; pull the base address high byte into A
TFR D,X ; move it to X
...
*
* Using DP when VAR is 256 bytes or more away from DPBAS:
...
TFR DP,A ; pull the base address high byte into A
CLRB ; make the full base address
ADDD #VAR-DPBASE ; add the offset
TFR D,X ; move it to X
...
*
* Or, if the assembler lets us split the offset up with advanced math:
...
TFR DP,A
LDB #(VAR-DPBASE)&$FF ; bit-and mask -- no carry!
ADDA #(VAR-DPBASE)/$100 ; add the high byte
TFR D,X
...
*
* As subroutines --
* unsigned offset in B:
LEADPUX TFR DP,A ; pull the base address high byte into A
TFR D,X ; move it to X
RTS
*
* unsigned offset in D:
LEADPDX TFR DP,A ; pull the base address high byte into A
CLRB ; make the full base address
ADDD #VAR-DPBASE ; add the offset
TFR D,X ; move it to X
RTS
*
* Because DP is not in the index post-byte,
* in some applications, it may be better to keep
* LOCBAS as a pseudo-register,
* in which case it would look like this --
* for small offsets < 128:
ADDLBB LDX <LOCBAS ; but do this in-line!
LEAX B,X
RTS
* for 127 < offset < 256, maybe, maybe not:
ADDLBU CLRA ; unsigned offset
* for larger offsets
ADDLBD LDX <LOCBAS ; and definitely do this in-line, too!
LEAX D,X
RTS
As with the previous two chapters, I have not tested the code. It should run, modulo typos.
Even though I keep saying things like "in-line this", and "you don't need that", it may be hard to visualize the impact that 6809 addressing modes has on addressing math until we compare the stack frame code for the 6800 and 6801 to the stack frame code for the 6809.
Likewise the 68000. But let's get an overview of addressing math on the 68000
before we take a look at a concrete example of stack frames on the 6801. And
on our way to addressing math on the 68000, let's take
a detour for multi-byte negation on the 6809.
No comments:
Post a Comment