Sequentially Accessing a
Simple List on the 6800, 6801, 6809, and
68000
Now that we've learned to
treat that list of small integers as something like a list, let's access that list sequentially. Why? Because sometimes you want to
work through a list in order.
We've done most of this before, so it should go quickly. (Right?)
Again, we are going to re-use source from the exercise we just finished, changing it slightly:
ENTRY JMP START
*
BYTTBL FCB 8
FCB 5
FCB 2
FCB 7
FCB 4
*
START LDX #BYTTBL
LDAB 0,X
INX
ADDB 0,X
INX
ADDB 0,X
INX
ADDB 0,X
INX
ADDB 0,X
NOP
DONE NOP
Don't you think that's a minor change?
Heh. Well, I I have a reason for thinking so.
Can you guess what it's going to do? Can you guess what INX means?
INX is a mnemonic for INcrement X. Now you know. And now you may be guessing what it does.
Assemble and run it according to the patterns we've been using, referring to
the previous chapters if you can't remember details:
- Get a session of the 6800 version of EXORsim running with the --mon switch;
- start an (a)ssembly at $1000, or maybe $2000 this time;
- copy this source and paste it in to assemble it;
- (d)isassemble it from appropriate addresses to see what assembled;
- set a breakpoint at the landing pad NOP (the DONE label);
- turn (t)racing on;
-
use the (c)ontinue command to run it from the ENTRY point. Or you can (s)tep
through each instruction, instead, if you want.
Confirm that the summation occurs correctly as before, and watch the value of
X as the CPU works through the code.
Now, you may notice, that inserting the INX instructions uses more code and
more processor cycles. But it maintains a pointer in X to the currently
interesting item in the list, which is what we are trying to do this time.
Because we can. And because there will be times when that's what we want to
do, and we will need to know how.
This is called the post-increment mode of addressing, since we INcrement X after each access.
Each access? Well, we don't have to increment after the last access, unless our algorithm requires the pointer treatment to be uniform. Right now, we just want to look at the difference, more than use these in some explicit algorithm.
Why is it called post-increment access mode instead of post-access increment mode? I'm not sure. Post-access incrementing is what we are doing. But mathematicians can use strange grammar sometimes. (And strange grammars, as well, but that's a whole 'nuther story.)
A partial explanation can be found in the terms "post-inc" and "pre-dec", which are common jargon in discussions of low-level computer programming. In both cases, the redundant word "access" has been deleted:
- post-access increment => post-access inc => post-inc
- pre-access decrement => pre-access dec => pre-dec
And, yes, the technical literature then re-uses the coinage, including in the less-used pre-inc and post-dec modes that we will visit later on.
We can deal with this, I think?
The 6801 is again the same for this, both the source code and the object. It will be a bit quicker -- slightly reduced cycle counts for the 6801 on both the indexed mode operators and the INX instruction. It's a bit of a rinse-and-repeat, but go ahead and check the simulation on the 6801, to keep the processes fresh in your head.
INX on the 6809
You might think it surprising and counter-intuitive, but the 6809 has no INX instruction.
WHAT?!? No INX?!?!?!?
So, of course, it has something better. So to speak.
Oh, it really is better, but if you start with a familiarity with the incrementing instructions, it might be easy to miss.
You've seen the LEA instruction already, in relation to position independent
addressing. Now see what you think of this:
LEAX 1,X
Load into index register X the address 1 past the current address in X.
(If you define this as a macro called INX, then you have an INX instruction
for the 6809, but I do not want to talk about macros yet. So just forget I
mentioned it, okay? ;-P)
Why? you ask, would a person want to invoke all the complexities of LEA just to increment X?
Why not? That's basically what the address calculation unit is for, isn't it? -- calculating addresses?
Here's a version for the 6809:
ENTRY JMP START
*
BYTTBL FCB 8
FCB 5
FCB 2
FCB 7
FCB 4
*
START LEAX BYTTBL,PCR
LDB 0,X
LEAX 1,X ; INX equivalent
ADDB 0,X
LEAX 1,X
ADDB 0,X
LEAX 1,X
ADDB 0,X
LEAX 1,X
ADDB 0,X
NOP
DONE NOP
Give it a try.
Admittedly, LEAX 1,X on the 6809 is a 2-byte instruction, where INX on the 6800 and 6801 is a 1-byte instruction. And LEAX 1,X takes 5 clock cycles, where INX only takes 4 on the 6800 and 3 one the 6801.
So, ... Why?
Adding fancy instructions does take room in the op-code map. The index mode post-byte that the 6809 indexing modes requires takes another cycle to process, too. And Motorola developed the 6801 after the 6809, so it benefits from some improvements in circuitry design and layout that Motorola picked up in-between.
It's a definite question why Motorola never improved the 6809, but what I heard from probably reliable sources was that Motorola didn't want the 6809 eating into the 68000's market.
And it would have. Initially.
At similar memory cycle timing, for primarily 8-bit stuff, the 6809 can be
faster than the 68000 -- if there is very little use of division and only
light use of multiplication of integers larger than 8 bits. And re-using 6809
source code on the 68000 can be kind of tricky, if you aren't using a
discipline like I will be demonstrating here once we get the basics down.So
upgrading from the 6809 to the 68000 is not a given.
Well, so then Motorola would have been stuck with customers wanting upgraded
6809s, like the customers that asked for upgraded 6800s that didn't require
conversion like the 6809 did, and management was worried about losing focus,
if I understand correctly.
It's a legitimate worry when you don't have a lot of talented engineers in the pool available to hire from. But ... what about the 6805 with its weird 8-bit index register? And that even weirder bit-serial 6804 that you probably will never hear about? Why then did Motorola do those?
Incidentally, the series that started with the 6805, from my inexpert survey,
was probably as profitable for Motorola as any other series, and was longer
lived than most, including the 68HC11. Both are still hiding in a lot of the
electronics you use every day. I want to do chapters for both of them at some
point, but I need to find decent open source/libre emulators and support
software. Or build my own.
(And there is
a non-Motorola side-path
I'm deliberately ignoring here. Maybe sometime later.)
Oh, there I go, getting distracted again.
So, yeah, LEAX 1,X does not seem to be a win.
But LEAX 2,X instead of two INX instructions? Two bytes, either way. One instruction and just 5 cycles on the 6809, vs. two instructions and 6 cycles total for the 6801, 8 cycles total for the 6800. That's a win. And the 6809 has more magic in that index post-byte, to cover for the more common not-a-win case of incrementing by 1:
ENTRY JMP START
*
BYTTBL FCB 8
FCB 5
FCB 2
FCB 7
FCB 4
*
START LEAX BYTTBL,PCR
LDB ,X+
ADDB ,X+ ; INX implied
ADDB ,X+
ADDB ,X+
ADDB ,X ; Following the 6800 code, no final increment
NOP
DONE NOP
Of course you're going to give this one a try and make sure it works as
advertised.
Is it a win?
In this source code, you take care of the post-access increment in the same
instruction as the load or add. The assembler source makes it clear that the
access is post-inc, whereas INX instructions are not always for post-inc
accesses. Win for clarity, also saves a few cycles on each access, vs. the
ADDB 0,X ; INX pair.
But converting 6800 source code to the 6809 becomes, through this kind of magic, shall we say, not straightforward. It will require allocating time for an engineer to work through the source, possibly adding bugs that will then need to be fixed. And much of the existing 6800 source code was definitely put together without any real discipline, and is thus hard to understand, and ...
In my way of thinking, that added engineering time could be considered an investment. More than one engineer will end up understanding the code. And it will be an opportunity preemptively attacking undiscovered bugs in the original code.
But even today (especially today?), management (and accounting and boards of
directors) do not seem to understand that kind of investment in intangible
capital.
If you're interested in comparing the op-codes and cycle counts of the processors, do a web search for the programming manuals. Look for something like
Motorola 6801 programming manual
You should find the PDFs on Bitsavers and Internet Archives, and in other
usual places. The
6809 even has an HTML manual on-line
(Thank you, Maddes and
the docs team.)
No, actually, even if you don't think you are all that interested in details
like byte and cycle counts, go get the manuals anyway. You need them. Don't
forget the 68000 manual.
INX on the 68000
Let's go back and get the 68000 source from
the most recent example:
EVEN
ENTRY JMP START
*
BYTTBL DC.B 8 ; byte data doesn't have to be aligned.
DC.B 5
DC.B 2
DC.B 7
DC.B 4
*
EVEN ; But 68K code does have to be even aligned.
START MOVE.L #BYTTBL,A0
MOVE.B 0(A0),D1
ADD.B 1(A0),D1
ADD.B 2(A0),D1
ADD.B 3(A0),D1
ADD.B 4(A0),D1
NOP
DONE NOP
* One way to return to the OS or other calling program
clr.w -(sp) ; there should be enough room on the caller's stack
trap #1 ; quick exit
Let's try that with a 68000 equivalent of INX:
OPT LIST,SYMTAB ; Options we want for the stand-alone assembler.
MACHINE MC68000 ; because there are a lot the assembler can do.
OPT DEBUG ; We want labels for debugging.
OUTPUT
***********************************************************************
EVEN
ENTRY JMP START
*
BYTTBL DC.B 8 ; byte data doesn't have to be aligned.
DC.B 5
DC.B 2
DC.B 7
DC.B 4
*
EVEN ; But 68K code does have to be even aligned.
START LEA BYTTBL(PC),A0
MOVE.B (A0),D1
ADDQ.L #1,A0 ; INX equivalent
ADD.B (A0),D1
ADDQ.L #1,A0
ADD.B (A0),D1
ADDQ.L #1,A0
ADD.B (A0),D1
ADDQ.L #1,A0
ADD.B (A0),D1 ; No trailing INX.
NOP
DONE NOP
* One way to return to the OS or other calling program
clr.w -(sp) ; there should be enough room on the caller's stack
trap #1 ; quick exit
So, do you see that? The 68000 actually has a direct equivalent of the INX instruction.
What is INX, really? It's an instruction to add 1 to the index register in the 6800/6801. If we had an ADDX instruction for the 6809, to ADD some source to the index register, INX would be the approximate equivalent of
ADDX #1 ; => INX (theoretical)
We don't have an ADDX on the 6809, but we do have something like it on the 68000. (We actually have an ADDX on the 68000, but that's something rather different, which we will discuss later.)
On the 68000, any address register could be our index register (equivalent of
the 6809/68006801 X). Since we use A0 in the previous chapter, we'll use it
now. (It agrees with stack order, by the way.) Here's an approximate
equivalent of the 6800/1/9 INX instruction:
ADD.L #1,A0 ; INX equivalent
ADD.L immediate, long add. Why long? Because addresses in the 68000 are long, unless you specifically know they are not. And we don't know that. But that means we have to waste bytes in the code for a 32-bit version of the small integer 1. In hexadecimal, the combined op-code and immediate argument would be 48 bits:
$D1FC
$0000
$0001
In binary,
1101000111111100
0000000000000000
0000000000000001
For the curious, the various fields of the instruction are
1101: add address
000: A0 is destination
111: size long (32 bit) operation
111 100: immediate source
0000000000000000: high 16 bits
0000000000000001: low 16 bits
That's a lot of zero bits, just to increment by 1.
So Motorola defined the ADDQ instruction that I use above. ADDQ increments by
any immediate value from 1 to 8, with the value specified internal to the
single 16-bit instruction. In hexadecimal, the complete instruction is
$5288
In binary,
0101001010001000
That's 16 bits instead of 48. Again, for the curious, the various fields are
0101: add quick
001: data (number to add, 0 means 8)
0: addq (1 is subq)
10: size long (32 bit) operation
001: target is address register
000: A0
SUBQ is the corollary decrement. And, by the way, either can target any
register. (Memory, too.)
Refer back to the last chapter and
- Copy the source to a text file in the host OS;
- Save it in your working directory with a name you'll maybe recognize, 8 characters or less;
- Open a terminal command-line shell and change to your working directory;
-
Assemble it with vasm, don't forget the
-Ftos TOS output and
-no-opt optimization switches; - Start a Hatari session
-
In Hatari:
- Hit Ctrl-C to drop out of GEM into the EmuTOS CPM shell;
- Change to the working directory in EmuTOS;
- Hit Alt-PAUSE to invoke the debugger;
-
Move with the mouse to the debugger in the command-line terminal and
- Set a breakpoint (b) on TEXT segment entry;
- Continue (c) execution in the EmuTOS shell;
- Run the executable you created with vasm;
-
Use the mouse to go back to the debugger;
-
Disassemble the code to make sure it's all where it should be;
-
Find the address of the START label and disassemble from there, too;
- Show the registers (r); and
- Step (s) through the code, watching the index and sum change;
-
Disassemble the code to make sure it's all where it should be;
-
Continue (c) back to EmuTOS, or quit (q) back to the host OS.
Convinced?
We could also use LEA instead of ADDQ, like we did on the 6809. The
instruction would look like
LEA 1(A0),A0
It takes an extra word (two bytes) of code, as compared with ADDQ, and it takes more cycles to complete, but it can be done.
We are more interested in the post-inc mode:
ADD.B (A0)+,D1
Let's do that now:
OPT LIST,SYMTAB ; Options we want for the stand-alone assembler.
MACHINE MC68000 ; because there are a lot the assembler can do.
OPT DEBUG ; We want labels for debugging.
OUTPUT
***********************************************************************
EVEN
ENTRY JMP START
*
BYTTBL DC.B 8 ; byte data doesn't have to be aligned.
DC.B 5
DC.B 2
DC.B 7
DC.B 4
*
EVEN ; But 68K code does have to be even aligned.
START LEA BYTTBL(PC),A0
MOVE.B (A0)+,D1 ; post-access inc implied
ADD.B (A0)+,D1
ADD.B (A0)+,D1
ADD.B (A0)+,D1
ADD.B (A0),D1 ; Leaving out the trailing post-inc.
NOP
DONE NOP
* One way to return to the OS or other calling program
clr.w -(sp) ; there should be enough room on the caller's stack
trap #1 ; quick exit
By the way, do you see that push instruction before the trap back to EmuTOS?
:)
I'll explain later, maybe in the next example.
For now, go ahead and copy this source into a text file, assemble it, and run it in the debugger, and then we will be done with this list of small constants -- for a while, at least.
Once you have that tested, and are satisfied that it works as advertised, you might want to clean up your working directory a bit. Make a subdirectory called step1 or basicadr or s1_adr or something, and move all the 68000 code files we have made to it.
Next, let's see if we can put some of this together to put out a message, and hopefully this all will begin to make some sense.
No comments:
Post a Comment