Monday, January 6, 2025

ALPP 03-XX (17) -- Demonstrating Left Shift -- 6800

I'm putting this here for reference. Eventually, I plan to do a chapter on shifts, and most of this will be demonstrated there. I've only tested part of the code.

Demonstrating Left Shift --
6800

(Title Page/Index)

 

I've shown you some theoretical background on bit shifting left and multiplying by powers of 2, and I want to move ahead because we can't print out the results in decimal yet. 

But I talked it over with God. Oh, some people will understand, some people won't. I could call it a hunch -- a strong hunch, strong enough to keep me from proceeding -- if you prefer. 

And the result? There's a lot of code in these. Read through the 6809 and 68000 versions, scan the other two, and test one or more if you are inclined. Come back for reference if things get murky when we talk about synthesizing the multiplication and division routines.

I didn't want to consume four posts for this, to show the rigging and the test code for each processor, but it's going to be four posts even without the rigging framework. 

But you've already seen everything in the rigging framework anyway, in the single character input chapters. I'm going to let you move the new stuff into the rigging framework yourself this time.

Starting with the 6800 code for character input, open the file rt_rig03_6800.asm up in a text editor and save it as rt_rig04_6800.asm. Keep it open and open inkey_6800.asm (or whatever you saved it as) up and save it as shftst_6800.asm (or whatever). Change the inclusion (EXP) line to include rt_rig04_6800.asm instead of rt_rig03_6800.asm.

Now cut INCHAR and INCHNE out of shftst_6800.asm , from the comments on AECHO to the hook to INCHV, and move them into someplace appropriate in rt_rig04_6800.asm. You might want to move them in two or three separate pieces, or you might want to move those lines altogether at once to the same place. Your choice.

Also grab PDUP for the 6800 and 6801.

Save the two files and make sure they still assemble and run as in chapter 3-10.

Now, if you want to do this part yourself, cut the test code out of multest_6800.asm and replace it with appropriate 6800 code from the last chapter. Yes, that does mean you'll need to convert the 6809 code to 6800 code. It's not hard, just tedious, and instructive.

If you don't want to do the conversion yourself, or if you want to see how I'd do it, the following is some demonstration code I produced. A little less than two thirds of the way down, I realized I was heading the wrong direction and quit testing it. Some of the remaining code is known not to do what I intended, and the rest of the remaining code is not tested, but I'm leaving it here for reference.

 For the 6800, assuming PSP in X and the bytes to test on the parameter stack:
* test shifts and multiplies for 6800 (EXORsim)
* using parameter stack,
* with test frame
* Copyright Joel Matthew Rees, December 2024
*
	EXP	rt_rig04_6800.asm
****************
* Program code:
*
*
INX8	INX
INX7	INX
INX6	INX
	INX
INX4	INX
	INX
	INX
	INX
	RTS
*
DEX8	DEX
	DEX
DEX6	DEX
	DEX
DEX4	DEX
	DEX
	DEX
	DEX
	RTS
*
* Unrolled 64-bit integer shift left 1 bit:
LSL64	LSL	7,X	; least significant byte (byte 7)
	ROL	6,X	; next less significant byte (byte 6)
	ROL	5,X	; next more significant byte (byte 5)
	ROL	4,X	; next more significant byte (byte 4)
	ROL	3,X	; next more significant byte (byte 3)
	ROL	2,X	; next more significant byte (byte 2)
	ROL	1,X	; next more significant byte (byte 1)
	ROL	0,X	; most significant byte on (byte 0)
	RTS
*
* 64-bit integer shift left 1 bit in a loop:
LSL64LP	LDX	PSP	; but do not update!
	JSR	INX7	; point to last byte
	LDAA	#7	; bytes to ROL
	LSL	0,X	; least significant byte starts with LSL
SHL64L	DEX		; carry not affected
	ROL	0,X	; next more significant byte
	DECA		; count down, carry not affected
	BNE	SHL64L	; do next
	RTS
* Ends with X pointing to most significant byte
*
PGSTRT	LDAB	#$5A	; a common bit pattern to watch move
	ROLB		; $B4	9-bit rotate left
	ROLB		; $68
	ROLB		; $D1
	ROLB		; $A2
	ROLB		; $45
	ROLB		; $8B
	ROLB		; $16
	ROLB		; $2D
	ROLB		; $5A
	NOP		; Pause for a look.
*
	ROLB
	ADCB	#0	; $B4	8-bit rotate left
	ROLB
	ADCB	#0	; $69
	ROLB
	ADCB	#0	; $D2
	ROLB
	ADCB	#0	; $A5
	TBA		; is another common bit pattern to watch move
	ROLB
	ADCB	#0	; $4B
	ROLB
	ADCB	#0	; $96
	ROLB
	ADCB	#0	; $2D
	ROLB
	ADCB	#0	; $5A
	NOP
*
	LSLB		; 1st: $A55A	16-bit shift left
	ROLA		; $4AB4
	LSLB		; 2nd
	ROLA		; $9568
	LSLB		; 3rd 
	ROLA		; $2AD0
	LSLB		; 4th 
	ROLA		; $55A0
	NOP
	LSLB		; 5th 
	ROLA		; $AB40
	LSLB		; 6th 
	ROLA		; $5680
	LSLB		; 7th 
	ROLA		; $AD00
	LSLB		; 8th 
	ROLA		; $5A00
	NOP
*
	LDX	PSP	; 32-bit shift left mixed stack/register
	DEX		; allocate two bytes
	DEX
	STX	PSP
	LDAA	#$87
	LDAB	#$65
	STAB		1,X
	STAA		0,X	; $8765 on stack
	LDAA	#$43
	LDAB	#$21	; $4321 in D
	LSLB		; least significant byte
	ROLA		; next less significant byte
	ROL	1,X	; next more significant byte on stack
	ROL	0,X	; most significant byte on stack
	LSLB		; 2nd time
	ROLA
	ROL	1,X
	ROL	0,X
	LSLB		; 3rd time
	ROLA
	ROL	1,X
	ROL	0,X
	LSLB		; 4th time
	ROLA
	ROL	1,X
	ROL	0,X	; result -- 7654:3210
	NOP
*
	LDAB	#$10	; set up test data
	STAB	1,X	; X still has PSP
	ADDB	#$22
	LDAA	#7
T64LP	STX	PSP	; allocate before store
	STAB	0,X
	ADDB	#$22
	DEX
	DECA
	BNE	T64LP
	NOP
* Check the contents of the parameter stack when done.
	LDX	PSP	; sync X and PSP
	JSR	LSL64	; unrolled loop
	JSR	LSL64LP	; loop
	JSR	LSL64
	JSR	LSL64LP
	NOP
* Should be shifted left one hexadecimal digit.
	JSR	LSL64
	JSR	LSL64LP
	JSR	LSL64
	JSR	LSL64LP
	NOP
* Should be shifted left another hexadecimal digit,
* which is a full byte!
* But that's a hard way to shift left 8 bits.
* Let's try an easier, quicker way:
	LDAA	#7
LS8BITL	LDAB	1,X
	STAB	0,X
	INX
	DECA
	BNE	LS8BITL
* Wasn't that fast?
	NOP
	LDX	PSP
	JSR	INX8	; drop them all
	STX	PSP	
	NOP
* Multiply 8 bits by 4
	LDAB	#65	; $41
	LSLB	; multiply by 2
	LSLB	; ignore carry and multiply by 2
	NOP
* Multiply 16 bits by 4
	LDAB	#65
	CLRA
	LSLB	; multiply by 2
	ROLA	; catch the carry
	LSLB	; and again
	ROLA	; catch the carry again
	NOP
* Multiply 16 bits by 16
	LDAB	#65
	CLRA
	LSLB	; multiply by 2
	ROLA	; catch the carry
	LSLB	; and again
	ROLA	; catch the carry again
	LSLB	; and again
	ROLA	; catch the carry again
	LSLB	; and again
	ROLA	; catch the carry again
	NOP
* Multiply 16 bits by 16, using loop
	LDAB	#65
	CLRA
	JSR	PPSHD	; PSP in X on return
	LDAB	#4
	JSR	PPSHD
	LDAA	2,X	; operand on parameter stack
	LDAB	3,X
MUL16WL	LSLB	; multiply by 2
	ROLA	; catch the carry
	DEC	1,X
	BNE	MUL16WL
	NOP
	INX		; drop count
	INX
	STX	PSP
	STAA	0,X	; save result
	STAB	1,X
	NOP		; re-use 2 bytes of allocation
* Other powers of 2: 2^7 == 128
	LDAB	#83	; $53 X 128
	CLRA		; for the high bits
	LSLB		; 1st
	ROLA
	LSLB		; 2nd
	ROLA
	LSLB		; 3rd
	ROLA
	LSLB		; 4th
	ROLA
	LSLB		; 5th
	ROLA
	LSLB		; 6th
	ROLA
	LSLB		; 7th
	ROLA
	STAA	0,X	; $2980 == 10624
	STAB	1,X
	NOP
* compare going the other direction,
* ends with high byte in B, low byte in A:
	LDAB	#83	; $53 X 128
	CLRA	; for result
	LSRB	; bit 0 to carry, B becomes high byte
	RORA	; bit 0 of B now in bit 7 of A
	NOP
* Saturation math:
	LDAB	#83	; $53 X 128
	LSRB		; bit 0 to carry
	RORB		; now to bit 7
	ANDB	#$80	; chop off the lost, double-shifted high bits
	NOP
* Extend the saturation math with recovery (de-optimization):
	LDAB	#83	; $53 X 128 (9 bits => 7 to left is 2 to right)
	LSRB		; bit 0 to carry
	RORB		; now to bit 7
	TBA
	ROLA		; bring the high bits back into position
	ANDB	#$80	; chop off the high bits
	ANDA	#$7F	; chop off the low bits
	NOP
* and again, more efficiently, but not most efficiently
	LDAB	#83	; $53 X 128 (8 bits => 7 to left is 1 to right)
	TBA		; make two halves
	LSRA		; bit 0 to carry, high bits
	RORB		; now to bit 7 (bit 0 to carry)
	ANDB	#$80	; chop off the high bits
	NOP
* 2^6 == 64
	LDAB	#83	; $53 X 64
	CLRA		; for the high bits
	LSLB		; 1st
	ROLA
	LSLB		; 2nd
	ROLA
	LSLB		; 3rd
	ROLA
	LSLB		; 4th
	ROLA
	LSLB		; 5th
	ROLA
	LSLB		; 6th
	ROLA
	STAA	0,X
	STAB	1,X
	NOP
* compare going the other direction,
* ends with high byte in B, low byte in A:
	LDAB	#83	; $53 X 64
	CLRA	; for result
	LSRB	; bit 0 to carry
	RORA	; old bit 0 of B now in bit 7 of A
	LSRB	; old bit 1 of B to carry
	RORA	; old bit 1,0 of B now in bit 7,6 of A
	NOP
* Saturation math:
	LDAB	#83	; $53 X 64
    	LSRB		; bit 0 to carry
    	RORB		; now to bit 7, old bit 1 to carry
    	RORB		; now to bit 7,6 in order
    	ANDB	#$C0	; chop off the remainder
	NOP
* Extend the saturation math with recovery (de-optimization):
	LDAB	#83	; $53 X 64 (9 bits => 6 to left is 3 to right)
    	LSRB		; bit 0 to carry
    	RORB		; now to bit 7, old bit 1 to carry
    	RORB		; now to bit 7,6 in order
	TBA
	ROLA		; recover high bits including last carry
    	ANDB	#$C0	; chop off the high bits
	ANDA	#$3F	; chop off the low bits
	NOP
* and again, copying first
	LDAB	#83	; $53 X 64 (8 bits => 6 to left is 2 to right)
	TBA		; make two halves
	LSRA		; bit 0 to carry, high bits
	RORB		; now to bit 7 (bit 0 to carry)
	LSRA		; bring the high bits into place (bit 1 to C)
    	RORB		; now to bit 7,6 in order
	ANDB	#$C0	; chop off the high bits
	NOP
* 2^5 == 32
	LDAB	#83	; $53 X 32
	CLRA		; for the high bits
	LSLB		; 1st
	ROLA
	LSLB		; 2nd
	ROLA
	LSLB		; 3rd
	ROLA
	LSLB		; 4th
	ROLA
	LSLB		; 5th
	ROLA
	STAA	0,X
	STAB	1,X
	NOP
* compare going the other direction,
* ends with high byte in B, low byte in A:
	LDAB	#83	; $53 X 32
	CLRA		; for the high bits
	LSRB	; bit 0 to carry
	RORA	; old bit 0 of B now in bit 7 of A
	LSRB	; old bit 1 of B to carry
	RORA	; old bit 1,0 of B now in bit 7,6 of A
	LSRB	; old bit 2 of B to carry
	RORA	; old bit 2,1,0 of B now in bit 7,6,5 of A
	NOP
* Saturation math:
	LDAB	#83	; $53 X 32
    	LSRB		; bit 0 to carry
    	RORB		; now to bit 7, old bit 1 to carry
    	RORB		; now to bit 7,6 in order, old bit 2 to carry
    	RORB		; now to bit 7,6,5 in order
    	ANDB	#$E0	; chop off the high bits
	NOP
* Extend the saturation math with recovery:
	LDAB	#83	; $53 X 32 (9 bits => 5 to left is 4 to right)
    	LSRB		; bit 0 to carry
    	RORB		; now to bit 7, old bit 1 to carry
    	RORB		; now to bit 7,6 in order, old bit 2 to carry
    	RORB		; now to bit 7,6,5 in order
	TBA
	ROLA		; recover last carry into high bits
    	ANDB	#$E0	; chop off the remainder
	ANDA	#$1F	; chop off the low bits
	NOP
* and again, copying first
	LDAB	#83	; $53 X 32 (8 bits => 5 to left is 3 to right)
	TBA		; make two halves
	LSRA		; bit 0 to carry, high bits
	RORB		; now to bit 7 (bit 0 to carry)
	LSRA		; bring the high bits into place (bit 1 to C)
    	RORB		; now to bit 7,6 in order
	LSRA		; once more (old bit 2 to C)
    	RORB		; now to bit 7,6,5 in order
    	ANDB	#$E0	; chop off the high bits
	NOP
* balance the stack
	INX
	INX
	STX	PSP	; clear stack
	NOP
*
* From here down either hasn't been tested
* or doesn't function as I intended.
*
* shift left by 13 by right rotation => multiply by 8192, lose high bits
	LDAA	#$41	; $4153 == 16723
	LDAB	#$53
    	LSRB		; bit 0 to carry
    	RORA		; now to bit 15, bit 8 to carry
    	RORB		; bit 8 to bit 7, old bit 1 to carry
    	RORA		; old bit 1 to bit 15 in order
    	RORB		; old bit 9 to bit 7, old bit 2 to carry
    	RORA		; old bit 2 to bit 15 in order
    	RORB		; old bit 10 to bit 7 (old bit 3 to carry)
    	ANDA	#$E0	; chop off the top bytes, ignore carry
	CLRB
	NOP
* shift left by 13 => multiply by 8192, capture all bits:
	LDAA	#$41	; $4153 == 16723
	LDAB	#$53
	JSR	DEX4	; pre-allocate
	STX	PSP
    	LSRB		; bit 0 to carry
    	RORA		; now to bit 15, bit 8 to carry
    	RORB		; bit 8 to bit 7, old bit 1 to carry
    	RORA		; old bit 1 to bit 15 in order
    	RORB		; old bit 9 to bit 7, old bit 2 to carry
    	RORA		; old bit 2 to bit 15 in order
    	RORB		; old bit 10 to bit 7 (old bit 3 to carry)
	CLR	3,X	; least significant, ignore carry
	CLR	2,X	; save a place for lower middle byte
	STAB	1,X	; save next more significant byte
	TAB		; copy to split out high and low
    	ANDB	#$E0	; chop off the high bits
	STAB	2,X	; save the lower middle byte
	ANDA	#$1F	; chop off low bits
	STAA	0,X	; save high bits
	NOP
* Check the results before continuing.
	NOP
* shift left directly by 13 => multiply by 8192, capture all bits:
	DEX		; 2 placeholders, for middle upper byte	
	DEX		; and high byte
	STX	PSP
	LDAA	#$41	; $4153 == 16723
	LDAB	#$53
	LSLB
	ROLA
	ROL	1,X	; catch 1
	LSLB
	ROLA
	ROL	1,X	; catch 2
	LSLB
	ROLA
	ROL	1,X	; catch 3
	LSLB
	ROLA
	ROL	1,X	; catch 4
	LSLB
	ROLA
	ROL	1,X	; catch 5
	LSLB
	ROLA
	ROL	1,X	; catch 6
	LSLB
	ROLA
	ROL	1,X	; catch 7
	LSLB
	ROLA
	ROL	1,X	; catch 8
	DEX		; for high byte final resting place
	CLR	0,X	; not completely filled
	LSLB
	ROLA
	ROL	1,X	; catch 9
	ROL	0,X	; catch 1
	LSLB
	ROLA
	ROL	1,X	; catch 10
	ROL	0,X	; catch 2
	LSLB
	ROLA
	ROL	1,X	; catch 11
	ROL	0,X	; catch 3
	LSLB
	ROLA
	ROL	1,X	; catch 12
	ROL	0,X	; catch 4
	LSLB
	ROLA
	ROL	1,X	; catch 13
	ROL	0,X	; catch 5
	NOP
* stop to check
	NOP
	JSR	INX6	; drop all the above
	STX	PSP
	NOP
* 8-bit-wide rotation
* accumulator-wide ROL by 3 / ROR by 5 using the stack:
	LDX	PSP	; just to by sure, and remind ourselves
	DEX		; temp
	STX	PSP
	LDAA	#83	; $53
	STAA	0,X	; copy to stack
	LSL	0,X	; shift left by 3
	LSL	0,X
	LSL	0,X
	LSRA		; shift right by 5
	LSRA		; (more shifts, use the faster shift accumulator)
	LSRA
	LSRA
	LSRA
	ORAA	0,X	; put results together	
	NOP
* accumulator-wide ROL by 3 / ROR by 5 using ABA:
	LDAB	#83	; $53
	TBA	; copy
	LSLA	; shift left by 3
	LSLA
	LSLA
	LSRB	; shift right by 5
	LSRB
	LSRB
	LSRB
	LSRB
	ABA	; put the results together
	NOP
* accumulator-wide ROL by 3 / ROR by 5 using ADC #0 trick:
	LDAB	#83	; $53
	LSLB
	ADCB	#0
	LSLB
	ADCB	#0
	LSLB
	ADCB	#0
	NOP
* ugly accumulator-wide ROR by 5 / ROL by 3 using branch and set:
	LDAB	#83	; $53
	LSRB		; clears bit 7
	BCC	RR8BN1
	ORAB	#$80	; set it for the carry
RR8BN1	LSRB
	BCC	RR8BN2
	ORAB	#$80
RR8BN2	LSRB
	BCC	RR8BN3
	ORAB	#$80
RR8BN3	LSRB
	BCC	RR8BN4
	ORAB	#$80
RR8BN4	LSRB
	BCC	RR8BN5
	ORAB	#$80
RR8BN5	NOP		; next instruction
* Not as ugly accumulator-wide ROR by 5 / ROL by 3,
* but uses both accumulators to avoid branches:
	LDAB	#83	; $53
	TBA
	LSRA	; get lowest bit in carry first
	RORB
	LSRA	; get 2nd bit in carry first
	RORB
	LSRA	; get 3rd bit in carry first
	RORB
	LSRA	; get 4th bit in carry first
	RORB
	LSRA	; get 5th bit in carry first
	RORB
	NOP
* Compare result before dropping
	INX		; drop temp
	STX	PSP
	NOP
* 16-bit integer rotate left 3 / right 13  on 6800:
	LDAA	#$41	; $4153 == 16723
	LDAB	#$53
	LSLB		; clear bottom bit on shifting left
	ROLA
	ADCB	#0	; push the top carry in (16-bit rotation)	
	LSLB
	ROLA
	ADCB	#0	; push the top carry in (16-bit rotation)	
	LSLB
	ROLA
	ADCB	#0	; push the top carry in (16-bit rotation)
	NOP
* 16-bit integer rotate right 3 / left 13  on 6800:
	DEX		; temp to grab bit with
	STX	PSP
	STAB	0,X	; copy
	LSR	0,X	; get bottom bit
	RORA		; rotate it into top byte
	RORB		; 1 bit complete
	LSR	0,X	; get next bottom bit
	RORA
	RORB		; 2nd bit complete
	LSR	0,X	; get next bottom bit
	RORA
	RORB		; 3rd bit complete
	NOP		; Should be back to $4153
	INX
	STX	PSP
*
	RTS
*
	END	ENTRY
Now let's look at multiplying by some small constants that aren't powers of two.

 

(Title Page/Index)

 

 

 

 

No comments:

Post a Comment