From a conversation on studying 6809 programming via random number generation in the FB group MOTOROLA 6809 / 6309, 6800 ASSEMBLY LANGUAGE PROGRAMMING:
The source Metcalf shows and asks for comments on is as follows:
xrnd:
ldd #1 ; seed must non-zero
rora
rorb
eorb xrnd+1
stb xrnd+1
rorb
eorb xrnd+2
tfr b,a
eora xrnd+1
std xrnd+1
rts
(20221010: John has put the code up on github to make it generally available:
https://github.com/impomatic/xorshift798)
Note that what the code does is feed bits of a remembered number back into the number in a controlled and predetermined fashion, making it a pseudo-random number generator.
And note that it does not make use of any of the interesting features of the
6809.
Also note that he is using something called self-modifying code, usually considered an ugly hack, but here displaying a peculiar aesthetic -- it eliminates separate state initialization code by incorporating the initial state value in the source code directly into the object machine code itself, and allowing the code to modify it where it is, using the immediate value in the object code as the running state value.
Self modifying code does have disadvantages.
Here, the primary disadvantage is that you can't put the object code in ROM. It has to run out of RAM.
Another disadvantage when it is done this way is that the code cannot be moved from where it was assembled without patching addresses using a linkage editor. If the code is moved without relinking from the address it was assembled at, the self-modifying part will modify something other than the state value, generally resulting in erratic behavior. Hopefully, it will crash before it corrupts anything.
There are other more esoteric disadvantages such as information leakage that I will only refer to obliquely below as I use the code as a springboard to introduce some of the more interesting features of the 6809.
[JMR202207180845:
And, in the morning light, I realize I didn't do that. I'll add the seed routines and the oblique mentions now, or sometime today, in between on-line family conference, shopping for the family, laundry, etc. It's my day off today.
]
I'll run the code through an assembler so that you can see what the object code will look like. Note particularly that, with the origin for Metcalf's code set at hexadecimal 1000, the state is located at address 1001. (Addresses are the leftmost column below, instructions begin at $1000 and $1003. $CC0001 is the op-code that gets modified. And most-significant-byte-first is eminently readable.)
( xrnd.asm):00001 *
( xrnd.asm):00002 * John Metcalf's random number generator code from FB 6809/6309, 6800 group,
( xrnd.asm):00003 * with my running commentary:
( xrnd.asm):00004 org $1000 ; Keeping the direct page at bay.
1000 ( xrnd.asm):00005 xrnd:
1000 CC0001 ( xrnd.asm):00006 ldd #1 ; Use immediate as static initial "previous".
1003 46 ( xrnd.asm):00007 rora ; Who cares what the cat^H^H^Hcarry drags in?
1004 56 ( xrnd.asm):00008 rorb ; Complete divide by 2, watch remainder in carry.
1005 F81001 ( xrnd.asm):00009 eorb xrnd+1 ; Mask high byte into halved low.
1008 F71001 ( xrnd.asm):00010 stb xrnd+1 ; Overwrite previous value high byte in immediate.
100B 56 ( xrnd.asm):00011 rorb ; Divide product low byte by 2 again, feeding remainder back into bit 7.
100C F81002 ( xrnd.asm):00012 eorb xrnd+2 ; Mask former low byte into product.
100F 1F98 ( xrnd.asm):00013 tfr b,a ; Dup product low to high, kill bits 15 to 9 of half-high.
1011 B81001 ( xrnd.asm):00014 eora xrnd+1 ; Fold intermediate low in again for new high.
1014 FD1001 ( xrnd.asm):00015 std xrnd+1 ; Store full final result in immediate.
1017 39 ( xrnd.asm):00016 rts ; return result in D.
( xrnd.asm):00017 *
( xrnd.asm):00018 * Clever, but ignores a lot of what makes the 6809 special.
( xrnd.asm):00019 *
( xrnd.asm):00020 * The seed routine, seed in D --
( xrnd.asm):00021 * Set the point in the pseudo-random sequence,
( xrnd.asm):00022 * force a 0 seed to 1.
( xrnd.asm):00023 * Even though it seems too short to bother with a call and return,
( xrnd.asm):00024 * modifying code from arbitrary other places in code is a recipe for disaster.
1018 ( xrnd.asm):00025 xrndseed:
1018 FD1001 ( xrnd.asm):00026 std xrnd+1
101B 2603 ( xrnd.asm):00027 bne xrndseedgo ; 0?
101D 7C1002 ( xrnd.asm):00028 inc xrnd+2 ; => 1
1020 ( xrnd.asm):00029 xrndseedgo
1020 39 ( xrnd.asm):00030 rts
( xrnd.asm):00031 *
( xrnd.asm):00032 *
( xrnd.asm):00033 *-------------------
( xrnd.asm):00034 *
( xrnd.asm):00035 * Watch what happens when the direct page comes into play:
( xrnd.asm):00036 org $1100
11 ( xrnd.asm):00037 setdp $11
1100 ( xrnd.asm):00038 xrndvdp:
1100 CC0001 ( xrnd.asm):00039 ldd #1
1103 46 ( xrnd.asm):00040 rora
1104 56 ( xrnd.asm):00041 rorb
1105 D801 ( xrnd.asm):00042 eorb xrndvdp+1
1107 D701 ( xrnd.asm):00043 stb xrndvdp+1
1109 56 ( xrnd.asm):00044 rorb
110A D802 ( xrnd.asm):00045 eorb xrndvdp+2
110C 1F98 ( xrnd.asm):00046 tfr b,a
110E 9801 ( xrnd.asm):00047 eora xrndvdp+1
1110 DD01 ( xrnd.asm):00048 std xrndvdp+1
1112 39 ( xrnd.asm):00049 rts
( xrnd.asm):00050 *
( xrnd.asm):00051 * The seed routine, seed again in D --
( xrnd.asm):00052 * Set the point in the pseudo-random sequence,
( xrnd.asm):00053 * force a 0 seed to 1.
( xrnd.asm):00054 * Again, keep the self-modifying stuff under control.
1113 ( xrnd.asm):00055 xrndvdpseed:
1113 DD01 ( xrnd.asm):00056 std xrndvdp+1
1115 2602 ( xrnd.asm):00057 bne xrndvdpseedgo ; 0?
1117 0C02 ( xrnd.asm):00058 inc xrndvdp+2 ; => 1
1119 ( xrnd.asm):00059 xrndvdpseedgo
1119 39 ( xrnd.asm):00060 rts
( xrnd.asm):00061 *
( xrnd.asm):00062 * When the direct page is set to $11 and the source code tells the assembler it is,
( xrnd.asm):00063 * the direct page call is automatically used, saving a byte and a cycle:
111A 9D00 ( xrnd.asm):00064 jsr xrndvdp
( xrnd.asm):00065 * compare:
111C BD1000 ( xrnd.asm):00066 jsr xrnd
( xrnd.asm):00067 *
( xrnd.asm):00068 *
( xrnd.asm):00069 *-------------------
( xrnd.asm):00070 *
( xrnd.asm):00071 * Still using self-modifying code for the static initialization,
( xrnd.asm):00072 * but not using extended/absolute or direct page;
( xrnd.asm):00073 * instead, using 5 extra cycles total to make it position-independent:
( xrnd.asm):00074 org $1200
1200 ( xrnd.asm):00075 xrndvpic:
1200 CC0001 ( xrnd.asm):00076 ldd #1
1203 46 ( xrnd.asm):00077 rora
1204 56 ( xrnd.asm):00078 rorb
1205 E88CF9 ( xrnd.asm):00079 eorb xrndvpic+1,pcr
1208 E78CF6 ( xrnd.asm):00080 stb xrndvpic+1,pcr
120B 56 ( xrnd.asm):00081 rorb
120C E88CF3 ( xrnd.asm):00082 eorb xrndvpic+2,pcr
120F 1F98 ( xrnd.asm):00083 tfr b,a
1211 A88CED ( xrnd.asm):00084 eora xrndvpic+1,pcr
1214 ED8CEA ( xrnd.asm):00085 std xrndvpic+1,pcr
1217 39 ( xrnd.asm):00086 rts
( xrnd.asm):00087 *
( xrnd.asm):00088 * The seed routine, seed again in D --
( xrnd.asm):00089 * Set the point in the pseudo-random sequence,
( xrnd.asm):00090 * force a 0 seed to 1.
( xrnd.asm):00091 * Again, keep the self-modifying stuff under control.
1218 ( xrnd.asm):00092 xrndvpicseed:
1218 ED8CE6 ( xrnd.asm):00093 std xrndvpic+1,pcr
121B 2603 ( xrnd.asm):00094 bne xrndvpicseedgo ; 0?
121D 6C8CE2 ( xrnd.asm):00095 inc xrndvpic+2,pcr ; => 1
1220 ( xrnd.asm):00096 xrndvpicseedgo
1220 39 ( xrnd.asm):00097 rts
( xrnd.asm):00098 *
( xrnd.asm):00099 * The above code can be copied as-is to any location in memory
( xrnd.asm):00100 * and called there without any patching or linkage editing.
( xrnd.asm):00101 * Note that, if Y does not need to be preserved, we could do this
( xrnd.asm):00102 * -- which is just better than break-even in terms of cycles and byte count:
( xrnd.asm):00103 org $1200
1200 ( xrnd.asm):00104 xrndvpicea:
1200 CC0001 ( xrnd.asm):00105 ldd #1
1203 46 ( xrnd.asm):00106 rora
1204 56 ( xrnd.asm):00107 rorb
1205 318CF9 ( xrnd.asm):00108 leay xrndvpicea+1,pcr
1208 E8A4 ( xrnd.asm):00109 eorb ,y
120A E7A4 ( xrnd.asm):00110 stb ,y
120C 56 ( xrnd.asm):00111 rorb
120D E821 ( xrnd.asm):00112 eorb 1,y
120F 1F98 ( xrnd.asm):00113 tfr b,a
1211 A8A4 ( xrnd.asm):00114 eora ,y
1213 EDA4 ( xrnd.asm):00115 std ,y
1215 39 ( xrnd.asm):00116 rts
( xrnd.asm):00117 *
( xrnd.asm):00118 * The seed routine, seed again in D --
( xrnd.asm):00119 * Set the point in the pseudo-random sequence,
( xrnd.asm):00120 * force a 0 seed to 1.
( xrnd.asm):00121 * Again, keep the self-modifying stuff under control.
1216 ( xrnd.asm):00122 xrndvpiceaseed:
1216 318CE8 ( xrnd.asm):00123 leay xrndvpicea+1,pcr ; Using Y again, but costs more than it saves here.
1219 EDA4 ( xrnd.asm):00124 std ,y
121B 2602 ( xrnd.asm):00125 bne xrndvpiceaseedgo ; 0?
121D 6C21 ( xrnd.asm):00126 inc 1,y ; => 1
121F ( xrnd.asm):00127 xrndvpiceaseedgo
121F 39 ( xrnd.asm):00128 rts
( xrnd.asm):00129 *
( xrnd.asm):00130 *
( xrnd.asm):00131 *-------------------
( xrnd.asm):00132 *
( xrnd.asm):00133 * Dropping the self-modifying code, using a variable in the direct page,
( xrnd.asm):00134 * and using explicit initialization:
( xrnd.asm):00135 *
( xrnd.asm):00136 org $2000
20 ( xrnd.asm):00137 setdp $20
( xrnd.asm):00138 *
2000 ( xrnd.asm):00139 xrndvromprev rmb 2
( xrnd.asm):00140 org $2300
( xrnd.asm):00141 * The seed routine, seed again in D --
( xrnd.asm):00142 * Set the point in the pseudo-random sequence,
( xrnd.asm):00143 * force a 0 seed to 1.
2300 ( xrnd.asm):00144 xrndvromseed:
2300 104D ( xrnd.asm):00145 tstd
2302 2603 ( xrnd.asm):00146 bne xrndvrominitgo
( xrnd.asm):00147 * Fall through:
( xrnd.asm):00148 *
2304 ( xrnd.asm):00149 xrndvrominit: ; No parameters.
2304 CC0001 ( xrnd.asm):00150 ldd #1
2307 ( xrnd.asm):00151 xrndvrominitgo
2307 DD00 ( xrnd.asm):00152 std xrndvromprev
2309 39 ( xrnd.asm):00153 rts
( xrnd.asm):00154 *
230A ( xrnd.asm):00155 xrndvrom:
230A DC00 ( xrnd.asm):00156 ldd xrndvromprev
230C 46 ( xrnd.asm):00157 rora
230D 56 ( xrnd.asm):00158 rorb
230E D800 ( xrnd.asm):00159 eorb xrndvromprev
2310 D700 ( xrnd.asm):00160 stb xrndvromprev
2312 56 ( xrnd.asm):00161 rorb
2313 D801 ( xrnd.asm):00162 eorb xrndvromprev+1
2315 1F98 ( xrnd.asm):00163 tfr b,a
2317 9800 ( xrnd.asm):00164 eora xrndvromprev
2319 DD00 ( xrnd.asm):00165 std xrndvromprev
231B 39 ( xrnd.asm):00166 rts
( xrnd.asm):00167 *
( xrnd.asm):00168 * The above has the disadvantage of requiring initialization,
( xrnd.asm):00169 * but it can be in ROM, and the ROM can be anywhere in memory --
( xrnd.asm):00170 * subject to proper handling of what the DP points at.
( xrnd.asm):00171 *
( xrnd.asm):00172 * This disadvantage has the advantage that,
( xrnd.asm):00173 * if DP is different for each process,
( xrnd.asm):00174 * each process has its own state,
( xrnd.asm):00175 * and one process can't probe or force the sequence of another
( xrnd.asm):00176 * (-- without going hunting for the other's DP).
( xrnd.asm):00177 *
( xrnd.asm):00178 * Note that, for some reason, Motorola left dp out of the indexed modes,
( xrnd.asm):00179 * so that, if the assembler allows it,
( xrnd.asm):00180 * leax xrndv1sprev
( xrnd.asm):00181 * ends up reverting to extended (absolute) mode (darn it).
( xrnd.asm):00182 * (Technically, extended is also an undefined index mode in the 6809 spec,
( xrnd.asm):00183 * but the machine code does do what it would be expected to in Motorola parts.)
( xrnd.asm):00184 *
( xrnd.asm):00185 * You can take the address of variables in the DP,
( xrnd.asm):00186 * but it takes several instructions to do so, instead of just one LEA.
( xrnd.asm):00187 * Two general ways:
( xrnd.asm):00188 *
( xrnd.asm):00189 * (1) Keep a pointer to the DP in the DP area,
( xrnd.asm):00190 * but it has to be initialized, and if it gets overwritten code goes wonky.
( xrnd.asm):00191 *
( xrnd.asm):00192 * (2) Calculate it, probably using a couple of TFR instructions and working in D.
( xrnd.asm):00193 *
( xrnd.asm):00194 * It's tempting to show more code, but I really should leave it
( xrnd.asm):00195 * as an exercise for the reader, so you can convince yourself.
( xrnd.asm):00196 *
( xrnd.asm):00197 *
( xrnd.asm):00198 *-------------------
( xrnd.asm):00199 *
( xrnd.asm):00200 * Using no global declarations, parameters on S stack:
( xrnd.asm):00201 * (We decide X does not need to be preserved here.)
( xrnd.asm):00202 *
( xrnd.asm):00203 org $3000
( xrnd.asm):00204 *
( xrnd.asm):00205 * random takes one parameter on S stack,
0000 ( xrnd.asm):00206 xrndv1sstp set 0 ; pointer to the previous state
0002 ( xrnd.asm):00207 xrndv1slocsz set 2
( xrnd.asm):00208 *
3000 ( xrnd.asm):00209 xrndv1s:
3000 AE62 ( xrnd.asm):00210 ldx xrndv1sstp+2,s ; dodge return address on stack
3002 EC84 ( xrnd.asm):00211 ldd ,x
3004 2602 ( xrnd.asm):00212 bne xrndv1sgo
3006 C601 ( xrnd.asm):00213 ldb #1 ; Make sure it's not 0.
3008 ( xrnd.asm):00214 xrndv1sgo
3008 46 ( xrnd.asm):00215 rora
3009 56 ( xrnd.asm):00216 rorb
300A E884 ( xrnd.asm):00217 eorb ,x
300C E784 ( xrnd.asm):00218 stb ,x
300E 56 ( xrnd.asm):00219 rorb
300F E801 ( xrnd.asm):00220 eorb 1,x
3011 1F98 ( xrnd.asm):00221 tfr b,a
3013 A884 ( xrnd.asm):00222 eora ,x
3015 ED84 ( xrnd.asm):00223 std ,x
3017 3510 ( xrnd.asm):00224 puls x ; return address
3019 3262 ( xrnd.asm):00225 leas xrndv1slocsz,s ; deallocate parameters
301B 6E84 ( xrnd.asm):00226 jmp ,x return
( xrnd.asm):00227 *
( xrnd.asm):00228 * Note that it would be tempting to
301D ECF802 ( xrnd.asm):00229 ldd [xrndv1sstp+2,s] ; use the pointer directly
( xrnd.asm):00230 * but we don't have a nice and easy
( xrnd.asm):00231 * eorb 1,[xrndv1sstp+2,s] ; not a 6809 index mode
( xrnd.asm):00232 * to avoid (allocating and) using local intermediate temporaries,
( xrnd.asm):00233 * and the memory indirect is just slower anyway.
( xrnd.asm):00234 *
( xrnd.asm):00235 * Seed takes two parameters,
0002 ( xrnd.asm):00236 xrndv1sstptr set 2 ; pointer to the previous state
0000 ( xrnd.asm):00237 xrndv1sseedval set 0 ; seed value
0004 ( xrnd.asm):00238 xrndv1sseedlocsz set 4
( xrnd.asm):00239 *
3020 ( xrnd.asm):00240 xrndv1sseed:
3020 AE64 ( xrnd.asm):00241 ldx xrndv1sstptr+2,s ; dodge return address on stack
3022 EC02 ( xrnd.asm):00242 ldd xrndv1sseedval+2,x
3024 2601 ( xrnd.asm):00243 bne xrndv1sseedgo
3026 5C ( xrnd.asm):00244 incb
3027 ( xrnd.asm):00245 xrndv1sseedgo
3027 ED84 ( xrnd.asm):00246 std ,x
3029 3510 ( xrnd.asm):00247 puls x ; return address
302B 3264 ( xrnd.asm):00248 leas xrndv1sseedlocsz,s ; deallocate parameters
302D 6E84 ( xrnd.asm):00249 jmp ,x ; return
( xrnd.asm):00250 *
( xrnd.asm):00251 * The above is self-initializing,
( xrnd.asm):00252 * can be in ROM, can be located anywhere.
( xrnd.asm):00253 *
( xrnd.asm):00254 * Perhaps even more important,
( xrnd.asm):00255 * it allows using the same code with the states of as many separate
( xrnd.asm):00256 * random number generators as you want.
( xrnd.asm):00257 *
( xrnd.asm):00258 * And, sure, if you use D and X for the parameters, instead,
( xrnd.asm):00259 * you don't have to manipulate the stack frame and you can save a lot of code.
( xrnd.asm):00260 * Again, I leave it as an exercise for the reader.
( xrnd.asm):00261 * (Cough) Yeah, here I am reaching for an excuse to show stack frames on the 6809.
( xrnd.asm):00262 *
( xrnd.asm):00263 *
( xrnd.asm):00264 *-------------------
( xrnd.asm):00265 *
( xrnd.asm):00266 * One more example, also using no global declarations,
( xrnd.asm):00267 * but using a separate parameter stack:
( xrnd.asm):00268 * (For some reason, we decide this time that both X and D must be preserved.)
( xrnd.asm):00269 *
( xrnd.asm):00270 org $4000
( xrnd.asm):00271 *
( xrnd.asm):00272 * One input parameter and one output parameter on U stack:
0000 ( xrnd.asm):00273 xrndv2sst set 0 ; input parameter: pointer to the previous state
0000 ( xrnd.asm):00274 xrndv2sres set 0 ; output parameter: pseudo-random number
( xrnd.asm):00275 *
4000 ( xrnd.asm):00276 xrndv2s:
4000 3416 ( xrnd.asm):00277 pshs d,x ; save them
4002 AEC4 ( xrnd.asm):00278 ldx xrndv2sst,u ; nothing to dodge on U stack
4004 EC84 ( xrnd.asm):00279 ldd ,x
4006 2602 ( xrnd.asm):00280 bne xrndv2sgo
4008 C601 ( xrnd.asm):00281 ldb #1 ; Make sure it's not 0.
400A ( xrnd.asm):00282 xrndv2sgo
400A 46 ( xrnd.asm):00283 rora
400B 56 ( xrnd.asm):00284 rorb
400C E884 ( xrnd.asm):00285 eorb ,x
400E E784 ( xrnd.asm):00286 stb ,x
4010 56 ( xrnd.asm):00287 rorb
4011 E801 ( xrnd.asm):00288 eorb 1,x
4013 1F98 ( xrnd.asm):00289 tfr b,a
4015 A884 ( xrnd.asm):00290 eora ,x
4017 ED84 ( xrnd.asm):00291 std ,x
4019 EDC4 ( xrnd.asm):00292 std xrndv2sres,u
401B 3596 ( xrnd.asm):00293 puls x,d,pc ; restore and return
( xrnd.asm):00294 *
( xrnd.asm):00295 * Seed takes two input parameters, has no output parameters:
0002 ( xrnd.asm):00296 xrndv2sstptr set 2 ; pointer to the previous state
0000 ( xrnd.asm):00297 xrndv2sseedval set 0 ; seed value
( xrnd.asm):00298 *
401D ( xrnd.asm):00299 xrndv2sseed:
401D 3416 ( xrnd.asm):00300 pshs d,x ; save them
401F 3716 ( xrnd.asm):00301 pulu d,x ; get parameters
4021 ED84 ( xrnd.asm):00302 std ,x ; set seed
4023 2602 ( xrnd.asm):00303 bne xrndv2sseedgo
4025 6C01 ( xrnd.asm):00304 inc 1,x ; make sure it's not 0
4027 ( xrnd.asm):00305 xrndv2sseedgo
4027 3596 ( xrnd.asm):00306 puls d,x,pc ; return
( xrnd.asm):00307 *
( xrnd.asm):00308 * Also self-initializing, can be in ROM, ROM can be anywhere.
( xrnd.asm):00309 * Supports as many separate random number generators as you want.
( xrnd.asm):00310
No comments:
Post a Comment