joel's programming fun: September 2022

Saturday, September 24, 2022

A Short Description of VTL-2 Programming (Very Tiny Language pt 2)

(Continuing from VTL-2 Expressions, because some people want to know:)

As I understand it, Very Tiny Language essentially owes its existence to the paucity of support from MITS for the Altair 680. RAM boards from the manufacturer were not just expensive, but they were really hard to come by. And even if you had the RAM, putting anything in the RAM was painful. No floppies, no cassette tape, only the serial port, so you had to load BASIC or the assembler or other tools from paper tape, and that took a long, noisy several hours according to what I have heard. And without tools, it's hard to develop tools in the first place.

So Gary Shannon and Frank McCoy at the famous Computer Store made this very tiny programming language that fit into three ROM sockets on a single ROM board. 760 bytes of fairly tight code. (When I say fairly tight, don't get me wrong. I might kibbitz their work, but they did a pretty good job.)

They took a bunch of short-cuts in the design of the language to keep it tight, which is something to keep in mind while we walk through using the language to write programs.

And it turned out to be useful enough that owners of the 8800 also wanted it, so they wrote a version for that. And then 3rd parties wrote versions for the 6502 and other processors.

Why?

Because it can turn your microcomputer into a programmable (integer) calculator even with very little ROM and RAM. (And it's also an example of a useful, minimal ad-hoc programming language.)

My post on expressions talked about using it without really programming in it. This post will be about programming in it.

Allocation variables!:

Before we can start putting VTL programs in memory, we have to make sure VTL knows where it can store the lines of program we feed it.

(Fasten your seat-belt and hang on, this detour-is-not-a-detour is a bit of a rough ride.)

We need to check the two allocation variables, the current allocation base -- & (ampersand) -- and the top of allocatable memory -- * (asterisk). (Yes, ampersand and asterisk are variables.)

?=&

will print the current allocation base. And

?=*

will print the top of allocatable memory -- except that they have to be correctly set first.

Yes, these are variables, and yes, that's what they are named.

If you want to know the total amount of memory available for programming (and the array, more later), type

?=*-&

-- if they happen to be correctly set. Which is what I should explain now.

To understand the problem, consider that Gary Shannon and Frank McCoy at one time were making a little money off the VTL ROMs they sold. Not much, though. They really didn't want to have lots of different versions of the ROM in their inventory.

Other than the CPU decision, they just wanted the user to be able to plug the ROMs into a ROM board, set the board's address, and turn on power. So they didn't want to program a RAM limit into the ROMs that would not apply to a large number of their customers, even if the customer could fix the limit. Better to make customers aware of what those variables are and why, and ask the customers to set them themselves.

Maybe.

Anyway, if you have the original ROMs or the original source (including the version I started with in bringing VTL-2 up on EXORsim), these variables are not set when VTL-2 starts up.

And, until they are set, you can't start programming.

If you are running VTL-2 inside an emulator, the emulator may do you the convenience of clearing memory before loading the interpreter -- in which case it will be obvious that they are not set.

Or it may not.

And if you are running real hardware, the variables will probably have garbage in them on boot-up, which may be obvious, or which may look valid to the interpreter (and may cause crashes if you try to edit or look at program lines).

On the other hand, if you are working with one of the executable images made from source in which I moved the VTL variables out of the direct page (all the rest of the versions I have posted), I went ahead and put start-up code in the interpreter that sets them -- specifically because I moved those variables and the initial allocation base is no longer a simple question of 264 on a 68XX CPU.

Similarly, C being what C is, VTL-C sets them for you. And others have done likewise, for similar reasons.

So, you need to take a look at them and figure out if they need to be set or not.

Top of allocatable RAM:

The top of allocatable RAM is fairly straightforward. Check it now:

?=*

It should be either equal to the amount of RAM you have (or have given it), if VTL resides (say, in a ROM) in an upper address range, or the amount of RAM minus 1K to 2K -- so that VTL and maybe system stuff like maybe stack can reside in RAM above the allocatable area.

Confusing? Maybe I should draw a map:

0000:	Possible system stuff
(low address):	VTL variables also, input buffer and usually VTL stack
(slightly higher address):	& (allocation base is here.) VTL allocatable space -- for program (and maybe an array) * (top of allocatable area is here.)
(still higher address):	possible system stuff (including, possibly, the host stack)
(even higher address, may be top half of memory):	VTL interpreter code image
(above that):	more system stuff (including, possibly, the host stack)

Hard numbers?

The top of the allocatable area should be something around 30000 or more for 32K RAM systems, 14000 or more for 16K, 6000 or more for 8K, etc. If the interpreter is up in the upper 32K of address space, where ROM often is, top of RAM may be exact powers of two.

However, in some systems, the system stuff at the bottom of RAM is huge, and general purpose RAM itself may not start until something like 16384. And general purpose RAM may not come in simple powers of 2. So you may need to figure out the total RAM size and add it to the physical beginning of general purpose RAM.

(For the MC-10, for example, you probably want to play around with it with my unedited sources in the XRoar emulator, then read the source and figure out how to adapt it to your hardware, to use your RAM configuration effectively. Your likely configurations are 4K, 8K, 20K, and ??, so top of allocatable space is likely to be 1 or 2 K less than one of those.)

(To be strictly accurate, in the MC-10 images and the CoCo images I've produced, I'm letting the VTL interpreter piggyback its stack on the stack the BASIC host uses. This is because the MC-10 interrupt routines, in particular, don't like the stack being somewhere else -- which means you really could eliminate the stack allocation in the source code, but I didn't want to pull too many rugs out from under things.

Without the necessity of maintaining BASIC's stuff, you would need less than 900 bytes at the top of RAM for the interpreter, not 2K. Top of RAM would be above 31000, 15000, 7000, etc.)

Allocation base:

If you are running an original ROM or an image from the original source, the initial value of the allocation base depends on the CPU:

For the 6800, the allocation base at startup is 264, eight bytes above the stack area at the top of the direct page.
For the 8080, they moved the variables up to make space for the CPU interrupt shims at the bottom of memory, and the initial allocation base is 320.

If you are running an image from my modified sources and have not altered the label ZERO or the ORG before where ZERO is declared, the allocation base depends on which source you are using. You can check the assembler listing for the value of the label PRGM.

(Cough. Let me correct myself.)

If you are assembling it yourself from whatever sources, you should check the assembler listing.

Go ahead and find the listing, or re-run the assembler and get it, and look for the label PRGM. If the interpreter is setting things for you, a little bit after the label COLD, or a little before START, you'll see something like


      LDX	#PRGM
    STX	AMPR

That's the allocation base. And you'll see the code setting the STAR variable shortly before the START label. That's the theoretical top of memory.

(If you're running a C language version, it will most likely be 264 on startup. Or, if it's a 32-bit version, 528 or something. Maybe. Look for clues in the C source. Or assume it's good unless it crashes.)

At the time I write this post,

For my 6800 version with just the variables moved, the initial allocation base is hexadecimal $308, which is 776 in base ten.
For my general 6801 enabled, slightly optimized version with the variables moved, it is the same.
For the MC-10 using the general 6801 image, because there is a huge gap where nothing exists below video RAM and because VTL and its variables have to reside above video RAM and memory in use by interrupt routines and such, the initial allocation base is $4400, or 17408 in base ten.
For the Color Computer using the transliteration image, while we don't have that gap of wasted memory map like on the MC-10, we do have memory used by BASIC and interrupt routines, and by the video RAM, which have to reside below the variables. The initial allocation base I'm using is $1708, or 5896 in base ten.

Whew. End of detour-is-not-a-detour.

If the values of the allocation variables don't look reasonable on startup, figure out what they should be and set them before you continue. Otherwise, you will by typing in code and VTL won't be remembering it.

One more point before we return to our originally scheduled programming:

New command:

Now that you have noted and remembered the initial value of the allocation base, consider what happens if you have a program in memory and you type

&=264

or whatever the initial allocation base label was on your system (say, 5896 on the Color Computer or 17408 on the MC-10).

Presto.

VTL has forgotten your program and is ready for a new one.

Programming a counted loop in VTL-2:

With the allocation variables properly set, we can now type program lines in and list them.

In my bring-up process, I used several variations of a simple counting program because it was simple. We'll start by looking at that. If you've been through my expressions post, much of this will be familiar, and you'll be able to see some differences from BASIC.

10 A=0
20 A=A+1
30 ?="A=";
40 ?=A
50 ?=""
60 #=(A<10)*20
70 ?="BYE-BYE NOW"

Lines 10 and 20 look just like BASIC.

If you make a mistake, _ (the underscore key) allows you to back up one space, but the terminal doesn't back up or erase it from the screen. If you have trouble keeping track of what's happening this way, type @ (the at-each key) and start the line over from scratch.

Line 30 has that print command that reminds one of the BASIC shortcut key, but isn't the same. It will just print the string

A=

on the output device. And the semicolon will stop the VTL interpreter from putting a new-line after it.

Line 40 prints the value of A.

Line 50 doesn't have a trailing ; (semicolon), so printing the empty string puts a new line on the output device.

Line 60:

You may remember from the expressions post that # is the current line number. And I mentioned that setting it forces a jump. But what is that weird business about multiplying a comparison by 20?

There is no line 0 in VTL. Trying to jump to line 0 is a no-op.

The value of the comparison is 0 (false) or 1 (true). Multiply that by 20 and you get either 0 or 20. If it is 0, there is no jump. If it is 20, setting the line number to 20 causes a jump to line 20.

Say what?

This is VTL's version of BASIC's IF -- GOTO statement.

VTL re-uses the assignment syntax -- and the semantics -- not just for jumps, but for conditional jumps, too.

Just for the record, the left-to-right parsing means that we don't really need the parentheses on the condition. I put them in there to make them more obvious. Whether you do or not is up to you. Remember that more complex conditions will likely need the parentheses.

So, that's how this counted loop works. If you run it.

Can you guess how to run it?

That's right. Type

#=1

on a line by itself.

OK
#=1
A=1
A=2
A=3
A=4
A=5
A=6
A=7
A=8
A=9
A=10
BYE-BYE NOW

Huh? Why not #=10?

#=10 would also work, but VTL does you the convenience of looking for the next higher existing line number and starting there, so #=1 will run any program.

What about line 0? Try it if you haven't yet.

Nothing? Remember, VTL forbids line 0.

That means that, when you type 0 by itself on a line, it does not try to edit, save, or delete a line 0. And the authors took advantage of that.

OK
0
10 A=0
20 A=A+1
30 ?="A=";
40 ?=A
50 ?=""
60 #=(A<10)*20
70 ?="BYE-BYE NOW"

Typing a 0 by itself on a line gives you a listing, instead -- if the allocation variables define memory to remember a program in, and if there is a program in there.

Some versions of VTL use an L instead of a 0 as a listing command, just for the record.

Programming Factorials in VTL-2

This one is from the PDF manual, but I've modified it just a little bit for no particular reason.

Since VTL wants us to be frugal, we'll take the cumulative approach.

LIne 10, N will be the index number.

Line 20, F will be the cumulative factorial.

Line 30, we want to print N,

then, in line 40 print some text to show what it is,

and, in line 50, print F.

In line 60, we'll terminate the output line.

In line 70, we'll increment remember F in G.

In line 80, we'll increment N,

and in line 90, we'll use the new value of N and the old value of the factorial to obtain the factorial for the new N.

In line 100, we'll see if the new factorial is so large that it wraps around. This only works once, and we only know it works because we know it works. We might as well check to see if N is less than 9.

And, in fact, the first time through, it doesn't work. We don't have a less than or equal comparison, so we'll add a line for the equality at 95.

10 N=0
20 F=1
30 ?=N
40 ?="! = ";
50 ?=F
60 ?=""
70 G=F
80 N=N+1
90 F=F*N
95 #=(G=F)*30
100 #=(G<F)*30

and that prints out the factorials of 0 through 8.

Subroutines in VTL:

Print the value of the (system) variable ! (exclamation mark).

?=!

If you just ran the program above, it should give you 101.

VTL does a little more than just set the line number when it sets the line number. (I told a little lie back there about the semantics. Or, at least, not the whole truth. It's doing a little more than re-using the semantics of assignment.)

Before VTL sets the # line number variable, it saves the current line number plus 1 in the ! (exclamation mark) variable.

So you get one level of call and return. Exclamation mark is the current return line number.

Don't get too excited, it's just one level. If you want to further call subroutines, you have to save the current return value first, yourself, somewhere.

There are other system variables, but I will stop here.

This should be enough to help you decide whether or not you want to go looking for the PDF manual with the rest of the system variables listed and some example programs to try.

One place the PDF manual for the Altair 680 can be found (with some other retro stuff) is at http://www.altair680kit.com/ .

The manual includes more sample code.

One downside of VTL-2 on the MC-10 or Color Computer is that we don't have a way to load the programs in other than typing them. If we had a true monitor, we could save and load the program area. Or someone could extend VTL-2 on these two, to create cassette tape save and load functions, perhaps -- someone besides me. I think I need to get back to some other projects.

Friday, September 23, 2022

A Short Description of VTL-2 Expressions (Very Tiny Language pt 1)

(Because some people want to know:)

The simplest way to describe VTL (Very Tiny Language) is to say it it is very small, and it turns your 8-bit microcomputer into a programmable integer calculator. Typical assembler implementations can fit into well less than 1 kilobyte of ROM.

Comparing this to the programming language BASIC, which can also be described as a way to turn your computer into a programmable desktop calculator. a typical small BASIC interpreter capable of handling floating point (fractional/scientific notation) numbers takes about 8 to 24 kilobytes of object code on a typical 8-bit microcomputer. BASIC on the original IBM PC took about 32 kilobytes. Bywater BASIC (something close to the original BASIC on the IBM 5150 PC) on a modern 64-bit computer takes about 180 kilobytes.

The Unix utility bc is another language that turns your computer into a calculator with arbitrary precision, and it has a typical object code image of about 90 kilobytes on a 64-bit computer.

Obviously, you won't get the functionality of a full BASIC or bc (or Forth) with VTL. But you can write small programs with it, so it's interesting.

Anyway, the available manuals all tend to be PDFs, and you may not want to go to the trouble of finding one and downloading it and hoping it's relevant to the implementation you have, so I'm writing down my impressions of VTL-2, with some examples.

This post focuses on expressions in VTL-2.

One word of warning, VTL hardly gives any feedback at all. In fact, my conversions to the MC-10 and the Color Computer don't even give you a cursor. (I should try to fix that, I suppose, but not today.)

But it does at least give you the

OK

prompt when it's ready for more commands.

No error messages. (Which I do not intend to try to fix.)

If you want see whether the VTL interpreter you've just gone to the trouble of downloading, assembling, and loading into your target machine is running, try this:

?=1+1

Don't forget the = after the ? . This is not a dialect of BASIC.

It should print out 2 for you. That means the expression grammar part is working at some level.

So, let's proceed with syntax and semantics, etc. --

Assignment:

Assignment is done with the usual = symbol:

A=A+1

and such.

Variables:

All variables are pre-declared and pre-allocated. Since the variable names are limited to one letter or character, there aren't that many, and it takes less memory to have them already exist than to write code to let you declare and allocate them.

The 26 variables A through Z are integer variables you can use as you want.

Output:

Printing re-uses the assignment syntax:

?=A

prints whatever integer value is in A to your output device.

Note that it sort of looks like the short-cut available in many BASICs, but is not.

? A

does something hard to explain just yet, but does not really do what you want, even though it may look like it does.

Printing strings is a logical extension to the print syntax:

?="APPLES"

puts the string

APPLES

on your output device.

You want to know why the = is in there? Re-using the syntax allows re-using the code and keeping it small, is the best explanation I can think of. (I assume. I am neither Gary Shannon nor Frank McCoy. Ask them if you get a chance.)

More variables, including system variables:

Variables other than A through Z?

All variable names are one character.

Take a look at your nearest ASCII chart. (It should be possible to produce an EBCDIC version of VTL, but we won't talk about that here.)

If you're running a real operating system, you can use

man ascii

at the command line to get an ASCII chart. Otherwise, a quick web search can find you one.

Stock VTL doesn't give you separate variables with lower-case names.

It does give you variables with punctuation marks for names, from exclamation mark to caret, but some, like ? and the digits 0 through 9 are not directly available, and several of the others are used by the interpreter.

The current line number, for instance, is #. Of particular note,

#=1200

is how you jump to line 1200. Note, again, the re-use of the assignment syntax. Or, actually, this is a re-use of assignment itself, but needs some more explanation, later.

(If you are having trouble with VTL-2 bombing or freezing on you, try setting the allocation base and top of RAM variables to zero for now:

&=0
*=0

But if VTL-2 is stable, don't do that. I'll explain later, in the programming introduction.)

Expressions:

First, let's work through the fundamental VTL expressions.

+ is addition,
- is subtraction,
* is multiplication,
/ is division (leaving the remainder in the variable %),
= in expressions (cough) is comparison for equality,
> in expressions is comparison for greater-than-or-equal,
< in expressions is comparison for less than, and
() nests expressions, overriding the normal left-to-right parse.

Comparisons give you 0 if false, 1 if true, which I will show how to use later.

All assume 16-bit integer math.

So, let's look at some examples.

The expression we used as a test up there

?=1+1

was just adding 1 and 1 and printing the result.

Typing in

Z=26

sets the variable Z to 26. After that, you can type

?=Z

and it prints out the value of Z,

26

unless you've changed it since then.

Let's look at comparisons for a moment.

?=Z=26

might cross your eyes, but it will print out 1 for true. It's comparing the value of Z with 26, which is what we just set it to, isn't it?

?=Z<26

will print out 0 for false, and

?=Z>26

will print out 1 for true - Remember that > in VTL is not just greater than. It's the opposite of <, so it's greater than or equal. (I don't think I'd have done it quite that way, but this is not my toy.)

Correcting input:

Continuing on,

?=31416/100000

Woops. Too many zeroes. If you tried backspace, you learned that doesn't work. (I think it could be made to work, but I haven't tried.) Unmodified VTL uses an underscore (back arrow on the MC-10 and Color Computer) to cancel single characters of input:

?=31416/100000_

but it does not erase them on the screen:

?=31416/100000_
3
OK

That's a little hard to get used to.

VTL also let's us cancel whole lines with at-each:

?=31416/100000@

Try it:

?=31416/100000@
OK
?=31416/10000
3
OK

So it divides 31,416 by 10,000 and prints the result. Following that with

?=%

prints the remainder

1416
OK

And that's helpful when you don't have a floating point or other fractional math built-in.

Some limits:

Again, this is integer only, so

?=3.1416

gives a result that is hard to explain just yet.

Also, just to warn you in advance,

31416 / 10000

without the ?= at the front defines line 31416 containing the invalid expression / 10000 . Which leads us to the fact that there is no line 0, which the authors have turned to advantage.

0

on a line by itself tells the interpreter to list the current program memory -- If you have valid allocation variables for your program memory.

Valid allocation variables? That's a detour we want to avoid just now.

Back to expressions.

Operator precedence:

The parse is strictly left-to-right, unless you use parentheses. (Again, they are keeping the interpreter simple and the code small.)

So, let's look at something a little complicated. Say we are calculating the area of a circle.

A = πr²

VTL-2 does not have (the last time I looked) exponentiation, so, in VTL-2, that would be

A=p*r*r

except that we don't have floating point or any other fractional math, so we're going to have to scale it. And p is not pi, so we'll have to get π in there ourselves somehow.

Suppose we can just throw the fractional part away, and a single fractional digit plus an extra is sufficient:

A=(314*R*R)/100

Try it. Set R to an integer radius and use the expression above.

R=4
A=(314*R*R)/100
?=A

Check it in bc if you have access to bc:

scale=40
pi=a(1)*4
r=4
a=pi*r^2
a
50.2654824574366918154022941324720461471488

(Heh. bc is fun.)

Or check it in a BASIC:

bwBASIC: 10 p=3.1415927
bwBASIC: 20 r=4
bwBASIC: 30 a=p*r^2
bwBASIC: 40 print "Area:", a
bwBASIC: list
     10: p=3.1415927
     20: r=4
     30: a=p*r^2
     40: print "Area:", a
bwBASIC: run
Area:         50.2654832

Okay? (I think bc is more fun.)

Okay. We're somewhat satisfied.

What happens if we leave the parentheses out in VTL? Try it.

In this case, it works okay going from left to right. In fact, that is exactly what we want -- In this case.

Use of parenthesis:

Let's look at something a little more involved. Say we want to calculate a markup of 50% on a lot of 50 apples and 80 nashi pears. Apples are ¥125 each and nashi are ¥189. (Cheap today.)

A=50
N=80
P=(A*125+N*189)*150/100
?=P

Will this work with just the one set of parentheses? If you guess no, you're right.

397

What happened? Work it left-to-right instead of using the algebraic precedence you worked so hard to remember in middle school or elementary:

50 * 125 is 6250.
6250 + 80 is 6330.
6330 * 189 is 1196370. Woops. That's way too big for 16 bits.

(You may be wondering, so I'll tell you here. All variables in VTL are unsigned. If you handle negative numbers, you have to handle them by magnitude and somewhere remember that they are negative.)

But go ahead and plug it into VTL anyway. Unless you've got VTL running on a 68000 or some other 32-bit or 64-bit CPU (or written to calculate 32-bit integers on a 16-bit or 8-bit CPU or something), it gives you 16722 to this point, right?

(Break out bc and subtract 2^16*18 and, oh, 16722 is exactly what it should have given us at this point.)

Okay, 50% is a half, so we should be able to just scale by 2 instead of 100. A and N should still be as we want them, so let's just repeat the expression setting P:

P=(A*125+N*189)*3/2

Oh, but there's something niggling at the back of your mind. It's going to blow up at 6330*189 again. It shouldn't. What's going wrong?

When parentheses don't say otherwise, it's working strictly left to right, right?

We have to have more parentheses because we have to calculate an intermediate value and VTL does not understand that algebraic precedence business that we spent so much time learning in middle school. (It's like some of the old desktop calculators from decades ago, or like some modern cheap calculators.)

I wonder if it will really work with more parentheses.

P=((A*125)+(N*189))*3/2

Does it give you 32055?

Okay, I guess it worked. bc says that's what it should give us. But it got close to the limits of 16-bit math in there. (Try each step by hand to watch it get close to blowing up.)

Now we know something about VTL-2 expressions, and have seen some of the limits, and this short description is getting long. Let's save programming for another post.

*** The post on programming is here:
https://joels-programming-fun.blogspot.com/2022/09/short-description-vtl-2-programming-very-tiny-language-pt2.html

Monday, September 19, 2022

VTL-2 part 5, Transliterating to 6809

Tandy/Radio Shack TRS-80 Color Computer 1

16K Color Computer 1
by Wikimedia contributor Bilby,
licensed under CC BY 3.0
via Wikimedia Commons

Well, the transliteration of the MC-10 version of VTL-2 in 6801 assembler to 6809 assembler on the Tandy Color Computer actually went pretty quickly once I made time, using the relationships between the 6800/6801 assembly language and runtime and the 6809 assembly language and runtime that I describe in my post on the software differences between the three CPUs.

But then I found myself using the assembly language equivalent of poking telltales to the screen to figure out where I'd fallen asleep at the wheel.

There was only one place, really, where I had inverted the transfer from B to A in the character output routine, or was it from A to B in the keyboard input? Something like that.

And it handled basic expressions, but wandered off in the ether any time I tried to type in a program. After going back over the transliteration with a fine-toothed comb and finding nothing (and falling asleep doing it) for several days in a row, then using more telltales to the screen to pinpoint where it was dying, I decided to look back in the commented disassembly of Color Computer extended BASIC.

It was only a matter of a half an hour to finding the problem.

Color Computer BASIC uses a number of variables in the direct page.

I had dodged a few of the variables down around $C0 by starting the direct page variables at $D0. But there is a variable at $E2 used by the IRQ handler routine as a counter, and by starting there I was using $E2 as the SAVLIN variable, which is essentially the most used variable during program editing.

But $00E2 is the place where the CoCo BASIC IRQ response return was counting screen refresh interrupts (60 times a second).

So I moved everything down to start at $C4, just after the PIA mask variable, and the variable list ended right before $E2.

So I didn't have to move any variables, and I didn't have to optimize the copy routines to be sensible and use X and Y together instead of X and the SRC and DEST variables in the DP.

And I can type in the test program I've been using:

10 A=0
20 A=A+1
30 ?=A
40 ?=""
50 #=(A<11)*20
60 ?="DONE"

and list it by typing

0

and hitting Enter. And I can run it by typing

#=1

and hitting Enter.

The source code is at https://osdn.net/users/reiisi/pastebin/8705. Copy or download it from there.

If you want to assemble it to run in 32K, fix the ORG before COLD. It needs less than 1K of RAM for the code. With another 1K to the end of RAM for the BASIC stack it borrows and as a buffer between BASIC and VTL-2, set the ORG at 2K before the end of RAM.

You'll need LWTools to assemble it -- either download it or use mercurial or git to clone it, then compile it and copy the executables into your preferred place for user-local executables.

Then use the following command line or something similar to assemble the source:

lwasm --list --symbols -f decb -o VTL_6809_coco_translit.bin VTL_6809_coco_translit.asm

This will give you a .bin file that XRoar can load from XRoar's File menu. (Sometime I'll put a screenshots here. Until then, refer to the MC-10 post.)

Run XRoar with the command line with

xroar -machine coco -ram 16k &

for a 16K RAM Color Computer configuration. It will look a lot like what I show of XRoar running the MC-10 emulation in the MC-10 post.

If you load it from the .bin file, you'll need to type

EXEC &h3800

at the BASIC prompt to get VTL-2 running after loading it.

Another way to load it is to convert it to a .cas format and load it from the Tape Control dialog, as I described for the MC-10. You can convert the .bin file to a .cas file with a command like

bin2cas.pl -o VTL-2.CAS -C -l 0x3800 -e 0x3800 VTL_6809_coco_translit.bin

To get the bin2cas.pl tool, look for it under CAS Tools on the the dragon page on 6809.org:

https://www.6809.org.uk/dragon/

If you specify the load and execute addresses in the bin2cas command line as above, you'll need to type

CLOAD
EXEC

at the Color Computer BASIC prompt.

You should be able to modify this source code to run on other 6809 computers with a little thought, especially if you walk through my posts on getting the 6800 and 6801 source running on the MC-10 and on EXORsim.

You should also find some interesting challenges left for the interested reader, such as modifying it to use the Y register in the edit routines, and to work with the DP moved out of BASIC's working area. Such things as that.

I think I'm off to other interesting projects now.

(Maybe. Not sure whether I want to go back and see if I can get fig-Forth running right on the 6809 first or whether I want to try my hand at writing my own VTL source, or a VTL-like language of my own design, etc. Or go back to trying to work on novels for a while. I probably need to get XRoar set up for GNU-debugging before anything else.)

[JMR202209231810: add]

I have written up a post on VTL expressions, here: https://joels-programming-fun.blogspot.com/2022/09/short-description-vtl-2-expressions-very-tiny-language-p1.html , which will help in testing and otherwise making use of the language. I should also shortly have a post up introducing programming in VTL-2.

JMR202209231810: add end]

[JMR202210011737: add]

I now have my work on VTL-2 up in a private repository:

https://osdn.net/users/reiisi/pf/nsvtl/wiki/FrontPage

There is a downloadable version of VTL-2 for the Tandy MC-10 (6801) in the stock 4K RAM configuration in there, with source, executable as a .c10 file, and assembly listing for reference. Look for it in the directory mc10:

https://osdn.net/users/reiisi/pf/nsvtl/files/

I haven't wrapped it up for the Color Computer yet, expect that later. But you can see the latest source I'm working on for the Color Computer in the source tree:

https://osdn.net/users/reiisi/pf/nsvtl/scm/tree/master/

[JMR202210011737: add end]

Software Differences between the 6800, the 6801/6803, and the 6809

Following on from my previous post about the software differences between the 6800 and the 6801/6803, I'm going to try to set out the software differences between those and the 6809. You may want to read that post first for background.

In terms of history, the 6801 came after the 6809, not, as some seem to expect, before.

Certain of Motorola's customers looked at the 6809 and said they wouldn't know what to do with all of that, and couldn't they have a chip that was just the 6800 with some ROM and RAM built-in instead?

Motorola already had the 6802/6808 out, which combined the 6800 and 128 bytes of RAM. The 6802 also had most of the CPU clock generation circuitry built in. The 6808 (not directly related to the 68HC08 and later derivatives of the 6805) was an odd chip, basically the 6802 with the RAM disabled (presumably because the RAM failed Q/A tests).

The 6802 was intended to be paired with custom ROM versions of the 6846, which was similar to MOS Technologies' RIOT chip, but definitely not the same. The 6846 had 2K of ROM, a parallel port, and a hardware timer device. (I had a Micro Chroma 68, which had the 6808 or 6802 paired with a 6846 with a monitor program in the ROM -- Fun little board for prototyping, and I should have done more than I did with it.)

As I said in the post linked above, Motorola borrowed a few ideas from the 6809, but designed the 6801 to be strictly compatible with the 6800 -- except for the new op-codes and the condition codes for the CPX compare X instruction. And they improved the instruction timing. But they failed to put in the missing direct-page versions of the unary operators like increment, decrement, and such.

And one of Motorola's big customers said the 6801 was still too much CPU, couldn't they strip it down more and make it cheaper?

And Motorola complied, producing the 6805, which is a true 8-bit CPU with only one accumulator and only an 8 bit wide index. But it has bit manipulation instructions for the direct page, and it has direct-page versions of all instructions. For unary instructions, it has 16-bit offset indexed versions of the instructions. On the 6805, instead of index+constant-offset, you (often) wanted to think constant-base+index. (Oh. And the initial 6805s had no push or pop, just branch and jump to subroutine. You had to do software stacks if you wanted them.) This was all in fewer transistors than the original 6800.

Motorola would evolve the 6805 in HCMOS versions, with more instructions and such. Some years later, Motorola brought out the 68HC08, which added a high byte of index to the 68HC05. (And then there was the 68HCS08 with push and pop, and the 'R08 and ....)

And later still, Motorola brought out the 68HC11, which added a Y index, bit manipulation instructions, and a little more to the 68HC01. The 68HC11 was strictly upward-compatible with the 6801. And we are roughly a decade ahead of the story now. But it's relevant.

As I say, the 6809 preceded the 6801 and 6805. The latter two benefited from lessons learned in the 6809, which the 6809 was never evolved to benefit from. (Nor was the 68000, unfortunately, until the CPU32, much later.)

Remember, the 6801 is almost (99.9% or so) perfectly object-code compatible with the 6800, but with faster instruction timings on a lot of instructions.

The 6805 is not object-code compatible, but inherits most of the overall instruction set and run-time model of the 6800 series. It also has better instruction times than the 6800, more effective use of memory space, and more flexible indexing.

The 6809 is not object-code compatible. Instructions are laid out a little different, and index encoding is significantly different. It inherits most of the instruction set of the 6800, and it extends the register and run-time model of the 6800 series, but it does not have all the improved instruction timings. The instructions it is missing can be synthesized, but care has to be exercised in the process in many cases.

And it departs from the 6800 in an important, non-trivial way --

Pushes on the 6800 and 6801 are post-decrement. Pops (PULs) are pre-increment. The stack pointer S is always pointing to the next available byte on the stack. That's a bit of a departure from most stack implementations, but the 6800 makes up for it in the TSX (transfer S to X) instruction by adding 1 before the transfer. And it makes up for it in the TXS instruction (transfer X to S) by subtracting 1 before the transfer. So, after a TXS, X is pointing to the last item pushed.

But if you store S in RAM and load it into X with the 6800 or 6801, this adjustment does not happen. You have to remember to use INX (increment X) after the LDX (load X), or adjust the offset to X. In other words, on the 6800,


      PSHA
    TSX
    ORAA   0,X

but


      PSHA
    STS	   TEMP
    LDX	   TEMP
    ORAA   1,X	; adjusting the offset instead of INXing.

On the 6809, pushes are pre-decrement, and pops (PULs) are post-increment. So the stack register is always pointing to the last item pushed, without adjustment. If you play games with the stack in 6800 or 6801 code, you have to be careful with this when moving the code to the 6809. (But the games usually played with the stack on the 6800 or 6801 are pretty much beside-the-point on the 6809. There are much better ways. More below.)

Using the above example, on the 6809,


      PSHS   A
    TFR	   S,X
    ORA    ,X

and


      PSHS   A
    STS	   TEMP
    LDX    TEMP
    ORA	   ,X	; no adjusting the offset or index.

or, better yet,


      PSHS   A
    ORA	   ,S+  ; but this is getting a little ahead of the story.

That's a long preamble, but I hope it will help you know what to look for as I compare the 6809 with the other two processors.

In broad overview, the differences in the 6809 are as follows, kind of in the order they tend to stand out:

Pairing A:B as double accumulator D, as in the 6801 (but not including 16-bit shifts).
Five indexable registers (X, Y, U, S, and PC) in the 6809, two of which (U and S) can be stack registers, as opposed to the 6800/6801 single X index register and single S stack register.
Movable direct page, with associated DP (upper 8 bits) direct page register. (This was not implemented as well as we might have hoped, more below.)
Extended interrupt model. (More registers means more time required to save them all, so an interrupt service that used only one or two registers could be assigned to the FIRQ fast interrupt request, and save time by only saving what it used.)
Two extra software interrupts (which allows the one-byte SWI to be used as a software breakpoint and the two-byte SWI2 and SWI3 to be used as system calls, if you want to do system calls with SWIs).
Stack model that always points directly at the top of stack, as mentioned above.
Compactly encoded extended indexing model:
- no offset,
- small (5 bit) signed constant offset,
- medium (8 bit) signed constant offset,
- long (16 bit) signed constant offset,
- A, B or D accumulator signed offset,
- post-increment and pre-decrement indexing (for efficient data moving -- goodbye stack blasting!),
- using the PC as an index register (as noted above),
- final memory indirect with any of the above,
- plus memory indirect with extended mode (but not, unfortunately, with direct page mode),
- and, for some reason, no way to use a load effective address (LEA) instruction to directly calculate the address of a variable in the direct page (further explained below).
Unary instructions all have direct-page mode, but are slower in direct-page mode than you might hope.
Effective address calculation instructions (LEA) to put the target addresses calculated in each of the indexed modes above into one of X, Y, U, or S. (JMP instruction allows indexed modes, so PC as a target is not needed.)
Note that LEAS and LEAU do not affect the condition codes, where LEAX and LEAY affect the Z bit.
No INX/DEX or INS/DES. Use LEAX 1,X/LEAX -1,X and LEA 1,S/LEA -1,S instead (in other words, for single increment/decrement. See below).
Full compare instructions for the four proper index registers, compatible with the 6801 CPX but not the 6800 CPX.
Multiple register push and pop for both S and U stacks.
- (And, hello, again, stack blasting. Hah.
  Be very careful if you do. Even with interrupts masked, this does not do the trick:
  PULU A,B,X,Y PSHS A,B,X,Y ; look at what happens when you repeat! Probably the best you should want to do is
  PULU A,B,X STD ,Y++ STX ,Y++ unless you like wee-hours-of-the-morning debugging sessions.)
Register-to-register transfer (TFR) instruction with source and register encoded in post-byte replaces the Txx transfer instructions in the 6800.
Register-with-register exchange (EXG) instruction that uses the same post-bytes as TFR.
Has ABX as in the 6801 (but not 6800), not affecting the condition codes.
LEAX B,X would be almost equivalent, except ABX treats B as unsigned and sets no flags by the result, where LEAX B,X sets the Z flag.
No ABA (add B to A) or SBA (subtract B from A). Instead, you can do it something like this: PSHS B ADDA ,S+ ; for ABA which is slower, but essentially equivalent if your S stack pointer is pointing to valid memory.
Instead of SEC, CLC, etc., for working with the condition codes, the 6809 has
- ORCC #FLAGBITS ; for setting flags
- ANDCC #~FLAGBITS ; for clearing flags
Neither of the 16-bit shifts that the 6801 has.
For LSRD use
LSRA RORB For LSLD use
LSLB ROLA
SYNC instruction can be used for fast synchronization to external input or to wait for an interrupt (slightly different from 6800/6801 WAI).
6809 has long branches, which aid in writing position-independent code. Conveniently, LBRA (long branch always) and LBSR are allocated single-byte op-codes to help motivate their use.
And the 6809 has BRN/LBRN as in the 6801.

I think that pretty much covers it, except the devil that is in the details.

In other words, you can map most single 6800 and 6801 instructions to single 6809 instructions. Those that don't map to single 6809 instructions either map to pairs of instructions or, especially in the case of the interrupt instructions, have to be fixed to account for the larger register set being pushed by the interrupt and for some technical differences implied by hardware in whatever design you're using.

But if you do that kind of unintelligent transliteration, you'll want to do at least a second pass. For instance,


      INX
    INX
    INX
    INX

which maps mechanically to,

    LEAX   1,X
    LEAX   1,X
    LEAX   1,X
    LEAX   1,X

really wants to be replaced with the single instruction


      LEAX    4,X

And


      LDAA    0,X
    LDAB    1,X
    INX
    INX

which maps mechanically to


      LDA    0,X
    LDB    1,X
    LEAX   1,X
    LEAX   1,X 
  really wants to be replaced with the single instruction


      LDD     ,X++

Unfortunately, the above kinds of optimizations tend to be hidden by the ways programmers tend to try to optimize 6800 code.

I think I've covered stack games pretty well. If you understand how to build a stack frame on the 6801 and understand what I've mentioned about the stacks and the TFR instruction above, you should probably be able to extrapolate how to do stack frames on the 6809.

Oh, parameter handling is a separate topic, but I guess I should at least give it a mention.

On the 6800 or 6801, if you are passing parameters on the call stack (which I don't prefer, myself), your calling routine on the 6800 might do something like this:


      PSHA    ; parameter in A
    JSR     SOMEFUN
    INS
    ...

and your called routine might do something like this:


  SOMEFUN
    ...
    TSX
    LDAB    2,X   ; skip over return address to parameter
    ...
    RTS

On the 6809 it would like like this:


      PSHS    A    ; parameter in A
    LBSR    SOMEFUN
    LEAS    1,S
    ...
    
    ...
SOMEFUN
    ...
    LDAB    2,S   ; skip over return address to parameter
    ...
    RTS

And since I've mentioned that I prefer splitting the stacks, on the 6800, a separate software parameter stack would look something like


  PMSTKP      ; preferably in the direct page
    RMB     2
    ...
PUSHA       ; Looks like a lot of bother, I know.
    LDX     PMSTKP
    DEX
    STX     PMSTKP
    STAA    0,X
    RTS
* (We need this, too:)
INCPS1      ; Looks like more bother, yes.
    LDX    PMSTKP
    INX
    STX    PMSTKP
    RTS
    ...

    ...
*** Woops! not this:    JMP     PUSHA   ; parameter in A
    JSR    PUSHA   ; parameter in A (this)
    JSR    SOMEFUN
*** Facepalm! not this:    INS
    JSR    INCPS1  ; increment parameter stack (this)
    ...
    
    ...
SOMEFUN
    ...
    LDX     PMSTKP
    LDAB    0,X   ; no return address to worry about
    ...
    RTS

On the 6809 the separate software stack pointer is in U:


      PSHU    A    ; parameter in A
    LBSR    SOMEFUN
    LEAU    1,S
    ...
    
    ...
SOMEFUN
    ...
    LDAB    ,U   ; no return address to worry about
    ...
    RTS

Which I think is very clean.

I need to talk a little about the direct page register.

The DP register allows a potential trap. If interrupt handler routines use variables in the direct page, they should load and set their own DP, or they should use the full extended address.

For example, if you have a timer variable at $00E2 that is incremented in the IRQ service routine, you might do this:


      SETDP   0
    ...
IRQ 
    ...
    INC   $00E2
    ...
    RTI

and the assembler would assemble that as a direct page reference.

If the user code that gets interrupted has DP set to $02, the address that will get incremented at interrupt time will be $02E2, not $00E2, which is not what you want at all.

So what you should do is


      ...
  SETDP   0
IRQ 
    LDA   #0
    TFR   A,DP
    ...
    INC   $00E2  ; now the address is right
    ...
    RTI   ; restores the interrupted DP

Or you might do


      ...
IRQ 
    LDA   #0
    TFR   A,DP
    ...
    INC   >$00E2  ; force extended mode
    ...
    RTI

(I think I'm not getting that backwards, > to force extended and < to force direct page.) If your assembler doesn't allow both of the above, it is not a decent 6809 assembler. Get another one.

~~Here's how to~~

I think I need to show how to calculate the effective address of a direct-page variable:


  DPBASE	EQU n*$100
    ORG   DPBASE
    ...
DPVAR RMB 1
    ...
    ORG   CODE
    SETDP   n
    LDA   #n
    TFR   A,DP
    ...
    LDB   #DPVAR-DPBASE
    TFR   DP,A
    TFR   D,X   ; X now contains the address of DPVAR
    ...

And I need to mention the PC relative addressing mode, even though you won't be considering it when moving 6800 or 6801 code to the 6809. This is just so you can get a real idea of why the 6809 is so interesting when it doesn't automatically run 6800 code faster.

Let's look at a simple 7 constant table buried in code that needs to be position independent:


  *
MFCONST:
    FCB   CONST0
    FCB   CONST1
    FCB   CONST2
    FCB   CONST3
    FCB   CONST4
    FCB   CONST5
    FCB   CONST6
*
MOREFUN
    ... 
    CMPB  #7
    BHS   MFERROR
*** ERK! Not this:    LDX   MFCONST,PCR  ; the assembler should calculate the offset for you.
    LEAX  MFCONST,PCR  ; the assembler should calculate the offset for you. (this)
    LDA   B,X
    ...

Without the PCR addressing, you'd need a loader to patch the address of MFCONST used in MOREFUN to be able to run MOREFUN at arbitrary addresses. With the PCR addressing, MOREFUN and the constant table can be in ROM.

[JMR202210011737: add]

As an example of source code that is being kept very parallel, I now have my work on VTL-2 up in a private repository:

https://osdn.net/users/reiisi/pf/nsvtl/wiki/FrontPage

You can peruse the source tree and compare files:

https://osdn.net/users/reiisi/pf/nsvtl/scm/tree/master/

[JMR202210011737: add end]

Saturday, September 3, 2022

Programming Tandy/Radio Shack's MC-10 in Assembly Language

What you were probably looking for is here:

https://joels-programming-fun.blogspot.com/2022/08/trs-mc-10-assembly-lang-pt1-vtl-2.html

This is where that rant originally resided, but the URL and the content didn't quite match, so I moved it.

I will probably add links for more assembly language information for the MC-10 here later.