Tuesday, June 27, 2017

Looking at Memory Layout

The programming language C allows you to look at memory layout.

Why would you want to do this?

To get a look at how dangerous a stack crash or smash might be.

In the C run time environment, there is usually a region of memory for statically allocated variables, another for variables allocated through the library by calling malloc() and friends, and another for variables allocated dynamically on the stack. And, of course, there is a region for the runnable object code, can't forget that.

Early on, in the old world of sixteen-bit computing, the allocations often looked something like this:


0xFFFF
  stack (dynamic variables, stack frames, return pointers)
0xFxxx ← SP
  gap
  heap (malloc()ed variables, etc.)
  statically allocated variables
0xF000
  application code
0x8000
  operating system code, variables, etc.
0x0000


There were many variations, of course. An M6800 memory map might look something like this:


0xFFFF
  reset vectors
  I/O ports and monitor code
0xFC00
  operating system code
0x8000
  stack (dynamic variables, stack frames, return pointers)
0x8xxx ← SP
  gap
  heap (malloc()ed variables, etc.)
  statically allocated variables
0x8000
  application code
0x0080
  operating system variables
0x0000


Either way, the problem is easy to see. In this kind of a map, the stack grows downward, and the malloc()ed variables in heap fill memory upward. There is a convenience of sorts, in that an application can be designed to use more of one and less of the other.

But they share the same space.

If a user pushes an application beyond its design limits the two regions can overlap. Since physical memory doesn't magically multiply and increase when the two overlap, writing to one overwrites what is in the other.

Imagine a chalkboard. You and a friend are working on a math problem at the chalkboard. You're doing the math on the left, and your friend is writing down the results that you get on the right.

Things go okay, until the two of you have about 20 results, and she starts erasing your work so she can write down your results.

You're trying to cooperate, but you just don't have enough room.

And after she erases your work to right down your results, if you have to go back, you don't know where to go back to. Your work is gone.

Or, say, you erase the results she has been so kind as to write down for you, to continue your work. That's just as bad, because you still end up not knowing where you are and how to go back so you can move ahead.

Larger address space and more memory doesn't make the problem go away, just gives you more room before things go boom. (There is a way to mitigate these problems, which I will talk about in another blog post on CPU design.)

That's the basic reason you want to know about the layout of your memory.

Now, I'm not going to do all of your homework for you, but this little bit of code can get you started:



/* stkreveal.c
// A Q&D tool to structure of C runtime address space.
// by Joel Matthew Rees, Amagasaki, Japan, 28 June 2017.
// Copyright 2017 Joel Matthew Rees, claimed to the extent possible.
// This source is very much derivative of so many other sources
// that claiming copyright should be meaningless.
// But lawyers are known to quibble for money,
// so I assert all such copyright as I can.
// Permission to use:
// Author extends permission to use the source
// and the object it produces when compiled freely
// for any and all purposes except to assert intellectual property claims.
// Author makes no claims about the safety or reliability of the code.
// USE AT YOUR OWN RISK.
*/


#include <stdlib.h>
#include <limits.h>
#include <stdio.h>


/* #define AUTOBLOCKSIZE */

#if defined AUTOBLOCKSIZE
#  if INT_MAX > 0x7fff
#     define BLOCKSIZE 0x10000000L
#  else
#     define BLOCKSIZE 0x4000
#  endif
#else
#  define BLOCKSIZE 0x1000
#endif


void reportaddress( char * label, char * addr )
{
   printf( "%32s: %20p (%22lu)\n", label, addr, (unsigned long) addr );
}


char globalvariable[ BLOCKSIZE ];


int findandshow( int * argc, char * argv[] )
{
   char * mallocedblock = malloc( BLOCKSIZE );
   int flag = mallocedblock != NULL;

   reportaddress( "program parameters", (char *) argv );
   reportaddress( "main() parameters", (char *) argc );
   reportaddress( "findandshow() parameters", (char *) &argc );
   reportaddress( "mallocedblock", mallocedblock );
   reportaddress( "globalvariable", globalvariable );
   reportaddress( "reportaddress() function", (char *) reportaddress );
   if ( flag )
   {
      free( mallocedblock );
   }
   return flag;
}

int main( int argc, char * argv[] )
{
   return findandshow( &argc, argv ) ? EXIT_SUCCESS : EXIT_FAILURE;  
}



(If you copy and paste, remember to make sure the brackets and ampersands, etc., survived. I explain where to get a compiler in several places, here, for instance.)

On one of my computers, I get the following:


me@mycomputer:~/games$ cc -Wall -o stkreveal stkreveal.c
me@mycomputer:~/games$ ./stkreveal
              program parameters:       0x7ffc1411c938 (       140720645196088)
               main() parameters:       0x7ffc1411c84c (       140720645195852)
        findandshow() parameters:       0x7ffc1411c818 (       140720645195800)
                   mallocedblock:            0x24d3010 (              38613008)
                  globalvariable:             0x600b60 (               6294368)
        reportaddress() function:             0x40059c (               4195740)



The actual numbers change from run to run, because of something called memory randomization, that is used to slow down attackers who want to make use of this little game we are playing.

The remainder of this project is how to figure out what address is showing you what, where.

And to figure out how to protect yourself in a world where Intel doesn't protect you.

And you might be interested in the blog post that started this one, which is still in progress half done now. It will come in two three four parts:
  1. Explaining the currently published issues,
  2. outlining a better approach at addressed regions
  3. describing how CPU manufacturers could completely protect the return address stack,
  4. and describing my perfect ideal processor that probably shall never be.
[JMR201707011051:

Maybe it won't be giving away too many clues to mention a few useful tools if you are running a *nix workstation or something similar. Fellow blogspot denizen gallier2 reminded me on my main blog (#1 in the list of rants above) of the pmap utility. Reading the man page on that will point you to some other useful things, as well.

(mmap() and brk() may be of particular interest.)

To use pmap on a particular process, say an instance of bc or gforth or ruby, you would use ps to find out the process id:
ps wwaux | grep "gforth" $ ps wwaux | grep gforth
me       3281  0.0  0.1  22856  1412 pts/2    S+   11:00   0:00 gforth
me       3593  0.0  0.0   7412   884 pts/1    S+   11:01   0:00 grep gforth

The line of interest is the one in which the command is "gforth", not "grep gforth". If that output is confusing, you can use pgrep, to help figure out which is which:

$ pgrep -l gforth
3281 gforth
Now you know the process id, you can use pmap:
$ pmap 3281
3281:   gforth
0000000000400000    100K r-x--  /usr/bin/gforth-0.7.0
...
ffffffffff600000      4K r-x--    [ anon ]
 total            22860K
If you want to use pmap on the little program I showed you here, add a little wait-for-user-input loop at the end:




int main( int argc, char * argv[] )
{
   int ch = 0;
   int rval = findandshow( &argc, argv );
   while ( ( ( ch = fgetc( stdin ) ) != 'q' )
            && ( ch != EOF ) )
   {
      /* do nothing. */
   }
   return rval ? EXIT_SUCCESS : EXIT_FAILURE;  
}



 Typing 'q' and hitting enter will stop the program. While it's running, you can open another terminal window, find the process id with "pgrep stkreveal", and take a pmap of it.

Beyond that is your homework. Have fun.

But be careful with the knowledge you get from this sort of homework. Don't fall into the trap of thinking, "It's just a computer."

]


Tuesday, November 29, 2016

Using GMP to Approximate π

Okay, now I have
I guess I can now use the method as a way to show a little about how to use the Gnu Multi-Precision library.


The Gnu Multi-Precision library, GMP:

Here are
If your native language is not English, the wikipedia English wikipedia article has links to wikipedia articles in other languages. It's not much beyond an overview, but it should help a little, and those pages link some non-English resources.

Wish I had time to write this all in Japanese, or even to try to figure out how to annotate what I've written properly.

Once you get used to GMP, you'll probably find MPFR and MPC quite useful, as well. Probably in that order. (MPC builds new functionality onto MPFR which builds new functionality on GMP, if I read the docs correctly.) These add flexibility and methods of dealing with precision issues, and functions for complex numbers.

Note that GMP includes rational arithmetic -- explicit fractions!

Also note that the manual strongly suggests that, for what I'm doing here with multi-precision math, the MPFR library is more complete. If I were attempting a formal discussion of precision, yes, I would go with MPFR. But I'm not, and we need to start somewhere that's not too hard, so I'm starting with GMP.)



First, I converted the program I wrote using C's standard floating point math to the gmp libraries. The result is here:
https://osdn.net/users/reiisi/pastebin/4462

[JMR201705122219:

I should note that the program is not an efficient or fast way to calculate π. It's more of an easy demonstration of overly simplified use of multi-precision arithmetic. And, even there, I don't track the precision very carefully.

]

You can refer to my blog posts above for general explanations of what the program does. And comments in the source code should explain much of the differences between C's standard floating point math and the handling of GMP math, but I should probably add a few notes:

In the comments, I pretty much imply that mpf_t is a pointer. It isn't. Sorry. But having already pasted that into the paste buffer, I'll satisfy myself with an explanation here.

The type is a small structure containing a pointer, as is partially explained in the manual entry on memory management. Either way, you have to initialize it before you use it, as you see in the code. (You do have the code open in a text editor, or in another browser window, don't you?) 

C++ should (I think) allow using the multi-precision math almost the same as the standard floating point and integer types. I haven't tried, there may be some tricky stuff in loop syntax or such. I'm not a fan of C++, or of the idea of attempting to fold all program functionality onto existing syntax. This may have something to do with my coming from a FORTH background, where inventing new syntax on-the-fly doesn't interfere that much with the day-to-day workflow. (But it may create design nightmares down the road, and that's-a-different-story-sort-of.)

So using the GMP variables requires resigning yourself to C's function call syntax. But C makes that rather convenient, so it shouldn't be a problem, really.

The way they have set it up here, it requires the initialization to be separated from the declaration. Thus, all the variables of types provided by GMP that need to be accessible in the global scope of the main function are declared there at the top of the main function.

(I believe that C11 or somewhere says we should be able to declare them close to their first use, but I'm sticking with the most common idiom here.)

With all the globals declared, we are free to start ordinary procedural stuff, so the first thing I do is set the precision. The function call provided sets a precision that gives at least the number of bits of precision specified, including rounding. (MPFR allows you to be a little more precise so that you can control the loss of precision yourself.)
mpf_set_default_prec ( 67 )
ensures that we have at least 67 bits of precision. I want this to mimic (sort of) the default precision of bc when you call it as "bc -l" and load in the default math library.

You should want to know how I chose that 67 bits, so I'll tell you.

One binary digit (also called a "bit" -- binary digit) allows you to represent two numbers, usually interpreted as zero and one. That's not enough to represent decimal numbers.

Two bits give you zero to three, which is great for base four. Three bits give you zero to seven which is great for base eight, or octal. Four bits give you zero to fifteen, great for base sixteen, or hexadecimal.

So, if we wanted twenty digits of base sixteen precision, 20 ✕ 4 or eighty bits would be enough. (And we could set it at eighty without causing too many issues, really.)

Since you can also represent zero to nine in four bits, four bits is enough for one digit of binary coded decimal. And twenty binary coded decimal (BCD) would also take eighty bits.

But GMP is not doing BCD. It's doing binary math. Eighty bits of integer binary gives you the ability to represent from 0 to (280 - 1) That's
00000000000000000000000000000000000000000000000000000000000000000000000000000000 
to
11111111111111111111111111111111111111111111111111111111111111111111111111111111
That is easy to convert to hexadecimal. Each group of four binary 0s is one hexadecimal 0, and each group of four binary 1s is one hexadecimal "F". So, in hexadecimal, that's
0x00000000000000000000
to
0xFFFFFFFFFFFFFFFFFFFF
which probably doesn't look very meaningful to you, unless you play with this stuff for fun like I sometimes do. How can we convert that to decimal numbers that we feel comfortable with? Of course!
me@mymachine:~/games/piArea$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
10^20
100000000000000000000
2^80-1
1208925819614629174706175
so it's from 0 to 1,208,925,819,614,629,174,706,175 (using the US centric notation). So, if we plug in 20 digits of 9s and output that in binary, we can count the number of bits. Uggh. ;-)
obase=2
99999999999999999999
1010110101111000111010111100010110101100011000011111111111111111111
But, wait, there's a better way!

The common logarithm (log base 10, or log10) of a number is strongly related to the number of decimal digits that it takes to write:
log10(10000) == 4
The base two logarithm is strongly related to the number of binary digits it takes to write:

log2(16) == 4
If only bc could give us a way to calculate log10 and log2. :)

bc gives us the natural logarithm (ln) in the form of "l()"! And any logb is related to any other logc like this:
logb(x) = logc(x) / logc(b)
So, let's set the output base back to ten and define log2
obase=10
define l2(x) { return l(x)/l(2); }
l2(16)
4.00000000000000000002
l2(10^20)
66.43856189774724695808
If you're using a reasonably modern bc, you'll probably want to call the function "log2" instead of "l2", since "l2" looks like "12":
define log2(x) { return l(x)/l(2); }
l2(10^20)
66.43856189774724695808 
Having four tenths of a bit is kind of hard to do, so let's just say 67 instead of 66.4 bits, okay? And GMP gives us some extra bits and, by default, takes care of 4/5 rounding for us, which is a little more convenient here than what bc does. (We'll expect some difference in results at some point.)


Okay, now that the default precision is set, let's initialize all our variables. (We could have explicitly declared the precision on each initialization, but that would not have saved us the long explanation.)

We should note here that, if this program were more complex, especially if we were using GMP variables as local variables to functions other than main(), we would need to remember to clear them after use in order to avoid memory leaks. (I probably should have cleared them explicitly here to emphasize that.)

I avoid this allocation/deallocation problem here by declaring xsquared, ysquared, y, and deltaA in the outer block of main() rather than in the block local to the for loop, as I had in the native C code. (More on those, later.)

The initialize from string functions are really convenient:
mpf_init_set_str( mpf_t variable, char value[], int base )
so
mpf_init_set_str(deltaX, "0.025", 10);
initializes deltaX with 1/40 at 67 bits of precision (interpreting 0.025 as a base ten fraction).

When we have the value in one mpf_t variable, we can use that value directly in further initializations, as in
mpf_init_set( x, lastX );
And, if we really don't need to specify a value, just want to set the variables up for later use, we can do that, too:
mpf_init( deltaA );
These variables which I declared without initial values are the ones I'm using in the for loop, by the way.

If there are command line parameters (argc > 1), we have to convert the value for deltaX from the parameter. But the parameter is in a string, so that's straightforward:
mpf_set_str( deltaX, argv[ 1 ], 10 )
It's safer to be sure we got a good conversion when we are using unlimited precision, so I am checking the return value for indication of an error now. In the native C version, I figured there would be no harm in a bad conversion, but it probably would have been better to check.

Moving on to the interesting stuff, the for loop is not impacted as much as we might fear: 
for ( ; mpf_cmp( x, limit ) <= 0; mpf_add( x, x, deltaX ) )
The loop counter initialization here was completed explicitly when the variable was initialized, so I left it out of the loop. (It's just not a good idea to leave variables lying around declared but uninitialized. Unless you enjoy working with hidden bugs.)

The comparison expression
mpf_cmp( x, limit ) <= 0
is similar to saying
x <= limit
with standard C floating point types. mpf_cmp( a, b ) returns
  • an integer value less than zero (negative) if a is less than b, 
  • integer zero if a == b (something we should not expect or depend on),
  • and an integer greater than zero (positive) if a is greater than b.

Thus, the for loop continues until the value of x exceeds the limit.


The function call
mpf_add( x, x, deltaX )
is essentially the same thing as saying
x = x + deltaX
would be for standard C floating point types (except it does not yield the value assigned as a return value), thus, it means "increment x by deltaX".


The function parameters are listed in the order they would be written in our ordinary (modern infix) algebraic expressions, so, for each of add, mul (multiply), sub (subtract), and div (divide), the parameters are the same:
mpf_sub( mpf_t target, mpf_t left_op, mpf_t right_op )
Target gets the result, and the right op is added to, subtracted from, multiplied with, or divided into the left, as if it were written
target = left_op - right_op
in the case of mpf_sub, and the same with the rest. Again, an important difference is that these functions do not return a result, so we can't nest them and we have to deal with intermediate results ourselves.

Now you know why I have xsquared and deltaA as variables in the GMP version. We need some place to put the intermediate values. Thus
ysquared = 1.0 - ( x * x );
becomes
mpf_mul( xsquared, x, x );
mpf_ui_sub( ysquared, 1L, xsquared );
and
area += deltaX * y;
becomes
mpf_mul( deltaA, deltaX, y );
mpf_add( area, area, deltaA );
I'll repeat myself here -- I did not want xsquared, ysquared, y, and deltaA going out of scope in the loop, so I declared and initialized them in the outer block of main().

If I had a really good reason to keep them local to the loop block, I'd have needed to (re-)initialize them at the top of the loop and clear them at the bottom, to be sure the variables would be valid (and non-segment-faulting) and to keep memory from leaking. That takes enough extra time that I'd prefer to avoid it in a loop like this. And, of course, it gives more opportunity for bugs to creep in.

(In practice, I don't know of a compiler that would deliberately move loop local variables from iteration to iteration of a simple loop like this, but depending on that behavior is depending on something that is not promised in the standard. Not a good idea, and you'll forget to do it right in some places it really will matter.)

Finally, we can celebrate that the GMP provides us with a version of printf that handles gmp variables. Making your own output functions is fun, but sometimes you don't need that kind of fun.

And that's about it. Play with the code and have good kinds of fun!

(I may use this as an excuse to show some ways to parse command line parameters, but, if I do, it will be in another post.)

Friday, November 18, 2016

Using π to Check Your Floating Point Accuracy

In my post on using areas of rectangles to approximate π in C, I blithely asserted that standard C floating point math would lose accuracy compared to bc. But I didn't explain how I came to that conclusion.

I'll show you a little bit about how I got there, tiptoeing around some of the steps where it's easy to get lost.

First, refresh your memory about the bc program:
And here's how we'll call the program from the command line:
echo "q=1;m=32;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
The name for the number of rectangles, "m" was rather awkward, but "q" for quiet should be easy to remember.

I suppose you may be wondering how that works.

First, echo just echoes what it is given into the standard output stream.

Second, the output stream of echo is piped to the standard input stream for cat and the command line tells cat to continue with the contents of the source code of the bc program. This way, cat concatenates things so that the result is a program that sets q, m, and scale appropriately before it begins:
q=1;m=32;scale=41;
/* ... */
if ( m < 1 ) { m = 11; }
s=0;
w = 1/m;
Unless we set them first, q, m, and d start at zero, and scale starts at 20 when we give bc the "-l" option.

So cat assembles the lines to set q, m, and scale with the program source and sends it all to the standard output stream.

Then the command line pipes the output of "cat" to "bc -l", and the program starts with scale, q, and m set to 41, 32, and 1.

Setting m changes the number of rectangles to divide the circle into, thus changing the width of the rectangles. We'll change that and examine the resulting attempt to calculate π.

The C program, halfpiArea, accepts a width as a parameter.

(That was not the best choice. You can change it if you want. I'm using the program as it is.)

The width is the inverse of the number of rectangles because the radius of the circle is 1. So we can calculate the width using bc:
echo "1/32" | bc -l
which tell us that one thirty-second is .03125000000000000000 (0.03125). We can select the decimal fraction that bc gives us on the terminal screen and copy it. In a Linux OS shell, we can usually use right-click after selecting the text. In a Mac OS X shell, the usual copy keyboard shortcut works.

In MSWindows, I think you will have to click the application icon in the left of the title bar and select "Edit" from the menu that pops up, then select "Copy" from the submenu. ;-*

Then we can run the C program by typing the program name and pasting the fraction in after as follows:
./halfpiArea .03125000000000000000
Except we want to run the bc program after the C program, so that we can get the output of both on the terminal screen at the same time:



Now we can select the output estimates for π and copy them to a text editor, where we can compare them at leisure.

We will notice that the two programs produce nearly the same results for counts of rectangles that are powers of two. Two rectangles, for example:
jwr@fun:~/work/mathgames/piArea$ ./halfpiArea .5
(             -0.5,0.866025403784439): 0.433012701892219
(                0,                1): 0.933012701892219
(              0.5,0.866025403784439):  1.36602540378444
(                1,                0):  1.36602540378444
Almost Pi:  2.73205080756888
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
3: (1.00000000000000000000000000000000000000000, 0): 1.3660254037844\
3864676372317075293618347140
2.73205080756887729352744634150587236694280
These are the same, except for the last digit of the C program, which is actually correctly rounded, rather than just different:
2.73205080756888
2.73205080756887729352744634150587236694280
Similarly, 2048 rectangles:
jwr@fun:~/work/mathgames/piArea$ echo "1/2048" | bc -l
.00048828125000000000
jwr@fun:~/work/mathgames/piArea$ ./halfpiArea .00048828125000000000
(   -0.99951171875,0.0312461850698753): 1.52569263036501e-05
...
(                1,                0):  1.57078998270572
Almost Pi:  3.14157996541144
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2048;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
2049: (1.00000000000000000000000000000000000000000, 0): 1.5707899827\
0572260019288624147880158332158
3.14157996541144520038577248295760316664316
which are the same to the last digit of the C output:
3.14157996541144
3.14157996541144520038577248295760316664316
In fact, the C program is well behaved all the way to 216:
jwr@fun:~/work/mathgames/piArea$ echo "2^16" | bc -l
65536
jwr@fun:~/work/mathgames/piArea$ echo "1/(2^16)" | bc -l
.00001525878906250000
jwr@fun:~/work/mathgames/piArea$ ./halfpiArea .00001525878906250000
Almost Pi:  3.14159258349584
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2^16;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
65537: (1.00000000000000000000000000000000000000000, 0): 1.570796291\
74791608251698437807118686209362
3.14159258349583216503396875614237372418724
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2^16;scale=81;" | cat - halfpiArea_by_m.bc | bc -l
65537: (1.0000000000000000000000000000000000000000000000000000000000\
00000000000000000000000, 0): 1.5707962917479160825169843780711868627\
48822128541338008872254934835925225596242002
3.141592583495832165033968756142373725497644257082676017744509869671\
850451192484004
(Going to scale=81 did take a minute or four. I should time it:
scale=81
a(1)*4
3.141592653589793238462643383279502884197169399375105820974944592307816406286208996
Okay, three minutes, forty seconds.)

Here are the estimates:

3.14159258349584
3.14159258349583216503396875614237372418724
3.141592583495832165033968756142373725497644257082676017744509869671850451192484004 
The C program produces the same as the bc program to the C program's second-to-the-last digit.

Note that the bc program is showing precision loss out in the 37th digit. Again, remember why, from the fly-in-the-pi post.

And, just as a reminder (in bc -l),
scale=81
a(1)*4
3.141592653589793238462643383279502884197169399375105820974944592307816406286208996
So, even with all this effort, we have only calculated π to the sixth digit past the decimal point, which is also expected, see the theory post.

That's not bad results, really, but if we don't choose the number of rectangles carefully, the C program output starts varying from the bc program quite quickly.

Ten, 100, and 1000 are good, but 10,000 is not. 11 is good, but 23 is not. Try a few, and see if you can guess what numbers work better and which work worse, and why.

Now, if our only purpose were to calculate π to six decimal digits, we'd be satisfied with using 216 for the count and be satisfied with standard C floating point math.

But, of course, that's not the real purpose.

What we are really talking about here is a way to get good solutions to equations involving integral calculus. (Was I being sneaky for not warning you in advance?)

So we do care about the differences, sometimes. We can't always select the convenient counts of rectangles to estimate with. Which is why I'm going to demonstrate using the Gnu Multi-Precision library real-soon-now.


Saturday, September 24, 2016

Calculating π (pi) in C

Maybe I need a better hobby.

Or maybe math really is this fun.

Okay, where this came from -- I gave my son a challenge, and he wrote a short program in perl to calculate π by the area of a circle.

So I wrote one in my favorite calculator, bc.

Now, its a nice program, and I can check the accuracy with bc itself, using the arctangent function. But bc is not as fast as C, and this old intel processor is not that fast, either.

And the program had a bug. Well, in my original explanation, I forgot to mention the problem of rounding. Bc always rounds down. Or, rather, it just drops fractional bits past the current scale This produced some notable effects, which are partially resolved in bc by doubling the scale.

Perhaps one reason bc is not as well known as it might be is that bc doesn't help you with rounding.

And it gives you scale instead of precision or accuracy.

So the limits of computers bite you pretty quick with bc, where other similar solutions help you ignore the limits -- sometimes until it's too late.

Which is part of the reason I'm writing this rant.

Anyway, I also wrote a nice little program for calculating π in standard C, using the language's ordinary math. It uses the same basic approach as I used in the bc version, except that the bc version I showed has a few more bells and whistles.

(I actually started with a very minimalistic version in bc, then wrote the C program and uploaded it to the pastebin linked above, then added the comments and extra features to the bc version before uploading it to a pastebin, if you are wondering why.)

So, if you want a full explanation of the method C source, look at my rants on the bc version (linked above) and the comments in the bc source. I'll limit my comments here to details of C programming and the standard C run-time environment, and issues of rounding and accuracy.

(Modifying the C source to do midpoint and so on is something I leave for you to play with.)

Here's the source of the standard C program, with extra comments:


/* Computing the value of pi by integrating the area of a (half of a) circle.
// by Joel Matthew Rees, 1 August 2013
// Copyright Joel Matthew Rees
//
// Fair use acknowledged.
//
// All rights to this expression of the method reserved.
//
// (Starting from scratch isn't that hard,
// and you'll like the results better.)
// -- Really, writing small programs like this from scratch
// is an excellent way to learn the basic grammar and runtime
// of a programming language. DIY!
// 
//
// Compile with the right libraries:
// cc -Wall -lm -o halfpiArea halfpiArea.c
//
// C language programs have to be compiled. You can't just run them.
// You have to save the source as halfpiArea.c
// and then run the compiler command above in a shell:
// http://joels-programming-fun.blogspot.com/2013/08/get-shell.html
//
// The compiler will catch simple errors in the source,
// and try to give you hints on how to fix them.
// Learning how to read the hints is a topic for another rant or more.
//
// If the compiler succeeds, the command line above will save
// the executable program in a file called halfpiArea,
// which you then run by typing 
// ./halfpiArea
// in the shell.
//
//
*/


/* The following lines help the compiler to figure out
// what functions you are using from the standard library
// and how you should be using them:
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
/* They literally include information about those functions
// into this program.
*/


/* The main function of a C program in a standard runtime environment
// is pretty much always called "main()".
// And it has two or three standard arguments.
// I only tell the program about two:
// argc is the integer count of the command line arguments,
// and argv[] is the array of command line arguments, as character strings.
// I'll tell you more about the command line arguments below.
*/
int main( int argc, char * argv[] )
{
   /* All the variables here are declared as
   // double length floating point.
   // Accuracy is usually more than 11 useful digits,
   // Exponent range is usually more than about 10^(+-235).
   // Simplify way too much,
   // you can write numbers from 1e-235 to 1e235,
   // and, if we get the math write,
   // we can guarantee at least 11 good digits for pi.
   //
   // (Actually, the math normally supported in modern C libraries
   // is IEEE 754, which would be 10^(+-308) and at least 15 good digits.
   // Check C Language Data Types and IEEE 754 on Wikipedia, stackexchange,
   // etc., for more information.
   //
   // If you want to know what your compiler gives,
   // #include <limits.h> and play with the math.)
   */
   double area = 0.0; /* Sum the area here. */
   double deltaX = 0.025; /* Width of the rectangles (by default 1/40). */
   double lastX = -1.0; /* Track the left side. */
   double x; /* The right side. */
   double almostPi; /* The approximation of pi. */
   /* You will notice that we declare the names and types
   // of the variables before we use them.
   // C also allows us to set the initial value as we declare them.
   */

   if ( argc > 1 ) /* Command line arguments? */
   {  deltaX = strtod( argv[ 1 ], NULL ); /* Assume it's a different width. */
   }
   /* This allows us to set a different width on the command line.
   // 1/11 would be 0.09090909090909090909 to twenty decimal places:
   // ./halfpiArea 0.09090909090909090909
   // 1/8 == 0.125, 1/10 == 0.1, 1/100 == 0.01, 1/200000 == 0.000005
   */

   /* Note that starting from the half delta converges at least one decimal digit faster.
   // (That's because the height of the rectangle crosses the circle at midpoint,
   // rather than left edge or right. Draw the pictures to see why.
   // This is right edge, but changing to midpoint or left edge is trivial.)
   */
   for ( x = -1.0 + deltaX; x <= 1.0; x += deltaX )
   {
      double ysquared = 1.0 - ( x * x );
      double y = sqrt( ysquared );
      /* deltaX = x - lastX; Reminder for humans, only! */
      lastX = x;
      area += deltaX * y;
      /* Suppress long listings of the areas.*/
      if ( deltaX > 0.00005 ) /* 1/20000 */
      {  printf( "(%17.15g,%17.15g): %17.15g\n", x, y, area );
         /* %17.15g is a floating point field
         // of 17 digits with 15 digits to the right of the decimal.
         */
      }
   }

   almostPi = area * 2;
   printf( "Almost Pi: %17.15g\n", almostPi );
   return EXIT_SUCCESS;
}



Calculating π to eleven digits works okay with C doubles. But if you look at the listing of the areas, you see some odd differences, because C always rounds but bc always does as many exact digits as you tell it to with the scale variable.

Play with different widths, compare to the bc version, and see what you can learn.

(Modern C, from about C99, allows you to declare floating point types that use alternative rounding.)

And you should notice that C is very fast when compared to bc.

When you want to check the real value of π, remember:
echo "scale=41; a(1)*4;" | bc -l
from the shell command line, or
scale=41
a(1)*4
in bc, adjust scale to one more than the number of good digits you want.

And don't try 
./halfpiArea 0.0000000000000000000000000000000000000001
unless you really want to watch your computer contemplate its navel until you give up and shut it down. Start with
./halfpiArea 0.00001
and work your way up to 0.0000001 or so, where you get really noticeable waits.

And see if you can figure out whether midpoint is really better than right side, as I said it would be, when you start using widths smaller than 1/100 or so.

Oh. And play with the source code a bit, too.

[JMR201611182140:
I decided to post a few hints about playing with the programs:
http://joels-programming-fun.blogspot.com/2016/11/using-to-check-your-floating-point.html.  
]


[JMR201611292224:
Hopefully, I'll get a chance to post sample source with the GMP gnu multi-precision library "real soon now". (I've got one running, it just seems to take more time to write the rant than the source code.)
Here it is, demonstrating the principle with GMP:
http://joels-programming-fun.blogspot.com/2016/11/using-gmp-to-approximate-pi.html.
]

Wednesday, March 9, 2016

What is Spaghetti Code?

When I posted the rant on my political blog about the body of US law loking too much like spaghetti code, I borrowed some examples from the Wikipedia article on spaghetti code.

Then I tried to find a basic interpreter that would run those archaic examples.

:)

Setting up an environment to run this kind of archaic code is a bit beyond an introductory topic.

So, to keep the original rant from getting too cluttered, I'm posting some more modern code for those examples in this post.

First, the example that was called structured on Wikipedia:



110 FOR i = 1 TO 10
120     PRINT i; " squared = "; i * i
130 NEXT i
140 PRINT "Program Completed."
150 END 


All you need to do to get this running in a more modern BASIC is remove the line numbers and maybe spread the print statement out:



FOR i = 1 TO 10
    PRINT i;
    PRINT " squared = ";
    PRINT i * i
NEXT i
PRINT "Program Completed."
END 


Then the example that was called spaghetti, but I think isn't really:



10 i = 0
20 i = i + 1
30 PRINT i; " squared = "; i * i
40 IF i >= 10 THEN GOTO 60
50 GOTO 20
60 PRINT "Program Completed."
70 END


You'll need some labels, which is probably why they called it spaghetti. Here's how it might look for a more modern BASIC:



i = 0
L20:
i = i + 1
PRINT i;
PRINT  " squared = ";
PRINT i * i
IF i >= 10 THEN GOTO L60
GOTO L20
L60:
PRINT "Program Completed.";
END


And my example of simple spaghetti:


200 e = 0
210 GOTO 300
220 PRINT i; " squared = "; i * i
230 GOTO 300
260 REMARK Remember to set e before you start! 
300 ON e GOTO 310,330
310 i = 0 
320 e = 1 
330 IF i < 10 THEN i = i + 1 ELSE GOTO 350
340 GOTO 220
350 PRINT "Program Completed."
360 END 


This can be a little trickier, because many BASIC interpreters differ about how the support the IF/THEN/ELSE constructs. Not to mention whether they support the computed GOTO. The following should run in a modern interpreter:



e = 0
GOTO L300
L220:
PRINT i;
PRINT " squared = ";
PRINT i * i
GOTO L300
REM Remember to set e before you start! 
L300:
IF e = 0 THEN GOTO L310
GOTO L330
L310:
i = 0 
e = 1 
L330:
IF i >= 10 THEN GOTO L350
i = i + 1 
GOTO L220
L350:
PRINT "Program Completed."
END 


I used Basic256 as my more modern interpreter rather than Gambas3 because using labels in Gambas3 just feels funny. Also, I would have had to talk about setting up an environment for the code to run in.

Gambas3 is actually useful for production code, where Basic256 is more for helping people learn BASIC programming skills (which is why Basic256 already has the environment set up).

Just for amusement, here's how these would look in C --

The clean version:



#include <stdio.h>
#include <stdlib.h>

/* Print the first ten squares the simplest way possible:
*/


int main( int argct, char * argvar[] )
{
   int i;

   for ( i = 1; i <= 10; i = i + 1 )
   {
      printf( "%d squared = %d\n", i, i * i );
   }
   puts( "Program Completed." );
   return EXIT_SUCCESS;
}


A version using GOTOs:



#include <stdio.h>
#include <stdlib.h>

/* Print the first ten squares the simplest way possible:
*/


int main( int argct, char * argvar[] )
{
   int i;

   i = 0;
l20:
   i = i + 1;
   printf( "%d squared = %d\n", i, i * i );
   if ( i >= 10 )
      goto l60;
   goto l20;
l60:
   puts( "Program Completed." );
   return EXIT_SUCCESS;
}


The spaghetti-fied version:



#include <stdio.h>
#include <stdlib.h>

/* Print the first ten squares the simplest way possible:
*/


int main( int argct, char * argvar[] )
{
   int e, i;

   e = 0;
   goto l300;
l220:
   printf( "%d squared = %d\n", i, i * i );
   goto l300;
   /* Remember to set e before you start! */
l300:
   if ( e == 0 )
      goto l310;
   goto l330;
l310: 
   i = 0; 
   e = 1; 
l330:
   if ( i >= 10 )
      goto l350;
   i = i + 1; 
   goto l220;
l350:
   puts( "Program Completed." );
   return EXIT_SUCCESS;
}


Here's what a session compiling the code and running it should look like --



me@mybox:~/work/spaghetti$ cc -Wall -o spaghetti spaghetti.c
me@mybox:~/work/spaghetti$ ./spaghetti
1 squared = 1
2 squared = 4
3 squared = 9
4 squared = 16
5 squared = 25
6 squared = 36
7 squared = 49
8 squared = 64
9 squared = 81
10 squared = 100
Program Completed.



The amusing thing about the C source code is that even the worst example here will compile and run correctly on any standard C compiler. But you can tell which version will be easier to figure out and add features to in the future.

Just for a certain kind of completeness, here's a FORTH version of the same function:



: tensquares 
   cr
   11 1 do 
       i . ." squared = " i i * . cr
   loop ;



Which you would invoke by entering tensquares at the FORTH command line, giving the same output as above.

Doing a version with the FORTH equivalent of GOTOs would be even more of a time waster than the rest of this rant, so I won't bother. It's probably even less interesting.

Saturday, January 9, 2016

Stemming the Tide: RC Time Delays with bc

This one was inspired by this post on debian-user:
https://lists.debian.org/debian-use/2015/12/msg00627.html
I mentioned in my general blog that I feel kind of sad that this kind of information is getting harder to find:
http://reiisi.blogspot.jp/2015/12/about-breaks-my-heart.html
Someday I'll expand on the idea of too much information being less information in my freedom isn't free blog blog

Today, I'm going to try to stem the tide a little bit, by talking about using tools that are should be available on everyone's computer or smart phone terminal to help solve a moderately simple electronics problem. 

One of the regular participants in the debian-user mailing list had a question about using an RC circuit. He has a DC powered milling machine that he needed to build a power supply for, and he wanted to build a switching power supply. 

Switching power supplies, without a slow-start circuit, draw enough power to throw your circuit breaker on start-up, so he needed to build a slow-start circuit. And he needed help with the physics and was asking for a tool to help do a units conversion.

The physics he needed are mostly explained on these pages on wikipedia:
https://en.wikipedia.org/wiki/RC_circuit
https://en.wikipedia.org/wiki/RC_time_constant
That helped me refresh my memory, as well.

He didn't need the conversion as much as the formula for the delay characteristics, and he also needed a calculator that would work the formula for him in a loop so he could try different sizes of capacitors and resistors and get an idea of the power draw over time.

There are lots of calculators that let you punch in numbers, but when you have to repeat the same series of calculations lots of times, you tend to make mistakes. That's one of the reasons being able to run a loop is important.

When you think of a calculation loop, you tend to think of programming languages. Perl, Python, Ruby, FORTH, etc. But programming is a scary topic for many people. (Most people? I think it should be taught in the primary grades, myself.) 

If you know about the interactive modes of the above four languages, you know that they can be used like a programmable calculator. 

Perl is installed as a part of almost every Unix-like OS. But perl's use of dollar signs makes it feel arcane, and if you have heard of regular expressions, you may think the name of the language makes your head implode. 

FORTH's postfix notation may also make you hesitate.

Python and Ruby may need to be installed, and installing may be scary.

Maybe all of that tends to make you shy away from programming.

But there is a basic, easy-to-use, multi-precision calculator, bc, that is available in just about every workstation or server installation of various Unix and Unix-like OSses, including Linux and (I think) still on Mac OS X (but not Android!), and can even be installed on MSWindows. (But not so easily on Android! No-root-debian is a bit of an SD hog.)

Okay, so, enough with the sales pitch for bc. Let's see it in action on this problem.

WARNING!

I demonstrate some idealized circuits below. They are dangerous as shown for a variety of reasons. Electronics parts used incorrectly can burn, explode, cause serious damage, and cause permanent injury including blindness, lung damage, and death.

If you don't know how to build circuits like these correctly, you are very likely to hurt yourself and others badly. If you know how to build these circuits correctly, you should have no reason to build them.

And any rate, don't try to build them without proper supervision by people who know what they are doing.

If you really want to experiment with this stuff, find yourself a switching power supply kit from a reputable kit seller and build that, instead.

Switching power supplies are kind of like the bucket showers you use at camp. Fill it up and you can get a nice shower for a short time. As long as you keep it more than half-full, the stream from the shower is, well, not too bad.

Unfortunately, the bucket metaphor gets really laborious from there. Wikipedia to the rescue again, if you want to know more:
https://en.wikipedia.org/wiki/Switched-mode_power_supply
In the metaphor, we use a capacitor like a bucket:


But if you hook a capacitor across line voltage this way in your house, the capacitor suddenly draws a very huge current for a few milliseconds, 120 times a second.

That's dangerous! 

A few milliseconds every eight and a third milliseconds is not a short time, really. We'll see that better, below.  

[Update 20160111 part 2:
Not incidentally, most large capacitors are polarized. A polarized capacitor hooked as above across an AC house power source will definitely burn, and very probably explode. That is, I don't know of a polarized capacitor in production with a high enough reverse rating to withstand household AC power.

(Polarized capacitors must be hooked up correctly, positive and negative to the corresponding power wires. AC power is switching positive and negative 60 times a second, so there is no way to hook it up correctly without rectifying it first.)
]

[Updated to correctly describe the AC physics. Apologies to the first ten or so people who read this rant while I was forgetting to explicitly deal with the current reversals inherent in AC. JMR20160111]

If the power were DC, that current would quickly drop to a sort-of manageable current as the capacitor charges. [Update JMR20160111: In AC, as above, in the milliseconds after the capacitor is charged a little, it gets forcibly discharged, which is also potentially dangerous.]

If the capacitor is very large at all, it could pop the circuit breaker, cause brown-outs and spikes, and do other things that are not nice to your power circuit and devices plugged in on it [Update JMR20160111: ... assuming the capacitor is properly rated and can survive being hooked up like this]

If your house is not well isolated electrically from your neighbors' houses, they would have serious reason to complain, too. 

So you want a resistor in series with the capacitor. And let's rectify the house current, while we're at it, so we can keep the capacitor charged.


Now, ultimately, we'll have some other circuitry, a relay, or an SCR or something, to take that resistor out (effectively short across it) once the capacitor is charged. Otherwise, the device is not going to get enough power to run, because the resistor will be eating up the power, instead.

So, you want to know how big a resistor you need for a capacitor large enough to drive the device. 

Say you don't want to draw more than 500 Watts instantaneously. 120 volts AC will push 170 volt peaks out of an ideal rectifier, so we'll use 170 volts in the formula. At start, the capacitor will be a dead short, so the resistor is what you pick. Remember Ohm's law:
  • E = IR and P = IE
  • R = E/I because we want R
  • I = P/E so we can get rid of I
  • subsitute: R = E/(P/E) 
  • normalize: R = E(E/P) or E2/P
  • substituting the numbers: R = 1702/500
You don't really need a calculator for this, do you? (Heh.)


Well, okay, you don't want to rely only on doing it in your head. Get a shell and fire up bc. You might as well bring in the math library, too:

$ bc -l
170*170/500
57.80000000000000000000
A sixty ohm, 500 watt resistor will keep the house from catching on fire, and be friendly to the circuit breaker and all your appliances and your neighbors.

Double checking,
$ 170/60
2.83333333333333333333
120/60
2.00000000000000000000
120*2
240
2.833 instantaneous amps. That's about what to expect, given the numbers, but, that's a lot of current. Let's see. 500 watts is a samll to medium room heater. Even preferring the RMS power, 240 watts is still a lot of heat if the capacitor should ever fail shorted.

Which wouldn't protect the house from burning down in the worst case, after all.

So, instead of a 240 watt resistor, we actually want a resistive fuse able to handle about 500 watts for a fraction of a second, but burning out if it sustains more than 250 watts (2 amps) for more than several seconds.

But the mill then needs to be able to safely power down suddenly if we do it that way. So, probably, we'll actually use a high-power transistor here that senses the voltage on the capacitor and drops the impedance as the capacitor charges. But that's for a later step. Right now we want to know limits.

We want an idea how fast the capacitor will charge. That's where the ability to run loops in bc comes in handy:

You can do this in bc:


define vc(vs,t,r,c) {
  return vs * (1-e(-t/(r*c)));
}

scale = 5;
vs = 120;
r = 60;
c = .01;
for ( t=0.0; t<5.0; t += 0.1 ) {  
  vc = vc(vs,t,r,c );
  vr = vs - vc;
  a = vr / r;
  p = vr * a;
  print "t: ", t, "  vc:", vc, "  a: ", a, "  p: ", p,  "\n"
}


[Update JMR200160111:

Select between the lines in your browser, and paste with right-click into the terminal bc session you opened up for the simple calculations above. ("bc -l" if you didn't do that yet.) 

You may need to hit the return key after pasting the code in. 

It works in the version of bc in Debian. openbsd's bc wants variable names to be only one letter long:

define v(s,t,r,c) {
  return s * (1-e(-t/(r*c)));
}

scale = 5;
s = 120;
r = 60;
c = .01;
for ( t=0.0; t<5.0; t += 0.1 ) {  
  v = v(s,t,r,c );
  z = s - v;
  a = z / r;
  p = z * a;
  print "t: ", t, "  vc:", v, "  a: ", a, "  p: ", p,  "\n"
}

]

And the output looks like this:

t: 0  vc:0  a: 2.00000  p: 240.00000
t: .1  vc:18.42240  a: 1.69296  p: 171.96681
t: .2  vc:34.01640  a: 1.43306  p: 123.21965
t: .3  vc:47.21640  a: 1.21306  p: 88.29087
t: .4  vc:58.38960  a: 1.02684  p: 63.26402
t: .5  vc:67.84920  a: .86918  p: 45.32843
t: .6  vc:75.85560  a: .73574  p: 32.47880
t: .7  vc:82.63200  a: .62280  p: 23.27279
t: .8  vc:88.36920  a: .52718  p: 16.67512
t: .9  vc:93.22440  a: .44626  p: 11.94887
t: 1.0  vc:97.33560  a: .37774  p: 8.56125
t: 1.1  vc:100.81440  a: .31976  p: 6.13478
t: 1.2  vc:103.76040  a: .27066  p: 4.39541
t: 1.3  vc:106.25400  a: .22910  p: 3.14920
t: 1.4  vc:108.36360  a: .19394  p: 2.25676
t: 1.5  vc:110.15040  a: .16416  p: 1.61691
t: 1.6  vc:111.66240  a: .13896  p: 1.15859
t: 1.7  vc:112.94280  a: .11762  p: .83006
t: 1.8  vc:114.02640  a: .09956  p: .59473
t: 1.9  vc:114.94320  a: .08428  p: .42618
t: 2.0  vc:115.71960  a: .07134  p: .30536
t: 2.1  vc:116.37720  a: .06038  p: .21874
t: 2.2  vc:116.93280  a: .05112  p: .15679
t: 2.3  vc:117.40440  a: .04326  p: .11228
t: 2.4  vc:117.80280  a: .03662  p: .08046
t: 2.5  vc:118.14000  a: .03100  p: .05766
t: 2.6  vc:118.42560  a: .02624  p: .04131
t: 2.7  vc:118.66800  a: .02220  p: .02957
t: 2.8  vc:118.87200  a: .01880  p: .02120
t: 2.9  vc:119.04600  a: .01590  p: .01516
t: 3.0  vc:119.19240  a: .01346  p: .01087
t: 3.1  vc:119.31600  a: .01140  p: .00779
t: 3.2  vc:119.42160  a: .00964  p: .00557
t: 3.3  vc:119.51040  a: .00816  p: .00399
t: 3.4  vc:119.58600  a: .00690  p: .00285
t: 3.5  vc:119.64960  a: .00584  p: .00204
t: 3.6  vc:119.70360  a: .00494  p: .00146
t: 3.7  vc:119.74920  a: .00418  p: .00104
t: 3.8  vc:119.78760  a: .00354  p: .00075
t: 3.9  vc:119.82000  a: .00300  p: .00054
t: 4.0  vc:119.84760  a: .00254  p: .00038
t: 4.1  vc:119.87160  a: .00214  p: .00027
t: 4.2  vc:119.89080  a: .00182  p: .00019
t: 4.3  vc:119.90760  a: .00154  p: .00014
t: 4.4  vc:119.92200  a: .00130  p: .00010
t: 4.5  vc:119.93400  a: .00110  p: .00007
t: 4.6  vc:119.94480  a: .00092  p: .00005
t: 4.7  vc:119.95320  a: .00078  p: .00003
t: 4.8  vc:119.96040  a: .00066  p: .00002
t: 4.9  vc:119.96640  a: .00056  p: .00001

Which tells you that, for a 10,000 microfarad capacitor, you'll get close to full power after two seconds, and that power has dropped below a quarter watt by 2.1 seconds.

Not counting the load of the device.

See how useful bc is?

[Update JMR20160111:

I left out a few things last Saturday night.

First, let's get a look at the charge curve for a straight wire (zero ohm resistor) in the rectified DC circuit. Remember that bare wire has a very small amount of resistance. We'll use 10 milli-ohms, and we'll focus on the first ten milliseconds.

Keep your bc session open, or start it again ("bc -l") and paste the above code in again. Then try this (as in the gnu bc version that works on debian):


r = 0.01;
for ( t=0.0; t<0.0101; t += 0.001 ) {  
  vc = vc(vs,t,r,c );
  vr = vs - vc;
  a = vr / r;
  p = vr * a;
  print "t: ", t, "  vc:", vc, "  a: ", a, "  p: ", p,  "\n"
}


Or, for the version of bc in openbsd:


r = 0.01;
for ( t=0.0; t<0.0101; t += 0.001 ) {  
  v = v(s,t,r,c );
  z = s - v;
  a = z / r;
  p = z * a;
  print "t: ", t, "  vc:", v, "  a: ", a, "  p: ", p,  "\n"
}



Here's what you should see:
t: 0  vc:0  a: 12000.00000  p: 1440000.00000
t: .001  vc:119.99520  a: .48000  p: .00230
t: .002  vc:120.00000  a: 0  p: 0
t: .003  vc:120.00000  a: 0  p: 0
t: .004  vc:120.00000  a: 0  p: 0
t: .005  vc:120.00000  a: 0  p: 0
t: .006  vc:120.00000  a: 0  p: 0
t: .007  vc:120.00000  a: 0  p: 0
t: .008  vc:120.00000  a: 0  p: 0
t: .009  vc:120.00000  a: 0  p: 0
t: .010  vc:120.00000  a: 0  p: 0
Wow!

12 kiloamps of instantaneous current!

1.44 megawatts of instantaneous power!

Fortunately, the resistance in the house wire will limit the current, so you wouldn't really draw that much. But it would try, and the circuit breaker would (hopefully) keep you from taking down the neighborhood grid. And you might be fumbling around in the circuit breaker box in the dark. Assuming you survive when the capacitor and rectifier probably explode and burn.

Now you have mathematical proof of why you don't want to build that circuit without the current limit resistor.

Do you wonder about the AC circuit? So do I, but I have a day job, so I'll save that for another rant. But it's not that hard. You just have to fix the loop so that it simulates the source voltage doing a sine curve from -170 to 170 peak-to-peak.

A few comments about real power supplies in recent electronic (especially after about the year 1997 or so):

The switch that removes the current-limiting resistor should be driven by a schottky diode circuit or some similar circuit that monitors the voltage across the capacitor, removing the current limit as the capacitor gains sufficient charge to drive the device at its rated voltage. Maybe I'll draw a model circuit for that when I have more time.

But, truth be told, we don't do it this way any more. We use a small microcontroller, itself powered by a very low power switching power supply, to control some sort of high-powered semiconductor switch (SCR or something similar) to act as the current-limiting resistor in the above circuit. Or we may use a  The microcontroller monitors the current draw with analog-to-digital inputs, for safety and soft startup.

It also monitors the voltage across the impedance device (SCR or whatever) directly, ramping the current-limiting impedance down smoothly as the power capacitor charges and the voltage across the impedance decreases.

I don't think I'm going to have time very soon to work up an example microcontroller-controlled switching power supply circuit, and the control logic for it, that which wouldn't be covered by current patents. (Stupid patents.)

End-Update]


Monday, November 23, 2015

Small Prime Sieve in FORTH

An explanation of the algorithm here can be found in a blog entry on the Sieve of Eratosthenes in my math and English blog, and in a previous entry in this blog on primes in C, along with my reasons for doing this, such as I have.

FORTH requires a lot of explanation, so I'm going to show you a C-like version first.

Comments in FORTH are between parentheses. Or you can start a comment with a backslash and it will end at the end of the line.

So, just read the code naturally and read the explanation in the comments. It's surprisingly readable (because I have written it to be readable). It's also full of hidden surprises for those used to algol-like in-fix languages including C and Ada.

Here's the C-like source:



( Archetypical implementation of the sieve of eratosthenes in FORTH -- gforth --
\ Copyright Joel Rees, 2015
\ By Joel Rees, Amagasaki, Japan, 2015
\ All rights reserved.
\ Permission granted by the author to use this code
\ for any purpose, 
\ on condition that substantial use 
\ shall retain this copyright and permission notice.
\
\ Timing results with MAXSIEVE set to 4194304 and output commented out:
\ me@fun:~/work/mathgames/sieve$ time gforth forthsieve.fs -e bye

\ real 0m1.061s
\ user 0m0.988s
\ sys 0m0.012s
\
\ Comparing to the C:
\
\ me@fun:~/work/mathgames/sieve$ time ./sieve_c
\ 
\ real    0m0.457s
\ user    0m0.436s
\ sys    0m0.020s
\
\ A bit more than double the run-times, 
\ so, really, in the same ballpark,
\ if not quite in the same league for speed.
)


256 constant MAXSIEVE
MAXSIEVE 1- 2 / constant FINALPASS

5 constant DISPWIDTH ( enough digits to display MAXSIEVE )

create sieve MAXSIEVE allot


: sieveMain ( -- )
0 sieve c!      ( 0 is not prime. )
0 sieve 1+ c!   ( 1 is not prime. )
sieve MAXSIEVE 2 ?do    ( set flags to true for 2 to FINALPASS. )
   -1 over i + c! loop  ( sieve pointer remains on stack. ) 
FINALPASS 2 ?do ( clear flags at all multiples. )
   dup i + c@ if      ( don't bother if not prime. )
      MAXSIEVE i dup + ?do    ( start at first multiple -- double. )
         0 over i + c!      ( clear at this multiple. )
      j +loop           ( sieve still on stack. )
   then
loop            ( sieve still on stack. )
MAXSIEVE 0 ?do
   i DISPWIDTH .r ." : is "
   dup i + c@ 0= if 
      ." not " 
   then
   ." prime." cr
loop drop ;

sieveMain




Okay, it looks like it ought to make sense, right?

But, if you are new to FORTH, and used to algol-like languages like C and Ada, it doesn't. So, let's look at it again, with comments that explain the FORTH itself.

But before we do that I should explain the stack.

FORTH's native interpreter operates in a post-fix grammar. In other words,
MAXSIEVE - 1
is written
MAXSIEVE 1 -
When the interpreter sees the constant MAXSIEVE, it pushes it on the stack. Then it sees 1 and pushes it on the stack. Then it sees the subtraction operator and subtracts the top item on the stack from the second:

inputstack
MAXSIEVEMAXSIEVE
1MAXSIEVE 1
-(result of MAXSIEVE - 1)

Or, borrowing from the code:
 MAXSIEVE 1- 2 / constant FINALPASS

inputstackwhat happens
MAXSIEVEMAXSIEVEConstant pushed on stack.
1-(result of decrement)Top item of stack gets decremented.
2(result1) 2 Numeric constant pushed on stack.
/(result of division)2nd on stack gets divided by top.
constant FINALPASS(empty)Constant FINALPASS generated in symbol table.

Now we can look at the code again:



256 constant MAXSIEVE
\ Defines MAXSIEVE as a true constant.
\ Incidentally, gforth is case insensitive, so we could call it "maxsieve", too.
\ But we won't.

MAXSIEVE 1- 2 / constant FINALPASS
\ See the explanation above.

5 constant DISPWIDTH ( enough columns to display MAXSIEVE )
\ We want this for integer output.

create sieve MAXSIEVE allot
\ This names and allocates a byte array named "sieve".
\ "sieve" in the input will now leave the array address on the stack:
\ ( -- adr )


: sieveMain ( --)
\ The colon operator defines a new word in the FORTH dictionary.
\ Or you can think of it as a new function or operator.
\ It is a symbol table entry with code that can be executed.
\ The ( -- ) is just a comment that shows tha there are no stack effects.
\ Some dialects of FORTH actually parse the stack effects, by the way.

\ "c!" stores the 2nd on stack at the address on top:
\ The stack effect is ( n adr -- )

0 sieve c!      ( 0 is not prime. )\ So, this stores a 0 in the first entry in the array,
\ for the integer zero.

0 sieve 1+ c!   ( 1 is not prime. ) \ And this stores a 0 in the second entry in the array,
\ for the integer one.


\ ?do 

\ Now we put the address on the stack again,
\ and set up a counted loop
\ with the limit and the initial count on stack
\ and start the loop, testing the limit against the count:
sieve MAXSIEVE 2 ?do    ( set flags to true for 2 to FINALPASS. )

\ "-1" is read as the numeric constant negative one.

\ "over" copies the 2nd on the stack and pushes it on top,
\ the stack effect is ( adr n -- adr n adr )

\ "i" pushes the current value of the loop counter on the stack.
\ The stack effect is ( -- counter )

\ "+" adds the top two numbers on stack, of course.
\ The stack effect is ( n1 n2 -- sum ).

\ "loop" marks the end of the loop.
\ It has no parameter stack effect: ( -- ),
\ but it removes the limit and current loop counter
\ from where they are stored.
\ It also branches back to the beginning of the loop,
\ if the limit has not been exceeded.

\ So, in the following code,
\ the loop counter is added to the array address,
\ and -1 is stored into the array as a byte value
\ at each byte in the array.

   -1 over i + c! loop  ( sieve pointer -- still on stack. )

\ This is the nested loop that clears the non-primes
\ by synthetic multiplication.
\ Note that "i" is always the innermost loop counter.
\ When we need to access the outer loop counter in the inner loop,
\ we do so with "j".

\ "?do" tests the limit against the initial value
\ and does not enter the loop if the limit is already exceeded.
\ "dup" duplicates the top of stack, effect is ( n -- n n ).
\ "c@" gets the byte value at the address on stack
\ and replaces the address with the value obtained.
\ Stack effect here is ( adr -- value )

\ "if" sets up a conditional evaluation.
\ Stack effect is ( flag -- ).
\ If the flag is zero, code up to "then" is skipped.
\ If the flag is non-zero, execution continues,
\ up to an optional "else".
\ The postfix-influenced interpretation of "then"
\ may be a little hard to get used to.
\ It is often aliased to "endif".

\ "+loop" is like loop, but it counts by the increment on top of stack,
\ stack effect is ( count -- )

FINALPASS 2 ?do ( clear flags at all multiples. )
\ Since the flag clearing pass starts at double each prime,
\ any prime over half of MAXSIEVE will never need to be looked at.

   dup i + c@ if      ( don't bother if not prime. )
      MAXSIEVE i dup + ?do    ( start at first multiple -- double. )
         0 over i + c!      ( clear at this multiple. )
      j +loop           ( sieve still on stack. )
\ Since j is the outer loop counter,
\ it is the current prime being scanned.
   then
loop            ( sieve still on stack. )

\ ".r" prints the 2nd integer on the stack
\ in a field width given by the top of stack.
\ "."" (sorry about the quotes) prints the following string,
\ up to the next quote.

\ "0=" inverts the flag on top of stack.
\ Non-zero become zero, zero becomes -1 .

\ "cr" writes a carriage return out.

\ "drop" drops a value from the stack,
\ stack effect is ( n -- ).

\ ";" terminates the colon definition.

MAXSIEVE 0 ?do
   i DISPWIDTH .r ." : is "
   dup i + c@ 0= if
      ." not "
   then
   ." prime." cr
loop drop ;

\ Now we can execute the newly defined function:

sieveMain
\ which generates the sieve and prints it.



As I said, this FORTH source looks like it was written by a C programmer. (It was. I am far more familiar with C than FORTH.)

You notice that there don't seem to be any local variables in FORTH. Modern FORTHs do have named local variables, but most local variables in FORTH are just parameters of expressions, anonymous on the stack.

Instead of local variables, FORTH traditionally cut functions up in smaller pieces. Some exercise of care allowed heavy re-use of those pieces, yielding more compact, highly readable code. (And then you could go to far and the code could become too dense to read.)

FORTH allows one to make new vocabularies, and that would allow us to hide the small pieces.

Finally, trying to use this on my bif-c revealed some of the bugs I haven't squashed yet. Trying to use it on the fig-FORTH for 6800 I transcribed some time back shows the wisdom of using the small pieces. The small pieces are far easier to debug. This works with the fig-FORTH, and, with the vocabulary stuff commented out, with my bif-c. (And now I have some debugging work to do on the bif-c.)


( Archetypical implementation of the sieve of eratosthenes in FORTH -- fig, bif-c --  )
( using more of the FORTH idiom. )
( Copyright Joel Rees, 2015 )
( By Joel Rees, Amagasaki, Japan, 2015 )
( All rights reserved. )
( Permission granted by the author to use this code )
( for any purpose,  )
( on condition that substantial use  )
( shall retain this copyright and permission notice. )



VOCABULARY sieve-local ( Make a local symbol table. )
sieve-local DEFINITIONS ( Switch to the local vocabulary. )


256 CONSTANT MAXSIEVE
MAXSIEVE 1 - 2 / CONSTANT FINALPASS

5 CONSTANT DISPWIDTH ( enough digits to display MAXSIEVE )


0 VARIABLE sieve ( Old FORTHs don't provide a default behavior for CREATE )
( gforth will leave the zero there. Old FORTHs need an initial value. )

   HERE sieve - DUP ( Old FORTHs don't provide a CELL width. )
   MAXSIEVE SWAP - ALLOT ( Allocate the rest of the byte array. )

   CONSTANT CELLWIDTH ( To show how this can be done. )

: sieveInit ( -- adr )
0 sieve C!      ( 0 is not prime. )
0 sieve 1+ C!   ( 1 is not prime. )
sieve MAXSIEVE 2 DO    ( set flags to true for 2 to FINALPASS. )
   -1 OVER I + C! LOOP  ( sieve pointer -- still on stack. )
;

: primePass ( adr prime -- adr )
MAXSIEVE OVER DUP + DO    ( start at first multiple -- double. )
   OVER I + 0 SWAP C!     ( clear at this multiple. )
   DUP +LOOP              ( next multiple )
DROP ;      ( sieve address still on stack. )

: findPrimes ( adr -- adr )
FINALPASS 2 DO   ( clear flags at all multiples. )
   DUP I + C@ IF ( don't bother if not prime. )
      I primePass
   ENDIF
LOOP ;           ( sieve still on stack. )

: printPrimes ( adr -- )
MAXSIEVE 0 DO
   I DISPWIDTH .R ." : is "
   DUP I + C@ 0= IF
      ." not "
   ENDIF
   ." prime." CR
LOOP DROP ;


FORTH DEFINITIONS

: sieveMain ( -- )
[ sieve-local ] sieveInit
findPrimes
printPrimes ;


sieveMain



If you are wondering about the speed of this version compared to the long version, it's pretty close. With the print commented out and the size set to 4194304, here's what we get in gforth:
me@fun:~/work/mathgames/sieve$ time gforth forthsieve.fs -e bye

real    0m1.096s
user    0m1.080s
sys    0m0.008s
which looks about right for 7 instructions vs. 6 in the inner loop.

(In bif-c, with the buggy parts commented out, too, it takes about 5 seconds. I guess, while I'm debugging it, I should add an option to parse an input file and go, so I can get accurate timings. But timings really aren't meaningful yet, there.)

You might also be interested in the FORTH language sieve entry in Rosetta Code.
Comparing the times for the current example using the dictionary buffer without allocating it:
me@fun:~/work/mathgames/sieve$ time gforth rs_sieve.fs -e bye

real    0m1.463s
user    0m1.444s
sys    0m0.012s
in gforth.