Pelles C forum
Pelles C => Bug reports => Topic started by: melsmith on June 04, 2020, 04:47:16 pm

Hi All:
This is my first post.
Environment: C Compiler PellesC64 and running a short Harbour program to test factorial computations.
When running a Factorial function FACT(21), the result is:
51,090,942,171,709,430,000
Other compilers (e.g., Borland BCC 7.4 and MinGW 10.2) show the correct answer should be:
51,090,942,171,709,440,000
The PellesC64 results from FACT(1) thru FACT(20) are *correct* and identical to other compilers. However, FACT(21) is *incorrect*.
Mel Smith

Hi All:
Simplifying:
The following multiplication is incorrect:
21 * 2432902008176640000 *should be* equal to 51090942171709440000
Instead, Pelles64 produces the result: 51090942171709430000
MinGW 10.1 and BCC 7.4 produce the correct answer.
Mel

Hi All:
Simplifying:
The following multiplication is incorrect:
21 * 2432902008176640000 *should be* equal to 51090942171709440000
Instead, Pelles64 produces the result: 51090942171709430000
MinGW 10.1 and BCC 7.4 produce the correct answer.
Mel
Such a big number requires a 128bits integer to hold it, or a long double.
In the first case if the Harbour compiler generated a 64bit user you should have experienced an overflow (negative result probably), in the second case, using floats, the different result simply could be due to rounding.
In any case the problem is related to which 'C' types generates the Harbour compiler with different backend Ccompilers.
I suggest you to ask the question on the Harbour forum.
Personally I'll be glad if you'll share the naswer with us. ;)

Oh joy, binary floatingpoint. My favorite subject. Not!
Well, if I try this code...
#include <stdio.h>
double tail_recursive_factorial_aux(double n, double acc) {
return n < 1 ? acc : tail_recursive_factorial_aux(n  1, acc * n);
}
double tail_recursive_factorial(double n) {
return tail_recursive_factorial_aux(n, 1);
}
int main(void) {
for (double n = 1; n < 25; n++) {
union {
double d;
unsigned char b[8];
} u;
u.d = tail_recursive_factorial(n);
printf("%3.0f! = %40.15f  0x%02hhX:0x%02hhX:0x%02hhX:0x%02hhX:0x%02hhX:0x%02hhX:0x%02hhX:0x%02hhX\n", n, u.d, u.b[7], u.b[6], u.b[5], u.b[4], u.b[3], u.b[2], u.b[1], u.b[0]);
}
}
... the output looks a bit better with MSVC (Version 19.25.28614 for x64) than with Pelles C (I tried v10.0, but whatever).
For stuff like this I use the C runtime by P.J. Plauger, Dinkumware.
A quick web search turned up this, and little else:
https://github.com/Justme0/CLIB (https://github.com/Justme0/CLIB)
This version (possibly unlicenced) is older than my version (which I bought many years ago). For example no C99 support like %a, but for the printf("%f", ...) case the code looks the same.
In the STDIO folder, the files forming the call chain for printf("%f", ...) looks like:
PRINTF.C > XPRINTF.C > XPUTFLD.C > XLDTOB.C > XGENLD.C
where the most interesting part is probably a loop in XLDTOB.C forming a sequence of "ASCII" digits. I can't immediately figure out if there is a reasonably easy way to improve the code/precision.
Does anyone here feels like more of an expert on this...?

Hi Pelle and Frankie:
I'll return to my xHarbour ng, and report your comments to the poster who complained..
Thanks for the response(s).
Mel Smith
(a 'Repositarian' for the Harbour Language)
www.whosaway.com

Hi Pelle and Frankie:
Below is the Harbour C Function that is the basis of the Bug Report. This function is part of
a larger file with several related functions and many 'headers'
You'll note that any value factorial argument greater than 21 results in a return of 1
Mel
**** the Factorial Function from Harbour *****
HB_FUNC( FACT )
{
if( ISNUM( 1 ) )
{
int iInput = hb_parni( 1 );
int i;
double dResult = 1.0;
if( ( iInput >= 0 ) && ( iInput < 22 ) )
{
for( i = 1; i <= iInput; i++ )
{
dResult *= ( double ) i;
}
hb_retnd( dResult );
}
else
{
hb_retnd( 1.0 );
}
}
else
{
PHB_ITEM pSubst = NULL;
int iArgErrorMode = ct_getargerrormode();
if( iArgErrorMode != CT_ARGERR_IGNORE )
{
pSubst = ct_error_subst( ( USHORT ) iArgErrorMode, EG_ARG, CT_ERROR_FACT,
NULL, "FACT", 0, EF_CANSUBSTITUTE, 1, hb_paramError( 1 ) );
}
if( pSubst != NULL )
{
hb_itemRelease( hb_itemReturnForward( pSubst ) );
}
else
{
hb_retnd( 0.0 );
}
}
}
***********

Frankie's statement is correct. A 64bit double number holds 54 mantissa bits which means about max. 19 valid decimal digits precision., error accumulation not accounted. Some C compilers work with 80bit long doubles internally under the hood to increase precision and to keep error accumulation below 19 decimal digits if possible. This seems to have happened here.
Why one would calculate integer factorials with floatingpoint numbers at all is beyond me. Any 64bit C compiler has int64_t integer math, some support even int128_t. Even if not, it is a simple task to add 128integer functions, e.g. by using some intrinsics or hardcoded.

... the output looks a bit better with MSVC (Version 19.25.28614 for x64) than with Pelles C (I tried v10.0, but whatever).
That's the real news ;D Waiting for the release... 8)
Now going back to the issue, I think that there is something strange in the conversion routines. Maybe a simplification algorithm, or a small bug can't say.
It is obvious that computing a factorial using natural numbers, meaning integers, can't give back real numbers, I mean numbers having a fractional part. This is the first point that is strange in the output from the double conversion. This should be enforced for values which exceeds the mantissa dimension (I mean the number requires more bits than those available in the mantissa field), that can't have any fractional part (due to left shifting to align the exponent).
Starting from 'double' layout, which uses 1 bit of sign, 11 bits for biased exponent (bias 1023), and 52bits for mantissa, I wrote a simple conversion routine to check the output. My code limits conversion up to numbers that requires less than 64bits for sake of simplicity (to handle larger code require 128bits variables at least). So data is available up to 20!.
#include <stdio.h>
#include <stdint.h>
#include <string.h>
double tail_recursive_factorial_aux_double(double n, double acc)
{
return n < 1 ? acc : tail_recursive_factorial_aux_double(n  1, acc * n);
}
double tail_recursive_factorial_double(double n)
{
return tail_recursive_factorial_aux_double(n, 1);
}
char *FromDouble2Asc(double val, char *buf)
{
uint64_t out;
union
{
double val;
uint8_t b[8];
uint64_t h;
} cvt;
cvt.val = val;
int sign = (cvt.h & 0x8000000000000000) ? 1 : 1;
int exponent = (int)((cvt.h >> 52) & 0xfff)  1023;
uint64_t mantissa = (cvt.h & 0xfffffffffffff)  0x10000000000000;
// printf("\tval=%20.3f => hex=0x%08llx  sign=%+d  exponent=%+04d  mantissa=%llu (0x%llx)\n", val, cvt.h, sign, exponent, mantissa, mantissa);
if (!cvt.h)
{
strcpy(buf, "0.0");
return buf;
}
if (exponent < 0)
{
strcpy(buf, "[exp < 0]");
return buf;
}
else if (exponent <= 52)
{
out = mantissa >> (52exponent);
}
else if (exponent<64)
{
out = mantissa << (exponent52);
}
else
{
strcpy(buf, "[exp > 63]");
return buf;
}
sprintf(buf, "%c%llu", sign>0 ? ' ':'', out);
return buf;
}
int main(void)
{
char buf[128];
for (double n = 1; n < 25; n++)
{
union
{
double d;
unsigned char b[8];
uint64_t uint64;
} u;
u.d = tail_recursive_factorial_double(n);
FromDouble2Asc(u.d, buf);
printf("%3.0f! = %40.15f  0x%016llx  %20s\n", n, u.d, u.uint64, buf);
}
}
Running the program you can note something strange starting from 12!
Below I copy the program output for reference. The last value is the output of my conversion routine:
1! = 1.000000000000000  0x3ff0000000000000  1
2! = 2.000000000000000  0x4000000000000000  2
3! = 6.000000000000000  0x4018000000000000  6
4! = 24.000000000000000  0x4038000000000000  24
5! = 120.000000000000000  0x405e000000000000  120
6! = 720.000000000000000  0x4086800000000000  720
7! = 5040.000000000000000  0x40b3b00000000000  5040
8! = 40320.000000000000000  0x40e3b00000000000  40320
9! = 362880.000000000000000  0x4116260000000000  362880
10! = 3628800.000000000000000  0x414baf8000000000  3628800
11! = 39916800.000000000000000  0x418308a800000000  39916800
12! = 479001600.000000034924000  0x41bc8cfc00000000  479001600
13! = 6227020799.999999580904000  0x41f7328cc0000000  6227020800
14! = 87178291199.999991804361000  0x42344c3b28000000  87178291200
15! = 1307674367999.999958550000000  0x4273077775800000  1307674368000
16! = 20922789887999.999336890000000  0x42b3077775800000  20922789888000
17! = 355687428096000.012010330000000  0x42f437eeecd80000  355687428096000
18! = 6402373705727999.657392500000000  0x4336beecca730000  6402373705728000
19! = 121645100408832004.177300000000000  0x437b02b930689000  121645100408832000
20! = 2432902008176640141.755300000000000  0x43c0e1b3be415a00  2432902008176640000
21! = 51090942171709440648.555700000000000  0x4406283be9b5c620  [exp > 63]
22! = 1124000727777607680764.000000000000000  0x444e77526159f06c  [exp > 63]
23! = 25852016738884979858994.000000000000000  0x4495e5c335f8a4ce  [exp > 63]
24! = 620448401733239370510000.000000000000000  0x44e06c52687a7b9a  [exp > 63]
I haven't tested GCC or MSVC output, but I expect that the output comply with my code.

That's the real news ;D Waiting for the release... 8)
I get that. It should be available within the next couple of weeks.
I have two annoying bugs to fix. Christian is looking at the German translation and web site.
About this topic:
After many trace lines, and searching through the CRT code, I realized that I could reuse some of the code I added to enhance precision for the scanf() family of functions (v9, I think). More floatingpoint code I didn't write, but it does the job. Now my output compared with MSVC is "good enough".
There are apparently a few modern/popular methods of doing this kind conversion, with possibly slightly better results, but also (AFAICT) involving more code. Nobody wants a big bloated C runtime, so I think "good enough" in this case is good enough.

I get that. It should be available within the next couple of weeks.
I have two annoying bugs to fix. Christian is looking at the German translation and web site.
What's new in version 10 ?

There are apparently a few modern/popular methods of doing this kind conversion, with possibly slightly better results, but also (AFAICT) involving more code. Nobody wants a big bloated C runtime, so I think "good enough" in this case is good enough.
Right decision, the issue, if it could really be defined a problem, arise when the magnitude becomes very large (or very small), in a field that should be of interest only to very specific applications that are supposed to use better tools for calculations (i.e. bignum libraries).
I haven't investigated much, I should confess (actually the work takes a lot of energy to recover delays due to COVID19), but the point seem to be the use of floating variables through the conversion process, which introduce the rounding errors at some point when the converted value exceeds the mantissa capacity, propagating the rounding through the flow. The trick should be to mask resolution when the magnitude of the value becomes large enough (i.e. when reached the max significant digit number for the mantissa just output 0's :o ).
Waiting for release... ;)

Then there is always the fused multiplyadd operator fma() with higher internal precision. So:
c = a * b; <==> c = fma (a, b, 0.);
I hope that Pelle's upcoming C version will support __int128_t and fp quad math.