usigned __int64 overflow question

PaoloC13 · December 19, 2013, 02:36:34 AM

Hi all!
I would like to submit a question that I do not understand.
I try to pass values to a variable declared as unsigned __int64, but it seems that I cannot exceed the value 9223372036854775807, which is the maximum limit of type (signed) __int64, while I have not yet reached the upper limit of type unsigned __int64 which should be 18446744073709551615.

I ran this test:

Code Select



	double d, k;
	__int64 i64;
	unsigned __int64 ui64;

	k =    1000000000000000.0;
	d = 9200000000000000000.0; // ...a little less than i64 max limit.
	i64 = (__int64)d; // Try to pass the value.
	d -= k; // So we should get the d < i64 condition. Now verify:

	if ( d < (double)i64 ) 
		Message("i64 condition checked!"); // True! :)

	d = 9300000000000000000.0; // ...a little more than i64 max limit.
	ui64 = (unsigned __int64)d; // Try to pass the value.
	d -= k; //  So we should get the d < i64 condition. Now verify:

	if ( d < (double)ui64 ) 
		Message("ui64 condition checked!"); // False! :(

Why cannot I use all of the field made available by the type unsigned __int64?
I compile in Pelles C 7.00 32 bit edition on Windows XP, with the Ze option.
I will appreciate any suggestion.

TimoVJL · December 19, 2013, 12:22:47 PM

From MS link here

QuoteA floating value is converted to an integral value by first converting to a long, then from the long value to the specific integral value. The decimal portion of the floating value is discarded in the conversion to a long. If the result is still too large to fit into a long, the result of the conversion is undefined.

DoubleTrouble:
This is a binary format that occupies 64 bits (8 bytes) and its significand has a precision of 53 bits (about 16 decimal digits).

more links:
Floating_point#IEEE_754
c-unsigned-__int64-tofrom-double-conversions

frankie · December 19, 2013, 01:17:34 PM

Don't know exactly if it is really required, or the problem is related to PellesC preprocessor handling, but to avoid the warning, and correctly code the unsigned value, you must postfix the constant with the unsigned value indicator 'u'.
Here is the the sample derived from your code with some modifications:

Code Select

	double d, k;
	__int64 i64;
	unsigned __int64 ui64;

	k =    1000000000000000.0;
	d = 9200000000000000000.0; // ...a little less than i64 max limit.
	i64 = 9200000000000000000; // Try to pass the value.
	d -= k; // So we should get the d < i64 condition. Now verify:

	if ( d < (double)i64 ) 
		message("i64 condition checked!"); // True! :)

	d = 9300000000000000000.0; // ...a little more than i64 max limit.
	ui64 = 9300000000000000000u; // Try to pass the value.
	d -= k; //  So we should get the d < i64 condition. Now verify:

	if ( d < (double)ui64 ) 
		message("ui64 condition checked!"); // True! :)

Anyway as Timo pointed out in his mail take care of conversions when using different data types.

PaoloC13 · December 19, 2013, 03:46:19 PM

Thanks frankie, your code shows me that the variable ui64 is able to accept values that I first I was not able to pass.
Now the problem comes from my purpose, which is converting values derived from calculations on double-precision floating-point values to integer values. When setting the overflow control I would like to take advantage of the whole field made available by the tipe unsigned __ int64, however if not possible I'll have to settle for receiving values with lower limits.

I will take the time to carefully read and consider the proposals of Timo and then I'll come back to post.

PaoloC13 · December 20, 2013, 12:05:42 PM

Then, I thoroughly read the link proposed by timovjl: "C + + unsigned __ int64 to / from double conversions", it's a topic started by a programmer who experienced exactly the same problem. Here's how he has exposed (translating from C++ to C):

Code Select


    double x = 0.0;  
    unsigned long long uint64 = 0;

    uint64 = 0x7000000000000000;
    x = (double)uint64;
    uint64 = (unsigned long long)x;    // uint64 is now '0x7000000000000000'

    uint64 = 0x9000000000000000;
    x = (double)uint64;
    uint64 = (unsigned long long)x;    // uint64 is now '0x8000000000000000', not '0x9000000000000000'

Unfortunately I find that he didn't get a solution. The only coherent answer that captures the problem, arrive at post #9 and reads as follows:

QuoteAt least under MSVC2005, this does appear to be a bug in the standard runtime. static_cast<unsigned long long>(double) produces a call to _ftol2, which does not respect unsigned-ness. A couple minutes on Google confirms that this bug still exists in VS2008 and maybe 2010 as well.

At post #18, the same author of the #9 says so:

QuoteThe problem is that the CPU instruction to do the conversion is not unsigned-aware. It only does signed conversions.

But I would say that the same code tested with the types __ int32 and unsigned __ int32 (adjusted to their limits), doesn't produce the same exception.
So I'm not sure what to think.

jj2007 · December 20, 2013, 01:35:03 PM

Quote from: PaoloC13 on December 20, 2013, 12:05:42 PM

uint64 = 0x9000000000000000;
x = (double)uint64;
uint64 = (unsigned long long)x; // uint64 is now '0x8000000000000000', not '0x9000000000000000'

It's even worse, actually:

double x;
unsigned long long uint64;
__asm int 3; // set a breakpoint
uint64 = 0x9000000000000000;
__asm nop;
x = (double)uint64;
__asm nop;
uint64 = (unsigned long long)x; // uint64 is now '0x8000000000000000', not '0x9000000000000000'
__asm nop;
printf("The variable is %q", uint64)

Disassembly:

CPU Disasm
Address Hex dump Command Comments
00401053 ³. CD 03 int 3
00401055 ³. 90 nop
00401056 ³. 90 nop
00401057 ³. 90 nop
00401058 ³. 68 00000080 push 80000000
0040105D ³. 6A 00 push 0
0040105F ³. 68 00704000 push offset 00407000 ; ASCII "The variable is %q"
00401064 ³. E8 07030000 call printf

As you can see, Pelles C makes fun of your attempts to "convert" something, it just uses 0x80000000...

frankie · December 20, 2013, 01:58:10 PM

I made some tests.
The problem is the x87 FP unit that is unaware of unsigned integers.
The 'funny' 0x80000000 is in reality the overflow condition, that's consistent because the number is not rappresentable in a 64 bits signed variable.
This problem is not related to the compiler, but to the hardware (you will get the same result with whichever compiler).
I attach my personal workaround that could be the base for your personal upgrades (glad to see the result...)
Another solution is to use a sw FP library like GNU-MPFR, or SSE or MMX.

jj2007 · December 20, 2013, 05:27:10 PM

Quote from: frankie on December 20, 2013, 01:58:10 PM
The problem is the x87 FP unit that is unaware of unsigned integers.

The FPU is not the culprit. Sorry that the code below is assembler, but I see no other way to prove that...

Output:
Hex= 90000000 00000000
Dec= 10376293545756590080
Hex= 90000000 00000000
Dec= 10376293545756590080

include \masm32\MasmBasic\MasmBasic.inc ; download
.data
MyLongLong QWORD 9000000000000000h
MyDouble REAL8 123.456
MyConvertedLongLong QWORD ?

Init
PrintLine "Hex=", Tb$, Hex$(MyLongLong)
PrintLine "Dec=", Tb$, Str$("%u", MyLongLong)

fild MyLongLong ; these four instructions...
fstp MyDouble
fld MyDouble
fistp MyConvertedLongLong ; ... work like a charm

PrintLine "Hex=", Tb$, Hex$(MyConvertedLongLong)
PrintLine "Dec=", Tb$, Str$("%u", MyConvertedLongLong)
Inkey
Exit
end start

TimoVJL · December 20, 2013, 08:27:57 PM

Code Select

.model flat,stdcall
.386
.387

ExitProcess PROTO STDCALL :DWORD
INCLUDELIB kernel32.lib

printf PROTO C :DWORD,:VARARG
INCLUDELIB msvcrt.lib

.data
sMsg db "test",13,10,0
;sFmt db "%.1lf",13,10,"%llu   %llXh",13,10,0
sFmt db "%.1lf",13,10,"%I64u   %I64Xh",13,10,0
;md REAL8 1234567890123456789.0
;md REAL8 9223372036854775808.0
 md REAL8 9300000000000000000.0

.data?
mll QWORD ?
buf db 80 DUP(?)

.code
start:
		fld    md
		fistp  mll
		INVOKE printf, ADDR sFmt, md, mll, mll
		INVOKE  ExitProcess, eax
END start

outputs:

Code Select

1234567890123456800.0
1234567890123456768   112210F47DE98100h

Code Select

9223372036854775800.0
9223372036854775808   8000000000000000h

Code Select

9300000000000000000.0
9223372036854775808   8000000000000000h

Double_precision

jj2007 · December 20, 2013, 09:57:49 PM

Timo,

Of course you can't convert 930... to a QWORD, that is simply beyond the range. But it seems that printf cannot handle QWORDs properly (using a "legal" value of 9223372036854775000.0):

include \masm32\MasmBasic\MasmBasic.inc ; download
.data
sFmt db "%.1lf",13,10,"%llu",13,10, "%llXh",13,10,0
md REAL8 9223372036854775000.0
mll QWORD ?

Init
fld md ; convert from double...
fistp mll ; ... to long long
INVOKE crt_printf, ADDR sFmt, md, mll, mll
deb 4, "MB", md, mll, x:mll
Exit
end start

Output:

9223372036854774800.0
4294966272
7FFFFFFFh

MB
md 9.223372036854775e+18
mll 9223372036854774784
x:mll 7FFFFFFF FFFFFC00

The "hardware" part works just fine, it's printf that chokes.

EDIT:
I realise that I get results different from yours. Which msvcrt.lib are you using?
The point remains that it is not the FPU but rather the compiler, see the three nops in the disassembly. At compile time, it decides erroneously that 90...0h means overflow, and pushes 80..0h.

PaoloC13 · December 21, 2013, 04:52:27 AM

My sin was thinking that could be a problem for beginners

TimoVJL · December 21, 2013, 03:27:57 PM

Quote from: PaoloC13 on December 21, 2013, 04:52:27 AM
My sin was thinking that could be a problem for beginners

That was a good question as count of replies. (double problem as i see it, __ftol())

Bonus track:

Code Select

long long double_to_ll(double x)
{
    long long y =  *(long long *)&x;
    long long e = 0x3FF + 63 - (y >> 52);
    long long m = 1 << 63 | y << 11;
    return m >> e & -(e < 64);
}

@Jochen

Quote from: jj2007 on December 20, 2013, 09:57:49 PM
EDIT:
I realise that I get results different from yours. Which msvcrt.lib are you using?

I'm using WinXP's or Win7 X64's msvcrt.dll's and made msvcrt.lib with polib.exe

Quote from: jj2007 on December 20, 2013, 09:57:49 PM
EDIT:
The point remains that it is not the FPU but rather the compiler, see the three nops in the disassembly. At compile time, it decides erroneously that 90...0h means overflow, and pushes 80..0h.

Not an error, just optimized code, when using __ftol() and compiler knows that static value. Compiler warning is still missing for that ?

frankie · December 21, 2013, 05:59:50 PM

The FP instructions:
fild -> load integer
fist -> store integer
fistp -> store integer and pop fp stack
works with *integers* there is no instruction for unsigned integers.
Only the instruction fistp could handle 64 bits integers, the instruction fist is limited to 32 bits integers.
A number as 9.3e+18 is too big for a signed 64 bits integer, but fits perfectly in an unsigned 64 bits integer. Unfortunately the Floating Point Unit doesn't know of unsigned integers existence, so if the number won't fit in a signed integer than it is an overflow (0x80000000....).
An optimized compiler remove instructions that could be detected as 'will never be executed' in compiling phase, so if you want a coherent translation of your code you have to select 'no optimizations' in compiler flags. PellesC is optimized enough to detect such occourrencies.

Now the memory layout of a double could be descripted as follows:

Code Select


#pragma pack(1)
typedef struct
{
	unsigned long long int mantissa:52;		//Mantissa or significant
	unsigned long long int exponent:11;		//exponent biased by 1023
	unsigned long long int sign:1;			//sign of mantissa
} DOUBLE_FMT, *LP_DOUBLE_FMT;
#pragma pack()

The number represented is composed by an implicit 1 before the decimal point followed by the fractional part called mantissa or significant, hold in 52 bits. The whole mantissa is 53 bits long considering the implicit 1.
The exponent is the power of 2 for which we have to multiply the mantissa to obtain our number. The exponent could be positive or negative to express respectively a number >1 or <1. Anyway the IEE standard doesn't use a 2's complement representation for exponent, but biasing. The bias value represent the 0, exponents less than 1023 represent numbers < 1 and exponents greater than 1023 represents numbers > 1. Some values of exponent (0h and 0x7ff have special meanings refer to IEE-754 standard).
So our number could be represented as:

number = (-1)^sign * 1.mantissa * 2^(exponent -1023)

This could seem complicated, but the math operation on base 2 are very simple for the machine, i.e. to compute the multiplication of mantissa for the power of 2 of the exponent simply requires a shift of mantissa itself for the exponent value and direction dependent on the exponent sign. Shift left for positive exponents and shift rigth for negative values.
Sign can have two values (is one bit wide) 1 or 0, because any number elevated to 0 gives 1 as result and itself when elevated to 1, our number will be sign changed if sign is 1 and unchanged if sign is 0 (-1^0=1 => 1*1.mantissa.... , -1^1=-1 => -1*1.mantissa....).
Coding a software converting routine from double to unsigned 64 bits integer could be:

Code Select


	//MyDouble is the double to convert
	double MyDouble = 9.3e+18;
	//Cast double to the structure describing the format
	LP_DOUBLE_FMT pDouble = (LP_DOUBLE_FMT)&MyDouble;
	//Get mantissa promoted as integer part (not fraction)
	//and add the implicit integer part (53rd bit)
	ui64 = pDouble->mantissa | 0x10000000000000LLU;
	//Compute exponent subtracting bias and considering the fraction promotion
	int exponent = pDouble->exponent - 1023 - 52;
	//Adjust result by exponent
	if (exponent)
	{
		if (exponent > 0)
			ui64 <<= exponent;
		else
			ui64 >>= -exponent;
	}

	//Adjust for negative numbers
	if (pDouble->sign)
		ui64 = -ui64;

This code performs the same operations of Timo's sample taken from Jochen, which is some poorly coded and documented (oh yes black-magik coding...

).
Last consideration is on the approximation that we get using floating point numbers. As seen the mantissa holds only 53 bits, so only numbers that fits in 53 bits are an exact reppresentation of the number, whichever value greater is just an approximation (the missing bits will be replaced by zeroes):

Code Select


//Max exact value representable in a double
//(53 bits = 11111111111111111111111111111111111111111111111111111b)
#define MAXDOUBLEEXACT 9007199254740991LL

If your number requires say 54 bits using a double the last bit will have no meaning, and you will see that could no more change by units, but by the power of 2 of the major missing bits. I.e. with 54 bits we will get only numbers ending in 0 or 2 (2^1), with 55 bits numbers will change by 4 and 0 (2^2), with 56 bits numbers will change by 8 and 0 (2^3) and so on....
So if you want run a counter using a double it is a very bad idea (at least for numbers bigger than MAXDOUBLEEXACT ), better use an integer....

P.S.
To prevent any comment about the fact that adding even small values (1.0 or 10.0 or the like) to a value >= MAXDOUBLEEXACT still seems to give consistent results I have to add mention about FPU internal data handling and rounding. The FPU generally always use Extended double internally to allow better precision on float calculations.
Extended double is 10 bytes long, the mantissa is 64 bits long including the integer part (integer part 1 bit, fractional part 63 bits), the exponent is 15 bits and the sign 1 bit.
the instructions to load and store floats convert from internal format to the storing required (float, double, long float...).
So performing calculations on extended double the result is still kind of accurate because the float reppresentation limits has been moved up.....

The rounding has been introduced in floating point because not all numbers are correctly reppresentable in base 2, in some cases the conversion generates an irreducble fraction (to give an idea think of 20/6=3.33333333333333....), in this case the floating number is an approximation. The rounding scope is to get as closer possible to the real value adding a small value that triggers the significant figures close to our value.
Consider that the float reppresentation of 20.1 is 20.0999999999999999999999, by adding a small rounding value as 0.0000000000000000000001 will give 20.1 as required.

jj2007 · December 22, 2013, 02:09:25 AM

Quote from: frankie on December 21, 2013, 05:59:50 PM
The FP instructions:
fild -> load integer
fist -> store integer
fistp -> store integer and pop fp stack
works with *integers* there is no instruction for unsigned integers.
Only the instruction fistp could handle 64 bits integers, the instruction fist is limited to 32 bits integers.
A number as 9.3e+18 is too big for a signed 64 bits integer, but fits perfectly in an unsigned 64 bits integer. Unfortunately the Floating Point Unit doesn't know of unsigned integers existence, so if the number won't fit in a signed integer than it is an overflow (0x80000000....).

As already mentioned above, the issue is not only with the out-of-range number 9.3e+18. Let's look at a test case with a nice number, 12345678901234567890:

include \masm32\MasmBasic\MasmBasic.inc ; download
.data
MyLongLongA LONGLONG 12345678901234567890 ; 0xAB54A98CEB1F0AD2
MyDouble DOUBLE ?
MyLongLongB LONGLONG ?

Init
Dll "msvcr100" ; the standard Masm32 crt lib is not enough
Declare void printf, C:? ; don't return anything, C calling convention, vararg

printf(cfm$("MyLongLongA is %llX aka %llu\n"), MyLongLongA, MyLongLongA)

fild MyLongLongA ; these four instructions:
fstp MyDouble ; ... convert a LL to a double
fld MyDouble ; ... and back, and
fistp MyLongLongB ; ... they work like a charm

printf(cfm$("MyLongLongB is %llX aka %llu\n"), MyLongLongB, MyLongLongB)
Inkey
Exit
end start

Output (Assembler + MasmBasic):
MyLongLongA is AB54A98CEB1F0AD2 aka 12345678901234567890
MyLongLongB is AB54A98CEB1F0C00 aka 12345678901234568192

As you can see, MyLongLongB suffered a very small rounding error but otherwise it survived the conversion. Now to Pelles C:

#include <stdio.h>
#pragma warn(disable:2007) // don't show the "assembly not portable" warning
#pragma warn(disable:2118) // para not referenced

int main(int argc, char* argv[]) {
double x;
unsigned long long uint64;
__asm int 3;
uint64 = 0xAB54A98CEB1F0AD2;
printf("MyLongLongA is %llX aka %llu\n", uint64, uint64);
__asm nop;
x = (double)uint64;
__asm nop;
uint64 = (unsigned long long)x; // this is __ftoll
__asm nop;
printf("MyLongLongB is %llX aka %llu\n", uint64, uint64);
}

Output (Pelles C):
MyLongLongA is AB54A98CEB1F0AD2 aka 12345678901234567890
MyLongLongB is 8000000000000000 aka 9223372036854775808

To understand the problem, I first had to find out how to disable all optimisations, but here is a look under the hood:

Disassembly (without printf's - this is the equivalent to the "four instructions" above)
int 3
mov dword ptr [ebp-10], EB1F0AD2
mov dword ptr [ebp-0C], AB54A98C
nop
mov eax, [ebp-10]   ; eax = EB1F0AD2
mov edx, [ebp-0C]   ; edx = AB54A98C
mov ecx, edx   ; ecx = AB54A98C
and edx, 7FFFFFFF (edx now 2B54A98C)
push edx   ; hi DWORD (crippled!)
push eax   ; lo DWORD
fild qword ptr [esp]   ; 3.1223068643797920820e+18
add esp, 8
and ecx, 80000000   ; ecx AB... -> 80000000h
push ecx
push 0
fild qword ptr [esp]   ; -9.2233720368547758080e+18
add esp, 8
fsubp st(1), st
fstp qword ptr [ebp-8]
nop
fld qword ptr [ebp-8] ; now guess what is loaded into the FPU??
call __ftoll
mov [ebp-10], eax
mov [ebp-0C], edx
nop

CPU and FPU don't care about signs, they treat (almost) everything as unsigned. Only jumps have their own rules. But if the compiler decides to be clever, voilà it ends up in major acrobacies... ;-)

Now The Hammer at the end of the whole story: These compiler acrobacies did almost no harm to our nice number. The fld qword ptr [ebp-8] shows 1.2345678901234567170e+19

What dumps the whole exercise in the end is ... __ftoll ... toll ... LOL

Merry Christmas to everybody,
JJ

TimoVJL · December 22, 2013, 12:21:37 PM

Example for POAsm too:

Code Select

.model flat,stdcall
.386
.387

ExitProcess PROTO STDCALL :DWORD
INCLUDELIB kernel32.lib

printf PROTO C :DWORD,:VARARG
INCLUDELIB msvcrt.lib
;INCLUDELIB msvcr100.lib

.data
sMsg db "test",13,10,0
sFmt db "%.1lf",13,10,"%I64u   %I64Xh",13,10,"%I64u   %I64Xh",13,10,0
mll1 QWORD 12345678901234567890

.data?
md REAL8 ?
mll2 QWORD ?
buf db 80 DUP(?)

.code
start:
		fild   mll1	; FILD    Load integer, fild mem64
		fstp   md	; FSTP    Floating point store and pop, fstp mem64
		fld    md	; FLD     Floating point load, fld mem64
		fistp  mll2	; FISTP   Store integer and pop, fistp mem64
		INVOKE printf, ADDR sFmt, md, mll1, mll1, mll2, mll2
		INVOKE  ExitProcess, eax
END start

msvcrt.def
polib.exe /DEF:msvcrt.def /OUT:msvcrt.lib

Code Select

LIBRARY msvcrt.dll
EXPORTS
	printf

output:

Code Select

-6101065172474983400.0
12345678901234567890   AB54A98CEB1F0AD2h
12345678901234568192   AB54A98CEB1F0C00h

News:

usigned __int64 overflow question

PaoloC13

PaoloC13

PaoloC13

PaoloC13