News:

Download Pelles C here: http://www.smorgasbordet.com/pellesc/

Main Menu

Bug in strtoull?

Started by Werner, July 08, 2021, 03:30:35 PM

Previous topic - Next topic

Werner

In 7.22.1.7 p. 8 the standard stipulates:

"If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno."

In the following example Pelles C returns "errno == 0" although the passed string provides a negative value which is "outside the range of representable values" in an unsigned long long int variable.

Am I mistaken?


#include   <errno.h>
#include   <stdio.h>
#include   <stdlib.h>

int main(void)
{
   const char* p_minus_one = "-1";
   unsigned long long int result;
   
   errno = 0;
   printf("errno before \"strtoull\": %i\n", errno);

   result = strtoull(p_minus_one, NULL, 0);
   printf("result                   : %llu\n", result); // expected by ISO/IEC 9899: ok
   printf("errno after \"strtoull\" : %i\n", errno); // expected ERANGE = 34 but received 0: bug?
}

Pelle

Well, "-unsigned" is an acceptable idiom to use. strtoull() (as well as strtoul()) will accept a negative sign, and as long as the result fits everything is OK.
With two's complement, -1ull is the same as ULLONG_MAX which does fit (no part of the value is dropped).

You have a point, but I can't find much support either way in the standard. FWIW, Microsoft compiler/runtime seems to agree with me on this one (i.e. no error from your example).
/Pelle

MrBcx

Quote from: Pelle on July 08, 2021, 07:23:55 PM
  FWIW, Microsoft compiler/runtime seems to agree with me on this one (i.e. no error from your example).

FYI ... I get identical output from Lcc-Win32, Pelles, MSVC, Mingw, and Clang. 

errno before "strtoull": 0
result                   : 18446744073709551615
errno after "strtoull" : 0
Bcx Basic to C/C++ Translator
https://www.bcxbasiccoders.com

Werner

Thank you very much, Pelle and MrBcx, for your comments.

The correctness of the conversion of the subject sequence is beyond debate.

But being a lawyer in real life--although not a language lawyer--
I feel impelled to scrutinize the standard's wording.

And indeed, 7.22.1.7 p. 5 justifies, even enforces, the common treatment
of the case in question.

Nevertheless, I cannot appreciate this. After all, the strto... functions
are intended to convert real world numbers, not two's complements.
So, transforming true negative numbers into positive ones, while technically correct,
seems wrong to me. At least, the standard should spare a footnote.

I apologize for bothering you all with my personal opninions.

John Z

Werner - you have nothing to apologize for open discourse is encouraged.

IMO this part of the standard 7.22.1.4 helps clarify:
"If the subject sequence has the expected form and the value of base is zero, the sequence of characters
starting with the first digit is interpreted as an integer constant according to the rules of 6.4.4.1. If
the subject sequence has the expected form and the value of base is between 2 and 36, it is used as
the base for conversion, ascribing to each letter its value as given above. If the subject sequence
begins with a minus sign, the value resulting from the conversion is negated
(in the return type). A
pointer to the final string is stored in the object pointed to by endptr, provided that endptr is not a
null pointer."

The minus sign is only applied after the conversion of the digits has been completed.  Therefore the digits conversion itself does not process a negative number for UL or ULL it converts the numerical portion then applies the negation.

John Z

Pelle

Quote from: Werner on July 09, 2021, 10:44:41 AM
And indeed, 7.22.1.7 p. 5 justifies, even enforces, the common treatment
of the case in question.
Right, sloppy reading on my part.

strtoull() (from C99) is really just an extension of strtoul() (from C89). I don't have the complete history for C89, so I can only speculate. Since C89 was the first standard for an (by then) already popular language, I guess they wanted to include rather than exclude as many implementations as possible. Accepting "-unsigned" allows for more code reuse, which was more important during the 1980's with less RAM. Today we have embedded systems, and maybe there isn't enough incentive to mess with the standard in this area (even a footnote). As I said, I don't really know...

Quote from: Werner on July 09, 2021, 10:44:41 AM
I apologize for bothering you all with my personal opninions.
You always have good points, more interesting (to me) than many other posts in this forum. Keep it coming...  ;)
/Pelle

Grincheux

#6
My solution : strtoul(rintl(fabs(number)))
or sprintf with %llu or %lu