NO

Author Topic: Why does pellesc cast hex char to compile-time codepage?  (Read 12020 times)

dienstag

  • Guest
Re: Why does pellesc cast hex char to compile-time codepage?
« Reply #15 on: December 14, 2012, 10:54:59 AM »
Quote
C99 standard 6.4.4.4 'Character constants' in paragraph 9 'constraints' specify
"The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the type unsigned char for an integer character constant, or the unsigned type corresponding to wchar_t for a wide character constant."
So it must be unsigned.

That is the wrong conclusion. The standard says, that if the hexadecimal sequence fits into an unsigned char, it is correctly specified for any integer character constant, no matter whether signed or unsigned. They should therefore be taken 1:1 by the compiler even when the leading bit specifies a negative value.

There is no word about strings here; but, however, if you consider a string as consisting of character constants, the hexadecimal numbers, when specifying a byte, must appear 1:1 in the compiled code no matter what these bits actually mean, even when they totally scramble the readability of the string.

Compilers should behave as GCC/VC/LCC do. They did all the time and there is no reason that source code that has been working for decades should now produce different results. Escape sequences were invented to put codes into characters and strings that usually do not belong there. To specify encrypted string constants, for example.

Offline frankie

  • Global Moderator
  • Member
  • *****
  • Posts: 2096
Re: Why does pellesc cast hex char to compile-time codepage?
« Reply #16 on: December 14, 2012, 11:09:53 AM »
More subtle, means that whatever you write is taken as an unsigned.
I fully agree that correct compilation must be grant for old code (while old code with wchar_t is not so diffused).
Of course the extension of of char constant to constants in strings, as composed of chars, is perfectly legal.
Is also perfectly  legal for the compiler to behave as it likes standing to last sentence of char constants semantics.
So at the end any way to classify this as a bug?
Or should we put it in wish list?
What is your opinion?
It is better to be hated for what you are than to be loved for what you are not. - Andre Gide

CommonTater

  • Guest
Re: Why does pellesc cast hex char to compile-time codepage?
« Reply #17 on: December 14, 2012, 02:10:01 PM »
I would suggest that we need to be careful to understand the difference between a CHAR array and a STRING array (even though C doesn't actually have strings). 

If you want literal storage of every value in a char array ... don't use string functions to put them there.  (i.e. memcpy() instead of strcpy() etc.)

If on the other hand you actually are working with text strings, then the language and character set become a consideration.  For example: the symbol ΓΌ, #129, in code page 406 might be 213 in code page 1402 .... To correctly display text across those two languages some translation of character values is necessary.

I would favour a compiler flag to disable localization functions... but with the complexity of a world with over 400 languages, I suggest it should default to "On".
 
« Last Edit: December 14, 2012, 02:12:44 PM by CommonTater »

Offline Stefan Pendl

  • Global Moderator
  • Member
  • *****
  • Posts: 582
    • Homepage
Re: Why does pellesc cast hex char to compile-time codepage?
« Reply #18 on: December 15, 2012, 09:35:51 AM »
Compilers should behave as GCC/VC/LCC do.

Why should one compiler behave like another one, when there is an ISO  standard to follow?

Wouldn't this result in the same problem that IE puts on us, which defines its own standards apart from the ISO standard for HTML?

Any compiler must follow the ISO standard, but can add its own extensions for whatever reason.

If you don't like that Pelles C only implements the ISO standard without any extension, you are free to use any other compiler, that includes behavior that is different from the ISO standard.
---
Stefan

Proud member of the UltraDefrag Development Team

CommonTater

  • Guest
Re: Why does pellesc cast hex char to compile-time codepage?
« Reply #19 on: December 15, 2012, 08:39:45 PM »
If you don't like that Pelles C only implements the ISO standard without any extension, you are free to use any other compiler, that includes behavior that is different from the ISO standard.

Actually if you enable Pelles C extensions in the compiler settings or use /Zx on the command line, there are quite a number of extensions.  Each extension and function is clearly marked as "Not Standard C" in the help file, and there are hundreds of them.
 
NO compiler is obligated to follow any standard.  This is purely voluntary (although it makes sense they would). 
 
From the help file...
Code: [Select]
  /Zx option (POCC) [2.70] 
 Syntax: /Zx
 
 Description: The /Zx option makes the compiler accept Pelle's extensions to standard C.
 
The currently supported extensions are:
  Optional arguments - similar to C++.
Support for the GCC extension __typeof__ [4.00].
Support for the GCC extension __alignof__ (same as the __alignof operator) [5.00].
Support for the GCC case range extension: case expr ... expr [4.00].
Support for the GCC escape sequence \e (ASCII character ESC) [4.00].
Support for the GCC extension binary constants, using the 0b (or 0B) prefix followed by a sequence of '0' and '1' digits [6.00].   

 Example 1: int test2(int a = 100)
{
    return a * 2;
}

int main(void)
{
    return test2();  // Not necessary to specify an argument to test2, the default value 100 is used in this case.
}

Example 2: #define pointer(T)  __typeof__(T *)
#define array(T, N)  __typeof__(T [N])

array(pointer(char), arrname;
 
 Example 3: switch (c)
{
    case '0' ... '9': /* digit */
    case 'a' ... 'z': /* lower case (works for ASCII/ANSI) */
}
 
 Example 4: unsigned int mask = 0b1111;  /* binary 1111 is decimal 15 */

Also see the help files list of "Private #include files"
 
« Last Edit: December 15, 2012, 08:45:40 PM by CommonTater »

aMarCruz

  • Guest
Re: Why does pellesc cast hex char to compile-time codepage?
« Reply #20 on: May 30, 2014, 08:30:40 PM »
For all,
in Pelles C v8 there is not translation at compile time.
Test with:

Code: [Select]
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <tchar.h>

// char to UINT
#define CH2UINT(c) ((unsigned int)((unsigned char)(c)))

#define SDUMP6(s) (void)_tprintf(_T("%s: %x,%x,%x,%x,%x,%x\n"), #s, \
        CH2UINT(s[0]), CH2UINT(s[1]), CH2UINT(s[2]), \
        CH2UINT(s[3]), CH2UINT(s[4]), CH2UINT(s[5]))

char *s1  = "\xfa\xfb\xfc\xfd\xfe\xff";
char s2[] = {'\xfa','\xfb','\xfc','\xfd','\xfe','\xff'};
char s3[] = {0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff};

int _tmain(void)
{
    SDUMP6(s1);
    SDUMP6(s2);
    SDUMP6(s3);

    //note: don't use sizeof(s1) - s1 have zero-terminator
    if (memcmp(s1,s2,sizeof(s2)) || memcmp(s2,s3,sizeof(s3)))
        _tprintf(_T("Buffers are NOT equals!\n"));

    return 0;
}

@beto