NO

Author Topic: UTF8 issue on Windows 7  (Read 959 times)

Offline Vortex

  • Member
  • *
  • Posts: 802
    • http://www.vortex.masmcode.com
UTF8 issue on Windows 7
« on: August 30, 2023, 11:46:47 AM »
Hello,

pocc is displaying an error message if the option /utf-8 is specified in the command line. The problem appears on Windows 7 systems.

Code: [Select]
pocc /utf-8 -std:C11  /Tx86-coff -Ot -Ob1 -fp:precise -W1 -Ze -Zx test.c

E:\PellesC\Include\ctype.h(40): fatal error #1065: Failed converting input from 'U'.

It doesn't matter if the source code is ANSI, UTF8 or UTF8 with BOM.
Code it... That's all...

Offline John Z

  • Member
  • *
  • Posts: 796
Re: UTF8 issue on Windows 7
« Reply #1 on: August 30, 2023, 12:15:40 PM »
Hi Vortex,

I still have a win 7 system running.  Post your test.c file and I'll verify on
my system.  Pelles C version 12 I assume....

John Z

Offline Vortex

  • Member
  • *
  • Posts: 802
    • http://www.vortex.masmcode.com
Re: UTF8 issue on Windows 7
« Reply #2 on: August 30, 2023, 12:39:55 PM »
Hi John,

Pelles C version : 12.00.2

The code does not matter, you will receive the same error message with the option /utf-8 on Windows 7 :

Code: [Select]
#include <stdio.h>

int main(int argc, char *argv[])
{
  printf("test");
  return 0;
}
Code it... That's all...

Offline John Z

  • Member
  • *
  • Posts: 796
Re: UTF8 issue on Windows 7
« Reply #3 on: August 30, 2023, 01:48:50 PM »
Thanks Vortex!

Confirmed - Win 7 Pro with V12 and Win 7 Home with V12 both show the issue -
Win 7 Home with V9 does not recognize the /UTF-8 option as it was not implemented yet . . . .

John Z

It looks like based on the help file that this option was added in V11. just fyi maybe test V11
« Last Edit: August 30, 2023, 01:52:07 PM by John Z »

Offline Vortex

  • Member
  • *
  • Posts: 802
    • http://www.vortex.masmcode.com
Re: UTF8 issue on Windows 7
« Reply #4 on: August 30, 2023, 09:53:27 PM »
Hi John,

Thanks for your tests.
Code it... That's all...

Offline MrBcx

  • Global Moderator
  • Member
  • *****
  • Posts: 176
    • Bcx Basic to C/C++ Translator
Re: UTF8 issue on Windows 7
« Reply #5 on: August 30, 2023, 11:00:26 PM »
Hi John,

Thanks for your tests.

Erol - I'm paying attention  ;)
Bcx Basic to C/C++ Translator
https://www.BcxBasicCoders.com

Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2091
Re: UTF8 issue on Windows 7
« Reply #6 on: August 31, 2023, 10:34:34 AM »
Code: [Select]
//#include <stdio.h>
int __cdecl printf(char*,...);
int main(int argc, char *argv[])
{
  printf("test");
  return 0;
}
Code: [Select]
10: pocc.exe -utf-8 test.c
fatal error: Unknown option: /utf-8.

11: pocc.exe -utf-8 test.c
test.c(5): fatal error #1065: Failed converting input using codepage 65001.

12: pocc.exe -utf-8 test.c
test.c(5): fatal error #1065: Failed converting input from 'U'.
« Last Edit: August 31, 2023, 10:36:08 AM by TimoVJL »
May the source be with you

Offline Pelle

  • Administrator
  • Member
  • *****
  • Posts: 2266
    • http://www.smorgasbordet.com
Re: UTF8 issue on Windows 7
« Reply #7 on: September 10, 2023, 07:50:03 PM »
The /utf8 option means the execution character set and source character set is UTF-8, i.e. a source file without a BOM must be UTF-8 (7-bit ASCII is a subset of UTF-8 so will work, "exotic" ANSI characters will not work, etc.)

I don't have a Win7 machine for a quick test right now, I will see if it's possible to set up. ..
/Pelle

Offline John Z

  • Member
  • *
  • Posts: 796
Re: UTF8 issue on Windows 7
« Reply #8 on: September 13, 2023, 03:01:55 PM »
OK interesting .....

The /utf8 option means the execution character set and source character set is UTF-8, i.e. a source file without a BOM must be UTF-8 (7-bit ASCII is a subset of UTF-8 so will work, "exotic" ANSI characters will not work, etc.)

I don't have a Win7 machine for a quick test right now, I will see if it's possible to set up. ..

I note that the above mentions /utf8 not /utf-8 so I tried that on WIN7 home with PellesC v12.002 the object file was created and there was NO error message so next I made up a random fake command switch and I get the "unknown option" error as I should, and no obj file is created.

Then I tried /utf no error, /ut no error, /u shows switches list (and is valid switch to undefine pp symbol, although should be uppercase).  as long as /ut is there anything can be after it like /utf99999 with no error....

SO with the test.c file one can't really tell if utf-8 is working because all characters are 7bit ASCII which are valid so we should try with valid unique to UTF-8 and/or BOM however it appears that /utf8 or just /ut is/are valid switch input(s) and adequate for windows 7 and Pelles C version 12.00.02 with pocc version 12.0.1.0

I'm not saying it makes sense  :P  Going to test more ....

John Z

Offline Pelle

  • Administrator
  • Member
  • *****
  • Posts: 2266
    • http://www.smorgasbordet.com
Re: UTF8 issue on Windows 7
« Reply #9 on: September 26, 2023, 06:39:16 PM »
Then I tried /utf no error, /ut no error, /u shows switches list (and is valid switch to undefine pp symbol, although should be uppercase).  as long as /ut is there anything can be after it like /utf99999 with no error....
Good catch! It's /utf-8, but a missing break makes invalid cases like /utf8 fall into the following option case, were it will silently be "handled" (as in not casing a diagnostic) ...

About the original problem: I managed to boot up (literally) an old inherited laptop with Windows 7. The problem boiled down to the horror show of an API function called WideCharToMultiByte(); the flags and other parameters must match the Windows version (and the online documentation isn't exactly helpful) ...
/Pelle