Pelles C forum

Pelles C => Feature requests => Topic started by: Robert on December 30, 2020, 10:40:32 AM

Title: printf UTF-8
Post by: Robert on December 30, 2020, 10:40:32 AM
printf UTF-8 ???
Someday ???
Someday soon ???

Code: [Select]

#include<stdio.h>

int main() {

    int στάσις = 1;
    στάσις++;
    printf("%d\n", στάσις);
    printf("%s", "δυσκατανοήτων");
    return 0;
}


CLang, GCC 10 output

Code: [Select]

2
δυσκατανοήτων


Pelles C 10 output

Code: [Select]

2
?????????????

Title: Re: printf UTF-8
Post by: TimoVJL on December 30, 2020, 11:34:30 AM
Those use msvcrt.dll
Code: [Select]
#include<stdio.h>
#pragma comment(lib, "msvcrt.lib")

void  __stdcall ExitProcess(int);
int  __stdcall SetConsoleOutputCP(int);
void __cdecl mainCRTStartup(void) {
SetConsoleOutputCP(65001);
ExitProcess(main());
}

int main(void) {

    int στάσις = 1;
    στάσις++;
    printf("%d\n", στάσις);
    printf("%s", u8"δυσκατανοήτων\n");
    return 0;
}
Code: [Select]
2
δυσκατανοήτων
Title: Re: printf UTF-8
Post by: John Z on December 30, 2020, 12:12:57 PM
Hmmmm ….  msvcrt.lib/dll is not included with PellesC.  You'll need a copy perhaps from MS Visual C, I see that I have it there, in C:\Program Files (x86)\DevStudio\VC\lib 

BTW I tried
fwprintf(stderr, L"%ls\n",L"δυσκατανοήτων");

didn't do much better.

John Z
Title: Re: printf UTF-8
Post by: Robert on December 30, 2020, 12:33:18 PM
Thanks  TimoVJL.

I couldn't get your code to work but the following did work 32 and 64 bit.

Code: [Select]

#include<stdio.h>
#include<windows.h>

int main(void) {
    SetConsoleOutputCP(65001);
    int στάσις = 1;
    στάσις++;
    printf("%d\n", στάσις);
    printf("%s", u8"δυσκατανοήτων");
    return 0;
}

Title: Re: printf UTF-8
Post by: TimoVJL on December 30, 2020, 06:16:48 PM
@Ropert, console mode is a problem, font have to support more than basic ANSI.
@John Z, msvcrt.lib isn't part of Pelles C, but i was.
https://forum.pellesc.de/index.php?topic=7206.msg32907#msg32907
Title: Re: printf UTF-8
Post by: Vortex on December 30, 2020, 07:59:46 PM
Hi Robert,

Here is how to create the import library, one for 32-bit and the other one for 64-bit programming :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\SysWOW64\msvcrt.dll

polib.exe /MACHINE:x64 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll

In this example, it's assumed that you are operating under a 64-bit system.

For 32-bit OS :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll
Title: Re: printf UTF-8
Post by: Robert on December 30, 2020, 09:02:22 PM
Hi Robert,

Here is how to create the import library, one for 32-bit and the other one for 64-bit programming :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\SysWOW64\msvcrt.dll

polib.exe /MACHINE:x64 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll

In this example, it's assumed that you are operating under a 64-bit system.

For 32-bit OS :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll

Hi Vortex:

Thank you for that useful information.

I do prefer a native Pelles C solution.

Still, it would be nice to be able to do

Code: [Select]

printf("%s", "δυσκατανοήτων");


without the u8 prefix, as is possible now in the newer versions of CLang and GCC.
Title: Re: printf UTF-8
Post by: TimoVJL on December 31, 2020, 05:09:31 PM
Console have to use Lucida Console or Consolas or similar TT font.
Title: Re: printf UTF-8
Post by: Robert on January 01, 2021, 04:59:36 AM
Sometimes I wonder, is it me or is it IT ?

Today, without the u8 prefix on the quoted string literal, the code below does execute as expected when compiled with Pelles C and with the console codepage set to 65001 and with a console font containing Greek alphabet glyphs.

Code: [Select]

#include<stdio.h>

int main(void) {
     printf("%s", "δυσκατανοήτων");
    return 0;
}


Title: Re: printf UTF-8
Post by: frankie on January 01, 2021, 01:13:31 PM
Sometimes I wonder, is it me or is it IT ?
It's IT!  >:( ;D ;D ;D ;D
The MS console is the worst ever wrote piece of s...oftware when multilingual support gets in...  ;)
Title: Re: printf UTF-8
Post by: Pelle on January 03, 2021, 08:02:22 PM
Since Pelles C is for Windows, the most obvious default runtime character set is/was "ANSI" (when no string prefix, for compatibility reasons) -- but this also means the "ANSI" character set that is in effect when compiling the source file(!)

This is far from perfect, but I'm not sure how to change the default without breaking old programs...  :-\