NO

Author Topic: printf UTF-8  (Read 6136 times)

Offline Robert

  • Member
  • *
  • Posts: 247
printf UTF-8
« on: December 30, 2020, 10:40:32 AM »
printf UTF-8 ???
Someday ???
Someday soon ???

Code: [Select]

#include<stdio.h>

int main() {

    int στάσις = 1;
    στάσις++;
    printf("%d\n", στάσις);
    printf("%s", "δυσκατανοήτων");
    return 0;
}


CLang, GCC 10 output

Code: [Select]

2
δυσκατανοήτων


Pelles C 10 output

Code: [Select]

2
?????????????


Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2122
Re: printf UTF-8
« Reply #1 on: December 30, 2020, 11:34:30 AM »
Those use msvcrt.dll
Code: [Select]
#include<stdio.h>
#pragma comment(lib, "msvcrt.lib")

void  __stdcall ExitProcess(int);
int  __stdcall SetConsoleOutputCP(int);
void __cdecl mainCRTStartup(void) {
SetConsoleOutputCP(65001);
ExitProcess(main());
}

int main(void) {

    int στάσις = 1;
    στάσις++;
    printf("%d\n", στάσις);
    printf("%s", u8"δυσκατανοήτων\n");
    return 0;
}
Code: [Select]
2
δυσκατανοήτων
May the source be with you

Offline John Z

  • Member
  • *
  • Posts: 865
Re: printf UTF-8
« Reply #2 on: December 30, 2020, 12:12:57 PM »
Hmmmm ….  msvcrt.lib/dll is not included with PellesC.  You'll need a copy perhaps from MS Visual C, I see that I have it there, in C:\Program Files (x86)\DevStudio\VC\lib 

BTW I tried
fwprintf(stderr, L"%ls\n",L"δυσκατανοήτων");

didn't do much better.

John Z

Offline Robert

  • Member
  • *
  • Posts: 247
Re: printf UTF-8
« Reply #3 on: December 30, 2020, 12:33:18 PM »
Thanks  TimoVJL.

I couldn't get your code to work but the following did work 32 and 64 bit.

Code: [Select]

#include<stdio.h>
#include<windows.h>

int main(void) {
    SetConsoleOutputCP(65001);
    int στάσις = 1;
    στάσις++;
    printf("%d\n", στάσις);
    printf("%s", u8"δυσκατανοήτων");
    return 0;
}


Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2122
Re: printf UTF-8
« Reply #4 on: December 30, 2020, 06:16:48 PM »
@Ropert, console mode is a problem, font have to support more than basic ANSI.
@John Z, msvcrt.lib isn't part of Pelles C, but i was.
https://forum.pellesc.de/index.php?topic=7206.msg32907#msg32907
« Last Edit: December 30, 2020, 06:20:01 PM by TimoVJL »
May the source be with you

Offline Vortex

  • Member
  • *
  • Posts: 870
    • http://www.vortex.masmcode.com
Re: printf UTF-8
« Reply #5 on: December 30, 2020, 07:59:46 PM »
Hi Robert,

Here is how to create the import library, one for 32-bit and the other one for 64-bit programming :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\SysWOW64\msvcrt.dll

polib.exe /MACHINE:x64 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll

In this example, it's assumed that you are operating under a 64-bit system.

For 32-bit OS :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll
« Last Edit: December 30, 2020, 08:01:20 PM by Vortex »
Code it... That's all...

Offline Robert

  • Member
  • *
  • Posts: 247
Re: printf UTF-8
« Reply #6 on: December 30, 2020, 09:02:22 PM »
Hi Robert,

Here is how to create the import library, one for 32-bit and the other one for 64-bit programming :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\SysWOW64\msvcrt.dll

polib.exe /MACHINE:x64 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll

In this example, it's assumed that you are operating under a 64-bit system.

For 32-bit OS :

Code: [Select]
polib.exe /MACHINE:x86 /OUT:msvcrt.lib C:\Windows\System32\msvcrt.dll

Hi Vortex:

Thank you for that useful information.

I do prefer a native Pelles C solution.

Still, it would be nice to be able to do

Code: [Select]

printf("%s", "δυσκατανοήτων");


without the u8 prefix, as is possible now in the newer versions of CLang and GCC.

Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2122
Re: printf UTF-8
« Reply #7 on: December 31, 2020, 05:09:31 PM »
Console have to use Lucida Console or Consolas or similar TT font.
May the source be with you

Offline Robert

  • Member
  • *
  • Posts: 247
Re: printf UTF-8
« Reply #8 on: January 01, 2021, 04:59:36 AM »
Sometimes I wonder, is it me or is it IT ?

Today, without the u8 prefix on the quoted string literal, the code below does execute as expected when compiled with Pelles C and with the console codepage set to 65001 and with a console font containing Greek alphabet glyphs.

Code: [Select]

#include<stdio.h>

int main(void) {
     printf("%s", "δυσκατανοήτων");
    return 0;
}



Offline frankie

  • Global Moderator
  • Member
  • *****
  • Posts: 2113
Re: printf UTF-8
« Reply #9 on: January 01, 2021, 01:13:31 PM »
Sometimes I wonder, is it me or is it IT ?
It's IT!  >:( ;D ;D ;D ;D
The MS console is the worst ever wrote piece of s...oftware when multilingual support gets in...  ;)
« Last Edit: January 01, 2021, 01:15:19 PM by frankie »
"It is better to be hated for what you are than to be loved for what you are not." - Andre Gide

Offline Pelle

  • Administrator
  • Member
  • *****
  • Posts: 2266
    • http://www.smorgasbordet.com
Re: printf UTF-8
« Reply #10 on: January 03, 2021, 08:02:22 PM »
Since Pelles C is for Windows, the most obvious default runtime character set is/was "ANSI" (when no string prefix, for compatibility reasons) -- but this also means the "ANSI" character set that is in effect when compiling the source file(!)

This is far from perfect, but I'm not sure how to change the default without breaking old programs...  :-\
/Pelle