When the poide.exe IDE is shut down and restarted, the UTF-8 in this code
#include <stdio.h>
/* entry point */
int main(void)
{
printf("Hello, world!\n");
printf("Салом Ҷаҳон!\n");
return 0;
}
is reloaded to the poide.exe IDE, with question marks replacing the UTF-8, as
#include <stdio.h>
/* entry point */
int main(void)
{
printf("Hello, world!\n");
printf("????? ?????!!\n");
return 0;
}
Tofu
https://fonts.google.com/knowledge/glossary/tofu (https://fonts.google.com/knowledge/glossary/tofu)
Eroteme
https://en.wiktionary.org/wiki/eroteme
https://en.wikipedia.org/wiki/Question_mark (https://en.wikipedia.org/wiki/Question_mark)
Hi Robert,
This is not really a bug. It maybe a minor inconvenience but here is the situation as I understand it.
Pelle C was originally ASCII/ANSI for all source files.
Pelle C converted to having UTF-8 the default for all source files.
It also supports UTF-16 for source files. When you create a new source file within the IDE it is automatically UTF-8. You will see that the source file tab also shows UTF-8 (or UTF-16). If it shows nothing but the name the source file is at best ASCII/ANSI. When using 'OLD' source code or creating the source code file outside of Pelle C with a plain text editor it will be ASCII/ANSI
Now the critical part is that the editor now works in UTF-8 by default always. This allows the editor to enter UTF-8 in the source code page, but since that page is not identified as UTF-8 when reloaded it will fail to display as expected.
So the Export64 program for example does not show UTF-8 in the tab so it is still ASCII/ANSI, even though the editor can make the 'display' show the character.
Using any editor that supports UTF-8 a source file can be created or just resaved saved with the encoding set to UTF-8.
I use TextPad for example to resave Export.c to Export_UTF8.c and if you add it to the Export64 program you will see the source tab shows the encoding. If you run your test on this file it should 'pass' reloading -
Hope this was at least a little bit clear -
John Z
The other method is to create a blank source file in the IDE then paste in the old source code. When saved it will be UTF-8
Thanks John Z.
The file, created in EditPad, is initially a No-BOM UTF-8 file with only ASCII characters.
The file then is modified in poide.exe IDE adding UTF-8 glyphs beyond U+00FF.
The file is saved.
When opened in EditPad the file is reported as Windows 1252.
When re-opened in poide.exe, the UTF-8 glyphs beyond U+00FF have been replaced with erotemes.
If a No-BOM UTF-8 file, with at least one beyond U+00FF glyph, is initially loaded into poide.exe, then the file will be saved as UTF-8 No-BOM.
I will have to remember that.
I trust the programmer to choose the proper file format when saving.
I try to identify UTF-8 encoded text files without a BOM when loading into the IDE, but an UTF-8 text file without a BOM and without any UTF-8 encoded characters looks like any ASCII(/ANSI) text file.
Not looking like an UTF-8 encoded text when loaded, and then annotated with "exotic" characters and just saved again will probably not go too well. The IDE can be smarter, but I don't want too many "helpful" dialogs either... (I usually do this: when in a source code editor: got to "Properties", and check/change "Encoding").
Helpful dialog boxes can steal interest. Another reason to like Pelles-C.