News:

Download Pelles C here: http://www.pellesc.se

Main Menu

Recent posts

#81
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 18, 2026, 05:00:36 AM
Quote from: Robert on January 18, 2026, 12:17:30 AMThe RTF output from this code is different from the RTF output of John Z's IDE addin.
It might just created for different purbose, like writing to RichEdit control.
Also it was for only ANSI source.

int printf(const char * restrict format, ...);

// https://stackoverflow.com/questions/5603559/one-file-lib-to-conv-utf8-char-to-wchar-t
short utf8_to_wchar(char **utf8)
{
    short sz = 0;
    short c;
    char *p = *(char **)utf8;
    char v = (*p);
    if (v >= 0)
    {
        c = v;
        sz += c;
        ++p; (*utf8)++;
    }
    int shiftCount = 0;
    if ((v & 0xE0) == 0xC0)
    {
        shiftCount = 1;
        c = v & 0x1F;
    }
    else if ((v & 0xF0) == 0xE0)
    {
        shiftCount = 2;
        c = v & 0xF;
    }
    else
        return 0;
    ++p; (*utf8)++;
    while (shiftCount)
    {
        v = *p;
        ++p; (*utf8)++;
        if ((v & 0xC0) != 0x80)
            return 0;
        c <<= 6;
        c |= (v & 0x3F);
        --shiftCount;
    }
    sz += c;
    return sz;
}

int ShortToStrPos(int n, char *s)
{
    int i, sign, idx, nl, len;

    idx = 0;
/*    if ((sign = n) < 0) {    // record sign
        n = -n;    // make n positive
        idx++;
    }*/
    i = 0;
    nl = n;
    while ((nl /= 10) > 0)    /* count nums */
        idx++;
    len = idx+1;
    s[idx+1] = '\0';
    do {    /* generate digits in reverse order */
        s[idx--] = n % 10 + '0';    /* get next digit */
    } while ((n /= 10) > 0);    /* delete it */
//    if (sign < 0)
//        s[0] = '-';
    return len;
}

int __cdecl main(void)
{
    char utf8[] = u8"σκατ";
    char *p = utf8;
    while (*p) {
        if (*(unsigned char*)p > 127) {    // UTF8 ?
            short uc = utf8_to_wchar(&p);
            printf("%Xh\t", uc);
        }
    }
    printf("\n%p\n%p\n", utf8, p);
    return 0;
}

EDIT 2025-01-19: UNICODE version in RE_Test3 and esc close window, but still bugs
#82
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 18, 2026, 12:17:30 AM
Quote from: TimoVJL on January 17, 2026, 01:21:16 PMA small stupid C to RTF project to RichEdit.

It can help to debug some code.

Hi Timo:
The RTF output from this code is different from the RTF output of John Z's IDE addin.
I'm going to have a look at the output in an ImHex editor and see what's going on.

You know ImHex ?
https://imhex.werwolv.net/
https://github.com/WerWolv/ImHex
#83
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 17, 2026, 09:46:21 PM
It was created in Windows 7 SP1 x64 EN version.
As sourcecode is available, crash point might be determines.
It was also collected from working code examples.

#84
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 17, 2026, 01:21:16 PM
A small stupid C to RTF project to RichEdit.

It can help to debug some code.
#85
Add-ins / Re: Export C source as HTML or...
Last post by John Z - January 17, 2026, 10:47:20 AM
Hi Robert,

Enjoy  :)  -

Both HTML and PDF formats do support UTF-8, and PDF formats are amenable to UTF-16 and others.
HTML can support UTF-16 but it is 'strongly' discouraged in the HTML5 spec so UTF-8 is mainstream for HTML.

It seems the biggest challenge is UTF-8/16 in RTF. All UTF-8 characters must be encoded using the \un? format.

So it can work, I just did the minimum to get it 'working' again.

While typing I see an update posted. I'll look at it more it seems your focus is a console program and not a windows program?  If so somewhere I saw a full fledged console program for console unicode display - maybe I can find again ...

John Z
#86
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 17, 2026, 10:35:14 AM
Hi John Z:

Yeah, well, uh ... we've got a bit of work in front of us if we hope to deal properly with non ASCII code.

This code


#include <windows.h>
#include <stdio.h>

static int OrigCodePage;
static const char* σκατ;
static const char* δυσκατανοήτων;

int main(int argc, char* argv[])
{
    OrigCodePage = GetConsoleOutputCP();
    SetConsoleOutputCP(65001);
    σκατ = "σκατ doo, be, shoo, bop, ooh, dee, doo, sha-bam";
    δυσκατανοήτων = "δυσκατανοήτων difficult to understand";
    printf("%s%s%s\n", σκατ, " ", δυσκατανοήτων);
    _getch();
    SetConsoleOutputCP(OrigCodePage);
    return 1;
}


Exports from poide.exe IDE as HTML file:

#include <windows.h>
#include <stdio.h>

static int OrigCodePage;
static const char* УКБФ;
static const char* ДХУКБФБНПЎФЩН;

int main(int argc, char* argv[])
{
    OrigCodePage = GetConsoleOutputCP();
    SetConsoleOutputCP(65001);
    УКБФ = "УКБФ doo, be, shoo, bop, ooh, dee, doo, sha-bam";
    ДХУКБФБНПЎФЩН = "ДХУКБФБНПЎФЩН difficult to understand";
    printf("%s%s%s\n", УКБФ, " ", ДХУКБФБНПЎФЩН);
    _getch();
    SetConsoleOutputCP(OrigCodePage);
    return 1;
}


Pelles C project code in attached file Skat.zip
#87
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 17, 2026, 07:39:27 AM
Hi John Z:

Very complex, very interesting.

I have had a look at Pelles and Timo's code comparing it with yours and have some ideas.

I will continue studying this and if I can get it to spit out anything intelligible, I'll let you know.

#88
Bug reports / Re: IDE Reload. Eroteme Replac...
Last post by Robert - January 17, 2026, 12:14:48 AM
Thanks John Z.

The file, created in EditPad, is initially a No-BOM UTF-8 file with only ASCII characters.
The file then is modified in poide.exe IDE adding UTF-8 glyphs beyond U+00FF.
The file is saved.
When opened in EditPad the file is reported as Windows 1252.
When re-opened in poide.exe, the UTF-8 glyphs beyond U+00FF have been replaced with erotemes.

If a No-BOM UTF-8 file, with at least one beyond U+00FF glyph, is initially loaded into poide.exe, then the file will be saved as UTF-8 No-BOM.

I will have to remember that.

#89
Bug reports / Re: IDE Reload. Eroteme Replac...
Last post by John Z - January 16, 2026, 11:06:59 PM
Hi Robert,

This is not really a bug.  It maybe a minor inconvenience but here is the situation as I understand it.

Pelle C was originally ASCII/ANSI for all source files.
Pelle C converted to having UTF-8 the default for all source files. 

It also supports UTF-16 for source files. When you create a new source file within the IDE it is automatically UTF-8.  You will see that the source file tab also shows UTF-8 (or UTF-16).  If it shows nothing but the name the source file is at best ASCII/ANSI.  When using 'OLD' source code or creating the source code file outside of Pelle C with a plain text editor it will be ASCII/ANSI

Now the critical part is that the editor now works in UTF-8 by default always.  This allows the editor to enter UTF-8 in the source code page, but since that page is not identified as UTF-8 when reloaded it will fail to display as expected.

So the Export64 program for example does not show UTF-8 in the tab so it is still ASCII/ANSI, even though the editor can make the 'display' show the character.

Using any editor that supports UTF-8 a source file can be created or just resaved saved with the encoding set to UTF-8.

I use TextPad for example to resave Export.c to Export_UTF8.c and if you add it to the Export64 program you will see the source tab shows the encoding.  If you run your test on this file it should 'pass' reloading -

Hope this was at least a little bit clear -

John Z

The other method is to create a blank source file in the IDE then paste in the old source code. When saved it will be UTF-8



#90
Beginner questions / Re: Small C Programs to Learn ...
Last post by Vortex - January 16, 2026, 10:01:48 PM
Hi jos,

In the manual supplied with Pelles C ( \PellesC\Bin\Help\help0009.chm ) search for this : Predefined preprocessor symbols (POCC)