News:

Download Pelles C here: http://www.pellesc.se

Main Menu

Recent posts

#11
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 21, 2026, 03:56:32 AM
Quote from: TimoVJL on January 18, 2026, 05:00:36 AM
Quote from: Robert on January 18, 2026, 12:17:30 AMThe RTF output from this code is different from the RTF output of John Z's IDE addin.
It might just created for different purbose, like writing to RichEdit control.
Also it was for only ANSI source.

int printf(const char * restrict format, ...);

// https://stackoverflow.com/questions/5603559/one-file-lib-to-conv-utf8-char-to-wchar-t
short utf8_to_wchar(char **utf8)
{
    short sz = 0;
    short c;
    char *p = *(char **)utf8;
    char v = (*p);
    if (v >= 0)
    {
        c = v;
        sz += c;
        ++p; (*utf8)++;
    }
    int shiftCount = 0;
    if ((v & 0xE0) == 0xC0)
    {
        shiftCount = 1;
        c = v & 0x1F;
    }
    else if ((v & 0xF0) == 0xE0)
    {
        shiftCount = 2;
        c = v & 0xF;
    }
    else
        return 0;
    ++p; (*utf8)++;
    while (shiftCount)
    {
        v = *p;
        ++p; (*utf8)++;
        if ((v & 0xC0) != 0x80)
            return 0;
        c <<= 6;
        c |= (v & 0x3F);
        --shiftCount;
    }
    sz += c;
    return sz;
}

int ShortToStrPos(int n, char *s)
{
    int i, sign, idx, nl, len;

    idx = 0;
/*    if ((sign = n) < 0) {    // record sign
        n = -n;    // make n positive
        idx++;
    }*/
    i = 0;
    nl = n;
    while ((nl /= 10) > 0)    /* count nums */
        idx++;
    len = idx+1;
    s[idx+1] = '\0';
    do {    /* generate digits in reverse order */
        s[idx--] = n % 10 + '0';    /* get next digit */
    } while ((n /= 10) > 0);    /* delete it */
//    if (sign < 0)
//        s[0] = '-';
    return len;
}

int __cdecl main(void)
{
    char utf8[] = u8"σκατ";
    char *p = utf8;
    while (*p) {
        if (*(unsigned char*)p > 127) {    // UTF8 ?
            short uc = utf8_to_wchar(&p);
            printf("%Xh\t", uc);
        }
    }
    printf("\n%p\n%p\n", utf8, p);
    return 0;
}

EDIT 2025-01-19: UNICODE version in RE_Test3 and esc close window, but still bugs

Hei TimoVJL:

The code snippet above is interesting. Thanks.

What bugs ? I don't see bugs in RE_Test3 output.



#12
Add-ins / Re: Export C source as HTML or...
Last post by Vortex - January 18, 2026, 10:11:36 AM
Hi Timo,

My apologies, it was my mistake. Your application works fine and I removed my previous message #41859
#13
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 18, 2026, 05:00:36 AM
Quote from: Robert on January 18, 2026, 12:17:30 AMThe RTF output from this code is different from the RTF output of John Z's IDE addin.
It might just created for different purbose, like writing to RichEdit control.
Also it was for only ANSI source.

int printf(const char * restrict format, ...);

// https://stackoverflow.com/questions/5603559/one-file-lib-to-conv-utf8-char-to-wchar-t
short utf8_to_wchar(char **utf8)
{
    short sz = 0;
    short c;
    char *p = *(char **)utf8;
    char v = (*p);
    if (v >= 0)
    {
        c = v;
        sz += c;
        ++p; (*utf8)++;
    }
    int shiftCount = 0;
    if ((v & 0xE0) == 0xC0)
    {
        shiftCount = 1;
        c = v & 0x1F;
    }
    else if ((v & 0xF0) == 0xE0)
    {
        shiftCount = 2;
        c = v & 0xF;
    }
    else
        return 0;
    ++p; (*utf8)++;
    while (shiftCount)
    {
        v = *p;
        ++p; (*utf8)++;
        if ((v & 0xC0) != 0x80)
            return 0;
        c <<= 6;
        c |= (v & 0x3F);
        --shiftCount;
    }
    sz += c;
    return sz;
}

int ShortToStrPos(int n, char *s)
{
    int i, sign, idx, nl, len;

    idx = 0;
/*    if ((sign = n) < 0) {    // record sign
        n = -n;    // make n positive
        idx++;
    }*/
    i = 0;
    nl = n;
    while ((nl /= 10) > 0)    /* count nums */
        idx++;
    len = idx+1;
    s[idx+1] = '\0';
    do {    /* generate digits in reverse order */
        s[idx--] = n % 10 + '0';    /* get next digit */
    } while ((n /= 10) > 0);    /* delete it */
//    if (sign < 0)
//        s[0] = '-';
    return len;
}

int __cdecl main(void)
{
    char utf8[] = u8"σκατ";
    char *p = utf8;
    while (*p) {
        if (*(unsigned char*)p > 127) {    // UTF8 ?
            short uc = utf8_to_wchar(&p);
            printf("%Xh\t", uc);
        }
    }
    printf("\n%p\n%p\n", utf8, p);
    return 0;
}

EDIT 2025-01-19: UNICODE version in RE_Test3 and esc close window, but still bugs
#14
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 18, 2026, 12:17:30 AM
Quote from: TimoVJL on January 17, 2026, 01:21:16 PMA small stupid C to RTF project to RichEdit.

It can help to debug some code.

Hi Timo:
The RTF output from this code is different from the RTF output of John Z's IDE addin.
I'm going to have a look at the output in an ImHex editor and see what's going on.

You know ImHex ?
https://imhex.werwolv.net/
https://github.com/WerWolv/ImHex
#15
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 17, 2026, 09:46:21 PM
It was created in Windows 7 SP1 x64 EN version.
As sourcecode is available, crash point might be determines.
It was also collected from working code examples.

#16
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 17, 2026, 01:21:16 PM
A small stupid C to RTF project to RichEdit.

It can help to debug some code.
#17
Add-ins / Re: Export C source as HTML or...
Last post by John Z - January 17, 2026, 10:47:20 AM
Hi Robert,

Enjoy  :)  -

Both HTML and PDF formats do support UTF-8, and PDF formats are amenable to UTF-16 and others.
HTML can support UTF-16 but it is 'strongly' discouraged in the HTML5 spec so UTF-8 is mainstream for HTML.

It seems the biggest challenge is UTF-8/16 in RTF. All UTF-8 characters must be encoded using the \un? format.

So it can work, I just did the minimum to get it 'working' again.

While typing I see an update posted. I'll look at it more it seems your focus is a console program and not a windows program?  If so somewhere I saw a full fledged console program for console unicode display - maybe I can find again ...

John Z
#18
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 17, 2026, 10:35:14 AM
Hi John Z:

Yeah, well, uh ... we've got a bit of work in front of us if we hope to deal properly with non ASCII code.

This code


#include <windows.h>
#include <stdio.h>

static int OrigCodePage;
static const char* σκατ;
static const char* δυσκατανοήτων;

int main(int argc, char* argv[])
{
    OrigCodePage = GetConsoleOutputCP();
    SetConsoleOutputCP(65001);
    σκατ = "σκατ doo, be, shoo, bop, ooh, dee, doo, sha-bam";
    δυσκατανοήτων = "δυσκατανοήτων difficult to understand";
    printf("%s%s%s\n", σκατ, " ", δυσκατανοήτων);
    _getch();
    SetConsoleOutputCP(OrigCodePage);
    return 1;
}


Exports from poide.exe IDE as HTML file:

#include <windows.h>
#include <stdio.h>

static int OrigCodePage;
static const char* УКБФ;
static const char* ДХУКБФБНПЎФЩН;

int main(int argc, char* argv[])
{
    OrigCodePage = GetConsoleOutputCP();
    SetConsoleOutputCP(65001);
    УКБФ = "УКБФ doo, be, shoo, bop, ooh, dee, doo, sha-bam";
    ДХУКБФБНПЎФЩН = "ДХУКБФБНПЎФЩН difficult to understand";
    printf("%s%s%s\n", УКБФ, " ", ДХУКБФБНПЎФЩН);
    _getch();
    SetConsoleOutputCP(OrigCodePage);
    return 1;
}


Pelles C project code in attached file Skat.zip
#19
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 17, 2026, 07:39:27 AM
Hi John Z:

Very complex, very interesting.

I have had a look at Pelles and Timo's code comparing it with yours and have some ideas.

I will continue studying this and if I can get it to spit out anything intelligible, I'll let you know.

#20
Bug reports / Re: IDE Reload. Eroteme Replac...
Last post by Robert - January 17, 2026, 12:14:48 AM
Thanks John Z.

The file, created in EditPad, is initially a No-BOM UTF-8 file with only ASCII characters.
The file then is modified in poide.exe IDE adding UTF-8 glyphs beyond U+00FF.
The file is saved.
When opened in EditPad the file is reported as Windows 1252.
When re-opened in poide.exe, the UTF-8 glyphs beyond U+00FF have been replaced with erotemes.

If a No-BOM UTF-8 file, with at least one beyond U+00FF glyph, is initially loaded into poide.exe, then the file will be saved as UTF-8 No-BOM.

I will have to remember that.