News:

Download Pelles C here: http://www.pellesc.se

Main Menu

Recent posts

#31
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 22, 2026, 11:03:02 PM
Those are connected.
{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}
\f1\'f3\'ea\'e1\'f4

With UNICODE 16LE a bit less conversion, have to find right fontset for chars.

https://www.oreilly.com/library/view/rtf-pocket-guide/9781449302047/ch04.html

I have low interest for that.
#32
Bug reports / Re: glu.h -- Silence warnings
Last post by Robert - January 22, 2026, 08:48:48 PM
Quote from: MrBcx on January 22, 2026, 04:04:36 PMA minor update that silences strict warnings - 3 instances replacing () with (void).

Silencing nothing with a void.
An interesting concept to contemplate.
It fits well into the Zen class of "koan".

Thanks for this, warnings make me nervous.
#33
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 22, 2026, 08:38:24 PM
Quote from: TimoVJL on January 22, 2026, 12:42:11 PMBetter to show important things too:
{\rtf1\ansi\deff0{\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\lang1035\f0\fs22 #include <windows.h>\par
#include <stdio.h>\par
\par
static int OrigCodePage;\par
static const char* \f1\'f3\'ea\'e1\'f4;\par
static const char* \'e4\'f5\'f3\'ea\'e1\'f4\'e1\'ed\'ef\'de\'f4\'f9\'ed;\par
Streamed parsing don't work, as have to separate RTF header while processing.


The streamed parsing is a problem because the AddIn_GetSourceText function extracts UTF-16LE with embedded nulls. The code extracted by AddIn_GetSourceText should be converted to UTF-8, removing the embedded nulls, so that it can be processed with standard, non-wide, C functions.

The RTF encoding of UTF-8 is beyond my understanding, for example, the encoding of UTF-8 eight byte

σκατ;

into the expected RTF representation

\'f3\'ea\'e1\'f4;

#34
Bug reports / glu.h -- Silence warnings
Last post by MrBcx - January 22, 2026, 04:04:36 PM
A minor update that silences strict warnings - 3 instances replacing () with (void).


#35
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 22, 2026, 12:42:11 PM
Better to show important things too:
{\rtf1\ansi\deff0{\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\lang1035\f0\fs22 #include <windows.h>\par
#include <stdio.h>\par
\par
static int OrigCodePage;\par
static const char* \f1\'f3\'ea\'e1\'f4;\par
static const char* \'e4\'f5\'f3\'ea\'e1\'f4\'e1\'ed\'ef\'de\'f4\'f9\'ed;\par
Streamed parsing don't work, as have to separate RTF header while processing.
#36
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 21, 2026, 10:48:55 AM
Quote from: TimoVJL on January 21, 2026, 07:14:06 AM
QuoteRTF SYNTAX

An RTF file consists of unformatted text, control words, control symbols, and groups. For ease of transport, a standard RTF file can consist of only 7-bit ASCII characters. (Converters that communicate with Microsoft Word for Windows or Microsoft Word for the Macintosh should expect 8-bit characters.) There is no set maximum line length for an RTF file.

RTF use ASCII 32 - 127 chars and some latin-1 (ISO/IEC 8859) chars without coding.

So i was just lazy for checking chars like many others.
UTF-8 with BOM can have conditional processing.


Hi TimoVJL and John Z:

RTF SYNTAX.
Oh that !
Yeah, well, I think I'm begining to remember why I'm here.

Export C source etc.

Anyway, you solved what I had considered the hard part, that is, dealing with the UTF-16LE text which is what the export addin function AddIn_GetSourceTextW has to process.

However,as John Z mentioned and you yelled "RTF SYNTAX" I had to look and see what was expected from

static const char* σκατ;

and saw that it was an RTF encoding of

\par }{\rtlch\fcs1 \af67 \ltrch\fcs0 \f67\insrsid15157589\charrsid15157589 static const char* \'f3\'ea\'e1\'f4;

Hmmmm  :-\  :o

#37
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 21, 2026, 07:14:06 AM
QuoteRTF SYNTAX

An RTF file consists of unformatted text, control words, control symbols, and groups. For ease of transport, a standard RTF file can consist of only 7-bit ASCII characters. (Converters that communicate with Microsoft Word for Windows or Microsoft Word for the Macintosh should expect 8-bit characters.) There is no set maximum line length for an RTF file.

RTF use ASCII 32 - 127 chars and some latin-1 (ISO/IEC 8859) chars without coding.

So i was just lazy for checking chars like many others.
UTF-8 with BOM can have conditional processing.
#38
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 21, 2026, 04:37:53 AM
Quote from: TimoVJL on January 18, 2026, 05:00:36 AM
....

        if (*(unsigned char*)p > 127) {    // UTF8 ?
....



Hei TimoVJL:

"Nearly all invalid UTF-8 cases can be detected by looking at the first two bytes of a character (in fact, the first 12 bits)."

Quoted from:
'Validating UTF-8 In Less Than One Instruction Per Byte'
available at
https://arxiv.org/pdf/2010.03090.pdf

See also:
Ridiculously fast unicode (UTF-8) validation

Thanks again for the code.

Mikään ei ole mahdotonta.


#39
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 21, 2026, 03:56:32 AM
Quote from: TimoVJL on January 18, 2026, 05:00:36 AM
Quote from: Robert on January 18, 2026, 12:17:30 AMThe RTF output from this code is different from the RTF output of John Z's IDE addin.
It might just created for different purbose, like writing to RichEdit control.
Also it was for only ANSI source.

int printf(const char * restrict format, ...);

// https://stackoverflow.com/questions/5603559/one-file-lib-to-conv-utf8-char-to-wchar-t
short utf8_to_wchar(char **utf8)
{
    short sz = 0;
    short c;
    char *p = *(char **)utf8;
    char v = (*p);
    if (v >= 0)
    {
        c = v;
        sz += c;
        ++p; (*utf8)++;
    }
    int shiftCount = 0;
    if ((v & 0xE0) == 0xC0)
    {
        shiftCount = 1;
        c = v & 0x1F;
    }
    else if ((v & 0xF0) == 0xE0)
    {
        shiftCount = 2;
        c = v & 0xF;
    }
    else
        return 0;
    ++p; (*utf8)++;
    while (shiftCount)
    {
        v = *p;
        ++p; (*utf8)++;
        if ((v & 0xC0) != 0x80)
            return 0;
        c <<= 6;
        c |= (v & 0x3F);
        --shiftCount;
    }
    sz += c;
    return sz;
}

int ShortToStrPos(int n, char *s)
{
    int i, sign, idx, nl, len;

    idx = 0;
/*    if ((sign = n) < 0) {    // record sign
        n = -n;    // make n positive
        idx++;
    }*/
    i = 0;
    nl = n;
    while ((nl /= 10) > 0)    /* count nums */
        idx++;
    len = idx+1;
    s[idx+1] = '\0';
    do {    /* generate digits in reverse order */
        s[idx--] = n % 10 + '0';    /* get next digit */
    } while ((n /= 10) > 0);    /* delete it */
//    if (sign < 0)
//        s[0] = '-';
    return len;
}

int __cdecl main(void)
{
    char utf8[] = u8"σκατ";
    char *p = utf8;
    while (*p) {
        if (*(unsigned char*)p > 127) {    // UTF8 ?
            short uc = utf8_to_wchar(&p);
            printf("%Xh\t", uc);
        }
    }
    printf("\n%p\n%p\n", utf8, p);
    return 0;
}

EDIT 2025-01-19: UNICODE version in RE_Test3 and esc close window, but still bugs

Hei TimoVJL:

The code snippet above is interesting. Thanks.

What bugs ? I don't see bugs in RE_Test3 output.



#40
Add-ins / Re: Export C source as HTML or...
Last post by Vortex - January 18, 2026, 10:11:36 AM
Hi Timo,

My apologies, it was my mistake. Your application works fine and I removed my previous message #41859