News:

Download Pelles C here: http://www.pellesc.se

Main Menu

Recent posts

#21
Assembly discussions / Re: Syslink demo
Last post by Vortex - January 24, 2026, 08:25:02 PM
Here is the 64-bit version.
#22
Add-ins / Re: Export C source as HTML or...
Last post by John Z - January 24, 2026, 10:59:02 AM
Hi Robert,

Quote from: Robert on January 23, 2026, 06:03:43 PMMy interest in your resurrection of the Pelles C Export addin is in the "Export to HTML" facility. I think it can handle Unicode identifiers and quotation mark embedded Unicode strings.

Yes - Export to HTML can easily handle UTF-8, I just did a quick update, but it should not take too much to fix it.

Quote from: Robert on January 23, 2026, 06:03:43 PMIf you are interested in developing a Unicode capable "Export to HTML" facility, you might find some help studying the BCX translated C codes of the example on the webpage

No need - I have already previously written as part of another Add-In program (LineCounter+  https://forum.pellesc.de/index.php?topic=10092.0 ) a module that does output a file in UTF-8 HTML.  Since Pelle C now defaults to UTF-8 rather than a codepage it should be even easier.

So - I'll take a look at fixing that part of the Export Add-In.  Attached is an HTML output example from the LineCounter Add-in.

John Z

Like you said previously, optimistically,  "Nothing is impossible."  :)
#23
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 23, 2026, 10:54:58 PM
Quote from: TimoVJL on January 23, 2026, 05:43:50 PMHow that helps RTF coding ?

EDIT:
Quote from: Robert on January 23, 2026, 06:03:43 PMUnfortunately, the RTFDEFS.H document referenced is not obviously available.
How to Obtain the WinWord Converter SDK (GC1039)

Thanks TimoVjl, the rtfdefs.h file is in the download and the charset defines are


// \fcharset, \cchs argument values
// some of these values may also be #defined in windows.h; here's the
// complete list
#define ANSI_CHARSET                  0
#define DEFAULT_CHARSET               1
#define SYMBOL_CHARSET                2
#define INVALID_CHARSET               3 // nil value
#define MAC_CHARSET                  77
#define SHIFTJIS_CHARSET            128 // CP 932: Japanese
#define HANGEUL_CHARSET             129 // CP 949: Korean
#define JOHAB_CHARSET               130
#define GB2312_CHARSET              134 // CP 936: PRC             
#define CHINESEBIG5_CHARSET         136 // CP 950: Taiwan
#define GREEK_CHARSET               161
#define TURKISH_CHARSET             162
#define HEBREW_CHARSET              177
#define ARABIC_CHARSET              178
#define ARABICTRADITIONAL_CHARSET   179
#define ARABICUSER_CHARSET          180
#define HEBREWUSER_CHARSET          181
#define BALTIC_CHARSET              186
#define RUSSIAN_CHARSET             204
#define THAI_CHARSET                222
#define EASTEUROPE_CHARSET          238
#define PC437_CHARSET               254
#define OEM_CHARSET                 255

Correlation of Unicode chars to the above RTF charset data may be possible using the International Components for Unicode libraries functions to process locale data contained in the the Unicode Common Locale Data Repository (CLDR).

https://github.com/unicode-org/icu
https://github.com/unicode-org/cldr

There are several RTF-charset to UTF-8 converters but what is needed here, for the Export to RTF addin, is a Unicode to RTF-charset converter.

Obviously, from the above list of charsets, the conversions from Unicode would be limited. For example, C coders working with the Native American Osage language script or the International Phonetic Alphabet (Hello, anyone out there ?) would be excluded.

#24
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 23, 2026, 06:03:43 PM
Quote from: John Z on January 23, 2026, 12:59:59 PMAssuming the code comments are in the users default code page language then
#include <windows.h>
#include <stdio.h>

int main() {
    UINT user_codepage = GetACP(); // Retrieve the system default Windows ANSI code page

    printf("The user's default Windows ANSI code page is: %u\n", user_codepage);

    // Optional: Keep the console window open to view the output
    printf("Press Enter to exit...");
    getchar();

    return 0;
}

or variation thereof can get the correct code page to encode in the output file(s).
Code snippet provided by Google 'AI? overview -  :( however only the first line is relevant :)

John Z

Hi John Z:

My inaccurate "Code Pages" statement should have stated

"Ah yes, charsets and charset fonts."

There is some information in

https://www.biblioscape.com/rtf15_spec.htm

where it is written

Quote\fcharsetN   Specifies the character set of a font in the font table. Values for N are defined by Windows header files, and in the file RTFDEFS.H accompanying this document.

Unfortunately, the RTFDEFS.H document referenced is not obviously available.

There is a webpage at

https://www.n2pdf.de/fileadmin/user_upload/n2pdf/files/en/help/client_enu/unicode.htm

that has a table of codepage - charset equivalents.

My interest in your resurrection of the Pelles C Export addin is in the "Export to HTML" facility. I think it can handle Unicode identifiers and quotation mark embedded Unicode strings. RTF ?? I really doubt it. PDF ?? Definitely beyond my pay grade.

If you are interested in developing a Unicode capable "Export to HTML" facility, you might find some help studying the BCX translated C codes of the example on the webpage

https://bcxbasiccoders.com/webhelp/html/bcxunicode.htm#widetoansi
#25
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 23, 2026, 05:43:50 PM
How that helps RTF coding ?

EDIT:
Quote from: Robert on January 23, 2026, 06:03:43 PMUnfortunately, the RTFDEFS.H document referenced is not obviously available.
How to Obtain the WinWord Converter SDK (GC1039)


HTML
https://unicodelookup.com/
#26
Add-ins / Re: Export C source as HTML or...
Last post by John Z - January 23, 2026, 12:59:59 PM
Assuming the code comments are in the users default code page language then
#include <windows.h>
#include <stdio.h>

int main() {
    UINT user_codepage = GetACP(); // Retrieve the system default Windows ANSI code page

    printf("The user's default Windows ANSI code page is: %u\n", user_codepage);

    // Optional: Keep the console window open to view the output
    printf("Press Enter to exit...");
    getchar();

    return 0;
}

or variation thereof can get the correct code page to encode in the output file(s).
Code snippet provided by Google 'AI? overview -  :( however only the first line is relevant :)

John Z
#27
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 23, 2026, 03:25:53 AM
Quote from: TimoVJL on January 22, 2026, 11:03:02 PMThose are connected.
{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}
\f1\'f3\'ea\'e1\'f4

With UNICODE 16LE a bit less conversion, have to find right fontset for chars.

https://www.oreilly.com/library/view/rtf-pocket-guide/9781449302047/ch04.html

I have low interest for that.

Ah yes, Code Pages and code page fonts.

Thanks Timo.
#28
Add-ins / Re: Export C source as HTML or...
Last post by TimoVJL - January 22, 2026, 11:03:02 PM
Those are connected.
{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}
\f1\'f3\'ea\'e1\'f4

With UNICODE 16LE a bit less conversion, have to find right fontset for chars.

https://www.oreilly.com/library/view/rtf-pocket-guide/9781449302047/ch04.html

I have low interest for that.
#29
Bug reports / Re: glu.h -- Silence warnings
Last post by Robert - January 22, 2026, 08:48:48 PM
Quote from: MrBcx on January 22, 2026, 04:04:36 PMA minor update that silences strict warnings - 3 instances replacing () with (void).

Silencing nothing with a void.
An interesting concept to contemplate.
It fits well into the Zen class of "koan".

Thanks for this, warnings make me nervous.
#30
Add-ins / Re: Export C source as HTML or...
Last post by Robert - January 22, 2026, 08:38:24 PM
Quote from: TimoVJL on January 22, 2026, 12:42:11 PMBetter to show important things too:
{\rtf1\ansi\deff0{\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\lang1035\f0\fs22 #include <windows.h>\par
#include <stdio.h>\par
\par
static int OrigCodePage;\par
static const char* \f1\'f3\'ea\'e1\'f4;\par
static const char* \'e4\'f5\'f3\'ea\'e1\'f4\'e1\'ed\'ef\'de\'f4\'f9\'ed;\par
Streamed parsing don't work, as have to separate RTF header while processing.


The streamed parsing is a problem because the AddIn_GetSourceText function extracts UTF-16LE with embedded nulls. The code extracted by AddIn_GetSourceText should be converted to UTF-8, removing the embedded nulls, so that it can be processed with standard, non-wide, C functions.

The RTF encoding of UTF-8 is beyond my understanding, for example, the encoding of UTF-8 eight byte

σκατ;

into the expected RTF representation

\'f3\'ea\'e1\'f4;