Pelles C forum

Pelles C => Add-ins => Topic started by: Pelle on March 16, 2005, 07:34:45 PM

Title: Export C source as HTML or PDF file
Post by: Pelle on March 16, 2005, 07:34:45 PM
The attached Add-In can be used to export C source code as a HTML or PDF file - with syntax color highlighting. Maybe useful to someone...

Tested with 3.00 Beta, but will probably work with 2.90 too...

New Mar 18: Added export to PDF files...
New Mar 22: Added compression to PDF files... - thanks Timppa!

- Use Export.ppj for no compression.
- Use ExportZ.ppj for compression - in this case you also need ZLIB 1.2.2 (http://www.zlib.net) in the subdirectory zlib-1.2.2.

Pelle
Title: Export C source as HTML or PDF file
Post by: Vortex on March 16, 2005, 08:28:30 PM
Very nice work
Title: Export C source as HTML or PDF file
Post by: Gerome on March 17, 2005, 10:02:59 AM
Hello Pelles,

Very useful program :)
BTW, in the same taste, a C to XML would be nice :)
Title: Export C source as HTML or PDF file
Post by: Pelle on March 17, 2005, 10:21:57 AM
Quote from: "Vortex"Very nice work

Thanks, Vortex!

Pelle
Title: Export C source as HTML or PDF file
Post by: Pelle on March 17, 2005, 10:27:29 AM
Hello,

Quote from: "Gerome"Very useful program :)
BTW, in the same taste, a C to XML would be nice :)
Thanks!

I'm not sure about the best representation/format for XML. Do you have an example, or know where I can find it...?

Export to PDF files would be nice to, but I'm not sure how to do that either.

Pelle
Title: Export C source as HTML or PDF file
Post by: Gerome on March 17, 2005, 12:24:32 PM
Hi,

XML ?
Yes i have ideas :)


<PROJECT>
 <FILENAME>MySample.c</FILENAME>
 <BEFOREFOOS>#include + declares + ...</BEFOREFOOS>
   <FOOS>
     <NAME>specialfoo</NAME>
     <BODY>void specialfoo(void)...</BODY>
     ...
   </FOOS>
   <FOOS>
     <NAME>otherfoo</NAME>
     <BODY>int *otherfoo(void)...</BODY>
     ...
   </FOOS>
</PROJECT>


would be a good start ? :)
After that, one can imagine making easy documentation and/or snippet database combined with an XSD layer it can be terribly useful :)

Quote from: "Pelle"Hello,

Quote from: "Gerome"Very useful program :)
BTW, in the same taste, a C to XML would be nice :)
Thanks!

I'm not sure about the best representation/format for XML. Do you have an example, or know where I can find it...?

Export to PDF files would be nice to, but I'm not sure how to do that either.

Pelle
Title: Export C source as HTML or PDF file
Post by: Pelle on March 17, 2005, 04:30:44 PM
OK, but this requires a different parser that better understands various C elements, like functions. Maybe some day...

Pelle
Title: Export C source as HTML or PDF file
Post by: Justin Thyme on March 18, 2005, 06:02:23 AM
Quote from: "Pelle"

Export to PDF files would be nice to, but I'm not sure how to do that either.


If you check out SciTE, it has an export to PDF function (in SciTE itself, not Scintilla), although I'm not sure if the export maintains highlighting.  But it might be a start to get you going.

http://scintilla.sourceforge.net/

Good luck!
Title: Export C source as HTML or PDF file
Post by: Pelle on March 18, 2005, 10:41:51 AM
Quote from: "Justin Thyme"If you check out SciTE, it has an export to PDF function (in SciTE itself, not Scintilla), although I'm not sure if the export maintains highlighting.  But it might be a start to get you going.
Thanks - I will look at it. I found a reference manual at Adobe. I got it working in 'hack-ish' way - need to make it more reusable...

Pelle
Title: C to HTML Converter
Post by: Robert on December 23, 2005, 09:59:30 AM
Hi Pelle:

I have modified your C to HTML converter addin so that the size of the exported HTML is significantly decreased. I enclosed the exported <body> .. </body> with <pre> ... </pre>, allowing <br> to be discarded, and replaced "class" with "id", and replaced "span" with "code" allowing "&nbsp;" to be replaced with a space.

I have only used the output with I.E. 6.0 so programmers using other browsers may have issues with the HTML export.

Merry Christmas to all !

Robert Wishlaw
Title: Export C source as HTML or PDF file
Post by: Pelle on December 23, 2005, 10:52:00 AM
Hello Robert,

Cool - thanks!

...and Merry Christmas...!

Pelle
Title: Export C source as HTML or PDF file
Post by: kobold on December 23, 2005, 02:10:23 PM
Great thing, could be very usefull - thx
Merry christmas!
Title: Re: C to HTML Converter
Post by: Robert on January 03, 2006, 01:15:56 AM
Quote from: "Robert"Hi Pelle:

I have modified your C to HTML converter addin so that the size of the exported HTML is significantly decreased. I enclosed the exported <body> .. </body> with <pre> ... </pre>, allowing <br> to be discarded, and replaced "class" with "id", and replaced "span" with "code" allowing "&nbsp;" to be replaced with a space.

I have only used the output with I.E. 6.0 so programmers using other browsers may have issues with the HTML export.

Merry Christmas to all !

Robert Wishlaw

Attached is a revised version of Pelle's C to HTML converter in which the "id" selector has been reverted to "class".

Although "id" seems to work, according to the CSS 2.0 standard, it is not meant to be used in more than one element instance and so multiple instances are tagged as non-compliant in strict type checking editors.

This non-compliance will probably cause problems in the future as the browsers become more compliant to the CSS 2.0 standard.

Robert Wishlaw
Title: Export C source as HTML or PDF file
Post by: JohnF on January 06, 2006, 11:18:42 AM
Is it ok to include exportToHTML2.zip on my web site ?

John
Title: Is it ok to include exportToHTML2.zip on my web site ?
Post by: Robert on January 06, 2006, 08:47:31 PM
Hi John:

Yes, go ahead, I have no objections.

Robert Wishlaw
Title: Re: Is it ok to include exportToHTML2.zip on my web site ?
Post by: JohnF on January 06, 2006, 11:07:36 PM
Quote from: "Robert"Hi John:

Yes, go ahead, I have no objections.

Robert Wishlaw

Thanks.

John
Title: Re: Export C source as HTML or PDF file
Post by: Freddy on February 13, 2008, 07:36:38 PM
Hi Pelle!
When I try to compile this ADDIN using Pelles C 5.0 BETA I get the following error:
Quote
Building Export.obj.
C:\Arquivos de programas\PellesC\Include\addin.h(987): warning #2099: Missing type specifier.
C:\Arquivos de programas\PellesC\Include\addin.h(987): error #2001: Syntax error: expected ';' but found 'ADDIN_FIND_IN_FILES'.
C:\Arquivos de programas\PellesC\Include\addin.h(987): warning #2099: Missing type specifier.
*** Error code: 1 ***
Done.

I guess there is a little bug in addin.h?

Thanks
Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on October 09, 2013, 07:14:08 AM
Updated to support form feed / page break too and RTF.

EDIT 2013-10-09: fix for CRLF (for Win7 WordPad?)
EDIT 2014-02-01: fix RTF font name.
Title: Re: Export C source as HTML or PDF file
Post by: Robert on October 13, 2013, 05:56:32 AM
Hi Timovjl:

Thank you for the update.

Robert Wishlaw
Title: Re: Export C source as HTML, PDF, or RTF file
Post by: John Z on January 15, 2026, 12:26:58 PM
Looking for some information on creating PDF files I found an old add-in project created by multiple authors Pelle, Timo, and Robert, way back in 2005 for Pelles C version 3 (I think), then updated and added RTF by Timo in 2013.  https://forum.pellesc.de/index.php?topic=471.15

Of course it no longer worked with the newer version of Pelle C version 13.00.9 - Soooo I've minimally hacked it to get it functional for the current version.  While it will now work with plain text, UTF-8, and UTF-16 source pages it will only accurately produce output if the text code point is within the ANSI space.  This is OK for source code but some comments won't be displayed correctly when non-ANSI characters are used.  Could be fixed too but not sure it would be worth the effort.

Project ZIP include everything for a 64 bit version.

John Z
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 16, 2026, 10:19:44 AM
Quote from: John Z on January 15, 2026, 12:26:58 PMLooking for some information on creating PDF files I found an old add-in project created by multiple authors Pelle, Timo, and Robert, way back in 2005 for Pelles C version 3 (I think), then updated and added RTF by Timo in 2013.  https://forum.pellesc.de/index.php?topic=471.15

Of course it no longer worked with the newer version of Pelle C version 13.00.9 - Soooo I've minimally hacked it to get it functional for the current version.  While it will now work with plain text, UTF-8, and UTF-16 source pages it will only accurately produce output if the text code point is within the ANSI space.  This is OK for source code but some comments won't be displayed correctly when non-ANSI characters are used.  Could be fixed too but not sure it would be worth the effort.

Project ZIP include everything for a 64 bit version.

John Z

Hi John Z:

Wow! This takes me back to the "Realm of Long Long Ago".

About the HTML Export for UTF-8:
1. The HTML header requires more info for the page to render properly.
2. I think _setmode has to be used to get proper UTF-8 output. For details see
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170 (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170)

Where can I get a zlib64.lib ?
Building Export64.dll.
POLINK: fatal error: File not found: 'zlib64.lib'.
*** Error code: 1 ***

Interesting.
Thanks.

Title: Re: Export C source as HTML or PDF file
Post by: John Z on January 16, 2026, 11:45:34 AM
Hi Robert,

My apologizes, I didn't check that the project zip was complete.  I updated the post with a project zip that includes the libs. 

I'll look further into it (UTF-8) with the link you provided.

Hopefully good memories for you :)

John Z
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 16, 2026, 06:30:52 PM
Quote from: John Z on January 16, 2026, 11:45:34 AMHi Robert,

My apologizes, I didn't check that the project zip was complete.  I updated the post with a project zip that includes the libs. 

I'll look further into it (UTF-8) with the link you provided.

Hopefully good memories for you :)

John Z

Hi John Z:

Thanks, the zlib64 is good. Export is as expected with the limitations you have mentioned regarding UTF-8.

There is a separate but maybe connected problem with Pelles poide.exe IDE.

This code


#include <stdio.h>

/* entry point */
int main(void)
{
  printf("Hello, world!\n");
 
  return 0;
}


amended by adding a print in Tajik UTF-8


#include <stdio.h>

/* entry point */
int main(void)
{
  printf("Hello, world!\n");
  printf("Салом Ҷаҳон!\n");
 
  return 0;
}


is reloaded to the poide.exe IDE as


#include <stdio.h>

/* entry point */
int main(void)
{
  printf("Hello, world!\n");
  printf("????? ?????!!\n");
 
  return 0;
}


when the IDE is shut down and restarted.
Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 16, 2026, 06:34:08 PM
zlib 1.3.1 project with missing header file, that dependecies forgot.

zlib 1.3.1 Release Notes (https://github.com/madler/zlib/releases)
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 16, 2026, 06:54:57 PM
Quote from: TimoVJL on January 16, 2026, 06:34:08 PMzlib 1.3.1 project with missing header file, that dependecies forgot.

zlib 1.3.1 Release Notes (https://github.com/madler/zlib/releases)

Thanks Timo  8)
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 17, 2026, 07:39:27 AM
Hi John Z:

Very complex, very interesting.

I have had a look at Pelles and Timo's code comparing it with yours and have some ideas.

I will continue studying this and if I can get it to spit out anything intelligible, I'll let you know.

Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 17, 2026, 10:35:14 AM
Hi John Z:

Yeah, well, uh ... we've got a bit of work in front of us if we hope to deal properly with non ASCII code.

This code


#include <windows.h>
#include <stdio.h>

static int OrigCodePage;
static const char* σκατ;
static const char* δυσκατανοήτων;

int main(int argc, char* argv[])
{
    OrigCodePage = GetConsoleOutputCP();
    SetConsoleOutputCP(65001);
    σκατ = "σκατ doo, be, shoo, bop, ooh, dee, doo, sha-bam";
    δυσκατανοήτων = "δυσκατανοήτων difficult to understand";
    printf("%s%s%s\n", σκατ, " ", δυσκατανοήτων);
    _getch();
    SetConsoleOutputCP(OrigCodePage);
    return 1;
}


Exports from poide.exe IDE as HTML file:

#include <windows.h>
#include <stdio.h>

static int OrigCodePage;
static const char* УКБФ;
static const char* ДХУКБФБНПЎФЩН;

int main(int argc, char* argv[])
{
    OrigCodePage = GetConsoleOutputCP();
    SetConsoleOutputCP(65001);
    УКБФ = "УКБФ doo, be, shoo, bop, ooh, dee, doo, sha-bam";
    ДХУКБФБНПЎФЩН = "ДХУКБФБНПЎФЩН difficult to understand";
    printf("%s%s%s\n", УКБФ, " ", ДХУКБФБНПЎФЩН);
    _getch();
    SetConsoleOutputCP(OrigCodePage);
    return 1;
}


Pelles C project code in attached file Skat.zip
Title: Re: Export C source as HTML or PDF file
Post by: John Z on January 17, 2026, 10:47:20 AM
Hi Robert,

Enjoy  :)  -

Both HTML and PDF formats do support UTF-8, and PDF formats are amenable to UTF-16 and others.
HTML can support UTF-16 but it is 'strongly' discouraged in the HTML5 spec so UTF-8 is mainstream for HTML.

It seems the biggest challenge is UTF-8/16 in RTF. All UTF-8 characters must be encoded using the \un? format.

So it can work, I just did the minimum to get it 'working' again.

While typing I see an update posted. I'll look at it more it seems your focus is a console program and not a windows program?  If so somewhere I saw a full fledged console program for console unicode display - maybe I can find again ...

John Z
Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 17, 2026, 01:21:16 PM
A small stupid C to RTF project to RichEdit.

It can help to debug some code.
Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 17, 2026, 09:46:21 PM
It was created in Windows 7 SP1 x64 EN version.
As sourcecode is available, crash point might be determines.
It was also collected from working code examples.

Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 18, 2026, 12:17:30 AM
Quote from: TimoVJL on January 17, 2026, 01:21:16 PMA small stupid C to RTF project to RichEdit.

It can help to debug some code.

Hi Timo:
The RTF output from this code is different from the RTF output of John Z's IDE addin.
I'm going to have a look at the output in an ImHex editor and see what's going on.

You know ImHex ?
https://imhex.werwolv.net/ (https://imhex.werwolv.net/)
https://github.com/WerWolv/ImHex (https://github.com/WerWolv/ImHex)
Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 18, 2026, 05:00:36 AM
Quote from: Robert on January 18, 2026, 12:17:30 AMThe RTF output from this code is different from the RTF output of John Z's IDE addin.
It might just created for different purbose, like writing to RichEdit control.
Also it was for only ANSI source.

int printf(const char * restrict format, ...);

// https://stackoverflow.com/questions/5603559/one-file-lib-to-conv-utf8-char-to-wchar-t
short utf8_to_wchar(char **utf8)
{
    short sz = 0;
    short c;
    char *p = *(char **)utf8;
    char v = (*p);
    if (v >= 0)
    {
        c = v;
        sz += c;
        ++p; (*utf8)++;
    }
    int shiftCount = 0;
    if ((v & 0xE0) == 0xC0)
    {
        shiftCount = 1;
        c = v & 0x1F;
    }
    else if ((v & 0xF0) == 0xE0)
    {
        shiftCount = 2;
        c = v & 0xF;
    }
    else
        return 0;
    ++p; (*utf8)++;
    while (shiftCount)
    {
        v = *p;
        ++p; (*utf8)++;
        if ((v & 0xC0) != 0x80)
            return 0;
        c <<= 6;
        c |= (v & 0x3F);
        --shiftCount;
    }
    sz += c;
    return sz;
}

int ShortToStrPos(int n, char *s)
{
    int i, sign, idx, nl, len;

    idx = 0;
/*    if ((sign = n) < 0) {    // record sign
        n = -n;    // make n positive
        idx++;
    }*/
    i = 0;
    nl = n;
    while ((nl /= 10) > 0)    /* count nums */
        idx++;
    len = idx+1;
    s[idx+1] = '\0';
    do {    /* generate digits in reverse order */
        s[idx--] = n % 10 + '0';    /* get next digit */
    } while ((n /= 10) > 0);    /* delete it */
//    if (sign < 0)
//        s[0] = '-';
    return len;
}

int __cdecl main(void)
{
    char utf8[] = u8"σκατ";
    char *p = utf8;
    while (*p) {
        if (*(unsigned char*)p > 127) {    // UTF8 ?
            short uc = utf8_to_wchar(&p);
            printf("%Xh\t", uc);
        }
    }
    printf("\n%p\n%p\n", utf8, p);
    return 0;
}

EDIT 2025-01-19: UNICODE version in RE_Test3 and esc close window, but still bugs
Title: Re: Export C source as HTML or PDF file
Post by: Vortex on January 18, 2026, 10:11:36 AM
Hi Timo,

My apologies, it was my mistake. Your application works fine and I removed my previous message #41859
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 21, 2026, 03:56:32 AM
Quote from: TimoVJL on January 18, 2026, 05:00:36 AM
Quote from: Robert on January 18, 2026, 12:17:30 AMThe RTF output from this code is different from the RTF output of John Z's IDE addin.
It might just created for different purbose, like writing to RichEdit control.
Also it was for only ANSI source.

int printf(const char * restrict format, ...);

// https://stackoverflow.com/questions/5603559/one-file-lib-to-conv-utf8-char-to-wchar-t
short utf8_to_wchar(char **utf8)
{
    short sz = 0;
    short c;
    char *p = *(char **)utf8;
    char v = (*p);
    if (v >= 0)
    {
        c = v;
        sz += c;
        ++p; (*utf8)++;
    }
    int shiftCount = 0;
    if ((v & 0xE0) == 0xC0)
    {
        shiftCount = 1;
        c = v & 0x1F;
    }
    else if ((v & 0xF0) == 0xE0)
    {
        shiftCount = 2;
        c = v & 0xF;
    }
    else
        return 0;
    ++p; (*utf8)++;
    while (shiftCount)
    {
        v = *p;
        ++p; (*utf8)++;
        if ((v & 0xC0) != 0x80)
            return 0;
        c <<= 6;
        c |= (v & 0x3F);
        --shiftCount;
    }
    sz += c;
    return sz;
}

int ShortToStrPos(int n, char *s)
{
    int i, sign, idx, nl, len;

    idx = 0;
/*    if ((sign = n) < 0) {    // record sign
        n = -n;    // make n positive
        idx++;
    }*/
    i = 0;
    nl = n;
    while ((nl /= 10) > 0)    /* count nums */
        idx++;
    len = idx+1;
    s[idx+1] = '\0';
    do {    /* generate digits in reverse order */
        s[idx--] = n % 10 + '0';    /* get next digit */
    } while ((n /= 10) > 0);    /* delete it */
//    if (sign < 0)
//        s[0] = '-';
    return len;
}

int __cdecl main(void)
{
    char utf8[] = u8"σκατ";
    char *p = utf8;
    while (*p) {
        if (*(unsigned char*)p > 127) {    // UTF8 ?
            short uc = utf8_to_wchar(&p);
            printf("%Xh\t", uc);
        }
    }
    printf("\n%p\n%p\n", utf8, p);
    return 0;
}

EDIT 2025-01-19: UNICODE version in RE_Test3 and esc close window, but still bugs

Hei TimoVJL:

The code snippet above is interesting. Thanks.

What bugs ? I don't see bugs in RE_Test3 output.



Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 21, 2026, 04:37:53 AM
Quote from: TimoVJL on January 18, 2026, 05:00:36 AM
....

        if (*(unsigned char*)p > 127) {    // UTF8 ?
....



Hei TimoVJL:

"Nearly all invalid UTF-8 cases can be detected by looking at the first two bytes of a character (in fact, the first 12 bits)."

Quoted from:
'Validating UTF-8 In Less Than One Instruction Per Byte'
available at
https://arxiv.org/pdf/2010.03090.pdf (https://arxiv.org/pdf/2010.03090.pdf)

See also:
Ridiculously fast unicode (UTF-8) validation (https://lemire.me/blog/2020/10/20/ridiculously-fast-unicode-utf-8-validation/)

Thanks again for the code.

Mikään ei ole mahdotonta.


Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 21, 2026, 07:14:06 AM
QuoteRTF SYNTAX

An RTF file consists of unformatted text, control words, control symbols, and groups. For ease of transport, a standard RTF file can consist of only 7-bit ASCII characters. (Converters that communicate with Microsoft Word for Windows or Microsoft Word for the Macintosh should expect 8-bit characters.) There is no set maximum line length for an RTF file.

RTF use ASCII 32 - 127 chars and some latin-1 (ISO/IEC 8859) chars without coding.

So i was just lazy for checking chars like many others.
UTF-8 with BOM can have conditional processing.
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 21, 2026, 10:48:55 AM
Quote from: TimoVJL on January 21, 2026, 07:14:06 AM
QuoteRTF SYNTAX

An RTF file consists of unformatted text, control words, control symbols, and groups. For ease of transport, a standard RTF file can consist of only 7-bit ASCII characters. (Converters that communicate with Microsoft Word for Windows or Microsoft Word for the Macintosh should expect 8-bit characters.) There is no set maximum line length for an RTF file.

RTF use ASCII 32 - 127 chars and some latin-1 (ISO/IEC 8859) chars without coding.

So i was just lazy for checking chars like many others.
UTF-8 with BOM can have conditional processing.


Hi TimoVJL and John Z:

RTF SYNTAX.
Oh that !
Yeah, well, I think I'm begining to remember why I'm here.

Export C source etc.

Anyway, you solved what I had considered the hard part, that is, dealing with the UTF-16LE text which is what the export addin function AddIn_GetSourceTextW has to process.

However,as John Z mentioned and you yelled "RTF SYNTAX" I had to look and see what was expected from

static const char* σκατ;

and saw that it was an RTF encoding of

\par }{\rtlch\fcs1 \af67 \ltrch\fcs0 \f67\insrsid15157589\charrsid15157589 static const char* \'f3\'ea\'e1\'f4;

Hmmmm  :-\  :o

Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 22, 2026, 12:42:11 PM
Better to show important things too:
{\rtf1\ansi\deff0{\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\lang1035\f0\fs22 #include <windows.h>\par
#include <stdio.h>\par
\par
static int OrigCodePage;\par
static const char* \f1\'f3\'ea\'e1\'f4;\par
static const char* \'e4\'f5\'f3\'ea\'e1\'f4\'e1\'ed\'ef\'de\'f4\'f9\'ed;\par
Streamed parsing don't work, as have to separate RTF header while processing.
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 22, 2026, 08:38:24 PM
Quote from: TimoVJL on January 22, 2026, 12:42:11 PMBetter to show important things too:
{\rtf1\ansi\deff0{\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\lang1035\f0\fs22 #include <windows.h>\par
#include <stdio.h>\par
\par
static int OrigCodePage;\par
static const char* \f1\'f3\'ea\'e1\'f4;\par
static const char* \'e4\'f5\'f3\'ea\'e1\'f4\'e1\'ed\'ef\'de\'f4\'f9\'ed;\par
Streamed parsing don't work, as have to separate RTF header while processing.


The streamed parsing is a problem because the AddIn_GetSourceText function extracts UTF-16LE with embedded nulls. The code extracted by AddIn_GetSourceText should be converted to UTF-8, removing the embedded nulls, so that it can be processed with standard, non-wide, C functions.

The RTF encoding of UTF-8 is beyond my understanding, for example, the encoding of UTF-8 eight byte

σκατ;

into the expected RTF representation

\'f3\'ea\'e1\'f4;

Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 22, 2026, 11:03:02 PM
Those are connected.
{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}
\f1\'f3\'ea\'e1\'f4

With UNICODE 16LE a bit less conversion, have to find right fontset for chars.

https://www.oreilly.com/library/view/rtf-pocket-guide/9781449302047/ch04.html (https://www.oreilly.com/library/view/rtf-pocket-guide/9781449302047/ch04.html)

I have low interest for that.
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 23, 2026, 03:25:53 AM
Quote from: TimoVJL on January 22, 2026, 11:03:02 PMThose are connected.
{\f1\fnil\fcharset161{\*\fname Courier New;}Courier New Greek;}
\f1\'f3\'ea\'e1\'f4

With UNICODE 16LE a bit less conversion, have to find right fontset for chars.

https://www.oreilly.com/library/view/rtf-pocket-guide/9781449302047/ch04.html (https://www.oreilly.com/library/view/rtf-pocket-guide/9781449302047/ch04.html)

I have low interest for that.

Ah yes, Code Pages and code page fonts.

Thanks Timo.
Title: Re: Export C source as HTML or PDF file
Post by: John Z on January 23, 2026, 12:59:59 PM
Assuming the code comments are in the users default code page language then
#include <windows.h>
#include <stdio.h>

int main() {
    UINT user_codepage = GetACP(); // Retrieve the system default Windows ANSI code page

    printf("The user's default Windows ANSI code page is: %u\n", user_codepage);

    // Optional: Keep the console window open to view the output
    printf("Press Enter to exit...");
    getchar();

    return 0;
}

or variation thereof can get the correct code page to encode in the output file(s).
Code snippet provided by Google 'AI? overview -  :( however only the first line is relevant :)

John Z
Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 23, 2026, 05:43:50 PM
How that helps RTF coding ?

EDIT:
Quote from: Robert on January 23, 2026, 06:03:43 PMUnfortunately, the RTFDEFS.H document referenced is not obviously available.
How to Obtain the WinWord Converter SDK (GC1039) (https://support.microsoft.com/en-us/topic/how-to-obtain-the-winword-converter-sdk-gc1039-9d68ab16-2714-c0ac-436d-0e9239206835)


HTML
https://unicodelookup.com/ (https://unicodelookup.com/)
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 23, 2026, 06:03:43 PM
Quote from: John Z on January 23, 2026, 12:59:59 PMAssuming the code comments are in the users default code page language then
#include <windows.h>
#include <stdio.h>

int main() {
    UINT user_codepage = GetACP(); // Retrieve the system default Windows ANSI code page

    printf("The user's default Windows ANSI code page is: %u\n", user_codepage);

    // Optional: Keep the console window open to view the output
    printf("Press Enter to exit...");
    getchar();

    return 0;
}

or variation thereof can get the correct code page to encode in the output file(s).
Code snippet provided by Google 'AI? overview -  :( however only the first line is relevant :)

John Z

Hi John Z:

My inaccurate "Code Pages" statement should have stated

"Ah yes, charsets and charset fonts."

There is some information in

https://www.biblioscape.com/rtf15_spec.htm (https://www.biblioscape.com/rtf15_spec.htm)

where it is written

Quote\fcharsetN   Specifies the character set of a font in the font table. Values for N are defined by Windows header files, and in the file RTFDEFS.H accompanying this document.

Unfortunately, the RTFDEFS.H document referenced is not obviously available.

There is a webpage at

https://www.n2pdf.de/fileadmin/user_upload/n2pdf/files/en/help/client_enu/unicode.htm (https://www.n2pdf.de/fileadmin/user_upload/n2pdf/files/en/help/client_enu/unicode.htm)

that has a table of codepage - charset equivalents.

My interest in your resurrection of the Pelles C Export addin is in the "Export to HTML" facility. I think it can handle Unicode identifiers and quotation mark embedded Unicode strings. RTF ?? I really doubt it. PDF ?? Definitely beyond my pay grade.

If you are interested in developing a Unicode capable "Export to HTML" facility, you might find some help studying the BCX translated C codes of the example on the webpage

https://bcxbasiccoders.com/webhelp/html/bcxunicode.htm#widetoansi (https://bcxbasiccoders.com/webhelp/html/bcxunicode.htm#widetoansi)
Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 23, 2026, 10:54:58 PM
Quote from: TimoVJL on January 23, 2026, 05:43:50 PMHow that helps RTF coding ?

EDIT:
Quote from: Robert on January 23, 2026, 06:03:43 PMUnfortunately, the RTFDEFS.H document referenced is not obviously available.
How to Obtain the WinWord Converter SDK (GC1039) (https://support.microsoft.com/en-us/topic/how-to-obtain-the-winword-converter-sdk-gc1039-9d68ab16-2714-c0ac-436d-0e9239206835)

Thanks TimoVjl, the rtfdefs.h file is in the download and the charset defines are


// \fcharset, \cchs argument values
// some of these values may also be #defined in windows.h; here's the
// complete list
#define ANSI_CHARSET                  0
#define DEFAULT_CHARSET               1
#define SYMBOL_CHARSET                2
#define INVALID_CHARSET               3 // nil value
#define MAC_CHARSET                  77
#define SHIFTJIS_CHARSET            128 // CP 932: Japanese
#define HANGEUL_CHARSET             129 // CP 949: Korean
#define JOHAB_CHARSET               130
#define GB2312_CHARSET              134 // CP 936: PRC             
#define CHINESEBIG5_CHARSET         136 // CP 950: Taiwan
#define GREEK_CHARSET               161
#define TURKISH_CHARSET             162
#define HEBREW_CHARSET              177
#define ARABIC_CHARSET              178
#define ARABICTRADITIONAL_CHARSET   179
#define ARABICUSER_CHARSET          180
#define HEBREWUSER_CHARSET          181
#define BALTIC_CHARSET              186
#define RUSSIAN_CHARSET             204
#define THAI_CHARSET                222
#define EASTEUROPE_CHARSET          238
#define PC437_CHARSET               254
#define OEM_CHARSET                 255

Correlation of Unicode chars to the above RTF charset data may be possible using the International Components for Unicode libraries functions to process locale data contained in the the Unicode Common Locale Data Repository (CLDR).

https://github.com/unicode-org/icu (https://github.com/unicode-org/icu)
https://github.com/unicode-org/cldr (https://github.com/unicode-org/cldr)

There are several RTF-charset to UTF-8 converters but what is needed here, for the Export to RTF addin, is a Unicode to RTF-charset converter.

Obviously, from the above list of charsets, the conversions from Unicode would be limited. For example, C coders working with the Native American Osage language script or the International Phonetic Alphabet (Hello, anyone out there ?) would be excluded.

Title: Re: Export C source as HTML or PDF file
Post by: John Z on January 24, 2026, 10:59:02 AM
Hi Robert,

Quote from: Robert on January 23, 2026, 06:03:43 PMMy interest in your resurrection of the Pelles C Export addin is in the "Export to HTML" facility. I think it can handle Unicode identifiers and quotation mark embedded Unicode strings.

Yes - Export to HTML can easily handle UTF-8, I just did a quick update, but it should not take too much to fix it.

Quote from: Robert on January 23, 2026, 06:03:43 PMIf you are interested in developing a Unicode capable "Export to HTML" facility, you might find some help studying the BCX translated C codes of the example on the webpage

No need - I have already previously written as part of another Add-In program (LineCounter+  https://forum.pellesc.de/index.php?topic=10092.0 ) a module that does output a file in UTF-8 HTML.  Since Pelle C now defaults to UTF-8 rather than a codepage it should be even easier.

So - I'll take a look at fixing that part of the Export Add-In.  Attached is an HTML output example from the LineCounter Add-in.

John Z

Like you said previously, optimistically,  "Nothing is impossible."  :)
Title: Re: Export C source as HTML or PDF file
Post by: John Z on January 31, 2026, 12:30:14 PM
Making progress  :)

Some help used from 'vibe' coding too.

Here is the first output trial utf8 source to utf8 html.
More improvements to be done before posting new project files.

So this is just preliminary look.

John Z
Title: Re: Export C source as HTML or PDF file
Post by: TimoVJL on January 31, 2026, 03:35:58 PM
A good thing, that Add-In is still updated  :)
Title: Re: Export C source as HTML or PDF file
Post by: John Z on January 31, 2026, 03:58:25 PM
Happy to do it.

Here is another output with actual UTF-8 characters :) I realized the first didn't have any 'special' characters.

The version is almost complete.  It also has an optional Dark output mode, and an optional Line Number output mode.  Just removing any nonsense I might have put in.

John Z
Title: Re: Export C source as HTML or PDF file
Post by: John Z on January 31, 2026, 05:21:03 PM
Attached is what I'm calling version 1.3.

It adds the ability to export an UTF encoded source file into an UTF encoded html file.
It supports color coding. If unwanted, change the colors in the source to black.
It supports output with line numbers added, if wanted.
It supports Dark background output, if wanted. (don't use black characters then  ;) )
It will partially work for UTF16le but I didn't try any characters unique to UTF16le (maybe later) as UTF-8 was the focus.

All sources included in the project zip as usual.
A Dark Mode with Line Numbers example is attached too.
Check the readme file for more information.

Done...

John Z

Title: Re: Export C source as HTML or PDF file
Post by: Robert on January 31, 2026, 10:02:38 PM
Quote from: John Z on January 31, 2026, 05:21:03 PMAttached is what I'm calling version 1.3.

It adds the ability to export an UTF encoded source file into an UTF encoded html file.
It supports color coding. If unwanted, change the colors in the source to black.
It supports output with line numbers added, if wanted.
It supports Dark background output, if wanted. (don't use black characters then  ;) )
It will partially work for UTF16le but I didn't try any characters unique to UTF16le (maybe later) as UTF-8 was the focus.

All sources included in the project zip as usual.
A Dark Mode with Line Numbers example is attached too.
Check the readme file for more information.

Done...

John Z

Thank you John Z. An Excellent job done !  ;D  8)