C declaration explainer

frankie · September 18, 2017, 12:57:45 AM

Playing around with compiler techniques I produced something that, at this stage, can be used to explain in more or less plain english the C declarations. Even those very complicate.
I.e. from a declaration as:

Code Select


void (*bsd_signal(int sig, void (*func)(int)))(int);

You will get:

Code Select


// 'bsd_signal' is: function having 2 parameters (int, pointer to function having 1 parameter (int) returning void) returning pointer to function having 1 parameter (int) returning void.
void (*bsd_signal(int sig, void (*func)(int )))(int );

You can use the executable with command line:

Code Select


fcc -T:Ast-C test_decl.c

The file test_decl.c is a sample source included in the zip.
The executable is a based on my compiler, so the I/O use files only.
After compilation, you'll find in the directory a freshly created file named test_decl.AstOut.c (the name of input file followed by .AstOut postfix). You can also use the -Fo switch to change output file name. I.e.:

Code Select


fcc -T:Ast-C -Fo:test_out.c test_decl.c

Will create an output file named test_out.c.

Added an advanced demo with a test file that does something more that explain declarations

jack · September 18, 2017, 12:49:15 PM

hello frankie
so this is a C declarations to english? could be helpful to a beginner like me.

frankie · September 18, 2017, 01:00:53 PM

Hello Jack,
yes is the first use of a bigger project.
It will (may)be a C compiler in future.
For now generates explanation in english, and help me in the development if everybody reports the problematic declarations (so I can debug the code).
It understands also many constructs, (not the preprocessor, the switch statement, the goto, the enums and something else).
You can try something like:

Code Select


int array_of_ints[10][10];
_Static_assert(sizeof (array_of_ints)  == sizeof(int)*10*10, "'array_of_ints' doesn't " "occupy" " 400 bytes!");
_Static_assert(sizeof(char)  == 1, "'char' is " "not" " equal to 1!");
_Static_assert(sizeof(short) == 2,
	"'short'"
	" is not"
	" equal to 2!");
_Static_assert(sizeof(double) == 8, "'double' is not equal to 8!");
_Static_assert((1<<2)+1 == 2*2+1, "5" " != "
	"5"
																									" ...");

Shows sizeof, arrays and string concatenation. Or:

Code Select


typedef int INTEGER;
INTEGER int1 = 100;
float FloatVal = 0.34e-1;
float FloatHex = 0x1.99999ap-4;
typedef char * CHAR_PTR;
CHAR_PTR pChar;
int fnproto(int a, char *c, float f);
const unsigned int * restrict const * restrict const flt;

That shows constants input and qualified declarations. Or a function definition:

Code Select


int fn(int a, int b)
{
	typedef int MyInt;
    MyInt c = (MyInt){2};
    int d = c + 1, z = d-1+array_of_ints[1][1];
	_Static_assert((1<<2)+1 == 2*2+1, "5 != 5 ...");

    if (flt == 0)
    {
        c = (MyInt)(void *)(c + d--) >> 2;
    }
    else
    {
        flt = 0;
    }

    do
    {
        c = c+1;
    }
    while (c<100);

    while (++c > c-a)
    {
        c = c - 1;
		c += 2;
		d |= c;
    }

    int res = (c + a) * b - 2  != 0 ? a >> (int)((3 & 1)) : (int)((b - a) % 4);
    return (c + a) * b - 2  != 0 ? a + (3 & 1) : (b - a) % 4;
    c = 3;
}

This will trigger a warning for unreachable code.

frankie · September 25, 2017, 12:21:42 AM

The advanced demo (see above) shows capacities of fcc.

Preprocessor and enums are not handled yet.... will come later. For now the core...

jack · September 25, 2017, 03:03:18 AM

thank you frankie

Jokaste · October 28, 2017, 05:11:26 PM

When do you make the same program for the assembler?

frankie · October 28, 2017, 05:45:11 PM

Quote from: Jokaste on October 28, 2017, 05:11:26 PM
When do you make the same program for the assembler?

I'm working on it.
A compiler is a complicate thing. I can't even think to produce assembler output before the preprocessor is working. And I'm now dealing with it.
As soon I'll have something I'll publish it.

Thanks for trusting me.

Jokaste · October 28, 2017, 07:05:49 PM

It can help you...

Quote
Sched
=====
Sched takes specially commented source code and reoutputs the code in an order
that can be more optimally executed by an in-order but superscalar or
pipelined CPU.
Why would I do this?
====================
Most modern CPU of relevance have out-of-order execution engines, that will
automatically schedule instructions to extract better parallelism anyways. So
what is the point?
1) Well, as a side effect, sched exposes the critical path dependencies in a
kind of obvious way. This will allow you see where you need to apply
code tweaks to improve your performance as opposed to throwing out red
herrings at you.
2) Different OOE architectures vary in quality. Certainly none of them that I
have encountered can perform perfect reordering. If you do the reording
yourself, then there is no issue about the quality of OOE engine for your
CPU.
3) Itanium and Sparc are an in-order processors. So are a lot of really
low-end processors and DSPs. So this is still useful for some platforms.
Ok how do I make it work?
=========================
Each line of code in the input corresponds to "one cycle" (yes, this is
dramatically simplified from reality, but it works for integer code on most
CPUs.) In each line of your code you need a tail comment which describes the
resources that are read and written. Then simply feed your the input to sched
and it will output the code reordered.
The tail comment includes a resource description in the following format:
@ wr(...) rd (...)
The ellipses (...) are replaced by the sequence of comma seperated resources
either written or read, respectively. For example:
mov eax, ebx ; @ wr(eax), rd(ebx)
add eax, ecx ; @ wr(eax), rd(eax,ecx)
shl eax, 2 ; @ wr(eax), rd(eax)
mov edx, ebx ; @ wr(edx), rd(ebx)
add edx, ecx ; @ wr(edx), rd(edx,ecx)
shl edx, 2 ; @ wr(edx), rd(edx)
After run through sched will give the following:
mov eax, ebx ; @ wr(eax), rd(ebx)
mov edx, ebx ; @ wr(edx), rd(ebx)
; /* cycle 1 */
add eax, ecx ; @ wr(eax), rd(eax,ecx)
add edx, ecx ; @ wr(edx), rd(edx,ecx)
; /* cycle 2 */
shl eax, 2 ; @ wr(eax), rd(eax)
shl edx, 2 ; @ wr(edx), rd(edx)
; /* cycle 3 */
Note that the "resources" are just arbitrary (case sensitive) strings, and no
interpretation of the source code is done. So obviously you could write some
C code, and assume a 3-operand CPU model:
_t0 = a + b; /* @ wr(_t0), rd(a,b) */
_t1 = c + d; /* @ wr(_t1), rd(c,d) */
a = _t0 + _t1; /* @ wr(a), rd(_t1,_t0) */
_t2 = a | b; /* @ wr(_t2), rd(a,b) */
_t3 = c | d; /* @ wr(_t3), rd(c,d) */
b = _t3 | _t2; /* @ wr(b), rd(_t3,_t2) */
_t4 = a ^ b; /* @ wr(_t4), rd(a,b) */
_t5 = c ^ d; /* @ wr(_t5), rd(c,d) */
c = _t5 ^ _t4; /* @ wr(c), rd(_t5,_t4) */
which outputs:
_t0 = a + b; /* @ wr(_t0), rd(a,b) */
_t1 = c + d; /* @ wr(_t1), rd(c,d) */
_t3 = c | d; /* @ wr(_t3), rd(c,d) */
_t5 = c ^ d; /* @ wr(_t5), rd(c,d) */
; /* cycle 1 */
a = _t0 + _t1; /* @ wr(a), rd(_t1,_t0) */
; /* cycle 2 */
_t2 = a | b; /* @ wr(_t2), rd(a,b) */
; /* cycle 3 */
b = _t3 | _t2; /* @ wr(b), rd(_t3,_t2) */
; /* cycle 4 */
_t4 = a ^ b; /* @ wr(_t4), rd(a,b) */
; /* cycle 5 */
c = _t5 ^ _t4; /* @ wr(c), rd(_t5,_t4) */
; /* cycle 6 */
Woohoo!
This looks like a hack I wrote in a couple hours
================================================
It is. But it uses modules that I didn't have in my toolchest until
relatively recently. I was asked by someone at a CPU vendor company to build
such a tool for an old CPU whose OOE capabilities were relatively modest. The
problem is that the C language by itself is piss poor at string, and ADT
handling using its native libraries, and at the time I didn't quite
understand that solving the most general problem helps solve the specific
problem.
Over time I have been writing simple components to solve standard problems in
C with the same ease in which I can solve problems in higher level languages.
I have also seen enough CPU architectures now, that it has dawned on me that
assuming you have infinite number of pipelines allows you to see all the
parallelism inherent in the software source itself.
It should also be said, that to solve the problem exactly for a finite
pipelined machine (with other strange restrictions like slotting or random
scheduling when faced with too much parallelism, as a couple x86 CPU
micro-architectures are limited by) is extremely difficult, as it requires
a full search.
This is the result, now that I know the right way to do it.
Is this free?
=============
Well, I am covering it with the BSD licence. So its very free, but not quite
public domain -- just don't go taking credit for my work.
--
Paul Hsieh

frankie · October 28, 2017, 08:53:01 PM

Quote from: Jokaste on October 28, 2017, 07:05:49 PM
It can help you...

I'll have a look.
Thanks

News:

C declaration explainer

frankie

jack

frankie

frankie

jack

Jokaste

frankie

Jokaste

frankie