Hi,
I was playing around with the compiler optimization options and I stumbled across this.
int main(void)
{
unsigned char a;
a = 1;
a = a + 3;
return 1;
}
Using "pocc /Ox /Tx86-asm" the generated assembly output is:
mov cl,1
movzx eax,cl
add eax,3
mov ecx,eax
mov eax,1
@1:
ret
Is there a particular reason why this is not simply done as
mov cl,1
add cl, 3
Or MOV CL, 4
?
Quote from: "kobold"Or MOV CL, 4
?
I agree. However for that matter you can as well completely remove the code, as the variable "a" is never used :shock: . But that's advanced optimization and flow analysis, which I don't consider that important.
I found that the compiler does many unnecessary loads when dealing with register variables. Another example:
int b;
int main(void)
{
int *a;
a = &b;
b = *a;
return 1;
}
When optimization is on it generates:
mov eax,(_b)
mov ecx,eax
mov eax,dword [ecx]
mov dword [(_b)],eax
Clearly the better pick would be
mov ecx,(_b)
mov eax,dword [ecx]
mov dword [(_b)],eax
First case: The original LCC compiler from Princeton is basically designed to be retargetable, generating "acceptable" code, and having a mangeable source code tree. The front end will describe the (generic) machine code with as few "operations" as possible - to help the retargeting process. As with any C compiler, types smaller than 'int' will be promoted to 'int'. "Sizing down" types in the back end using the few available "operations" is doable up to a point, but not in all cases (and all cases shouldn't be handled). In C, the size of an 'int' is supposed to map to the most 'natural' and 'efficent' type on a certain machine. For this reason, most sensible C programmers only use 'char' and 'short' when they really must (in machine structures, and the like).
Any type of flow control isn't implemented in LCC. The register allocator is simple, maybe best suited for RISC processors. On a crappy architecture like X86, with far too few general registers, it will sometimes get lost. Adding flow control analyzis, a better register allocator, and many other optimizations, would probably require a staff of several full-time developers - which of course is completely unacceptable for a project like this...
Second case: I don't know how you got this result - I can't reproduce it...
Pelle
I agree with all your remarks. Also I want to point out that I'm perfectly happy with the performance of PelleC as is, I simply point out things that I think are of interest. I also don't think flow analysis is of much beneft as clean and good programming practice makes up for most of it.
So back to our first example.
int main(void)
{
unsigned char a;
a = 1;
a = a + 3;
return 1;
}
When compiling without optimization "pocc.exe /Tx86-asm", I get the following code:
mov byte [ebp+(-1)],1
add byte [ebp+(-1)],3
As you can see you don't promote the char to int in this situation and the code is pretty much the best possible.
So I'm just wondering why you promote in one case and not the other ? (BTW: I'm just wondering,not critizing as I almost never use optimization. Keeps my debugging easy.)
ds
This case is trivial because a constant can be "sized down" to a byte easily, but when you are optimizing and assigning values to *registers*, promotions come into play much more.
As I said before, it's possible to fix this up to a point. I have already spent several month on this. In the next version, your example will look something like this:
...
; unsigned char a;
; a = 1;
B101 mov cl,1
; a = a + 3;
80C103 add cl,3
...
It's a matter of adding more operation "patterns". The problem is that it's getting hard when you start combining simple operations into longer sequences. A single binary operation, like add (char1 + char2) can be handled, but two or more such operations are at least *very* hard. Adding char1 + char2 + char3, is one addition with an intermediate result, then this is added with char3 etc.
Another problem with adding more "patterns" is that the pattern-matching code in LCC will quickly bloat. With the current changes, the compiler is some 40-50% bigger. The "size of a diskette" is not a factor anymore, but I don't want the compiler to get too big either...
Here are some more examples:
; a = b = c = 1;
C645FF01 mov byte ptr [ebp-1],1
B301 mov bl,1
B101 mov cl,1
; a = b + c;
88D9 mov cl,bl
024DFF add cl,byte ptr [ebp-1]
Here it starts to look ugly again...
; a = a + b + c;
0FB6C1 movzx eax,cl
0FB6D3 movzx edx,bl
01D0 add eax,edx
0FB655FF movzx edx,byte ptr [ebp-1]
01D0 add eax,edx
88C1 mov cl,al
Pelle