Here's a small issue I met during using Pelles C Compiler.
This is a simple code to calculate division of two integers:
/* Result of unsigned integer division. */
typedef struct st_stdiv_t {
size_t quot; /* Quotient. */
size_t rem; /* Remainder. */
} stdiv_t;
stdiv_t stdiv(size_t numerator, size_t denominator)
{
stdiv_t result;
/* May your compiler optimize division of these two following codes into one instruction. */
result.quot = numerator / denominator;
result.rem = numerator % denominator;
return result;
}
Now we put this snippet into
https://godbolt.org/ and we choose x86-64 gcc.
We open 3 levels of optimization for the compiler -O3 and check what happens in the right disassembly code view.
stdiv:
mov rax, rdi
xor edx, edx
div rsi
ret
Now we can see Gcc has optimized the code / and % into one div instruction.
I tested the above code for MSVC and clang either. Each of them could optimize the above code.
Let's see our Pelles C compiler's performance. We choose Project->Project option->Compiler->Code generation in Pelles C IDE and check maximize speed and Extra optimizations.
mov eax
xor edx
div ebx
mov esi
mov eax
xor edx
div ebx
mov eax
Unfortunately, Pelles C Compiler didn't optimize the above code into one div instruction.
We knew that div instruction made divisions on one register and stored the result into two registers.
It means that we can use only one div instruction instead of two and finally return result directly to registers to enlighten the performance.
Shall we alter the back-end of Pelles C compiler to support this?
It can significantly reduce the code size and accelerate the execution speed.
PS: If we try to compile glibc(the gnu c lib) with Pelles C, the gnu C lib actually implement div function as this too, we will not get the advantage of such optimizations.