I was just taking a look at how Pelles assembles code & I made a few small observations...
Just a note, these are only my opinions.
Testing return booleans is done through a 'cmp ax, 0' instruction, this could be easily optimized & cut down in size (operation size, 'test ax, ax' opcode is shorter than 'cmp ax, 0' [1 byte difference & generally much faster]).
When leaving a procedure which uses local variables the following stack/base pointer procedure is done:
mov esp, ebp
pop ebp
There's an instruction made just for this, this is the 'leave' instruction which does the exact same as the aforementioned instruction combination but is not only smaller but it is faster (processor cache only has to parse one command).
Any reason these aren't different? I may be wrong in my thoughts or this may just be a LCC related thing which isn't isolated in Pelles.