The problem turned out to be in the spiller/rewriter.
The way a pointed-to value is fetched from a post-incremented pointer, like:
c = *str++
is turned into something like this internally (LCSE=Local Common SubExpression):
LCSE-temp = *str
str = str + 1
c = LCSE-temp
This construct reduces the internal set of operators, which is a good thing, but needs careful handling all over.
There was a logical error in that a GCSE (Global Common SubExpression) like:
GCSE-temp = <value>
(other code)
... first use of GCSE-temp
(other code)
... second use of GCSE-temp etc.
can be rewritten as:
(other code)
... first use of <value>
(other code)
... second use of <value>
this will always work, but this is not always true for a LCSE-temp (due to the above special construct).
The spiller/rewriter is rarely touched, last time was apparently in 2010, so this logical error has remained undetected for a long time. I guess it needed a register-starved architecture like X86 to trigger (most people are presumably using X64 these days).
It took a long time to discover the optimizer problem, the main reason is that compiling with debug the bug disappear.
- Full debugging info can
at best be used to find logical errors, but since most optimizer passes that move around code are disabled in this mpde, there is little chance of finding code generator problems.
- Line number debugging info will not disable any optimizer passes, but line numbers can still get lost (complicating debugging) because it can be hard to explain sequences like this:
line #1
line #8
line #3
line #9
line #2
The common concept is to relate to the source file, but how to do that without confusing everyone involved?