Another optimizer bug (with strlen)

Started by jullien, August 12, 2014, 05:21:16 PM

Previous topic - Next topic

jullien

Found in RC6 (I don't recall this bug happens in RC5)

With the following sample, line 21 should compute strlen of "foo-bar" (i.e. 7) but it always returns the length of INIT (2 with current value).
It happens at least with x64 version using -Ox (-O1 is Ok)

c:\PellesC\bin\cc -Ze -Zx -Gz -W0 -Tamd64-coff -Ox -Ob2 foo.c

#include <stdio.h>
#include <string.h>

#define INIT   "xx"

char *
test( const unsigned char *name )
{
   int len;
   unsigned char testfile[512];
   char c;

   (void)strcpy( testfile, INIT );

   c = testfile[strlen(testfile)-1];

   if( c != '/' && c != '\\' ) {
      (void)strcat( testfile, "/" );
   }

   (void)strcpy(testfile, name);
   printf("<%s>\n", testfile);
   len = (int)strlen( testfile ); // always size of INIT (i.e. 2 using "xx")
   printf("len=%d <%s>\n", len, testfile);
}

int
main()
{
   test("foo-bar");
}

Christian

neo313

#1
Here is a simplified version, compile with speed optimizations:


#include <stdio.h>
#include <string.h>

//#pragma function( strlen )

int main( void )
{
char testfile[512] = { 0 };
strcpy( testfile, "xy" );
printf("%d, should be 2\n" , strlen(testfile) ) ;
strcpy(testfile, "this is a test");
printf("%d, should be 14\n" , strlen(testfile) ) ;

return 0 ;
}


Only happens with speed optimizations, size optimization is not affected.
Removing the first call to strlen removes the bug.
Looking in assembly, the old value of strlen stored in a register is used, instead checking the length again.

After trying the intrinsic options, I found the culprit to be strlen.  If you enable speed optimizations and disable intrinsic for strlen you remove the bug.

Superlokkus

Found this too, with 8.00.19 ie. RC#4 and #6, got a textfile with many lines, but with any speed optimization, strlen is always the same.
I attached a screenshot, where you can see the difference between debugger result and program.

frankie

This bug is present surely since version 7.00.
The optimizer inlining strlen code is too much ... optimistic  >:( and don't take in account any modification occourred to the string.
"It is better to be hated for what you are than to be loved for what you are not." - Andre Gide

frankie

A workaround that seems to work and still allows code optimization is:
#pragma function(strlen)
Using this pragma in all affected modules force a strlen function call and avoid the optimization of inlined strlen code (normally removed and replaced by the very first strlen computation).
"It is better to be hated for what you are than to be loved for what you are not." - Andre Gide

neo313

#5
Quote from: frankie on August 28, 2014, 12:44:25 PM
A workaround that seems to work and still allows code optimization is:
#pragma function(strlen)
Using this pragma in all affected modules force a strlen function call and avoid the optimization of inlined strlen code (normally removed and replaced by the very first strlen computation).

I just remember we have this problem.

I have put this #pragma function(strlen) at the start of the Pelles's string.h header( \PellesC\Include ). This solves the entire problem since no one else is using the header. Unless there is a better place for it?


frankie

Since it's not only strlen to give problems when inlined maybe the best thing to do is to use the compiler /Ob0 option to disable all inlining...  ;)

"It is better to be hated for what you are than to be loved for what you are not." - Andre Gide

neo313

Quote from: frankie on October 14, 2014, 09:04:07 PM
Since it's not only strlen to give problems when inlined maybe the best thing to do is to use the compiler /Ob0 option to disable all inlining...  ;)

Which are the other functions that we know are affected?

frankie

Potentially all inlined function.
The bug is not the inlining itself, but the optimizing after the inlining.
During optimization the compiler miss variables changes many times removing the whole function and assuming that the return vale is the same of the previous call. I.e. if you have a loop comparing strings after optimization of inlined code maybe the compiler uses for the whole loop long always the result of the very first compare...  :(
"It is better to be hated for what you are than to be loved for what you are not." - Andre Gide