I've been trying to see if I can build LuaJIT using Pelles C, and came across this bug with strcpy() when /Ot is used with pocc, in both x64 and x86 mode, which causes strcpy() to copy its given string to a location one byte before where it's supposed to.
The attached file contains a copy of the function where the error occurs, which in LuaJIT is supposed to modify a string like "%02X" in a buffer into "%02lX". Note that the attached file does not exhibit the error until lines 10 and/or 11 are un-commented, at which point the behavior appears.
The program should generate the following output:
String in buf is: "%l "
but when the bug manifests, the output is instead:
String in buf is: "l "
Hi,
You might want to give a few more details, which version of Pelles C, etc.
I can only partially reproduce your results using Pelles C 10.006
outputs
String in buf is: "%l " //10&11 commented
String in buf is: "%li " //10 uncommented
String in buf is: "%l " //10 commented 11 uncomented
String in buf is: "%li " //both uncommented
In this test
only when line 10 is uncommented results in a different output AND
it is
independent of optimization settings i.e. no optimizations produce same results as max speed.
Then too I'm unclear on the test because the example you gave of the purpose was to change
Quote from: mbrittsess on October 29, 2021, 02:22:15 PM
a string like "%02X" in a buffer into "%02lX"
which seems to expand the string by 1 character ?
Pelles C is also warning the buf size is too small . . .warning #2237: No room for null terminator from literal string in 'char [4]'. Which may not be the issue but is not a good thing.
If you use the Pelles ZIP feature you can zip the entire project and post it. Might be helpful .
John Z
Update:
Also the issue happens AFTER strcpy when trying to stuff the buffer so it does not seem to be a strcpy issue itself.
I'm not sure it is a bug at all at least in version 10.006
Line 10 stuffs an "i", line 11 stuffs a NULL. looks right to me.
As this is an optimization case, strcpy isn't reason for it.1: #include <stdio.h>
2: #include <string.h>
3: #include <stdlib.h>
4:
5: void addintlen( char *form )
addintlen:
6: {
7: size_t l = strlen(form);
[0000000000000000] 48C7C0FFFFFFFF mov rax,FFFFFFFFFFFFFFFF
[0000000000000007] 660F1F840000000000 nop [rax+rax+0]
[0000000000000010] 4883C001 add rax,1
[0000000000000014] 803C0100 cmp byte ptr [rcx+rax],0
[0000000000000018] 75F6 jne 0000000000000010
8: char spec = form[ l - 1 ];
[000000000000001A] 8A5401FF mov dl,byte ptr [rcx+rax-1]
9: strcpy( form+l-1, "l" );
[000000000000001E] 4C8D4401FF lea r8,[rcx+rax-1]
[0000000000000023] 6641C740FF6C00 mov word ptr [r8-1],6C
10: form[ l + sizeof("l") - 2 ] = spec;
[000000000000002A] 881401 mov byte ptr [rcx+rax],dl
11: form[ l + sizeof("l") - 1 ] = '\0';
[000000000000002D] C644010100 mov byte ptr [rcx+rax+1],0
[0000000000000032] C3 ret
[0000000000000033] 0F1F4000 nop [rax+0]
[0000000000000037] 660F1F840000000000 nop [rax+rax+0]
12: };
msvc 2019addintlen:
00000000 48C7C0FFFFFFFF mov rax, -1h
00000007 48FFC0 inc rax
0000000A 803C0100 cmp byte ptr [rcx+rax*1], 0h
0000000E 75F7 jnz L_7
00000010 4803C8 add rcx, rax
00000013 BA6C000000 mov edx, 6Ch
00000018 0FB641FF movzx eax, byte ptr [rcx-1h]
0000001C 668951FF mov word ptr [rcx-1h], dx
00000020 8801 mov byte ptr [rcx], al
00000022 C6410100 mov byte ptr [rcx+1h], 0h
00000026 C3 ret
This issue can't be replicated in version 10.006. There must be another setting involved. I tried all optimization settings and the results show there is no output change from any optimization, again version 10.006.
So mbrittsess please post more details on the Pelles C version and post a zip of the project so that all settings can be accounted for. This is under Project - Zip Files.
Here is the output from optimization test
NO OPTIMIZATION
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
MAX SPEED
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
MINIMIZE SIZE
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
MAXIMIZE SPEED MORE
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
MINIMIZE SIZE MORE
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
whether right or wrong the output is not changing based on optimization settings.
John Z
I am not using Pelles C 10.006, I am using Pelles C 11.00
I am also not using any project file, I am invoking the binary tools directly from the cmdline.
At this point, how you choose to interpret what precisely this bug is, is irrelevant. The fact is, I will show that certain changes to code, which
should not result in any changes in output, nevertheless do so. Where
precisely the error lies is not a matter of concern to me, it suffices that I have identified the narrow area and steps for reproducing it.
The warnings about terminating nuls in buffers are irrelevant, I have constructed and initialized this array very specifically to set up this example.
Let me lay out the behavior exhibited by variants of this code very carefully. Here is the initial version of my build script, a very small one constructed specifically for showing this bug:
@echo off
set PC=C:\Program Files\PellesC
set PATH=%PATH%;%PC%\Bin
pocc /std:C17 /Tx64-coff /Ze /Zx /J /I "%PC%\Include" test.c
polink /MACHINE:X64 /SUBSYSTEM:CONSOLE "/LIBPATH:%PC%\Lib" "/LIBPATH:%PC%\Lib\Win64" test.obj
Note the
lack of the
/Ot flag when compiling.
And this is the code I am using, specifically to demonstrate this bug:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void addintlen( char *form )
{
size_t l = strlen(form);
char spec = form[ l - 1 ];
strcpy( form+l-1, "l" );
//form[ l + sizeof("l") - 2 ] = spec;
//form[ l + sizeof("l") - 1 ] = '\0';
};
int main ( int argc, char *argv[ argc ] )
{
char buf[4] = "%i\0\0";
addintlen( buf );
fputs( "String in buf is: \"", stdout );
fwrite( buf, 1, 4, stdout );
fputs( "\"\n", stdout );
return EXIT_SUCCESS;
};
Note that the two given lines are commented-out.
I will now lay out several variants on compiler flags and source code modifications, and show how they change the output.
The
expected output is:
String in buf is: "%l "Now, under four scenarios, the output is:
- No changes: String in buf is: "%l "
- The line #pragma intrinsic( strcpy ) added to the source: String in buf is: "%l "
- The flag /Ot added to compiler options: String in buf is: "%l "
- The flat /Ot added to compiler options and line #pragma function( strcpy ) added to source: String in buf is: "%l "
Note that in this case, all outputs are the same.
Now, as before, but this time, those two commented-out lines are un-commented. The expected output for all of them would be:
String in buf is: "%li "Now, build and run with the same four scenarios as before:
- No changes: String in buf is: "%li "
- The line #pragma intrinsic( strcpy ) added to the source: String in buf is: "%li "
- The flag /Ot added to compiler options: String in buf is: "l "
- The flag /Ot added to compiler options and line #pragma function( strcpy ) added to source: String in buf is: "%li "
Note the change in that
third case, where speed optimization is enabled, and the compiler has not been forbidden from inlining
strcpy(). The output changes compared to other scenarios, when it should not. This mere fact is sufficient proof that a bug is present.
No need to be irritated. It is important to know the version to help troubleshooting.
The results on version 11 support the bug report. And show that there was a change
between version 10 and version 11 which ultimately can be helpful determining the cause.
Here are the same results with Version 11.
No Optimizations
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
MAXIMIZE SPEED
String in buf is: "%l" both commented
String in buf is: "l" 10 uncommented
String in buf is: "l" 11 uncommented
String in buf is: "l" 10 & 11 uncommented
MINIMIZE SIZE
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
MAXIMIZE SPEED + EXTRA
String in buf is: "%l" both commented
String in buf is: "l" 10 uncommented
String in buf is: "l" 11 uncommented
String in buf is: "l" 10 & 11 uncommented
MINIMIZE SIZE + EXTRA
String in buf is: "%l" both commented
String in buf is: "%li" 10 uncommented
String in buf is: "%l" 11 uncommented
String in buf is: "%li" 10 & 11 uncommented
John Z
"Well done" - I can find two bugs in addintlen():
1) For some reason (I can't say why yet), the alias analysis is unable to figure out that "spec = ..." can be propagated across "strcpy( form+l-1, "l" )".
This can be worked around for now by adding "volatile" to "spec", like so:
volatile char spec = form[ l - 1 ];
2) The "strcpy( form+l-1, "l" )" statement is simplified (because of the literal string), but with the address not being one of &object, &(object + constant), pointer, (pointer + constant), an -apparently- rarely used code path in the compiler gets confused and keeps a bogus offset of -1 (in the transformed address). There are numerous ways to work around this for now by rewriting this in a less convoluted way...