Pelles C forum
Pelles C => Bug reports => Topic started by: jonathan on January 31, 2019, 11:50:57 PM
-
I need to have arrays and variables aligned on a 32 byte boundary for the AVX packed instructions. I use the following when I declare my variables:
float _Alignas(32) ionea[8]={0,0,0,0,0,0,0,0};
float _Alignas(32) qonea[8]={0,0,0,0,0,0,0,0};
It does not work reliably. Sometimes they are aligned, other times they are not. Whats going on?
-
A simple test:
int __cdecl main(void)
{
// float _Alignas(32) ionea[8]={0,0,0,0,0,0,0,0};
// float _Alignas(32) qonea[8]={0,0,0,0,0,0,0,0};
float __declspec(align(32)) ionea[8]={0,0,0,0,0,0,0,0};
float __declspec(align(32)) qonea[8]={0,0,0,0,0,0,0,0};
printf("%p %p\n", ionea, qonea);
printf("%u %u\n", (long)ionea & 31, (long)qonea & 31);
return 0;
}
test_align.c(2): warning #2224: Alignment of 'stack' exceeds 16 bytes; ignored.
poccmain:
00000000 4883EC68 sub rsp, 68h
00000004 0F57C0 xorps xmm0, xmm0
00000007 F30F11442440 movss dword ptr [rsp+40h], xmm0
msvcmain:
00000000 4055 push rbp
00000002 4881EC80000000 sub rsp, 80h
00000009 488D6C2440 lea rbp, [rsp+40h]
0000000E 4883E5E0 and rbp, -20h
00000012 0F57C0 xorps xmm0, xmm0
00000015 F30F114520 movss dword ptr [rbp+20h], xmm0
clangmain:
00000000 55 push rbp
00000001 4883EC70 sub rsp, 70h
00000005 488D6C2470 lea rbp, [rsp+70h]
0000000A 4883E4E0 and rsp, -20h
0000000E 0F57C0 xorps xmm0, xmm0
00000011 0F29442450 movaps xmmword ptr [rsp+50h], xmm0
EDIT: fix missing float
-
No support for over-aligned stack on X64 with Pelles C (perhaps in the future, but probably not).
It's often possible to skip local stack variables like these and load a SIMD register directly by using one of the "initialize" intrinsics, like _mm_set_ps(), _mm256_set_ps(), _mm256_set1_ps(), etc.
...
__m256 whatever1 = _mm256_set_ps(0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f);
...