New submission from STINNER Victor <vstin...@redhat.com>:
On x86-64, clang -O3 compiles the following function: PyCArgObject * PyCArgObject_new(void) { PyCArgObject *p; p = PyObject_New(PyCArgObject, &PyCArg_Type); if (p == NULL) return NULL; p->pffi_type = NULL; p->tag = '\0'; p->obj = NULL; memset(&p->value, 0, sizeof(p->value)); return p; } like that: 0x00007fffe9c6acb0 <+0>: push rax 0x00007fffe9c6acb1 <+1>: mov rdi,QWORD PTR [rip+0xe308] # 0x7fffe9c78fc0 0x00007fffe9c6acb8 <+8>: call 0x7fffe9c5e8a0 <_PyObject_New@plt> 0x00007fffe9c6acbd <+13>: test rax,rax 0x00007fffe9c6acc0 <+16>: je 0x7fffe9c6acdf <PyCArgObject_new+47> 0x00007fffe9c6acc2 <+18>: mov QWORD PTR [rax+0x20],0x0 0x00007fffe9c6acca <+26>: mov BYTE PTR [rax+0x28],0x0 0x00007fffe9c6acce <+30>: xorps xmm0,xmm0 0x00007fffe9c6acd1 <+33>: movaps XMMWORD PTR [rax+0x30],xmm0 0x00007fffe9c6acd5 <+37>: mov QWORD PTR [rax+0x40],0x0 0x00007fffe9c6acdd <+45>: pop rcx 0x00007fffe9c6acde <+46>: ret 0x00007fffe9c6acdf <+47>: xor eax,eax 0x00007fffe9c6ace1 <+49>: pop rcx 0x00007fffe9c6ace2 <+50>: ret The problem is that movaps requires the memory address to be aligned on 16 bytes, whereas PyObject_New() uses pymalloc allocator (the requested size is 80 bytes, pymalloc supports allocations up to 512 bytes) and pymalloc only provides alignment on 8 bytes. If PyObject_New() returns an address not aligned on 16 bytes, PyCArgObject_new() crash immediately with a segmentation fault (SIGSEGV). CPython must be compiled using -fmax-type-align=8 to avoid such alignment crash. Using this compiler flag, clag emits expected machine code: 0x00007fffe9caacb0 <+0>: push rax 0x00007fffe9caacb1 <+1>: mov rdi,QWORD PTR [rip+0xe308] # 0x7fffe9cb8fc0 0x00007fffe9caacb8 <+8>: call 0x7fffe9c9e8a0 <_PyObject_New@plt> 0x00007fffe9caacbd <+13>: test rax,rax 0x00007fffe9caacc0 <+16>: je 0x7fffe9caacdf <PyCArgObject_new+47> 0x00007fffe9caacc2 <+18>: mov QWORD PTR [rax+0x20],0x0 0x00007fffe9caacca <+26>: mov BYTE PTR [rax+0x28],0x0 0x00007fffe9caacce <+30>: xorps xmm0,xmm0 0x00007fffe9caacd1 <+33>: movups XMMWORD PTR [rax+0x30],xmm0 0x00007fffe9caacd5 <+37>: mov QWORD PTR [rax+0x40],0x0 0x00007fffe9caacdd <+45>: pop rcx 0x00007fffe9caacde <+46>: ret 0x00007fffe9caacdf <+47>: xor eax,eax 0x00007fffe9caace1 <+49>: pop rcx 0x00007fffe9caace2 <+50>: ret "movaps" instruction becomes "movups" instruction: "a" stands for "aligned" in movaps, whereas "u" stands for "unaligned" in movups. ---------- components: Build messages: 340087 nosy: vstinner priority: normal severity: normal status: open title: clang expects memory aligned on 16 bytes, but pymalloc aligns to 8 bytes versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36618> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com