https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513
Bug ID: 112513 Summary: Misoptimization of argument Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: alexander.gr...@tu-dresden.de Target Milestone: --- Created attachment 56569 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56569&action=edit Preprocessed source In the NVIDIA NCCL library (https://github.com/NVIDIA/nccl) I came across a SIGSEGV in __strncmp_sse42 that happens "sometimes" when compiled with GCC 12 in -O2 mode and higher but don't happen in lower modes or in GCC 11 The original stacktrace looks like this: #0 0x00002aaaabd82e3a in __strcmp_sse42 () from /lib64/libc.so.6 #1 0x00002aab18d83a6e in xmlGetAttrIndex (index=<synthetic pointer>, attrName=0x2aab18e2820c "familyid", node=0x2aae9c108160) at graph/xml.h:67 #2 xmlSetAttrInt (value=143, attrName=0x2aab18e2820c "familyid", node=0x2aae9c108160) at graph/xml.h:167 #3 ncclTopoGetXmlFromCpu (cpuNode=cpuNode@entry=0x2aae9c108160, xml=xml@entry=0x2aae9c0d1f20) at graph/xml.cc:436 Moving the `strncmp(key, attrName, MAX_STR_LEN) == 0` out into a separate function to see the arguments in the debugger shows this backtrace: #0 0x00002aaaabd83c00 in __strncmp_sse42 () from /lib64/libc.so.6 #1 0x00002aab18d75a9f in cmpFromXml (attrName=0x89300800 <error: Cannot access memory at address 0x89300800>, key=0x2aaeac107eb0 "numaid") at graph/xml.h:65 #2 xmlGetAttrIndex (index=<synthetic pointer>, attrName=0x89300800 <error: Cannot access memory at address 0x89300800>, node=0x2aaeac107db0) at graph/xml.h:73 #3 xmlSetAttrInt (node=node@entry=0x2aaeac107db0, attrName=attrName@entry=0x89300800 <error: Cannot access memory at address 0x89300800>, value=143) at graph/xml.h:174 #4 0x00002aab18d77de4 in ncclTopoGetXmlFromCpu (cpuNode=cpuNode@entry=0x2aaeac107db0, xml=xml@entry=0x2aaeac0d1b70) at graph/xml.cc:437 So it looks like the `attrName` parameter gets corrupted somehow. The callsite of `xmlSetAttrInt` is `NCCLCHECK(xmlSetAttrInt(cpuNode, "familyid", familyId));`, so that parameter is a string constant already used earlier by `NCCLCHECK(xmlGetAttrIndex(cpuNode, "familyid", &index));` I suspect the `index` parameter to be involved. Many modifications cause the bug to disappear, such as removing the `NCCLCHECK` macro (basically an `if(error) return error;`-wrapper) or adding fprintf-statements into xmlGetAttrIndex or cmpFromXml The compile command is `g++ -fPIC -fvisibility=hidden -std=c++11 -O2 -g -ggdb3 -c graph/xml.cc`, the preprocessed source. Needs minimization but as it only happens when compiled into a library used by a python package from a script I don't know how. So I hope that there will be something obvious for someone familiar with the optimization in GCC