[ https://issues.apache.org/jira/browse/KUDU-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Serbin reassigned KUDU-3517: ----------------------------------- Assignee: Alexey Serbin > Kudu servers crash on Graviton3 (aarch64) instances in EC2 > ---------------------------------------------------------- > > Key: KUDU-3517 > URL: https://issues.apache.org/jira/browse/KUDU-3517 > Project: Kudu > Issue Type: Bug > Components: CLI, client, master, tserver > Affects Versions: 1.17.0 > Environment: Graviton3 instances in EC2 > Reporter: Alexey Serbin > Assignee: Alexey Serbin > Priority: Critical > Labels: ARM, aarch64 > > Kudu masters and tablet servers built from the source code released with Kudu > 1.17.0 crash with SIGSEGV when running on Graviton3 (aarch64) instances in > EC2. > Upon closer examination, it turned out the problem happens when > StackCollector tries to symbolize a thread's stack, and an example of the > trace looked like below. The stack trace has been collected under GDB when > running a smoke test with the kudu CLI tool: {{kudu perf loadgen > <master_rpc_addr> \-\-table_num_replicas=3 \-\-num_rows_per_thread=1000000}}: > {noformat} > #0 access_mem (as=0x3304418 <local_addr_space>, addr=7745970402396146688, > val=0xfffff325ca18, write=0, arg=0xfffff325ce70) > at > /root/Projects/kudu/thirdparty/src/libunwind-1.6.2/src/aarch64/Ginit.c:337 > #1 0x0000000000a97ac0 in is_plt_entry (c=0xfffff325ce70) > at > /root/Projects/kudu/thirdparty/src/libunwind-1.6.2/src/aarch64/Gstep.c:43 > #2 0x0000000000a97fdc in _ULaarch64_step (cursor=0xfffff325ce70) > at > /root/Projects/kudu/thirdparty/src/libunwind-1.6.2/src/aarch64/Gstep.c:171 > #3 0x00000000025050c8 in kudu::StackTrace::Collect ( > this=this@entry=0xfffff325d7d8, skip_frames=skip_frames@entry=0) > at /root/Projects/kudu/src/kudu/util/debug-util.cc:612 > #4 0x0000000002507f64 in kudu::StackTrace::Collect ( > this=this@entry=0xfffff325d7d8, skip_frames=skip_frames@entry=0) > at /root/Projects/kudu/src/kudu/util/debug-util.cc:579 > #5 0x000000000259c390 in kudu::(anonymous > namespace)::SubmitSpinLockProfileData (contendedlock=0x4ed8a220, > wait_cycles=2966400) > at /root/Projects/kudu/src/kudu/util/spinlock_profiling.cc:229 > {noformat} > The crash happens with SIGSEGV somewhere in the libunwind code, and that > looks very similar to what's reported in [this github > issue|https://github.com/libunwind/libunwind/issues/260]. -- This message was sent by Atlassian Jira (v8.20.10#820010)