[ 
https://issues.apache.org/jira/browse/KUDU-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin reassigned KUDU-3517:
-----------------------------------

    Assignee: Alexey Serbin

> Kudu servers crash on Graviton3 (aarch64) instances in EC2
> ----------------------------------------------------------
>
>                 Key: KUDU-3517
>                 URL: https://issues.apache.org/jira/browse/KUDU-3517
>             Project: Kudu
>          Issue Type: Bug
>          Components: CLI, client, master, tserver
>    Affects Versions: 1.17.0
>         Environment: Graviton3 instances in EC2
>            Reporter: Alexey Serbin
>            Assignee: Alexey Serbin
>            Priority: Critical
>              Labels: ARM, aarch64
>
> Kudu masters and tablet servers built from the source code released with Kudu 
> 1.17.0 crash with SIGSEGV when running on Graviton3 (aarch64) instances in 
> EC2.
> Upon closer examination, it turned out the problem happens when 
> StackCollector tries to symbolize a thread's stack, and an example of the 
> trace looked like below.  The stack trace has been collected under GDB when 
> running a smoke test with the kudu CLI tool: {{kudu perf loadgen 
> <master_rpc_addr> \-\-table_num_replicas=3 \-\-num_rows_per_thread=1000000}}:
> {noformat}
> #0  access_mem (as=0x3304418 <local_addr_space>, addr=7745970402396146688, 
>     val=0xfffff325ca18, write=0, arg=0xfffff325ce70)
>     at 
> /root/Projects/kudu/thirdparty/src/libunwind-1.6.2/src/aarch64/Ginit.c:337
> #1  0x0000000000a97ac0 in is_plt_entry (c=0xfffff325ce70)
>     at 
> /root/Projects/kudu/thirdparty/src/libunwind-1.6.2/src/aarch64/Gstep.c:43
> #2  0x0000000000a97fdc in _ULaarch64_step (cursor=0xfffff325ce70)
>     at 
> /root/Projects/kudu/thirdparty/src/libunwind-1.6.2/src/aarch64/Gstep.c:171
> #3  0x00000000025050c8 in kudu::StackTrace::Collect (
>     this=this@entry=0xfffff325d7d8, skip_frames=skip_frames@entry=0)
>     at /root/Projects/kudu/src/kudu/util/debug-util.cc:612
> #4  0x0000000002507f64 in kudu::StackTrace::Collect (
>     this=this@entry=0xfffff325d7d8, skip_frames=skip_frames@entry=0)
>     at /root/Projects/kudu/src/kudu/util/debug-util.cc:579
> #5  0x000000000259c390 in kudu::(anonymous 
> namespace)::SubmitSpinLockProfileData (contendedlock=0x4ed8a220, 
> wait_cycles=2966400)
>     at /root/Projects/kudu/src/kudu/util/spinlock_profiling.cc:229
> {noformat}
> The crash happens with SIGSEGV somewhere in the libunwind code, and that 
> looks very similar to what's reported in [this github 
> issue|https://github.com/libunwind/libunwind/issues/260].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to