Public bug reported:

Systemd appears to crash often on Azure v6 arm64 VM sizes during initial
(provisioning) boot.  I caught the crash on my first attempt to repro on
Standard_D2pds_v6 with canonical:ubuntu-24_04-lts:server-arm64:latest.

[   14.082815] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Reloading 
requested from client PID 1208 ('systemctl') (unit walinuxagent.service)...
[   14.082982] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Reloading...
[   14.096465] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Caught 
<SEGV> from PID -535718928.
...
[  108.535662] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Caught 
<SEGV>, dumped core as pid 1227.
[  108.536032] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Freezing 
execution.
...
[  184.052078] temp-vm-cpatterson-eastus2-t02190915031 systemd-journald[136]: 
Failed to send WATCHDOG=1 notification message: Connection refused

(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:39
#1  0x0000e14d999df8e4 in missing_rt_tgsigqueueinfo (info=0xffffddda1ca0, 
sig=11, tid=<optimized out>, tgid=1227) at ../src/basic/missing_syscall.h:384
#2  propagate_signal (sig=sig@entry=11, siginfo=siginfo@entry=0xffffddda1ca0) 
at ../src/basic/signal-util.c:301
#3  0x0000bb509f70e9bc [PAC] in crash (sig=11, siginfo=0xffffddda1ca0, 
context=<optimized out>) at ../src/core/crash-handler.c:94
#4  <signal handler called>
#5  0x0000e14d99d2944c in unit_active_state (u=u@entry=0xbb50c8aeeb10) at 
../src/core/unit.c:941
#6  0x0000e14d99d2d454 in unit_may_gc (u=0xbb50c8aeeb10) at 
../src/core/unit.c:465
#7  0x0000e14d99d2e7d8 [PAC] in unit_add_to_gc_queue (u=u@entry=0xbb50c8aeeb10) 
at ../src/core/unit.c:535
#8  0x0000e14d99d2efe0 [PAC] in unit_clear_dependencies (u=0xbb50c8b0a7c0) at 
../src/core/unit.c:656
#9  unit_free (u=0xbb50c8b0a7c0) at ../src/core/unit.c:797
#10 0x0000e14d99ce228c [PAC] in manager_clear_jobs_and_units.part.0.lto_priv.0 
(m=m@entry=0xbb50c8acf790) at ../src/core/manager.c:1594
#11 0x0000e14d99ce2624 [PAC] in manager_clear_jobs_and_units (m=0xbb50c8acf790) 
at ../src/core/manager.c:1591
#12 manager_reload (m=m@entry=0xbb50c8acf790) at ../src/core/manager.c:3568
#13 0x0000bb509f709338 [PAC] in invoke_main_loop 
(ret_error_message=0xffffddda3178, ret_switch_root_init=<synthetic pointer>, 
ret_switch_root_dir=<synthetic pointer>, ret_fds=0xffffddda3168, 
ret_retval=<synthetic pointer>,
    saved_rlimit_memlock=0xffffddda31a0, saved_rlimit_nofile=0xffffddda31b0, 
m=0xbb50c8acf790) at ../src/core/main.c:1982
#14 main (argc=1, argv=0xffffddda3668) at ../src/core/main.c:3106

In this particular case the reload was requested by WALinuxAgent, but I
have evidence of other crashes when cloud-init request the reload:

2025-01-02T10:18:03.007621+00:00 localhost systemd[1]: Reloading requested from 
client PID 871 ('systemctl') (unit cloud-init-local.service)...
2025-01-02T10:18:03.007662+00:00 localhost systemd[1]: Reloading...
2025-01-02T10:18:03.010753+00:00 localhost systemd[1]: Caught <SEGV>, from 
unknown sender process.

In some cases I don't even see a reload but I will see if I can get a
crash dump for those cases.

** Affects: systemd (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "systemd crash dump on reload requested by walinuxagent"
   
https://bugs.launchpad.net/bugs/2098861/+attachment/5858765/+files/_usr_lib_systemd_systemd.0.crash

** Description changed:

  Systemd appears to crash often on Azure v6 arm64 VM sizes during initial
  (provisioning) boot.  I caught the crash on my first attempt to repro on
- Standard_D2pds_v6.
+ Standard_D2pds_v6 with canonical:ubuntu-24_04-lts:server-arm64:latest.
  
  [   14.082815] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Reloading 
requested from client PID 1208 ('systemctl') (unit walinuxagent.service)...
  [   14.082982] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: 
Reloading...
  [   14.096465] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Caught 
<SEGV> from PID -535718928.
  ...
  [  108.535662] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Caught 
<SEGV>, dumped core as pid 1227.
  [  108.536032] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Freezing 
execution.
  ...
  [  184.052078] temp-vm-cpatterson-eastus2-t02190915031 systemd-journald[136]: 
Failed to send WATCHDOG=1 notification message: Connection refused
- 
  
  (gdb) bt
  #0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:39
  #1  0x0000e14d999df8e4 in missing_rt_tgsigqueueinfo (info=0xffffddda1ca0, 
sig=11, tid=<optimized out>, tgid=1227) at ../src/basic/missing_syscall.h:384
  #2  propagate_signal (sig=sig@entry=11, siginfo=siginfo@entry=0xffffddda1ca0) 
at ../src/basic/signal-util.c:301
  #3  0x0000bb509f70e9bc [PAC] in crash (sig=11, siginfo=0xffffddda1ca0, 
context=<optimized out>) at ../src/core/crash-handler.c:94
  #4  <signal handler called>
  #5  0x0000e14d99d2944c in unit_active_state (u=u@entry=0xbb50c8aeeb10) at 
../src/core/unit.c:941
  #6  0x0000e14d99d2d454 in unit_may_gc (u=0xbb50c8aeeb10) at 
../src/core/unit.c:465
  #7  0x0000e14d99d2e7d8 [PAC] in unit_add_to_gc_queue 
(u=u@entry=0xbb50c8aeeb10) at ../src/core/unit.c:535
  #8  0x0000e14d99d2efe0 [PAC] in unit_clear_dependencies (u=0xbb50c8b0a7c0) at 
../src/core/unit.c:656
  #9  unit_free (u=0xbb50c8b0a7c0) at ../src/core/unit.c:797
  #10 0x0000e14d99ce228c [PAC] in 
manager_clear_jobs_and_units.part.0.lto_priv.0 (m=m@entry=0xbb50c8acf790) at 
../src/core/manager.c:1594
  #11 0x0000e14d99ce2624 [PAC] in manager_clear_jobs_and_units 
(m=0xbb50c8acf790) at ../src/core/manager.c:1591
  #12 manager_reload (m=m@entry=0xbb50c8acf790) at ../src/core/manager.c:3568
  #13 0x0000bb509f709338 [PAC] in invoke_main_loop 
(ret_error_message=0xffffddda3178, ret_switch_root_init=<synthetic pointer>, 
ret_switch_root_dir=<synthetic pointer>, ret_fds=0xffffddda3168, 
ret_retval=<synthetic pointer>,
-     saved_rlimit_memlock=0xffffddda31a0, saved_rlimit_nofile=0xffffddda31b0, 
m=0xbb50c8acf790) at ../src/core/main.c:1982
+     saved_rlimit_memlock=0xffffddda31a0, saved_rlimit_nofile=0xffffddda31b0, 
m=0xbb50c8acf790) at ../src/core/main.c:1982
  #14 main (argc=1, argv=0xffffddda3668) at ../src/core/main.c:3106
  
- 
- In this particular case the reload was requested by WALinuxAgent, but I have 
evidence of other crashes when cloud-init request the reload:
+ In this particular case the reload was requested by WALinuxAgent, but I
+ have evidence of other crashes when cloud-init request the reload:
  
  2025-01-02T10:18:03.007621+00:00 localhost systemd[1]: Reloading requested 
from client PID 871 ('systemctl') (unit cloud-init-local.service)...
  2025-01-02T10:18:03.007662+00:00 localhost systemd[1]: Reloading...
  2025-01-02T10:18:03.010753+00:00 localhost systemd[1]: Caught <SEGV>, from 
unknown sender process.
  
  In some cases I don't even see a reload but I will see if I can get a
  crash dump for those cases.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2098861

Title:
  systemd 255.4-1ubuntu8.5 crashing on arm64 (Azure v6 VM sizes)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2098861/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to