This seems to have passed, though what caused it remains unclear to me.

The osses and mds may have had connectivity issues that cleared when we reset
the opa fabric, in conjuction with rebooting the entire lustre backend.


On Wed, 2022-07-06 at 22:31 +0000, Nehring, Shane R [LAS] via lustre-discuss
wrote:
> Hello all,
> 
> We recently rebuilt our filesystem on 2.15.0 due to issues we were seeing on
> our
> legacy volume. The mds and oss-es are rhel 8.6 The mdt is zfs as are all the
> osts. We built from b2_15 to get the rhel 8.6 compat commits.
> 
> It's been working pretty well for the past week or so until we had a client
> crash earlier today. Said client wasn't able to access the mount when it came
> back so we decided to reboot the mds.
> 
> The mounts for the mdts hung in llog_process_or_fork for about 25 minutes or
> so
> before actually beginning the mounting process and starting recovery.
> Similarly
> in two clients we had with connectivity issues with the filesystem and
> rebooted
> we see them stuck in llog_process_or_fork
> 
> stack traces:
> client:
> [<0>] llog_process_or_fork+0x43c/0x560 [obdclass]
> [<0>] llog_process+0x10/0x20 [obdclass]
> [<0>] class_config_parse_llog+0x1e9/0x3e0 [obdclass]
> [<0>] mgc_process_cfg_log+0x709/0xd80 [mgc]
> [<0>] mgc_process_log+0x6c3/0x800 [mgc]
> [<0>] mgc_process_config+0xb53/0xe60 [mgc]
> [<0>] lustre_process_log+0x5fa/0xad0 [obdclass]
> [<0>] ll_fill_super+0x739/0x1190 [lustre]
> [<0>] lustre_fill_super+0xf4/0x4a0 [lustre]
> [<0>] mount_nodev+0x48/0xa0
> [<0>] legacy_get_tree+0x27/0x40
> [<0>] vfs_get_tree+0x25/0xb0
> [<0>] do_mount+0x2e2/0x950
> [<0>] ksys_mount+0xb6/0xd0
> [<0>] __x64_sys_mount+0x21/0x30
> [<0>] do_syscall_64+0x5b/0x1a0
> [<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
> 
> server:
> [<0>] llog_process_or_fork+0x43c/0x560 [obdclass]
> [<0>] llog_process+0x10/0x20 [obdclass]
> [<0>] class_config_parse_llog+0x1e9/0x3e0 [obdclass]
> [<0>] mgc_process_cfg_log+0x709/0xd80 [mgc]
> [<0>] mgc_process_log+0x6c3/0x800 [mgc]
> [<0>] mgc_process_config+0xb53/0xe60 [mgc]
> [<0>] lustre_process_log+0x5fa/0xad0 [obdclass]
> [<0>] ll_fill_super+0x739/0x1190 [lustre]
> [<0>] lustre_fill_super+0xf4/0x4a0 [lustre]
> [<0>] mount_nodev+0x48/0xa0
> [<0>] legacy_get_tree+0x27/0x40
> [<0>] vfs_get_tree+0x25/0xb0
> [<0>] do_mount+0x2e2/0x950
> [<0>] ksys_mount+0xb6/0xd0
> [<0>] __x64_sys_mount+0x21/0x30
> [<0>] do_syscall_64+0x5b/0x1a0
> [<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
> 
> any recommendations or insight would be appreciated
> 
> Thanks
> 
> Shane
> _______________________________________________
> lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to