Re: [ceph-users] CephFS msg length greater than osd_max_write_size

Yan, Zheng Wed, 22 May 2019 01:21:16 -0700

On Tue, May 21, 2019 at 6:10 AM Ryan Leimenstoll
<[email protected]> wrote:
>
> Hi all,
>
> We recently encountered an issue where our CephFS filesystem unexpectedly was 
> set to read-only. When we look at some of the logs from the daemons I can see 
> the following:
>
> On the MDS:
> ...
> 2019-05-18 16:34:24.341 7fb3bd610700 -1 mds.0.89098 unhandled write error 
> (90) Message too long, force readonly...
> 2019-05-18 16:34:24.341 7fb3bd610700  1 mds.0.cache force file system 
> read-only
> 2019-05-18 16:34:24.341 7fb3bd610700  0 log_channel(cluster) log [WRN] : 
> force file system read-only
> 2019-05-18 16:34:41.289 7fb3c0616700  1 heartbeat_map is_healthy 'MDSRank' 
> had timed out after 15
> 2019-05-18 16:34:41.289 7fb3c0616700  0 mds.beacon.objmds00 Skipping beacon 
> heartbeat to monitors (last acked 4.00101s ago); MDS internal heartbeat is 
> not healthy!
> ...
>
> On one of the OSDs it was most likely targeting:
> ...
> 2019-05-18 16:34:24.140 7f8134e6c700 -1 osd.602 pg_epoch: 682796 pg[49.20b( v 
> 682796'15706523 (682693'15703449,682796'15706523] local-lis/les=673041/673042 
> n=10524 ec=245563/245563 lis/c 673041/673041 les/c/f 673042/673042/0 
> 673038/673041/668565) [602,530,558] r=0 lpr=673041 crt=682796'15706523 lcod 
> 682796'15706522 mlcod 682796'15706522 active+clean] do_op msg data len 
> 95146005 > osd_max_write_size 94371840 on osd_op(mds.0.89098:48609421 49.20b 
> 49:d0630e4c:::mds0_sessionmap:head [omap-set-header,omap-set-vals] snapc 0=[] 
> ondisk+write+known_if_redirected+full_force e682796) v8
> 2019-05-18 17:10:33.695 7f813466b700  0 log_channel(cluster) log [DBG] : 
> 49.31c scrub starts
> 2019-05-18 17:10:34.980 7f813466b700  0 log_channel(cluster) log [DBG] : 
> 49.31c scrub ok
> 2019-05-18 22:17:37.320 7f8134e6c700 -1 osd.602 pg_epoch: 683434 pg[49.20b( v 
> 682861'15706526 (682693'15703449,682861'15706526] local-lis/les=673041/673042 
> n=10525 ec=245563/245563 lis/c 673041/673041 les/c/f 673042/673042/0 
> 673038/673041/668565) [602,530,558] r=0 lpr=673041 crt=682861'15706526 lcod 
> 682859'15706525 mlcod 682859'15706525 active+clean] do_op msg data len 
> 95903764 > osd_max_write_size 94371840 on osd_op(mds.0.91565:357877 49.20b 
> 49:d0630e4c:::mds0_sessionmap:head 
> [omap-set-header,omap-set-vals,omap-rm-keys] snapc 0=[] 
> ondisk+write+known_if_redirected+full_force e683434) v8
> …
>
> During this time there were some health concerns with the cluster. 
> Significantly, since the error above seems to be related to the SessionMap, 
> we had a client that had a few blocked requests for over 35948 secs (it’s a 
> member of a compute cluster so we let the node drain/finish jobs before 
> rebooting). We have also had some issues with certain OSDs running older 
> hardware staying up/responding timely to heartbeats after upgrading to 
> Nautilus, although that seems to be an iowait/load issue that we are actively 
> working to mitigate separately.
>


This prevent mds from trimming completed requests recorded in session.
which results a very large session item.  To recovery, blacklist the
client that has blocked request, the restart mds.

> We are running Nautilus 14.2.1 on RHEL7.6. There is only one MDS Rank, with 
> an active/standby setup between two MDS nodes. MDS clients are mounted using 
> the RHEL7.6 kernel driver.
>
> My read here would be that the MDS is sending too large a message to the OSD, 
> however my understanding was that the MDS should be using osd_max_write_size 
> to determine the size of that message [0]. Is this maybe a bug in how this is 
> calculated on the MDS side?
>
>
> Thanks!
> Ryan Leimenstoll
> [email protected]
> University of Maryland Institute for Advanced Computer Studies
>
>
>
> [0] https://www.spinics.net/lists/ceph-devel/msg11951.html
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS msg length greater than osd_max_write_size

Reply via email to