Hi Dongdong. > is simple and can be applied cleanly.
I understand this statement from a developer's perspective. Now, try to explain to a user with a cephadm deployed containerized cluster how to build a container from source, point cephadm to use this container and what to do for the next upgrade. I think "simple" depends on context. Applying a patch to a production system is currently an expert operation, I'm afraid. If you have instructions for building a ceph-container with the patch applied, I would be very interested. I was asking for a source container for exactly this reason. As far as I can tell from the conversation, this is quite a project in itself. The thread was "Re: Building ceph packages in containers? [was: Ceph debian/ubuntu packages build]", but I can't find it on the mailing list any more. There seems to be an archived version: https://www.spinics.net/lists/ceph-users/msg73231.html Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Dongdong Tao <[email protected]> Sent: 11 January 2023 04:30:14 To: Frank Schilder Cc: Igor Fedotov; [email protected]; [email protected] Subject: Re: [ceph-users] Re: OSD crash on Onode::put Hi Frank, I don't have an operational workaround, the patch https://github.com/ceph/ceph/pull/46911/commits/f43f596aac97200a70db7a70a230eb9343018159 is simple and can be applied cleanly. Yes, restarting the OSD will clear pool entries, you can restart it when the bluestore_onode items are very low (e.g less than 10) if it really helps, but I think you'll need to tune and monitor the performance until you can get a number that is most suitable for your cluster. But it can't help with the crash, since in general, the crash itself is basically a restart. Regards, Dongdong On Tue, Jan 10, 2023 at 8:21 PM Serkan Çoban <[email protected]<mailto:[email protected]>> wrote: Slot 19 is inside the chassis? Do you check chassis temperature? I sometimes have more failure rate in chassis HDDs than in front of the chassis. In our case it was related to the temperature difference. On Tue, Jan 10, 2023 at 1:28 PM Frank Schilder <[email protected]<mailto:[email protected]>> wrote: > > Following up on my previous post, we have identical OSD hosts. The very > strange observation now is, that all outlier OSDs are in exactly the same > disk slot on these hosts. We have 5 problematic OSDs and they are all in slot > 19 on 5 different hosts. This is an extremely strange and unlikely > co-incidence. > > Are there any specific conditions for this problem to be present or amplified > that could have to do with hardware? > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > _______________________________________________ > ceph-users mailing list -- [email protected]<mailto:[email protected]> > To unsubscribe send an email to > [email protected]<mailto:[email protected]> _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
