Good morning Istvan, sadly no, it’s not fixed. I just have an idea what might trigger the problem and how I can try to mitigate it.
I still don’t know what these errors are and why they happen. I refuse to think that RGW „lose“ data, when OSDs become unstable. Have a good start in in the week Boris > Am 22.08.2022 um 05:12 schrieb Szabo, Istvan (Agoda) <istvan.sz...@agoda.com>: > > Hi, > > So your problem has it been fixed? > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: istvan.sz...@agoda.com > --------------------------------------------------- > > -----Original Message----- > From: Boris Behrens <b...@kervyn.de> > Sent: Monday, August 22, 2022 12:48 AM > To: ceph-users@ceph.io > Subject: [ceph-users] Re: Ceph Octopus RGW 15.2.17 - files not available in > rados while still in bucket index > > Email received from the internet. If in doubt, don't click any link nor open > any attachment ! > ________________________________ > > I just checked something else and it looks like this problem happens when our > SSD OSDs get marked as laggy, because of the GC bug: > https://tracker.ceph.com/issues/53585 > > :2022-08-18T22:00:12.257+0000 7fb9dbe62700 0 log_channel(cluster) log [INF] > : osd.263 marked itself dead as of e658014 > :2022-08-18T22:01:48.727+0000 7fb9dbe62700 0 log_channel(cluster) log [INF] > : osd.242 marked itself dead as of e658018 > :2022-08-18T22:03:07.898+0000 7fb9dbe62700 0 log_channel(cluster) log [INF] > : osd.263 marked itself dead as of e658023 > :2022-08-18T22:10:54.963+0000 7fb9dbe62700 0 log_channel(cluster) log [INF] > : osd.242 marked itself dead as of e658028 > > Out s3 cluster us also used for our backup center which got RBD exports from > our rbd clusters (which are usually multiple GB/TB in size). > We added some SSD OSDs and put all of our non-data pools on these SSD OSDs. > > This helped to leverage some pressure from the cluster, when the GC goes > nuts. Maybe this happens together. > >> Am So., 21. Aug. 2022 um 19:34 Uhr schrieb Boris Behrens <b...@kervyn.de>: >> >> Cheers everybody, >> >> I had this issue some time ago, and we though it was fixed, but it >> seems to happen again. >> We have files, that get uploaded by one of our customer, only >> available in the index, but not in the rados. >> >> At first we thought this might be a bug ( >> https://tracker.ceph.com/issues/54528) which got fixed with the last >> pointrelease, but it seems not. And only on customer got this problem. >> At the moment we thing it is some very weird usage of the s3 API (they >> developed their own library and used the AWS SDK for .net as a basis) >> together with multipart uploads. >> >> I am also not sure HOW they do the upload, because it is a backup that >> get uploaded every day and they seem to have multiple of them. I >> didn't went through all of our logs, but I managed to pull one >> lifecycle of a file from the logs and it showed very strange errors at >> the end and I couldn't find anything with this error. >> >> Hope someone can tell me what this is and how I can fix it. >> >> Cheers >> Boris >> >> Strange errors: >> 2022-08-18T22:04:29.538+0000 7f7ba9fcb700 0 req 9033182355071581504 >> 183.407425780s s3:complete_multipart WARNING: failed to remove object >> sql-backup-de:_multipart_IM_DIFFERENTIAL_22.bak.2~ehGVVRPV3LnWW31bRmBE >> cOHSKB_zJAs.meta >> 2022-08-18T22:04:29.542+0000 7f7ba9fcb700 0 req 9033182355071581504 >> 183.411425768s s3:complete_multipart WARNING: failed to unlock >> CLUSTERUUID.BUCKET.INDENTIFIER__multipart_IM_DIFFERENTIAL_22.bak.2~ehG >> VVRPV3LnWW31bRmBEcOHSKB_zJAs.meta >> >> Full log (trimmed when only partNumber changed): >> 2022-08-18T22:01:08.894838+0000 "GET >> /sql-backup-sde/IM_DIFFERENTIAL_22.bak HTTP/1.1" 200 315392 - >> "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" >> - >> 2022-08-18T22:01:08.930838+0000 "POST >> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploads HTTP/1.1" 200 271 - >> "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" >> - >> 2022-08-18T22:01:09.108374+0000 "POST >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploads HTTP/1.1" 200 270 - >> "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" >> - >> 2022-08-18T22:01:09.472368+0000 "PUT >> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploadId=2~KX75VPCYFOZRPRLo5L0y >> tQuyp-nzbrT&partNumber=4 HTTP/1.1" 200 2523136 - "Boto3/1.24.23 >> Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" - .. >> 2022-08-18T22:01:09.619099+0000 "PUT >> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploadId=2~KX75VPCYFOZRPRLo5L0y >> tQuyp-nzbrT&partNumber=2 HTTP/1.1" 200 8388608 - "Boto3/1.24.23 >> Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:01:09.706836+0000 "POST >> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploadId=2~KX75VPCYFOZRPRLo5L0y >> tQuyp-nzbrT HTTP/1.1" 200 334 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:01:09.852362+0000 "PUT >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs&partNumber=1 HTTP/1.1" 200 8388608 - "Boto3/1.24.23 >> Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" - .. >> 2022-08-18T22:01:26.098900+0000 "PUT >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs&partNumber=161 HTTP/1.1" 200 8388608 - "Boto3/1.24.23 >> Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:02:14.103386+0000 "GET >> /sql-backup-de/IM_DIFFERENTIAL_22.bak >> HTTP/1.1" 200 4194304 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:02:26.275201+0000 "POST >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:02:27.787178+0000 "POST >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:02:29.386586+0000 "POST >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:02:30.911130+0000 "POST >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:02:30.999129+0000 "DELETE >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs HTTP/1.1" 204 0 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:02:42.782544+0000 "GET >> /sql-backup-de/IM_DIFFERENTIAL_22.bak >> HTTP/1.1" 200 0 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar >> Botocore/1.27.23" - >> 2022-08-18T22:04:29.538+0000 7f7ba9fcb700 0 req 9033182355071581504 >> 183.407425780s s3:complete_multipart WARNING: failed to remove object >> sql-backup-de:_multipart_IM_DIFFERENTIAL_22.bak.2~ehGVVRPV3LnWW31bRmBE >> cOHSKB_zJAs.meta >> 2022-08-18T22:04:29.542210+0000 "POST >> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEc >> OHSKB_zJAs HTTP/1.1" 200 334 - "Boto3/1.24.23 Python/3.10.5 >> Linux/5.10.102-flatcar Botocore/1.27.23" - >> 2022-08-18T22:04:29.542+0000 7f7ba9fcb700 0 req 9033182355071581504 >> 183.411425768s s3:complete_multipart WARNING: failed to unlock >> CLUSTERUUID.BUCKET.INDENTIFIER__multipart_IM_DIFFERENTIAL_22.bak.2~ehG >> VVRPV3LnWW31bRmBEcOHSKB_zJAs.meta >> >> -- >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend >> im groüen Saal. >> > > > -- > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im > groüen Saal. > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to > ceph-users-le...@ceph.io > > ________________________________ > This message is confidential and is for the sole use of the intended > recipient(s). It may also be privileged or otherwise protected by copyright > or other legal rules. If you have received it by mistake please let us know > by reply email and delete it from your system. It is prohibited to copy this > message or disclose its content to anyone. Any confidentiality or privilege > is not waived or lost by any mistaken delivery or unauthorized disclosure of > the message. All messages sent to and from Agoda may be monitored to ensure > compliance with company policies, to protect the company's interests and to > remove potential malware. Electronic messages may be intercepted, amended, > lost or deleted, or contain viruses. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io