Hi Nathan,

Any luck on your case? We're observing the same errors with 18.4.2 in the RGW 
daemons with Haproxy in front and are yet to figure out the root cause.
Additionally our logs get hit from time to time with the following:
debug 2024-10-10T09:31:42.213+0000 7f51dd83d640  0 req 8849896936637595128 
0.063999131s s3:complete_multipart WARNING: failed to unlock 
4d377818-805a-49fc-b96e-958045b407c1.19533300.2__multipart_1a6a75.20241010_093028_01680_n89e2.external-exchange-0.247/0/0_0.data.2~uVth32IWeO92ZNqlQEVtl5DtvIY-hKP.meta

Our thoughts are currently pointing into different timeouts between Haproxy and 
anything initiating multipart uploads. Things like boto3 for Python (aws sdk 
has default 60s read and connect timeouts), S3FileIO for Iceberg tables, 
various s3 libraries from Hadoop. They might be having shorter timeouts that 
Haproxy has therefore leaving Haproxy retrying connections while the aws sdk 
(for example) already timed out and initiated a new request.


Thanks,
Laimis J. @ oxy
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to