[ceph-users] Re: quincy v17.2.4 QE Validation status

2022-09-17 Thread Venky Shankar
On Wed, Sep 14, 2022 at 1:33 AM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/57472#note-1
> Release Notes - https://github.com/ceph/ceph/pull/48072
>
> Seeking approvals for:
>
> rados - Neha, Travis, Ernesto, Adam
> rgw - Casey
> fs - Venky

FS approved.

> orch - Adam
> rbd - Ilya, Deepika
> krbd - missing packages, Adam Kr is looking into it
> upgrade/octopus-x - missing packages, Adam Kr is looking into it
> ceph-volume - Guillaume is looking into it
>
> Please reply to this email with approval and/or trackers of known
> issues/PRs to address them.
>
> Josh, Neha - LRC upgrade pending major suites approvals.
> RC release - pending major suites approvals.
>
> Thx
> YuriW
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>


-- 
Cheers,
Venky

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ceph-users] OSD Crash in recovery: SST file contains data beyond the point of corruption.

2022-09-17 Thread Benjamin Naber

Hey Igor,

i just wanted to thank you for the help!
With the Flag you told me, i was able to brind up the OSD Container, then i 
marked it out and let ceph rebalance the PG´s and Objects to other OSD´s.
Yesterday all PG´s got avaiable again i could wipe the failed OSD drive an 
created a new one. Now everything works without the rocksdb flag and seems to 
fine!
THANK YOU SO MUCH!

Regards

Ben

Am Dienstag, September 13, 2022 11:26 CEST, schrieb Igor Fedotov 
:
  
Hi Benjamin,
sorry for the confusion this should be kSkipAnyCorruptedRecords not 
kSkipAnyCorruptedRecord
 
Thanks,
IgorOn 9/12/2022 11:26 PM, Benjamin Naber wrote:Hi Igor,

looks like the setting wont work, the container now starts with a different 
error message that the setting is an invalid argument.
Did i something wrong by setting: ceph config set osd.4 
bluestore_rocksdb_options_annex "wal_recovery_mode=kSkipAnyCorruptedRecord" ?

debug 2022-09-12T20:20:45.044+ 8714e040  1 bluefs add_block_device bdev 
1 path /var/lib/ceph/osd/ceph-4/block size 2.7 TiB
debug 2022-09-12T20:20:45.044+ 8714e040  1 bluefs mount
debug 2022-09-12T20:20:45.044+ 8714e040  1 bluefs _init_alloc shared, 
id 1, capacity 0x2baa100, block size 0x1
debug 2022-09-12T20:20:45.608+ 8714e040  1 bluefs mount 
shared_bdev_used = 0
debug 2022-09-12T20:20:45.608+ 8714e040  1 
bluestore(/var/lib/ceph/osd/ceph-4) _prepare_db_environment set db_paths to 
db,2850558889164 db.slow,2850558889164
debug 2022-09-12T20:20:45.608+ 8714e040 -1 rocksdb: Invalid argument: 
No mapping for enum : wal_recovery_mode
debug 2022-09-12T20:20:45.608+ 8714e040 -1 rocksdb: Invalid argument: 
No mapping for enum : wal_recovery_mode
debug 2022-09-12T20:20:45.608+ 8714e040  1 rocksdb: do_open load 
rocksdb options failed
debug 2022-09-12T20:20:45.608+ 8714e040 -1 
bluestore(/var/lib/ceph/osd/ceph-4) _open_db erroring opening db:
debug 2022-09-12T20:20:45.608+ 8714e040  1 bluefs umount
debug 2022-09-12T20:20:45.608+ 8714e040  1 bdev(0xec8e3c00 
/var/lib/ceph/osd/ceph-4/block) close
debug 2022-09-12T20:20:45.836+ 8714e040  1 bdev(0xec8e2400 
/var/lib/ceph/osd/ceph-4/block) close
debug 2022-09-12T20:20:46.088+ 8714e040 -1 osd.4 0 OSD:init: unable to 
mount object store
debug 2022-09-12T20:20:46.088+ 8714e040 -1  ** ERROR: osd init failed: 
(5) Input/output error

Regards and many thanks for the help!

Ben
Am Montag, September 12, 2022 21:14 CEST, schrieb Igor Fedotov 
:
 Hi Benjamin,

honestly the following advice is unlikely to help but you may want to
try to set bluestore_rocksdb_options_annex to one of the following options:

- wal_recovery_mode=kTolerateCorruptedTailRecords

- wal_recovery_mode=kSkipAnyCorruptedRecord


The indication that the setting is in effect would be the respective
value at the end of following log line:

debug 2022-09-12T17:37:05.574+ a8316040 4 rocksdb:
Options.wal_recovery_mode: 2


It should get 0 and 3 respectively.


Hoe this helps,

Igor


On 9/12/2022 9:09 PM, Benjamin Naber wrote:
> Hi Everybody,
>
> im struggeling now a couple of days with a degraded cehp cluster.
> Its a simple 3 node Cluster with 6 OSD´s, 3 SSD based, 3 HDD based. A couple 
> of days ago one of the nodes crashed. in case of Hardisk failure, i replaces 
> the hard disk and the recovery process started without any issues.
> As the node was still recovering the new replaced OSD drive was switched to 
> backfillfull. And this is where the pain stareted. I added another node 
> bought a harddrive and wiped the replacement OSD.
> The Cluster then was a 4 node sized cluster with 3 OSD´s for the SSD pool and 
> 4 OSD´s for the HDD pool.
> Then i started the recovery process from beginning. Ceph has also started at 
> this point a reassingment of missplaced objects.
> Then a power failure to one of the remaining nodes happend and now im 
> stucking with a degraded Cluster and  49 pgs inactive, 3 pgs incomplete.
> The OSD Container on the power failure node dindt come up anymore in case of 
> rocksdb error. Any advice how the recover the corrupt rocksdb ?
> Container Log and rocksdb error:
>
> https://pastebin.com/gvGJdubx
>
> Regards an thanks for your help!
>
> Ben
>
>
> --
> ___
> Diese E-mail einschließlich eventuell angehängter Dateien enthält 
> vertrauliche und / oder rechtlich geschützte Informationen. Wenn Sie nicht 
> der richtige Adressat sind und diese E-mail irrtümlich erhalten haben, dürfen 
> Sie weder den Inhalt dieser E-mail nutzen noch dürfen Sie die eventuell 
> angehängten Dateien öffnen und auch keine Kopie fertigen oder den Inhalt 
> weitergeben / verbreiten. Bitte verständigen Sie den Absender und löschen Sie 
> diese E-mail und eventuell angehängte Dateien umgehend.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-l

[ceph-users] Requested range is not satisfiable

2022-09-17 Thread Rok Jaklič
Hi,

we try to copy a big file (over 400GB) using a minio client to the ceph
cluster. Copy or better transfer takes a lot of time (2 days for example)
because of "slow connection".

Usually somewhere near the end (but looks random) we get an error like:
 Failed to copy `/360GB.bigfile.img`. The requested range is not
satisfiable

We are using RGW.

Any ideas why?

Kind regards,
Rok
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io