Hi,

I have set a 3 host cluster with 30 OSDs total. Cluster has health OK and no 
warning whatsoever. I set an RBD pool and 14 images which werer all 
rbd-mirrored to a second cluster (which was disconnected since problems began) 
and also an iSCSI interface. Then I connected a Windows 2019 Server through 
iSCSI, mounted all 14 drives and created a spanned volume with all the drives. 
Everything was working fine, but I had to disconnect the server, so I 
disconnected the iSCSI interface and when I tried to reconnect my volume was 
unusable and drives seemed stuck. I ended rebooting each cluster node and then 
later, since I still couldn't use my images, removed and recreated all images.

in this second run all was good and I had a robocopy syncing files for almost a 
week to my ceph cluster and had copied more than 5TB of data already when my 
Windows Server got stuck. Still not sure why it got stuck, but some services 
like FTP were responding but others, including login, were not. So I reset 
Windows server and when it was back up my spanned volume was bad again. I've 
been trying to recover it for the last 2 days but without success.

Right now all images are disconnected, I have no locks (found some at some 
point and removed, but not sure who was locking) and no watchers in any of the 
images, but the 3 images that had data in it are corrupt or locked somehow. 
Nothing I try works on them and the operation gets stuck. I can edit the 
images' config, but not these 3. I can create snapshots, but not these 3. I 
managed to mount images using iSCSI in a Linux box, but these 3 get Linux 
commands (fdisk, parted) hanging. Ceph dashboard shows some stats like read and 
write rate for all images, but these 3.

It seems something inside the image is broken or stuck but as I said no locks 
on them.

I tried a lot of options and somehow my cluster now has some RGW pools that I 
have no idea where they came from.

Any idea what I should do?

--
Salsa



_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to