[ceph-users] Re: PGs down

2020-12-22 Thread Igor Fedotov
Hi Jeremy, good to know you managed to bring your OSDs up. Have you been able to reweight them to 0 and migrate data out of these "broken" OSDs? If so I suggest to redeploy them - the corruption is still in the DB and it might pop-up one day. If not please do that first - you might still h

[ceph-users] kvm vm cephfs mount hangs on osd node (something like umount -l available?) (help wanted going to production)

2020-12-22 Thread Marc Roos
I have a vm on a osd node (which can reach host and other nodes via the macvtap interface (used by the host and guest)). I just did a simple bonnie++ test and everything seems to be fine. Yesterday however the dovecot procces apparently caused problems (only using cephfs for an archive names

[ceph-users] Re: kvm vm cephfs mount hangs on osd node (something like umount -l available?) (help wanted going to production)

2020-12-22 Thread Marc Roos
Just got this during bonnie test, trying to do an ls -l on the cephfs. I also have this kworker process constantly at 40% when doing this bonnie++ test. [35281.101763] INFO: task bash:1169 blocked for more than 120 seconds. [35281.102064] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" di

[ceph-users] Re: kvm vm cephfs mount hangs on osd node (something like umount -l available?) (help wanted going to production)

2020-12-22 Thread Marc Roos
I can live-migrate the vm in this locked up state to a different host without any problems. -Original Message- To: ceph-users Subject: [ceph-users] kvm vm cephfs mount hangs on osd node (something like umount -l available?) (help wanted going to production) I have a vm on a osd nod

[ceph-users] Can big data use Ceph?

2020-12-22 Thread fantastic2085
Can big data use Ceph?For example, can Hive Hbase Spark use Ceph? https://github.com/ceph/cephfs-hadoop is no longer maintain? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Can big data use Ceph?

2020-12-22 Thread Brian :
Have a search for cern and ceph. On Tuesday, December 22, 2020, fantastic2085 wrote: > Can big data use Ceph?For example, can Hive Hbase Spark use Ceph? > https://github.com/ceph/cephfs-hadoop is no longer maintain? > ___ > ceph-users mailing list -- ce

[ceph-users] Re: Can big data use Ceph?

2020-12-22 Thread Marc Roos
I am not really familiar with spark, but I see it often used in combination with mesos. They currently implemented a csi solution that should enable access to ceph. I have been trying to get this to work[1] I assume being able to scale tasks with distributed block devices or the cephfs would

[ceph-users] Re: kvm vm cephfs mount hangs on osd node (something like umount -l available?) (help wanted going to production)

2020-12-22 Thread Marc Roos
Is there not some genius out there that can shed a ligth on this? ;) Currently I am not able to reproduce this. Thus it would be nice to have some procedure at hand that resolves stale cephfs mounts nicely. -Original Message- To: ceph-users Subject: [ceph-users] kvm vm cephfs mount ha

[ceph-users] Re: Can big data use Ceph?

2020-12-22 Thread Matt Benjamin
Ceph RGW is frequently used as a backing store for Hadoop and Spark (S3A connector). Matt On Tue, Dec 22, 2020 at 5:29 AM fantastic2085 wrote: > > Can big data use Ceph?For example, can Hive Hbase Spark use Ceph? > https://github.com/ceph/cephfs-hadoop is no longer maintain? > __

[ceph-users] Re: kvm vm cephfs mount hangs on osd node (something like umount -l available?) (help wanted going to production)

2020-12-22 Thread Eugen Block
Hi, there have been several threads about hanging cephfs mounts, one quite long thread [1] describes a couple of debugging options but also mentions to avoid mounting cephfs on OSD nodes in a production environment. Do you see blacklisted clients with 'ceph osd blacklist ls'? If the ans

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-12-22 Thread Kalle Happonen
For anybody facing similar issues, we wrote a blog post about everything we faced, and how we worked through it. https://cloud.blog.csc.fi/2020/12/allas-november-2020-incident-details.html Cheers, Kalle - Original Message - > From: "Kalle Happonen" > To: "Dan van der Ster" , "ceph-user

[ceph-users] Re: Data migration between clusters

2020-12-22 Thread Kalle Happonen
Hi Istvan, I'm not sure it helps, but here's at least some pitfalls we faced when migrating radosgws between clusters. https://cloud.blog.csc.fi/2019/12/ceph-object-storage-migraine-i-mean.html Cheers, Kalle - Original Message - > From: "Szabo, Istvan (Agoda)" > To: "ceph-users" > Sen

[ceph-users] Re: Debian repo for ceph-iscsi

2020-12-22 Thread Chris Palmer
Hi Joachim Thanks for that pointer. I've pulled ceph-iscsi from there and and trying to get things going now on buster. The problem I have at the moment though is with python3-rtslib-fb. That hasn't been backported to buster, and the latest in the main buster repo is 2.1.66, but ceph-iscsi r

[ceph-users] Ceph rgw & dashboard problem

2020-12-22 Thread Mika Saari
Hi, Using Ceph Octopus installed with cephadm here. Version running currently is 15.2.6. There are 3 machines running the cluster. Machine names are introduced in /etc/hosts in long(FQDN) & short forms but ceph hostnames of the servers are in short form (not sure if this affects anyway). rdb sid

[ceph-users] Re: PGs down

2020-12-22 Thread Jeremy Austin
Hi Igor, I had taken the OSDs out already, so bringing them up allowed a full rebalance to occur. I verified that they were not exhibiting ATA or SMART-reportable errors, wiped them and re-added. I will deep scrub. Thanks again! Jeremy On Mon, Dec 21, 2020 at 11:39 PM Igor Fedotov wrote: >

[ceph-users] Failing OSD RocksDB Corrupt

2020-12-22 Thread Ashley Merrick
Hello,I had some faulty power cables on some OSD's in one server which caused lots of IO issues/disks appearing/disappearing.This has been corrected now, 2 of the 10 OSD's are working, however 8 are failing to start due to what looks to be a corrupt DB.When running a ceph-bluestore-tool fsck I g

[ceph-users] Re: Debian repo for ceph-iscsi

2020-12-22 Thread Chris Palmer
Pulling the package python3-rtslib-fb_2.1.71-3_all.deb from bullseye and manually installing it on buster seems to have done the trick. On 22/12/2020 13:20, Chris Palmer wrote: Hi Joachim Thanks for that pointer. I've pulled ceph-iscsi from there and and trying to get things going now on bust

[ceph-users] Re: diskprediction_local fails with python3-sklearn 0.22.2

2020-12-22 Thread Reed Dier
I'm going to resurrect this thread in hopes that in the 6 months since, someone has found a solution? After recently upgrading my mgr's to 20.04 and 15.2.8, the diskprediction_local module is failing for me in the exact same manner. > $ dpkg -l | grep sklearn > ii python3-sklearn

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-22 Thread Ken Dreyer
Thanks David! A couple of things have happened since the last update. The primary Fedora cheroot package maintainer updated cheroot from 8.5.0 to 8.5.1 in Rawhide. I've rebuilt this for el8 and put it into a new repository here: https://fedorapeople.org/~ktdreyer/bz1907005/ There are a few more s

[ceph-users] after octopus cluster reinstall, rbd map fails with timeout

2020-12-22 Thread Philip Brown
More banging on my prototype cluster, and ran into an odd problem. Used to be, when I create an rbd device, then try to map it, it would initially fail, saying I have to disable some features. Then I just run the suggested disable line -- usually rbd feature disable poolname/rbdname object-ma