[ceph-users] Re: RGW bucket check --check-objects -fix failed

2019-09-05 Thread EDH - Manuel Rios Fernandez
Checking the shards, bucket got 64. But shard 48efb8c3-693c-4fe0-bbe4-fdc16f590a82.16313306.1.1 seems missing. Radosgw-admin fix wont recreate again, any recommendation? Maybe change shard numbers to lower? Regards Manuel De: EDH - Manuel Rios Fernandez Enviado el: jueves, 5 d

[ceph-users] Re: Followup: weird behaviour with ceph osd pool create and the "crush-rule" parameter (suddenly changes behaviour)

2019-09-05 Thread Robert LeBlanc
On Thu, Sep 5, 2019 at 2:12 AM wrote: > As far as I could tell, there wasn't anything in the logs (at the default > log levels), other than the forwarded request to the mons... > > (I did read the documentation, which is why I was using the long form... > it's the inconsistency which is problemat

[ceph-users] Re: v14.2.3 Nautilus rpm dependency problem: ceph-selinux-14.2.3-0.el7.x86_64 Requires: selinux-policy-base >= 3.13.1-229.el7_6.15

2019-09-05 Thread Ning Li
I found what was going on. I did set a enabled=0 on the updates section of repo. Enable the update and it works. [updates] name=CentOS-7 - Updates - mirrors.aliyun.com failovermethod=priority baseurl=http://mirrors.aliyun.com/centos/7/updates/$basearch/ http://mirrors.aliyuncs.com/centos

[ceph-users] v14.2.3 Nautilus rpm dependency problem: ceph-selinux-14.2.3-0.el7.x86_64 Requires: selinux-policy-base >= 3.13.1-229.el7_6.15

2019-09-05 Thread Ning Li
Hi list: When I yum install ceph-14.2.3, I run into the following depenency issue: --> Processing Dependency: selinux-policy-base >= 3.13.1-229.el7_6.15 for package: 2:ceph-selinux-14.2.3-0.el7.x86_64 ---> Package groff-base.x86_64 0:1.22.2-8.el7 will be installed ---> Package libnetfilter_connt

[ceph-users] Re: ceph mons stuck in electing state

2019-09-05 Thread Nick
Hi Huang, Thanks for offering to help but this original issue with the ceph-mon's not connecting already got diagnosed as a possible networking error at the hardware level last week. We originally removed all the mons except one to force it to come online without waiting for a quorum, and the netw

[ceph-users] Re: Applications slow in VMs running RBD disks

2019-09-05 Thread fengyd
Hi, 1. Find the Qemu pid with "ps ax" 2. Collect the FD used by the Qemu with "ls /proc/$pid/fd/ | wc -w" // $pid is the Qemu pid 3. cat /proc/$pid/limits | grep -E 'open files | processes' Do you have the configuration file */etc/libvirt/qemu.conf *on the computer? Find the line with *max_

[ceph-users] Re: ceph mons stuck in electing state

2019-09-05 Thread Nick
Hi Ashley, The only change I made was increasing the osd_max_backfills from 3 to 10 at first, and when that ended up causing more problems than it helped, it was lowering the setting back down to 3 that took the cluster offline. I've actually been working on this issue for a week now and my compa

[ceph-users] RGW bucket check --check-objects -fix failed

2019-09-05 Thread EDH - Manuel Rios Fernandez
Hi, We’re at 14.2.2 We just found a broken bucket index, trying to repair with the common commands ]# radosgw-admin bucket check --check-objects –fix finish instantly, but bucket should have near 60-70TB info. [root@CEPH-MON01 home]# radosgw-admin bucket check --check-objects --bucket B

[ceph-users] Re: Heavily-linked lists.ceph.com pipermail archive now appears to lead to 404s

2019-09-05 Thread Florian Haas
On 03/09/2019 18:42, Ilya Dryomov wrote: > On Tue, Sep 3, 2019 at 6:29 PM Florian Haas wrote: >> >> Hi, >> >> replying to my own message here in a shameless attempt to re-up this. I >> really hope that the list archive can be resurrected in one way or >> another... > > Adding David, who managed t

[ceph-users] Re: disk failure

2019-09-05 Thread Anthony D'Atri
Are you using Filestore? If so directory splitting can manifest this way. Check your networking too, packet loss between OSD nodes or between OSD nodes and the mons can also manifest this way, say if bonding isn’t working properly or you have a bad link. But as suggested below, check the OSD

[ceph-users] Re: disk failure

2019-09-05 Thread Nathan Fish
Disks failing should cause the OSD to exit, be marked down, and after around 15 minutes marked out. That's routine. An OSD flapping is something you need to look into. It could be a flaky drive, or extreme load as was mentioned. On Thu, Sep 5, 2019 at 2:27 PM solarflow99 wrote: > > dicks are exp

[ceph-users] Re: disk failure

2019-09-05 Thread solarflow99
dicks are expected to fail, and every once in a while i'll lose one, so that was expected and didn't come as any surprise to me. Are you suggesting failed drives almost always stay down and out? On Thu, Sep 5, 2019 at 11:13 AM Ashley Merrick wrote: > I would suggest checking the logs and seein

[ceph-users] Re: disk failure

2019-09-05 Thread Ashley Merrick
I would suggest checking the logs and seeing the exact reason its being marked out. If the disk is being hit hard and their is heavy I/O delays then Ceph may see that as a delayed reply outside of the set windows and mark as out. There is some variables that can be changed to give an OSD more t

[ceph-users] Re: disk failure

2019-09-05 Thread solarflow99
no, I mean ceph sees it as a failure and marks it out for a while On Thu, Sep 5, 2019 at 11:00 AM Ashley Merrick wrote: > Is your HD actually failing and vanishing from the OS and then coming back > shortly? > > Or do you just mean your OSD is crashing and then restarting it self > shortly later

[ceph-users] Re: disk failure

2019-09-05 Thread Ashley Merrick
Is your HD actually failing and vanishing from the OS and then coming back shortly? Or do you just mean your OSD is crashing and then restarting it self shortly later? On Fri, 06 Sep 2019 01:55:25 +0800 solarflo...@gmail.com wrote One of the things i've come to notice is when HDD

[ceph-users] disk failure

2019-09-05 Thread solarflow99
One of the things i've come to notice is when HDD drives fail, they often recover in a short time and get added back to the cluster. This causes the data to rebalance back and forth, and if I set the noout flag I get a health warning. Is there a better way to avoid this? _

[ceph-users] Re: CephFS+NFS For VMWare

2019-09-05 Thread Maged Mokhtar
this is an old thread, but could be useful for others, i found out the discrepancy in VMware vmotion speed under iSCSI is probably due the "emulate_3pc" config attribute for the LIO target. if set to 0, then yes VMWare will issue io in 64KB blocks, so the bandwidth will indeed be around 25 MB/s

[ceph-users] Re: Best osd scenario + ansible config?

2019-09-05 Thread Robert LeBlanc
On Tue, Sep 3, 2019 at 5:03 AM Yoann Moulin wrote: > > As for your EC 7+5 I would have gone for some thing like 8+3 as then you > have a spare node active in the cluster and can still provide full > protection in the event of a failure of a node. > > Make sense! On another cluster, I have an EC 7

[ceph-users] Re: ceph fs crashes on simple fio test

2019-09-05 Thread Robert LeBlanc
On Tue, Sep 3, 2019 at 11:33 AM Frank Schilder wrote: > Hi Robert and Paul, > > sad news. I did a 5 seconds single thread test after setting > osd_op_queue_cut_off=high on all OSDs and MDSs. Here the current settings: > > [root@ceph-01 ~]# ceph config show osd.0 > NAME

[ceph-users] bluestore_default_buffered_write

2019-09-05 Thread Fyodor Ustinov
Hi! Сan anybody help me - if I turn on bluestore_default_buffered_write will i get a WriteBack or WriteThrow? According to the documentation, we don’t understand this. And the second question - but in general there is an analog of the writeback in the OSD (I perfectly understand the danger of s

[ceph-users] Re: CEPH 14.2.3

2019-09-05 Thread Abhishek Lekshmanan
"Sasha Litvak" writes: > I wonder if it is possible to send an announcement at least a 1 business > day before you run the sync. You can state the date of the packages > availability in the announcement. This is just a suggestion. Maybe I can send a heads up in the devel/users list once we sta

[ceph-users] Re: Followup: weird behaviour with ceph osd pool create and the "crush-rule" parameter (suddenly changes behaviour)

2019-09-05 Thread aoanla
As far as I could tell, there wasn't anything in the logs (at the default log levels), other than the forwarded request to the mons... (I did read the documentation, which is why I was using the long form... it's the inconsistency which is problematic, especially since it also happens if you re

[ceph-users] Re: rados + radosstriper puts fail with "large" input objects (mimic/nautilus, ec pool)

2019-09-05 Thread Thomas Byrne - UKRI STFC
The placement groups are created, and the pool is completely functional for non-striper use. I can 'rados put' test objects in just fine to a pool created with any k value. It's when the --striper option in 'rados put' is used to invoke the use of libradosstriper, it fails on pools where k is n

[ceph-users] Re: rados + radosstriper puts fail with "large" input objects (mimic/nautilus, ec pool)

2019-09-05 Thread Eugen Block
Hi, I'm not sure if this could really be a bug. We have configured several ec pools with different profiles, many of them with k not a power of 2, e.g. k3 m7, k7 m11 on Luminous, k5 m1 in a Nautilus test cluster, I haven't had any issues regarding these profile yet. Can you share the rule

[ceph-users] Re: Followup: weird behaviour with ceph osd pool create and the "crush-rule" parameter (suddenly changes behaviour)

2019-09-05 Thread Eugen Block
Hi, just a short note on automatic rule creation. A SUSE engineer pointed me to the fact that this is actually documented [1] in the pool creation section, under [crush-rule-name] it says: For replicated pools it is the rule specified by the osd pool default crush rule config variable. Th

[ceph-users] Re: Proposal to disable "Avoid Duplicates" on all ceph.io lists

2019-09-05 Thread Marco Gaiarin
Mandi! Ilya Dryomov In chel di` si favelave... > I'm proposing that we disable this handler for all ceph.io lists. +1 -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG -