Re: [ceph-users] Right way to delete OSD from cluster?

2019-02-28 Thread Fyodor Ustinov
Hi! Yes. But I am a little surprised by what is written in the documentation: http://docs.ceph.com/docs/mimic/rados/operations/add-or-rm-osds/ --- Before you remove an OSD, it is usually up and in. You need to take it out of the cluster so that Ceph can begin rebalancing and copying its data to

Re: [ceph-users] ceph osd pg-upmap-items not working

2019-02-28 Thread Dan van der Ster
It looks like that somewhat unusual crush rule is confusing the new upmap cleaning. (debug_mon 10 on the active mon should show those cleanups). I'm copying Xie Xingguo, and probably you should create a tracker for this. -- dan On Fri, Mar 1, 2019 at 3:12 AM Kári Bertilsson wrote: > > This i

Re: [ceph-users] Mimic 13.2.4 rbd du slowness

2019-02-28 Thread Glen Baars
Here is the strace result. % time seconds usecs/call callserrors syscall -- --- --- - - 99.940.236170 790 299 5 futex 0.060.000136 0 365 brk 0.000.00 0

Re: [ceph-users] Mimic 13.2.4 rbd du slowness

2019-02-28 Thread David Turner
Have you used strace on the du command to see what it's spending its time doing? On Thu, Feb 28, 2019, 8:45 PM Glen Baars wrote: > Hello Wido, > > The cluster layout is as follows: > > 3 x Monitor hosts ( 2 x 10Gbit bonded ) > 9 x OSD hosts ( > 2 x 10Gbit bonded, > LSI cachecade and write cache

Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy

2019-02-28 Thread David Turner
Why are you making the same rbd to multiple servers? On Wed, Feb 27, 2019, 9:50 AM Ilya Dryomov wrote: > On Wed, Feb 27, 2019 at 12:00 PM Thomas <74cmo...@gmail.com> wrote: > > > > Hi, > > I have noticed an error when writing to a mapped RBD. > > Therefore I unmounted the block device. > > Then

Re: [ceph-users] PG Calculations Issue

2019-02-28 Thread David Turner
Those numbers look right for a pool only containing 10% of your data. Now continue to calculate the pg counts for the remaining 90% of your data. On Wed, Feb 27, 2019, 12:17 PM Krishna Venkata wrote: > Greetings, > > > I am having issues in the way PGs are calculated in > https://ceph.com/pgcalc

Re: [ceph-users] redirect log to syslog and disable log to stderr

2019-02-28 Thread David Turner
You can always set it in your ceph.conf file and restart the mgr daemon. On Tue, Feb 26, 2019, 1:30 PM Alex Litvak wrote: > Dear Cephers, > > In mimic 13.2.2 > ceph tell mgr.* injectargs --log-to-stderr=false > Returns an error (no valid command found ...). What is the correct way to > inject m

Re: [ceph-users] Right way to delete OSD from cluster?

2019-02-28 Thread David Turner
The reason is that an osd still contributes to the host weight in the crush map even while it is marked out. When you out and then purge, the purging operation removed the osd from the map and changes the weight of the host which changes the crush map and data moves. By weighting the osd to 0.0, th

Re: [ceph-users] Configuration about using nvme SSD

2019-02-28 Thread 韦皓诚
I have tried to devide an nvme disk into four partitions. However, no significant improvement was found in performance by rados bench. nvme with partition: 1 node 3 nvme 12 osd, 166066 iops in 4K read nvme without partition: 1 node 3 nvme 3 osd 163336 iops in 4K read My ceph version is 12.2.4. What

Re: [ceph-users] ceph osd pg-upmap-items not working

2019-02-28 Thread Kári Bertilsson
This is the pool pool 41 'ec82_pool' erasure size 10 min_size 8 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 last_change 63794 lfor 21731/21731 flags hashpspool,ec_overwrites stripe_width 32768 application cephfs removed_snaps [1~5] Here is the relevant crush rule: rule ec_pool

Re: [ceph-users] Mimic 13.2.4 rbd du slowness

2019-02-28 Thread Glen Baars
Hello Wido, The cluster layout is as follows: 3 x Monitor hosts ( 2 x 10Gbit bonded ) 9 x OSD hosts ( 2 x 10Gbit bonded, LSI cachecade and write cache drives set to single, All HDD in this pool, no separate DB / WAL. With the write cache and the SSD read cache on the LSI card it seems to perform

Re: [ceph-users] rbd space usage

2019-02-28 Thread Matthew H
It looks like he used 'rbd map' to map his volume. If so, then yes just run fstrim on the device. If it's an instance with a cinder, or a nova ephemeral disk (on ceph) then you have to use virtio-scsi to run discard in your instance. From: ceph-users on behalf

Re: [ceph-users] rbd space usage

2019-02-28 Thread Jack
Ha, that was your issue RBD does not know that your space (on the filesystem level) is now free to use You have to trim your filesystem, see fstrim(8) as well as the discard mount option The related scsi command have to be passed down the stack, so you may need to check on other level (for insta

Re: [ceph-users] rbd space usage

2019-02-28 Thread Matthew H
I think the command you are looking for is 'rbd du' example rbd du rbd/myimagename From: ceph-users on behalf of solarflow99 Sent: Thursday, February 28, 2019 5:31 PM To: Jack Cc: Ceph Users Subject: Re: [ceph-users] rbd space usage yes, but: # rbd showmappe

Re: [ceph-users] rbd space usage

2019-02-28 Thread solarflow99
yes, but: # rbd showmapped id pool image snap device 0 rbd nfs1 -/dev/rbd0 1 rbd nfs2 -/dev/rbd1 # df -h Filesystem Size Used Avail Use% Mounted on /dev/rbd0 8.0T 4.8T 3.3T 60% /mnt/nfsroot/rbd0 /dev/rbd1 9.8T 34M 9.8T 1% /mnt/nfsroot/rbd1 only 5T is tak

Re: [ceph-users] rbd space usage

2019-02-28 Thread Jack
Are not you using 3-replicas pool ? (15745GB + 955GB + 1595M) * 3 ~= 51157G (there is overhead involved) Best regards, On 02/28/2019 11:09 PM, solarflow99 wrote: > thanks, I still can't understand whats taking up all the space 27.75 > > On Thu, Feb 28, 2019 at 7:18 AM Mohamad Gebai wrote: > >

Re: [ceph-users] rbd space usage

2019-02-28 Thread solarflow99
thanks, I still can't understand whats taking up all the space 27.75 On Thu, Feb 28, 2019 at 7:18 AM Mohamad Gebai wrote: > On 2/27/19 4:57 PM, Marc Roos wrote: > > They are 'thin provisioned' meaning if you create a 10GB rbd, it does > > not use 10GB at the start. (afaik) > > You can use 'rbd -

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Igor Fedotov
Also I think it makes sense to create a ticket at this point. Any volunteers? On 3/1/2019 1:00 AM, Igor Fedotov wrote: Wondering if somebody would be able to apply simple patch that periodically resets StupidAllocator? Just to verify/disprove the hypothesis it's allocator relateted On 2/28/2

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Igor Fedotov
Wondering if somebody would be able to apply simple patch that periodically resets StupidAllocator? Just to verify/disprove the hypothesis it's allocator relateted On 2/28/2019 11:57 PM, Stefan Kooman wrote: Quoting Wido den Hollander (w...@42on.com): Just wanted to chime in, I've seen thi

Re: [ceph-users] MDS_SLOW_METADATA_IO

2019-02-28 Thread Patrick Donnelly
On Thu, Feb 28, 2019 at 12:49 PM Stefan Kooman wrote: > > Dear list, > > After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs > (MDS_SLOW_METADATA_IO). The metadata IOs would have been blocked for > more that 5 seconds. We have one active, and one active standby MDS. All > storage

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Stefan Kooman
Quoting Wido den Hollander (w...@42on.com): > Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe > OSDs as well. Over time their latency increased until we started to > notice I/O-wait inside VMs. On a Luminous 12.2.8 cluster with only SSDs we also hit this issue I guess. After

[ceph-users] MDS_SLOW_METADATA_IO

2019-02-28 Thread Stefan Kooman
Dear list, After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs (MDS_SLOW_METADATA_IO). The metadata IOs would have been blocked for more that 5 seconds. We have one active, and one active standby MDS. All storage on SSD (Samsung PM863a / Intel DC4500). No other (OSD) slow ops repo

[ceph-users] Fuse-Ceph mount timeout

2019-02-28 Thread Doug Bell
I am having trouble where all of the clients attached to a Ceph cluster are timing out when trying to perform a fuse mount of the cephfs volume. # ceph-fuse -f -m 10.1.2.157,10.1.2.194,10.0.2.191 /v --keyring /etc/ceph/ceph.client.admin.keyring --name client.admin -o debug 2019-02-21 20:13:46.7072

Re: [ceph-users] collectd problems with pools

2019-02-28 Thread Reed Dier
I've been collecting with collectd since Jewel, and experienced the growing pains when moving to Luminous and collectd-ceph needing to be reworked to support Luminous. It is also worth mentioning that in Luminous+ there is an Influx plugin for ceph-mgr that has some per pool statistics. Reed

Re: [ceph-users] radosgw sync falling behind regularly

2019-02-28 Thread Christian Rice
Yeah my bad on the typo, not running 12.8.8 ☺ It’s 12.2.8. We can upgrade and will attempt to do so asap. Thanks for that, I need to read my release notes more carefully, I guess! From: Matthew H Date: Wednesday, February 27, 2019 at 8:33 PM To: Christian Rice , ceph-users Subject: Re: rado

Re: [ceph-users] collectd problems with pools

2019-02-28 Thread Matthew Vernon
Hi, On 28/02/2019 17:00, Marc Roos wrote: Should you not be pasting that as an issue on github collectd-ceph? I hope you don't mind me asking, I am also using collectd and dumping the data to influx. Are you downsampling with influx? ( I am not :/ [0]) It might be "ask collectd-ceph authors n

Re: [ceph-users] RBD poor performance

2019-02-28 Thread Maged Mokhtar
Hi Mark, The 38K iops for single OSD is quite good. For the 4 OSDs, I think the 55K iops may start to be impacted by network latency on the server node. It will be interesting to know when using something more common like 3x replica, what additional amplification factor we see over the replic

Re: [ceph-users] collectd problems with pools

2019-02-28 Thread Marc Roos
Should you not be pasting that as an issue on github collectd-ceph? I hope you don't mind me asking, I am also using collectd and dumping the data to influx. Are you downsampling with influx? ( I am not :/ [0]) [0] https://community.influxdata.com/t/how-does-grouping-work-does-it-work/7936

[ceph-users] collectd problems with pools

2019-02-28 Thread Matthew Vernon
Hi, We monitor our Ceph clusters (production is Jewel, test clusters are on Luminous) with collectd and its official ceph plugin. The one thing that's missing is per-pool outputs - the collectd plugin just talks to the individual daemons, none of which have pool details in - those are availa

Re: [ceph-users] [EXTERNAL] Re: Multi-Site Cluster RGW Sync issues

2019-02-28 Thread Benjamin . Zieglmeier
The output has 57000 lines (and growing). I’ve uploaded the output to: https://gist.github.com/zieg8301/7e6952e9964c1e0964fb63f61e7b7be7 Thanks, Ben From: Matthew H Date: Wednesday, February 27, 2019 at 11:02 PM To: "Benjamin. Zieglmeier" Cc: "ceph-users@lists.ceph.com" Subject: [EXTERNAL] Re

Re: [ceph-users] rbd space usage

2019-02-28 Thread Mohamad Gebai
On 2/27/19 4:57 PM, Marc Roos wrote: > They are 'thin provisioned' meaning if you create a 10GB rbd, it does > not use 10GB at the start. (afaik) You can use 'rbd -p rbd du' to see how much of these devices is provisioned and see if it's coherent. Mohamad > > > -Original Message- > From

Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Uwe Sauter
olcDbShmKey only applies to BDB and HDB backends but I'm using the new MDB backend. Am 28.02.19 um 14:47 schrieb Marc Roos: > If you have every second disk io with your current settings, which I > also had with 'default' settings. There are some optimizations you can > do, bringing it down to

Re: [ceph-users] Cephfs recursive stats | rctime in the future

2019-02-28 Thread Yan, Zheng
On Thu, Feb 28, 2019 at 5:33 PM David C wrote: > > On Wed, Feb 27, 2019 at 11:35 AM Hector Martin wrote: >> >> On 27/02/2019 19:22, David C wrote: >> > Hi All >> > >> > I'm seeing quite a few directories in my filesystem with rctime years in >> > the future. E.g >> > >> > ]# getfattr -d -m ceph.d

[ceph-users] Bluestore lvm wal and db in ssd disk with ceph-ansible

2019-02-28 Thread Andres Rojas Guerrero
Hi all, I have another newbie question, we are trying to deploy a ceph cluster mimic with bluestore with the wal a db data in a SSD disks. For this we are using ceph-ansible approach, we have seen that ceph-ansible has a playbook in order to create lvm structure (lv-create.yml) but it's seems only

Re: [ceph-users] [Ceph-community] How does ceph use the STS service?

2019-02-28 Thread Sage Weil
On Thu, 28 Feb 2019, Matthew H wrote: > This feature is in the Nautilus release. > > The first release (14.1.0) of Nautilus is available from > download.ceph.com as of last Friday. Please keep in mind this is a release candidate. The first official stable nautilus release will be 14.2.0 in a w

Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Uwe Sauter
I already sent my configuration to the list about 3,5h ago but here it is again: [global] auth client required = cephx auth cluster required = cephx auth service required = cephx cluster network = 169.254.42.0/24 fsid = 753c9bbd-74bd-4fea-8c1e-88da775c5ad4 keyring = /etc/pve/priv/$clu

Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Marc Roos
If you have every second disk io with your current settings, which I also had with 'default' settings. There are some optimizations you can do, bringing it down to every 50 seconds or so. Adding the olcDbShmKey will allow for slapd to access the db cache. I am getting an error of sharedmemory s

Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Matthew H
Could you send your ceph.conf file over please? Are you setting any tunables for OSD or Bluestore currently? From: ceph-users on behalf of Uwe Sauter Sent: Thursday, February 28, 2019 8:33 AM To: Marc Roos; ceph-users; vitalif Subject: Re: [ceph-users] Fwd: Re:

[ceph-users] ceph tracker login failed

2019-02-28 Thread M Ranga Swami Reddy
I tried to login to ceph tracker - it failing with openID url.? I tried with my OpenID: http://tracker.ceph.com/login my id: https://code.launchpad.net/~swamireddy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi

Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Uwe Sauter
Do you have anything particular in mind? I'm using mdb backend with maxsize = 1GB but currently the files are only about 23MB. > > I am having quite a few openldap servers (slaves) running also, make > sure to use proper caching that saves a lot of disk io. > > > > > -Original Messag

Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Marc Roos
I am having quite a few openldap servers (slaves) running also, make sure to use proper caching that saves a lot of disk io. -Original Message- Sent: 28 February 2019 13:56 To: uwe.sauter...@gmail.com; Uwe Sauter; Ceph Users Subject: *SPAM* Re: [ceph-users] Fwd: Re: Blocked

Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Виталий Филиппов
"Advanced power loss protection" is in fact a performance feature, not a safety one. 28 февраля 2019 г. 13:03:51 GMT+03:00, Uwe Sauter пишет: >Hi all, > >thanks for your insights. > >Eneko, > >> We tried to use a Samsung 840 Pro SSD as OSD some time ago and it was >a no-go; it wasn't that perfo

Re: [ceph-users] osd exit common/Thread.cc: 160: FAILED assert(ret == 0)--10.2.10

2019-02-28 Thread lin zhou
Thanks Greg. I found the limit. it is /proc/sys/kernel/threads-max. I count thread numbers using: ps -eo nlwp | tail -n +2 | awk '{ num_threads += $1 } END { print num_threads }'" -o 97981 lin zhou 于2019年2月28日周四 上午10:33写道: > Thanks, Greg. Your reply always so fast. > > I check my system these li

[ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Uwe Sauter
Hi all, thanks for your insights. Eneko, > We tried to use a Samsung 840 Pro SSD as OSD some time ago and it was a > no-go; it wasn't that performance was bad, it > just didn't work for the kind of use of OSD. Any HDD was better than it (the > disk was healthy and have been used in a > softw

Re: [ceph-users] Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Uwe Sauter
Am 28.02.19 um 10:42 schrieb Matthew H: > Have you made any changes to your ceph.conf? If so, would you mind copying > them into this thread? No, I just deleted an OSD, replaced HDD with SDD and created a new OSD (with bluestore). Once the cluster was healty again, I repeated with the next OSD.

Re: [ceph-users] REQUEST_SLOW across many OSDs at the same time

2019-02-28 Thread Matthew H
Is fstrim or discard enabled for these SSD's? If so, how did you enable it? I've seen similiar issues with poor controllers on SSDs. They tend to block I/O when trim kicks off. Thanks, From: ceph-users on behalf of Paul Emmerich Sent: Friday, February 22, 201

Re: [ceph-users] Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Matthew H
Have you made any changes to your ceph.conf? If so, would you mind copying them into this thread? From: ceph-users on behalf of Vitaliy Filippov Sent: Wednesday, February 27, 2019 4:21 PM To: Ceph Users Subject: Re: [ceph-users] Blocked ops after change from fi

Re: [ceph-users] Cephfs recursive stats | rctime in the future

2019-02-28 Thread David C
On Wed, Feb 27, 2019 at 11:35 AM Hector Martin wrote: > On 27/02/2019 19:22, David C wrote: > > Hi All > > > > I'm seeing quite a few directories in my filesystem with rctime years in > > the future. E.g > > > > ]# getfattr -d -m ceph.dir.* /path/to/dir > > getfattr: Removing leading '/' from abs

Re: [ceph-users] [Ceph-community] How does ceph use the STS service?

2019-02-28 Thread Matthew H
This feature is in the Nautilus release. The first release (14.1.0) of Nautilus is available from download.ceph.com as of last Friday. From: ceph-users on behalf of admin Sent: Thursday, February 28, 2019 4:22 AM To: Pritha Srivastava; Sage Weil; ceph-us...@ce

Re: [ceph-users] [Ceph-community] How does ceph use the STS service?

2019-02-28 Thread admin
Hi, can you tell me the version that includes STS lite?Thanks,myxingkong___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Mimic 13.2.4 rbd du slowness

2019-02-28 Thread Wido den Hollander
On 2/28/19 9:41 AM, Glen Baars wrote: > Hello Wido, > > I have looked at the libvirt code and there is a check to ensure that > fast-diff is enabled on the image and only then does it try to get the real > disk usage. The issue for me is that even with fast-diff enabled it takes > 25min to g

Re: [ceph-users] Mimic 13.2.4 rbd du slowness

2019-02-28 Thread Glen Baars
Hello Wido, I have looked at the libvirt code and there is a check to ensure that fast-diff is enabled on the image and only then does it try to get the real disk usage. The issue for me is that even with fast-diff enabled it takes 25min to get the space usage for a 50TB image. I had considere

Re: [ceph-users] ceph osd pg-upmap-items not working

2019-02-28 Thread Dan van der Ster
Hi, pg-upmap-items became more strict in v12.2.11 when validating upmaps. E.g., it now won't let you put two PGs in the same rack if the crush rule doesn't allow it. Where are OSDs 23 and 123 in your cluster? What is the relevant crush rule? -- dan On Wed, Feb 27, 2019 at 9:17 PM Kári Bertilss