Hi!
Yes. But I am a little surprised by what is written in the documentation:
http://docs.ceph.com/docs/mimic/rados/operations/add-or-rm-osds/
---
Before you remove an OSD, it is usually up and in. You need to take it out of
the cluster so that Ceph can begin rebalancing and copying its data to
It looks like that somewhat unusual crush rule is confusing the new
upmap cleaning.
(debug_mon 10 on the active mon should show those cleanups).
I'm copying Xie Xingguo, and probably you should create a tracker for this.
-- dan
On Fri, Mar 1, 2019 at 3:12 AM Kári Bertilsson wrote:
>
> This i
Here is the strace result.
% time seconds usecs/call callserrors syscall
-- --- --- - -
99.940.236170 790 299 5 futex
0.060.000136 0 365 brk
0.000.00 0
Have you used strace on the du command to see what it's spending its time
doing?
On Thu, Feb 28, 2019, 8:45 PM Glen Baars
wrote:
> Hello Wido,
>
> The cluster layout is as follows:
>
> 3 x Monitor hosts ( 2 x 10Gbit bonded )
> 9 x OSD hosts (
> 2 x 10Gbit bonded,
> LSI cachecade and write cache
Why are you making the same rbd to multiple servers?
On Wed, Feb 27, 2019, 9:50 AM Ilya Dryomov wrote:
> On Wed, Feb 27, 2019 at 12:00 PM Thomas <74cmo...@gmail.com> wrote:
> >
> > Hi,
> > I have noticed an error when writing to a mapped RBD.
> > Therefore I unmounted the block device.
> > Then
Those numbers look right for a pool only containing 10% of your data. Now
continue to calculate the pg counts for the remaining 90% of your data.
On Wed, Feb 27, 2019, 12:17 PM Krishna Venkata
wrote:
> Greetings,
>
>
> I am having issues in the way PGs are calculated in
> https://ceph.com/pgcalc
You can always set it in your ceph.conf file and restart the mgr daemon.
On Tue, Feb 26, 2019, 1:30 PM Alex Litvak
wrote:
> Dear Cephers,
>
> In mimic 13.2.2
> ceph tell mgr.* injectargs --log-to-stderr=false
> Returns an error (no valid command found ...). What is the correct way to
> inject m
The reason is that an osd still contributes to the host weight in the crush
map even while it is marked out. When you out and then purge, the purging
operation removed the osd from the map and changes the weight of the host
which changes the crush map and data moves. By weighting the osd to 0.0,
th
I have tried to devide an nvme disk into four partitions. However, no
significant improvement was found in performance by rados bench.
nvme with partition: 1 node 3 nvme 12 osd, 166066 iops in 4K read
nvme without partition: 1 node 3 nvme 3 osd 163336 iops in 4K read
My ceph version is 12.2.4.
What
This is the pool
pool 41 'ec82_pool' erasure size 10 min_size 8 crush_rule 1 object_hash
rjenkins pg_num 512 pgp_num 512 last_change 63794 lfor 21731/21731 flags
hashpspool,ec_overwrites stripe_width 32768 application cephfs
removed_snaps [1~5]
Here is the relevant crush rule:
rule ec_pool
Hello Wido,
The cluster layout is as follows:
3 x Monitor hosts ( 2 x 10Gbit bonded )
9 x OSD hosts (
2 x 10Gbit bonded,
LSI cachecade and write cache drives set to single,
All HDD in this pool,
no separate DB / WAL. With the write cache and the SSD read cache on the LSI
card it seems to perform
It looks like he used 'rbd map' to map his volume. If so, then yes just run
fstrim on the device.
If it's an instance with a cinder, or a nova ephemeral disk (on ceph) then you
have to use virtio-scsi to run discard in your instance.
From: ceph-users on behalf
Ha, that was your issue
RBD does not know that your space (on the filesystem level) is now free
to use
You have to trim your filesystem, see fstrim(8) as well as the discard
mount option
The related scsi command have to be passed down the stack, so you may
need to check on other level (for insta
I think the command you are looking for is 'rbd du'
example
rbd du rbd/myimagename
From: ceph-users on behalf of solarflow99
Sent: Thursday, February 28, 2019 5:31 PM
To: Jack
Cc: Ceph Users
Subject: Re: [ceph-users] rbd space usage
yes, but:
# rbd showmappe
yes, but:
# rbd showmapped
id pool image snap device
0 rbd nfs1 -/dev/rbd0
1 rbd nfs2 -/dev/rbd1
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/rbd0 8.0T 4.8T 3.3T 60% /mnt/nfsroot/rbd0
/dev/rbd1 9.8T 34M 9.8T 1% /mnt/nfsroot/rbd1
only 5T is tak
Are not you using 3-replicas pool ?
(15745GB + 955GB + 1595M) * 3 ~= 51157G (there is overhead involved)
Best regards,
On 02/28/2019 11:09 PM, solarflow99 wrote:
> thanks, I still can't understand whats taking up all the space 27.75
>
> On Thu, Feb 28, 2019 at 7:18 AM Mohamad Gebai wrote:
>
>
thanks, I still can't understand whats taking up all the space 27.75
On Thu, Feb 28, 2019 at 7:18 AM Mohamad Gebai wrote:
> On 2/27/19 4:57 PM, Marc Roos wrote:
> > They are 'thin provisioned' meaning if you create a 10GB rbd, it does
> > not use 10GB at the start. (afaik)
>
> You can use 'rbd -
Also I think it makes sense to create a ticket at this point. Any
volunteers?
On 3/1/2019 1:00 AM, Igor Fedotov wrote:
Wondering if somebody would be able to apply simple patch that
periodically resets StupidAllocator?
Just to verify/disprove the hypothesis it's allocator relateted
On 2/28/2
Wondering if somebody would be able to apply simple patch that
periodically resets StupidAllocator?
Just to verify/disprove the hypothesis it's allocator relateted
On 2/28/2019 11:57 PM, Stefan Kooman wrote:
Quoting Wido den Hollander (w...@42on.com):
Just wanted to chime in, I've seen thi
On Thu, Feb 28, 2019 at 12:49 PM Stefan Kooman wrote:
>
> Dear list,
>
> After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs
> (MDS_SLOW_METADATA_IO). The metadata IOs would have been blocked for
> more that 5 seconds. We have one active, and one active standby MDS. All
> storage
Quoting Wido den Hollander (w...@42on.com):
> Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe
> OSDs as well. Over time their latency increased until we started to
> notice I/O-wait inside VMs.
On a Luminous 12.2.8 cluster with only SSDs we also hit this issue I
guess. After
Dear list,
After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs
(MDS_SLOW_METADATA_IO). The metadata IOs would have been blocked for
more that 5 seconds. We have one active, and one active standby MDS. All
storage on SSD (Samsung PM863a / Intel DC4500). No other (OSD) slow ops
repo
I am having trouble where all of the clients attached to a Ceph cluster are
timing out when trying to perform a fuse mount of the cephfs volume.
# ceph-fuse -f -m 10.1.2.157,10.1.2.194,10.0.2.191 /v --keyring
/etc/ceph/ceph.client.admin.keyring --name client.admin -o debug
2019-02-21 20:13:46.7072
I've been collecting with collectd since Jewel, and experienced the growing
pains when moving to Luminous and collectd-ceph needing to be reworked to
support Luminous.
It is also worth mentioning that in Luminous+ there is an Influx plugin for
ceph-mgr that has some per pool statistics.
Reed
Yeah my bad on the typo, not running 12.8.8 ☺ It’s 12.2.8. We can upgrade and
will attempt to do so asap. Thanks for that, I need to read my release notes
more carefully, I guess!
From: Matthew H
Date: Wednesday, February 27, 2019 at 8:33 PM
To: Christian Rice , ceph-users
Subject: Re: rado
Hi,
On 28/02/2019 17:00, Marc Roos wrote:
Should you not be pasting that as an issue on github collectd-ceph? I
hope you don't mind me asking, I am also using collectd and dumping the
data to influx. Are you downsampling with influx? ( I am not :/ [0])
It might be "ask collectd-ceph authors n
Hi Mark,
The 38K iops for single OSD is quite good. For the 4 OSDs, I think the
55K iops may start to be impacted by network latency on the server node.
It will be interesting to know when using something more common like 3x
replica, what additional amplification factor we see over the replic
Should you not be pasting that as an issue on github collectd-ceph? I
hope you don't mind me asking, I am also using collectd and dumping the
data to influx. Are you downsampling with influx? ( I am not :/ [0])
[0]
https://community.influxdata.com/t/how-does-grouping-work-does-it-work/7936
Hi,
We monitor our Ceph clusters (production is Jewel, test clusters are on
Luminous) with collectd and its official ceph plugin.
The one thing that's missing is per-pool outputs - the collectd plugin
just talks to the individual daemons, none of which have pool details in
- those are availa
The output has 57000 lines (and growing). I’ve uploaded the output to:
https://gist.github.com/zieg8301/7e6952e9964c1e0964fb63f61e7b7be7
Thanks,
Ben
From: Matthew H
Date: Wednesday, February 27, 2019 at 11:02 PM
To: "Benjamin. Zieglmeier"
Cc: "ceph-users@lists.ceph.com"
Subject: [EXTERNAL] Re
On 2/27/19 4:57 PM, Marc Roos wrote:
> They are 'thin provisioned' meaning if you create a 10GB rbd, it does
> not use 10GB at the start. (afaik)
You can use 'rbd -p rbd du' to see how much of these devices is
provisioned and see if it's coherent.
Mohamad
>
>
> -Original Message-
> From
olcDbShmKey only applies to BDB and HDB backends but I'm using the new MDB
backend.
Am 28.02.19 um 14:47 schrieb Marc Roos:
> If you have every second disk io with your current settings, which I
> also had with 'default' settings. There are some optimizations you can
> do, bringing it down to
On Thu, Feb 28, 2019 at 5:33 PM David C wrote:
>
> On Wed, Feb 27, 2019 at 11:35 AM Hector Martin wrote:
>>
>> On 27/02/2019 19:22, David C wrote:
>> > Hi All
>> >
>> > I'm seeing quite a few directories in my filesystem with rctime years in
>> > the future. E.g
>> >
>> > ]# getfattr -d -m ceph.d
Hi all, I have another newbie question, we are trying to deploy a ceph
cluster mimic with bluestore with the wal a db data in a SSD disks.
For this we are using ceph-ansible approach, we have seen that
ceph-ansible has a playbook in order to create lvm structure
(lv-create.yml) but it's seems only
On Thu, 28 Feb 2019, Matthew H wrote:
> This feature is in the Nautilus release.
>
> The first release (14.1.0) of Nautilus is available from
> download.ceph.com as of last Friday.
Please keep in mind this is a release candidate. The first official
stable nautilus release will be 14.2.0 in a w
I already sent my configuration to the list about 3,5h ago but here it is again:
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 169.254.42.0/24
fsid = 753c9bbd-74bd-4fea-8c1e-88da775c5ad4
keyring = /etc/pve/priv/$clu
If you have every second disk io with your current settings, which I
also had with 'default' settings. There are some optimizations you can
do, bringing it down to every 50 seconds or so. Adding the olcDbShmKey
will allow for slapd to access the db cache.
I am getting an error of sharedmemory s
Could you send your ceph.conf file over please? Are you setting any tunables
for OSD or Bluestore currently?
From: ceph-users on behalf of Uwe Sauter
Sent: Thursday, February 28, 2019 8:33 AM
To: Marc Roos; ceph-users; vitalif
Subject: Re: [ceph-users] Fwd: Re:
I tried to login to ceph tracker - it failing with openID url.?
I tried with my OpenID:
http://tracker.ceph.com/login
my id: https://code.launchpad.net/~swamireddy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi
Do you have anything particular in mind? I'm using mdb backend with maxsize =
1GB but currently the files are only about 23MB.
>
> I am having quite a few openldap servers (slaves) running also, make
> sure to use proper caching that saves a lot of disk io.
>
>
>
>
> -Original Messag
I am having quite a few openldap servers (slaves) running also, make
sure to use proper caching that saves a lot of disk io.
-Original Message-
Sent: 28 February 2019 13:56
To: uwe.sauter...@gmail.com; Uwe Sauter; Ceph Users
Subject: *SPAM* Re: [ceph-users] Fwd: Re: Blocked
"Advanced power loss protection" is in fact a performance feature, not a safety
one.
28 февраля 2019 г. 13:03:51 GMT+03:00, Uwe Sauter
пишет:
>Hi all,
>
>thanks for your insights.
>
>Eneko,
>
>> We tried to use a Samsung 840 Pro SSD as OSD some time ago and it was
>a no-go; it wasn't that perfo
Thanks Greg. I found the limit. it is /proc/sys/kernel/threads-max.
I count thread numbers using:
ps -eo nlwp | tail -n +2 | awk '{ num_threads += $1 } END { print
num_threads }'" -o 97981
lin zhou 于2019年2月28日周四 上午10:33写道:
> Thanks, Greg. Your reply always so fast.
>
> I check my system these li
Hi all,
thanks for your insights.
Eneko,
> We tried to use a Samsung 840 Pro SSD as OSD some time ago and it was a
> no-go; it wasn't that performance was bad, it
> just didn't work for the kind of use of OSD. Any HDD was better than it (the
> disk was healthy and have been used in a
> softw
Am 28.02.19 um 10:42 schrieb Matthew H:
> Have you made any changes to your ceph.conf? If so, would you mind copying
> them into this thread?
No, I just deleted an OSD, replaced HDD with SDD and created a new OSD (with
bluestore). Once the cluster was healty again, I
repeated with the next OSD.
Is fstrim or discard enabled for these SSD's? If so, how did you enable it?
I've seen similiar issues with poor controllers on SSDs. They tend to block I/O
when trim kicks off.
Thanks,
From: ceph-users on behalf of Paul Emmerich
Sent: Friday, February 22, 201
Have you made any changes to your ceph.conf? If so, would you mind copying them
into this thread?
From: ceph-users on behalf of Vitaliy
Filippov
Sent: Wednesday, February 27, 2019 4:21 PM
To: Ceph Users
Subject: Re: [ceph-users] Blocked ops after change from fi
On Wed, Feb 27, 2019 at 11:35 AM Hector Martin
wrote:
> On 27/02/2019 19:22, David C wrote:
> > Hi All
> >
> > I'm seeing quite a few directories in my filesystem with rctime years in
> > the future. E.g
> >
> > ]# getfattr -d -m ceph.dir.* /path/to/dir
> > getfattr: Removing leading '/' from abs
This feature is in the Nautilus release.
The first release (14.1.0) of Nautilus is available from download.ceph.com as
of last Friday.
From: ceph-users on behalf of admin
Sent: Thursday, February 28, 2019 4:22 AM
To: Pritha Srivastava; Sage Weil; ceph-us...@ce
Hi, can you tell me the version that includes STS lite?Thanks,myxingkong___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 2/28/19 9:41 AM, Glen Baars wrote:
> Hello Wido,
>
> I have looked at the libvirt code and there is a check to ensure that
> fast-diff is enabled on the image and only then does it try to get the real
> disk usage. The issue for me is that even with fast-diff enabled it takes
> 25min to g
Hello Wido,
I have looked at the libvirt code and there is a check to ensure that fast-diff
is enabled on the image and only then does it try to get the real disk usage.
The issue for me is that even with fast-diff enabled it takes 25min to get the
space usage for a 50TB image.
I had considere
Hi,
pg-upmap-items became more strict in v12.2.11 when validating upmaps.
E.g., it now won't let you put two PGs in the same rack if the crush
rule doesn't allow it.
Where are OSDs 23 and 123 in your cluster? What is the relevant crush rule?
-- dan
On Wed, Feb 27, 2019 at 9:17 PM Kári Bertilss
53 matches
Mail list logo