Is the score improving?
ceph balancer eval
It should be decreasing over time as the variances drop toward zero.
You mentioned a crush optimize code at the beginning... how did that
leave your cluster? The mgr balancer assumes that the crush weight of
each OSD is equal to its size in TB.
Do y
Hi all,
I am running a small Ceph cluster (1 MON and 3OSDs), and it works fine.
However, I have a doubt about the two networks (public and cluster) that an OSD
uses.
There is a reference from Mellanox
(https://community.mellanox.com/docs/DOC-2721) how to configure 'ceph.conf'.
However, after r
Hi,
Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
> Is the score improving?
>
> ceph balancer eval
>
> It should be decreasing over time as the variances drop toward zero.
>
> You mentioned a crush optimize code at the beginning... how did that
> leave your cluster? The mgr balancer assum
On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
wrote:
> Hi,
> Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
>> Is the score improving?
>>
>> ceph balancer eval
>>
>> It should be decreasing over time as the variances drop toward zero.
>>
>> You mentioned a crush optimize code
Hi,
Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
> wrote:
>> Hi,
>> Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
>>> Is the score improving?
>>>
>>> ceph balancer eval
>>>
>>> It should be decreasing over time as the varia
On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
wrote:
> Hi,
>
> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
>> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
>> wrote:
>>> Hi,
>>> Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
Is the score improving?
>
Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
> wrote:
>> Hi,
>>
>> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
>>> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
>>> wrote:
Hi,
Am 01.03.2018 um 09:03
On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
wrote:
>
> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
>> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
>> wrote:
>>> Hi,
>>>
>>> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
On Thu, Mar 1, 2018 at 9:31 AM,
On Thu, Mar 1, 2018 at 10:38 AM, Dan van der Ster wrote:
> On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
> wrote:
>>
>> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
>>> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
>>> wrote:
Hi,
Am 01.03.2018 um
On 02/28/2018 11:51 PM, Sage Weil wrote:
> On Wed, 28 Feb 2018, Dan Mick wrote:
>
>> Would anyone else appreciate a Google Calendar invitation for the
>> CDMs? Seems like a natural.
>
> Funny you should mention it! I was just talking to Leo this morning
> about creating a public Ceph Events cal
On Thu, Mar 1, 2018 at 10:40 AM, Dan van der Ster wrote:
> On Thu, Mar 1, 2018 at 10:38 AM, Dan van der Ster wrote:
>> On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
>> wrote:
>>>
>>> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
On Thu, Mar 1, 2018 at 9:52 AM, Stefan Prie
Stefan,
How many OSD's and how much RAM are in each server?
bluestore_cache_size=6G will not mean each OSD is using max 6GB RAM right?
Our bluestore hdd OSD's with bluestore_cache_size at 1G use ~4GB of total
RAM. The cache is a part of the memory usage by bluestore OSD's.
Kind regards,
Caspar
I have recently updated to Luminous (12.2.4) and I have noticed that
using "ceph -w" only produces an initial output like the one below but
never gets updated afterwards. Is this a feature because I was used to
the old way that was constantly
producing info.
Here is what I get as initial outpu
I was testing IO and I created a bench pool.
But if I tried to delete I get:
Error EPERM: pool deletion is disabled; you must first set the
mon_allow_pool_delete config option to true before you can destroy a
pool
So I run:
ceph tell mon.\* injectargs '--mon-allow-pool-delete=true'
Ah!
So you think this is done by design?
However that command is very very very usefull.
Please add that to documentation.
Next time it will save me 2/3 hours.
Il 01/03/2018 06:12, Sébastien VIGNERON ha scritto:
Hi Max,
I had the same issue (under Ubuntu 1/6/.04) but I have read the
ceph-de
On Thu, Mar 1, 2018 at 12:03 PM, Georgios Dimitrakakis
wrote:
> I have recently updated to Luminous (12.2.4) and I have noticed that using
> "ceph -w" only produces an initial output like the one below but never gets
> updated afterwards. Is this a feature because I was used to the old way that
>
nice thanks will try that soon.
Can you tell me how to change the log lever to info for the balancer module?
Am 01.03.2018 um 11:30 schrieb Dan van der Ster:
> On Thu, Mar 1, 2018 at 10:40 AM, Dan van der Ster wrote:
>> On Thu, Mar 1, 2018 at 10:38 AM, Dan van der Ster
>> wrote:
>>> On Thu, Ma
Xen by Citrix used to be a very good hypervisor.
However they used very old kernel till the 7.1
The distribution doesn't allow you to add package from yum. So you need
to hack it.
I have helped to develop the installer of the not ufficial plugin:
https://github.com/rposudnevskiy/RBDSR
However
On Thu, Mar 1, 2018 at 1:08 PM, Stefan Priebe - Profihost AG
wrote:
> nice thanks will try that soon.
>
> Can you tell me how to change the log lever to info for the balancer module?
debug mgr = 4/5
-- dan
___
ceph-users mailing list
ceph-users@lists.c
Jaroslaw Owsiewski writes:
> What about this: https://tracker.ceph.com/issues/22015#change-105987 ?
Still has to wait for 12.2.5 unfortunately. We only had some critical
build/ceph-disk and whatever prs had already passed QE post 12.2.3 in
12.2.4.
>
> Regards
>
> --
> Jarek
>
> 2018-02-28 16:
Excellent! Good to know that the behavior is intentional!
Thanks a lot John for the feedback!
Best regards,
G.
On Thu, Mar 1, 2018 at 12:03 PM, Georgios Dimitrakakis
wrote:
I have recently updated to Luminous (12.2.4) and I have noticed that
using
"ceph -w" only produces an initial output
Il 28/02/2018 18:16, David Turner ha scritto:
My thought is that in 4 years you could have migrated to a hypervisor
that will have better performance into ceph than an added iSCSI layer.
I won't deploy VMs for ceph on anything that won't allow librbd to
work. Anything else is added complexity a
I totally understand and see your frustration here, but you've to keep
in mind that this is an Open Source project with a lots of volunteers.
If you have a really urgent need, you have the possibility to develop
such a feature on your own or you've to buy someone who could do the
work for you.
It'
>
> 在 2018年3月1日,下午8:04,Max Cuttins 写道:
>
> I was testing IO and I created a bench pool.
>
> But if I tried to delete I get:
> Error EPERM: pool deletion is disabled; you must first set the
> mon_allow_pool_delete config option to true before you can destroy a pool
>
> So I run:
>
> ceph tell
On 01. mars 2018 13:04, Max Cuttins wrote:
I was testing IO and I created a bench pool.
But if I tried to delete I get:
Error EPERM: pool deletion is disabled; you must first set the
mon_allow_pool_delete config option to true before you can destroy a
pool
So I run:
ceph tell
It's not necessary to restart a mon if you just want to delete a pool,
even if the "not observed" message appears. And I would not recommend
to permanently enable the "easy" way of deleting a pool. If you are
not able to delete the pool after "ceph tell mon ..." try this:
ceph daemon mon. c
Quoting Caspar Smit (caspars...@supernas.eu):
> Stefan,
>
> How many OSD's and how much RAM are in each server?
Currently 7 OSDs, 128 GB RAM. Max wil be 10 OSDs in these servers. 12
cores (at least one core per OSD).
> bluestore_cache_size=6G will not mean each OSD is using max 6GB RAM right?
A
You mean documentation like `ceph-deploy --help` or `man ceph-deploy` or
the [1] online documentation? Spoiler, they all document and explain what
`--release` does. I do agree that the [2] documentation talking about
deploying a luminous cluster should mention it if jewel was left the
default insta
On another note, is there any work being done for persistent group
reservations support for Ceph/LIO compatibility? Or just a rough estimate :)
Would love to see Redhat/Ceph support this type of setup. I know Suse
supports it as of late.
Sam
On Mar 1, 2018 07:33, "Kai Wagner" wrote:
> I total
There has been some chatter on the ML questioning the need to separate out
the public and private subnets for Ceph. The trend seems to be in
simplifying your configuration which for some is not specifying multiple
subnets here. I haven't heard of anyone complaining about network problems
with putt
Hello,
I've tried to change a lot of things on configuration and use ceph-fuse but
nothing makes it work better... When I deploy the git repository it becomes
much slower until I remount the FS (just executing systemctl stop nginx &&
umount /mnt/ceph && mount -a && systemctl start nginx). It happe
It's very high on our priority list to get a solution merged in the
upstream kernel. There was a proposal to use DLM to distribute the PGR
state between target gateways (a la the SCST target) and it's quite
possible that would have the least amount of upstream resistance since
it would work for all
`ceph pg stat` might be cleaner to watch than the `ceph status | grep
pgs`. I also like watching `ceph osd pool stats` which breaks down all IO
by pool. You also have the option of the dashboard mgr service which has a
lot of useful information including the pool IO breakdown.
On Thu, Mar 1, 201
With default memory settings, the general rule is 1GB ram/1TB OSD. If you
have a 4TB OSD, you should plan to have at least 4GB ram. This was the
recommendation for filestore OSDs, but it was a bit much memory for the
OSDs. From what I've seen, this rule is a little more appropriate with
bluestor
Hi Max,
> On Feb 28, 2018, at 10:06 AM, Max Cuttins wrote:
>
> This is true, but having something that just works in order to have minimum
> compatibility and start to dismiss old disk is something you should think
> about.
> You'll have ages in order to improve and get better performance. But
Using CephFS for something like this is about the last thing I would do.
Does it need to be on a networked posix filesystem that can be mounted on
multiple machines at the same time? If so, then you're kinda stuck and we
can start looking at your MDS hardware and see if there are any MDS
settings
Hi,
Still seeing this on Luminous 12.2.2:
When I do ceph pg deep-scrub on the pg or ceph osd deep-scrub on the
primary osd, I get the message
instructing pg 5.238 on osd.356 to deep-scrub
But nothing happens on that OSD. I waited a day, but the timestamp I see
in ceph pg dump hasn't changed
On Thu, 1 Mar 2018 09:11:21 -0500, Jason Dillaman wrote:
> It's very high on our priority list to get a solution merged in the
> upstream kernel. There was a proposal to use DLM to distribute the PGR
> state between target gateways (a la the SCST target) and it's quite
> possible that would have t
On 02/28/2018 10:06 AM, Max Cuttins wrote:
Il 28/02/2018 15:19, Jason Dillaman ha scritto:
On Wed, Feb 28, 2018 at 7:53 AM, Massimiliano Cuttini
wrote:
I was building ceph in order to use with iSCSI.
But I just see from the docs that need:
CentOS 7.5
(which is not available yet, it's still
They added `ceph pg force-backfill ` but there is nothing to force
scrubbing yet aside from the previously mentioned tricks. You should be
able to change osd_max_scrubs around until the PGs you want to scrub are
going.
On Thu, Mar 1, 2018 at 9:30 AM Kenneth Waegeman
wrote:
> Hi,
>
> Still seeing
Hi Jason,
That's awesome. Keep up the good work guys, we all love the work you are
doing with that software!!
Sam
On Mar 1, 2018 09:11, "Jason Dillaman" wrote:
> It's very high on our priority list to get a solution merged in the
> upstream kernel. There was a proposal to use DLM to distribut
I wonder when EMC/Netapp are going to start giving away production ready
bits that fit into your architecture
At least support for this feature is coming in the near term.
I say keep on keepin on. Kudos to the ceph team (and maybe more teams) for
taking care of the hard stuff for us.
On T
Hello,
Our problem is that the webpage is on a autoscaling group, so the created
machine is not always updated and needs to have the latest data always.
I've tried several ways to do it:
- Local Storage synced: Sometimes the sync fails and data is not updated
- NFS: If NFS server goes down,
Hello,
I would like to point out that we are running ceph+redundant iscsiGW's,
connecting the LUN's to a esxi+vcsa-6.5 cluster with Red Hat support.
We did encountered a few bumps on the road to production, but those got
fixed by Red Hat engineering and are included in the rhel7.5 and 4.17
kernel
Hello,
With Bluestore, I have a couple of questions regarding the case of
separate partitions for block.wal and block.db.
Let's take the case of an OSD node that contains several OSDs (HDDs) and
also contains one SSD drive for storing WAL partitions and an another
one for storing DB partitio
This removes ceph completely, or any other networked storage, but git has
triggers. If your website is stopped in git and you just need to make sure
that nginx always has access to the latest data, just configure git
triggers to auto-update the repository when there is a commit to the
repository fr
This aspect of osds has not changed from filestore with SSD journals to
bluestore with DB and WAL soon SSDs. If the SSD fails, all osds using it
aren't lost and need to be removed from the cluster and recreated with a
new drive.
You can never guarantee data integrity on bluestore or filestore if a
s/aren't/are/ :)
Met vriendelijke groet,
Caspar Smit
Systemengineer
SuperNAS
Dorsvlegelstraat 13
1445 PA Purmerend
t: (+31) 299 410 414
e: caspars...@supernas.eu
w: www.supernas.eu
2018-03-01 16:31 GMT+01:00 David Turner :
> This aspect of osds has not changed from filestore with SSD journa
Hello,
Some data is not in git repository and also needs to be updated on all
servers at same time (uploads...), that's why I'm searching for a
centralized solution.
I think I've found a "patch" to do it... All our server are connected to a
manager, so I've created a task in that managet to stop
Almost...
Il 01/03/2018 16:17, Heðin Ejdesgaard Møller ha scritto:
Hello,
I would like to point out that we are running ceph+redundant iscsiGW's,
connecting the LUN's to a esxi+vcsa-6.5 cluster with Red Hat support.
We did encountered a few bumps on the road to production, but those got
fixed
I get:
#ceph daemon mon.0 config set mon_allow_pool_delete true
admin_socket: exception getting command descriptions: [Errno 13]
Permission denied
Il 01/03/2018 14:00, Eugen Block ha scritto:
It's not necessary to restart a mon if you just want to delete a pool,
even if the "not observed" me
Probably priorities have changed since RedHat acquired Ceph/InkTank (
https://www.redhat.com/en/about/press-releases/red-hat-acquire-inktank-provider-ceph
) ?
Why support a competing hypervisor ? Long term switching to KVM seems to be the
solution.
- Rado
From: ceph-users On Behalf Of Max Cu
Indeed it makes sense, thanks !
And so, just for my own thinking, for the implementation of a new
Bluestore project, we really have to ask ourselves the question of
whether separating WAL/DBs significantly increases performance. If the
WAL/DB are on the same device as the bluestore data device
On Thu, Mar 01, 2018 at 04:57:59PM +0100, Hervé Ballans wrote:
:Can we find recent benchmarks on this performance issue related to the
:location of WAL/DBs ?
I don't have benchmarks but I have some anecdotes.
we previously had 4T NLSAS (7.2k) filestore data drives with journals
on SSD (5:1 ssd:s
Hi,
connect to the ceph-node1 machine and run : ceph daemon mon.ceph-node1 config
set mon_allow_pool_delete true
You are just using the wrong parameter as an ID
JC
> On Mar 1, 2018, at 07:41, Max Cuttins wrote:
>
> I get:
>
> #ceph daemon mon.0 config set mon_allow_pool_delete true
> admin_
When dealing with the admin socket you need to be an admin. `sudu` or
`sudo -u ceph` ought to get you around that.
I was able to delete a pool just by using the injectargs that you showed
above.
ceph tell mon.\* injectargs '--mon-allow-pool-delete=true'
ceph osd pool rm pool_name pool_name --yes
and now it worked.
maybe a typo in my first command.
Sorry
Il 01/03/2018 17:28, David Turner ha scritto:
When dealing with the admin socket you need to be an admin. `sudu` or
`sudo -u ceph` ought to get you around that.
I was able to delete a pool just by using the inj
I think this is a good question for everybody: How hard should be delete
a Pool?
We ask to tell the pool twice.
We ask to add "--yes-i-really-really-mean-it"
We ask to add ability to mons to delete the pool (and remove this
ability ASAP after).
... and then somebody of course ask us to restor
Is there a switch to turn on the display of specific OSD issues? Or
does the below indicate a generic problem, e.g. network and no any
specific OSD?
2018-02-28 18:09:36.438300 7f6dead56700 0
mon.roc-vm-sc3c234@0(leader).data_health(46) update_stats avail 56%
total 15997 MB, used 6154 MB, avail 9
`ceph health detail` should show you more information about the slow
requests. If the output is too much stuff, you can grep out for blocked or
something. It should tell you which OSDs are involved, how long they've
been slow, etc. The default is for them to show '> 32 sec' but that may
very wel
Even with bluestore we saw memory usage plateau at 3-4GB with 8TB drives
filled to around 90%. One thing that does increase memory usage is the
number of clients simultaneously sending write requests to a particular
primary OSD if the write sizes are large.
Subhachandra
On Thu, Mar 1, 2018 at 6:1
On Thu, Mar 1, 2018 at 5:37 PM, Subhachandra Chandra
wrote:
> Even with bluestore we saw memory usage plateau at 3-4GB with 8TB drives
> filled to around 90%. One thing that does increase memory usage is the
> number of clients simultaneously sending write requests to a particular
> primary OSD if
On Thu, Mar 1, 2018 at 2:47 PM, David Turner wrote:
> `ceph health detail` should show you more information about the slow
> requests. If the output is too much stuff, you can grep out for blocked or
> something. It should tell you which OSDs are involved, how long they've
> been slow, etc. The
Blocked requests and slow requests are synonyms in ceph. They are 2 names
for the exact same thing.
On Thu, Mar 1, 2018, 10:21 PM Alex Gorbachev wrote:
> On Thu, Mar 1, 2018 at 2:47 PM, David Turner
> wrote:
> > `ceph health detail` should show you more information about the slow
> > requests.
Hi David,
Thank you for your reply. As I understand your experience with multiple subnets
suggests sticking to a single device. However, I have a powerful RDMA NIC
(100Gbps) with two ports and I have seen recommendations from Mellanox to
separate the
two networks. Also, I am planning on having q
On Thu, Mar 1, 2018 at 10:57 PM, David Turner wrote:
> Blocked requests and slow requests are synonyms in ceph. They are 2 names
> for the exact same thing.
>
>
> On Thu, Mar 1, 2018, 10:21 PM Alex Gorbachev wrote:
>>
>> On Thu, Mar 1, 2018 at 2:47 PM, David Turner
>> wrote:
>> > `ceph health de
The only communication on the private network for ceph is between the OSDs
for replication, Erasure coding, backfilling, and recovery. Everything else
is on the public network. Including communication with clients, mons, MDS,
rgw and, literally everything else.
I haven't used RDMA, but from the qu
Dear all,
I wonder how we could support VM systems with ceph storage (block
device)? my colleagues are waiting for my answer for vmware (vSphere 5) and
I myself use oVirt (RHEV). the default protocol is iSCSI.
I know that openstack/cinder work well with ceph and proxmox (just heard)
too. But cu
Thank you for reply and explanation. I will take a look your at your reference
related to ML and
Ceph.
From: David Turner
Sent: Friday, March 2, 2018 2:12:18 PM
To: Justinas LINGYS
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph and multiple
69 matches
Mail list logo