Hi list,
Given a single node Ceph cluster (lab), I started out with the following
CRUSH rule:
> # rules
> rule replicated_ruleset {
> ruleset 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step choose firstn 0 type osd
> step emit
> }
Meanwhile, t
Hi Micha,
Thank you very much for your prompt response. In an earlier process, I
already ran:
> $ ceph tell osd.* injectargs '--osd-max-backfills 1'
> $ ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
> $ ceph tell osd.* injectargs '--osd-client-op-priority 63'
> $ ceph tell osd.* inject
Thank you very much, I'll start testing the logic prior to implementation.
On 06-07-16 19:20, Bob R wrote:
> See http://dachary.org/?p=3189 for some simple instructions on testing
> your crush rule logic.
ceph-users mailing list
Hi Gaurav,
Unfortunately I'm not completely sure about your setup, but I guess it
makes sense to configure Cinder and Glance to use RBD for a backend. It
seems to me, you're trying to store VM images directly on an OSD filesystem.
Please refer to http://docs.ceph.com/docs/master/rbd/rbd-openstack
Hi Gaurav,
The following snippets should suffice (for Cinder, at least):
> enabled_backends=rbd
> [rbd]
> volume_driver = cinder.volume.drivers.rbd.RBDDriver
> rbd_pool = cinder-volumes
> rbd_ceph_conf = /etc/ceph/ceph.conf
> rbd_flatten_volume_from_snapshot = false
> rbd_max_clone_d
Thank you everyone, I just tested and verified the ruleset and applied
it so some pools. Worked like a charm!
On 06-07-16 19:20, Bob R wrote:
> See http://dachary.org/?p=3189 for some simple instructions on testing
> your crush rule logic.
Hi Gaurav,
Have you distributed your Ceph authentication keys to your compute
nodes? And, do they have the correct permissions in terms of Ceph?
ceph-users mailing list
I'd recommend generating an UUID and use it for all your compute nodes.
This way, you can keep your configuration in libvirt constant.
On 08-07-16 16:15, Gaurav Goyal wrote:
> For below section, should i generate separate UUID for both compte hosts?
I think there's still something misconfigured:
> Invalid: 400 Bad Request: Unknown scheme 'file' found in URI (HTTP 400)
It seems the RBD backend is not used as expected.
Have you configured both Cinder _and_ Glance to use Ceph?
On 08-07-16 17:33, Gaurav Goyal wrote:
> I re
Glad to hear it works now! Good luck with your setup.
On 11-07-16 17:29, Gaurav Goyal wrote:
> Hello it worked for me after removing the following parameter from
> /etc/nova/nova.conf file
ceph-users mailing list
Sorry, should have posted this to the list.
Forwarded Message
Subject:Re: [ceph-users] (no subject)
Date: Tue, 12 Jul 2016 08:30:49 +0200
From: Kees Meijs
To: Gaurav Goyal
Hi Gaurav,
It might seem a little far fetched, but I'd use the qemu-img(1) to
Hi Fran,
Fortunately, qemu-img(1) is able to directly utilise RBD (supporting
sparse block devices)!
Please refer to http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/ for examples.
On 13-07-16 09:18, Fran Barrera wrote:
> Can you explain how you do this procedure? I have the same prob
If the qemu-img is able to handle RBD in a clever way (and I assume it
does) it is able to sparsely write the image to the Ceph pool.
But, it is an assumption! Maybe someone else could shed some light on this?
Or even better: read the source, the RBD handler specifically.
And last but not l
This is an OSD box running Hammer on Ubuntu 14.04 LTS with additional
systems administration tools:
> $ df -h | grep -v /var/lib/ceph/osd
> Filesystem Size Used Avail Use% Mounted on
> udev5,9G 4,0K 5,9G 1% /dev
> tmpfs 1,2G 892K 1,2G 1% /run
> /dev/dm-1
Hi Cephers,
There's some physical maintainance I need to perform on an OSD node.
Very likely the maintainance is going to take a while since it involves
replacing components, so I would like to be well prepared.
Unfortunately it is no option to add another OSD node or rebalance at
this time, so I
So to sum up, I'd best:
* set the noout flag
* stop the OSDs one by one
* shut down the physical node
* jank the OSD drives to prevent ceph-disk(8) from automaticly
activating at boot time
* do my maintainance
* start the physical node
* reseat and activate the OSD drive
Thanks guys, this worked like a charm. Activating the OSDs wasn't
necessary: it seemed udev(7) helped me with that.
On 13-07-16 14:47, Kees Meijs wrote:
> So to sum up, I'd best:
> * set the noout flag
> * stop the OSDs one by one
> * sh
Hi list,
It's probably something to discuss over coffee in Ede tomorrow but I'll
ask anyway: what HBA is best suitable for Ceph nowadays?
In an earlier thread I read some comments about some "dumb" HBAs running
in IT mode but still being able to use cache on the HBA. Does it make
sense? Or, is th
Hi Jake,
On 19-09-17 15:14, Jake Young wrote:
> Ideally you actually want fewer disks per server and more servers.
> This has been covered extensively in this mailing list. Rule of thumb
> is that each server should have 10% or less of the capacity of your
> cluster.
That's very true, but let's f
Hi Cephers,
Using Ceph 0.94.9-1trusty we noticed severe I/O stalling during deep
scrubbing (vanilla parameters used in regards to scrubbing). I'm aware
this has been discussed before, but I'd like to share the parameters
we're going to evaluate:
* osd_scrub_begin_hour 1
* osd_scrub_end_hour 7
On 28-10-16 12:06, w...@42on.com wrote:
> I don't like this personally. Your cluster should be capable of doing
> a deep scrub at any moment. If not it will also not be able to handle
> a node failure during peak times.
Valid point and I totally agree. Unfortunately, the current load doesn't
Interesting... We're now running using deadline. In other posts I read
about noop for SSDs instead of CFQ.
Since we're using spinners with SSD journals; does it make since to mix
the scheduler? E.g. CFG for spinners _and_ noop for SSD?
On 28-10-16 14:43, Wido den Hollander wrote:
> Make
there) a little
On 28-10-16 15:37, Kees Meijs wrote:
> Interesting... We're now running using deadline. In other posts I read
> about noop for SSDs instead of CFQ.
> Since we're using spinners with SSD journals; does it make since to
> mix
Hi list,
Our current Ceph production cluster seems to cope with performance
issues, so we decided to add a fully flash based cache tier (now running
with spinners and journals on separate SSDs).
We ordered SSDs (Intel), disk trays and read
> 13: (()+0xfbaa1) [0x5650637c1aa1]
> NOTE: a copy of the executable, or `objdump -rdS ` is
> needed to interpret this.
> terminate called after throwing an instance of 'ceph::FailedAssertion'
Hope it helps.
On 24-11-16 13:06, Kees Meijs wrote:
Hi Burkhard,
A testing pool makes absolute sense, thank you.
About the complete setup, the documentation states:
> The cache tiering agent can flush or evict objects based upon the
> total number of bytes *or* the total number of objects. To specify a
> maximum number of bytes, execute the follo
Hi Nick,
All Ceph pools have very restrictive permissions for each OpenStack
service, indeed. Besides creating the cache pool and enabling it, no
additional parameters or configuration was done.
Do I understand correctly access parameters (e.g. authx keys) are needed
for a cache tier? If yes, it
Hi Nick,
Oh... In retrospect it makes sense in a way, but it does not as well. ;-)
To clarify: it makes sense since the cache is "just a pool" but it does
not since "it is an overlay and just a cache in between".
Anyway, something that should be well documented and warned for, if you
ask me.
allow rwx pool=cinder-vms, allow rx
> pool=glance-images"
I presume I should add *allow rwx pool=cache* in our case?
Thanks again,
On 24-11-16 15:55, Kees Meijs wrote:
> Oh... In retrospect it makes sense in a way, but it does not as well. ;-)
> To clarify: it makes sen
Hi list,
We're using CoW clones (using OpenStack via Glance and Cinder) to store
virtual machine images.
For example:
> # rbd info cinder-volumes/volume-a09bd74b-f100-4043-a422-5e6be20d26b2
> rbd image 'volume-a09bd74b-f100-4043-a422-5e6be20d26b2':
> size 25600 MB in 3200 objects
> order
Hi Wido,
Valid point. At this moment, we're using a cache pool with size = 2 and
would like to "upgrade" to size = 3.
Again, you're absolutely right... ;-)
Anyway, any things to consider or could we just:
1. Run "ceph osd pool set cache size 3".
2. Wait for rebalancing to complete.
3. Run "c
Hi Wido,
Since it's a Friday night, I decided to just go for it. ;-)
It took a while to rebalance the cache tier but all went well. Thanks
again for your valuable advice!
Best regards, enjoy your weekend,
On 07-12-16 14:58, Wido den Hollander wrote:
>> Anyway, any things to consider or cou
Hi guys,
In the past few months, I've read some posts about upgrading from
Hammer. Maybe I've missed something, but I didn't really read something
on QEMU/KVM behaviour in this context.
At the moment, we're using:
> $ qemu-system-x86_64 --version
> QEMU emulator version 2.3.0 (Debian 1:2.3+dfsg-
Hi Wido,
At the moment, we're running Ubuntu 14.04 LTS using the Ubuntu Cloud
Archive. To be precise again, it's QEMU/KVM 2.3+dfsg-5ubuntu9.4~cloud2
linked to Ceph 0.94.8-0ubuntu0.15.10.1~cloud0.
So yes, it's all about running a newer QEMU/KVM on a not so new version
of Ubuntu.
Question is, are
Hi Wido,
Thanks again! Good to hear, it saves us a lot of upgrade trouble in advance.
If I'm not mistaken, we haven't done anything with CRUSH tunables. Any
pointers on how to make sure we really didn't?
On 20-12-16 10:14, Wido den Hollander wrote:
> No, you don't. A Hammer/Jewel
Hi Asley,
We experience (using Hammer) a similar issue. Not that I have a perfect
solution to share, but I felt like mentioning a "me too". ;-)
On a side note: we configured correct weight per drive as well.
On 29-12-16 11:54, Ashley Merrick wrote:
> Hello,
> I currently
Thanks, I'll try a manual reweight at first.
Have a happy new year's eve (yes, I know it's a day early)!
On 30-12-16 11:17, Wido den Hollander wrote:
> For this reason you can do a OSD reweight by running the 'ceph osd
> reweight-by-utilization' command or do it manually with 'cep
Hi Cephers,
For the last months (well... years actually) we were quite happy using
Hammer. So far, there was no immediate cause implying an upgrade.
However, having seen Luminous providing support for BlueStore, it seemed
like a good idea to perform some upgrade steps.
Doing baby steps, I wanted
Hi David,
Thank you for pointing out the option.
On http://docs.ceph.com/docs/infernalis/release-notes/ one can read:
Ceph daemons now run as user and group ceph by default. The ceph
user has a static UID assigned by Fedora and Debian (also used by
derivative distributions like RHE
to fixing the
ownerships while maybe still running), maybe two but not all of them.
Although it's very likely it wouldn't make a difference, I'll try a ceph
pg repair for each PG.
To be continued again!
On 18-08-18 10:52, Kees Meijs wrote:
To be continued... O
700 -1 log_channel(cluster) log
[ERR] : 3.7d soid -5/007d/temp_3.7d_0_16204674_4402/head: failed
to pick suitable auth object
2018-08-18 18:29:23.159691 7efc52942700 -1 log_channel(cluster) log
[ERR] : 3.7d repair 1 errors, 0 fixed
I'll investigate further.
On 18-08-18
t be the case for one or maybe
two OSDs but definitely not all.
On 19-08-18 08:55, Kees Meijs wrote:
> I'll investigate further.
ceph-users mailing list
Ehrm, that should of course be rebuilding. (I.e. removing the OSD,
reformat, re-add.)
On 20-08-18 11:51, Kees Meijs wrote:
> Since there's temp in the name and we're running a 3-replica cluster,
> I'm thinking of just reboili
Hi David,
Thanks for your advice. My end goal is BlueStore so to upgrade to Jewel
and then Luminous would be ideal.
Currently all monitors are (succesfully) running Internalis, one OSD
node is running Infernalis and all other OSD nodes have Hammer.
I'll try freeing up one Infernalis OSD at first
Good afternoon Cephers,
While I'm fixing our upgrade-semi-broken cluster (see thread Upgrade to
Infernalis: failed to pick suitable auth object) I'm wondering about
ensuring client compatibility.
My end goal is BlueStore (i.e. running Luminous) and unfortunately I'm
obliged to offer Hammer client
Bad news: I've got a PG stuck in down+peering now.
Please advice.
On 20-08-18 12:12, Kees Meijs wrote:
> Thanks for your advice. My end goal is BlueStore so to upgrade to Jewel
> and then Luminous would be ideal.
> Currently all monitors are (succesfully) running Internali
27; thread 7f8962b2f700
> time 2018-08-20 13:06:33.709922
> osd/ReplicatedPG.cc: 10115: FAILED assert(r >= 0)
Restarting the OSDs seems to work.
On 20-08-18 13:14, Kees Meijs wrote:
> Bad news: I've got a PG stuck in down+peering now.
contain one inconsistent PG.
* Backfilling started.
* After hours and hours of backfilling, OSDs started to crash.
Other than restarting the "out" and stopped OSD for the time being
(haven't tried that yet) I'm quite lost.
Hopefully someone has some pointers for me.
y to catch some sleep.
Thanks, thanks!
Best regards,
On 20-08-18 21:46, Kees Meijs wrote:
Other than restarting the "out" and stopped OSD for the time being
(haven't tried that yet) I'm quite lost.
ceph-users mailing
Hi Lincoln,
We're looking at (now existing) RBD support using KVM/QEMU, so this is
an upgrade path.
On 20-08-18 16:37, Lincoln Bryant wrote:
What interfaces do your Hammer clients need? If you're looking at
CephFS, we have had reasonable success moving our older clients (EL6)
Hello David,
Thank you and I'm terribly sorry; I was unaware I was starting new threads.
From the top of my mind I say "yes it'll fit" but obviously I make sure
at first.
On 21-08-18 16:34, David Turner wrote:
Ceph does not support downgrading OSDs. When you removed the single
Hi list,
A little update: meanwhile we added a new node consisting of Hammer OSDs
to ensure sufficient cluster capacity.
The upgraded node with Infernalis OSDs is completely removed from the
CRUSH map and the OSDs removed (obviously we didn't wipe the disks yet).
At the moment we're still runnin
Hi Maxime,
Given your remark below, what kind of SATA SSD do you recommend for OSD
On 15-01-17 21:33, Maxime Guyot wrote:
> I don’t have firsthand experience with the S3520, as Christian pointed out
> their endurance doesn’t make them suitable for OSDs in most case
Hi Cephers,
Long story short: I'd like to shrink our cache pool a little.
Is it safe to just alter cache target_max_byte and wait for objects to
get evicted? Anything to take into account?
ceph-users mailing list
Hi Cephers,
Although I might be stating an obvious fact: altering the parameter
works as advertised.
The only issue I encountered was lowering the parameter too much at once
results in some slow requests because the cache pool is "full".
So in short: it works when lowering the parameter bit by b
ically and
we arrived at HEALTH_OK again!
Case closed: up to Jewel.
For everyone involved: a big, big and even bigger thank you for all
pointers and support!
On 10-09-18 16:43, Kees Meijs wrote:
A little update: meanwhile we added a new node consisting of Hammer OSDs
to ensure
Running ceph osd set require_jewel_osds seemed harmless in terms of
client compatibility so that's done already.
However, what about sortbitwise and tunables?
On 21-08-18 03:47, Kees Meijs wrote:
We're looking at (now existing) RBD support using KVM/QEM
There's a lot of data shuffling going on now, so fingers crossed.
On 12-11-18 09:14, Kees Meijs wrote:
> However, what about sortbitwise and tunables?
ceph-users mailing list
Hi Alex,
What kind of clients do you use? Is it KVM (QEMU) using NBD driver,
kernel, or...?
On 17-11-18 20:17, Alex Litvak wrote:
> Hello everyone,
> I am trying to troubleshoot cluster exhibiting huge spikes of latency.
> I cannot quite catch it because it happens during the lig
Hi Cephers,
Documentation on
http://docs.ceph.com/docs/master/rados/operations/erasure-code/ states:
> Choosing the right profile is important because it cannot be modified
> after the pool is created: a new pool with a different profile needs
> to be created and all objects from the previous poo
Thanks guys.
On 04-03-19 22:18, Smith, Eric wrote:
> This will cause data migration.
> -Original Message-
> From: ceph-users On Behalf Of Paul
> Emmerich
> Sent: Monday, March 4, 2019 2:32 PM
> To: Kees Meijs
> Cc: Ceph Users
> Subject: Re:
Experienced similar issues. Our cluster internal network (completely
separated) now has NOTRACK (no connection state tracking) iptables rules.
In full:
> # iptables-save
> # Generated by xtables-save v1.8.2 on Wed Jul 17 14:57:38 2019
> *filter
Hi list,
Yesterday afternoon we experienced a compute node outage in our
OpenStack (obviously Ceph backed) cluster.
We tried to (re)start compute instances again as fast as possible,
resulting in some KVM/RBD clients getting blacklisted. The problem was
spotted very quickly so we could remove the
Hi Paul,
Okay, thanks for clarifying. If we see the phenomenon again, we'll just
leave it be.
On 03-08-2019 14:33, Paul Emmerich wrote:
> The usual reason for blacklisting RBD clients is breaking an exclusive
> lock because the previous owner seemed to have crashed.
> Blacklisting the old own
64 matches
Mail list logo