Yes, I"d say they aren't related. Since you can repeat this issue
after a fresh VM boot, can you enable debug-level logging for said VM
(add "debug rbd = 20" to your ceph.conf) and recreate the issue. Just
to confirm, this VM doesn't have any features enabled besides
(perhaps) layering?
On Fri, Ju
Hi,
I have a CephFS cluster based on Ceph version: 10.2.5
(c461ee19ecbc0c5c330aca20f7392c9a00730367)
I use ceph-fuse to mount CephFS volume on Debian with Ceph version 10.2.5
I would like set quota on CephFS folder:
# setfattr -n ceph.quota.max_bytes -v 10 /mnt/cephfs/foo
setfattr: /mnt
The problem seems to be reliably reproducible after a fresh reboot of the VM…
With this knowledge, I can cause the hung IO condition while having noscrub and
nodeepscrub set.
Does this confirm this is not-related to http://tracker.ceph.com/issues/20041 ?
--
Eric
On 6/22/17, 11:23 AM, "Hall, E
I set the "mon_data" configuration item and "user" configuration item in my
ceph.conf, and start ceph-mon using the user "ceph".
I tested directly calling "ceph-mon" command to start the daemon using "root"
and "ceph", there were no problem. Only when starting through systemctl, the
start failed
Dear all,
running all server and clients a centOS release with a kernel 3.10.* I'm
facing this choiche:
* sacrifice TUNABLES and downgrade all the cluster to
CEPH_FEATURE_CRUSH_TUNABLES3 (which should be the right profile for
jewel on old kernel 3.10)
* sacrifice KERNEL RBD and map Cep
Hi Ashley,
I already know, I was already expecting that the bottleneck was the
minimum between bandwidth and disks (and was currently disk on my first
email).
I thinking that write is still to low.
I read that removing journal overhead is not a good idea.
However I'm writing twice on a SSD...
Hi Mark,
having 2 node for testing allow me to downgrade the replication to 2x
(till the production).
SSD have the following product details:
* sequential read: 540MB/sec
* sequential write: 520MB/sec
As you state my sequential write should be:
~600 * 2 (copies) * 2 (journal write per c
On Thu, Jun 22, 2017 at 5:31 PM, Casey Bodley wrote:
>
> On 06/22/2017 10:40 AM, Dan van der Ster wrote:
>>
>> On Thu, Jun 22, 2017 at 4:25 PM, Casey Bodley wrote:
>>>
>>> On 06/22/2017 04:00 AM, Dan van der Ster wrote:
I'm now running the three relevant OSDs with that patch. (Recompile
Very good to know!
Thanks for the info.
Il 22/06/2017 20:15, Maged Mokhtar ha scritto:
Generally you can measure your bottleneck via a tool like
atop/collectl/sysstat and see how busy (ie %busy, %util ) your
resources are: cpu/disks/net.
As was pointed out, in your case you will most prob
*Of course yes!*
SSD bottleneck is the SATA controller.
If you use a NVMe/PCIe controller you get from almost the same SSD
2.400MB/sec instead of 580MB/sec.
2400MB/sec x 8 = ~19Gbit/sec
580MB/sec x 8 = ~5 Gbit/sec
If you don't trust me take a look at this benchmark between 2 really
common S
Hello, everyone!
We are working on a project which uses RBD images (formatted with XFS) as
home folders for the project's users. The access speed and the overall
reliability have been pretty good, so far.
>From the architectural perspective, our main focus is on providing a
seamless user experien
Only features enabled are layering and deep-flatten:
root@cephproxy01:~# rbd -p vms info c9c5db8e-7502-4acc-b670-af18bdf89886_disk
rbd image 'c9c5db8e-7502-4acc-b670-af18bdf89886_disk':
size 20480 MB in 5120 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.f4e
CentOS 7.3's krbd supports Jewel tunables (CRUSH_TUNABLES5) and does
not support NBD since that driver is disabled out-of-the-box. As an
alternative for NBD, the goal is to also offer LIO/TCMU starting with
Luminous and the next point release of CentOS (or a vanilla >=4.12-ish
kernel).
On Fri, Jun
On Fri, Jun 23, 2017 at 8:47 AM, Hall, Eric wrote:
> I have debug logs. Should I open a RBD tracker ticket at
> http://tracker.ceph.com/projects/rbd/issues for this?
Yes, please. You might need to use the "ceph-post-file" utility if the
logs are too large to attach to the ticket. In that case,
You could move your Journal to another SSD this would remove the double write.
Ideally you’d want one or two PCIe NVME in the servers for the Journal.
Or if you can hold off a bit then bluestore, which removes the double write,
however is still handy to move some of the services to a seperate di
Hi,
On 06/23/2017 02:44 PM, Bogdan SOLGA wrote:
Hello, everyone!
We are working on a project which uses RBD images (formatted with XFS)
as home folders for the project's users. The access speed and the
overall reliability have been pretty good, so far.
From the architectural perspective, o
Hello,
We are in the process of evaluating the performance of a testing
cluster (3 nodes) with ceph jewel. Our setup consists of:
3 monitors (VMs)
2 physical servers each connected with 1 JBOD running Ubuntu Server 16.04
Each server has 32 threads @2.1GHz and 128GB RAM.
The disk distribution per
Hi Ashley,
You could move your Journal to another SSD this would remove the
double write.
If I move the journal to another SSD, I will loss an available OSD, so
this is likely to say improve of *x2* and then decrease of *x½ *...
this should not improve performance in any case on a full SSD di
Sorry for the not inline reply.
If you can get 6 OSD’s per a NVME as long as your getting a decent rated NVME
your bottle neck will be the NVME but will still improve over your current
bottle neck.
You could add two NVME OSD’s, but their higher performance would be lost along
with the other 12
This is the first release candidate for Luminous, the next long term
stable release.
Ceph Luminous will be the foundation for the next long-term
stable release series. There have been major changes since Kraken
(v11.2.z) and Jewel (v10.2.z).
Major Changes from Kraken
-
-
We have found that we can place 18 journals on the Intel 3700 PCI-e devices
comfortably, We also tried it with fio adding more jobs to ensure that
performance did not drop off (via Sebastian Han’s tests described at
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-sui
Ashley,
but.. instead of use NVMe as a journal, why don't add 2 OSD to the cluster?
Incresing number of OSD instead of improving performance of actual OSD?
Il 23/06/2017 15:40, Ashley Merrick ha scritto:
Sorry for the not inline reply.
If you can get 6 OSD’s per a NVME as long as your gettin
But your then have a very mismatch of performance across your OSD’s which is
never recommend by CEPH.
It’s all about what you can do with your current boxes capacity to increase
performance across the whole OSD set.
,Ashley
Sent from my iPhone
On 23 Jun 2017, at 10:40 PM, Massimiliano Cuttini
http://tracker.ceph.com/issues/20393 created with supporting logs/info noted.
--
Eric
On 6/23/17, 7:54 AM, "Jason Dillaman" wrote:
On Fri, Jun 23, 2017 at 8:47 AM, Hall, Eric
wrote:
> I have debug logs. Should I open a RBD tracker ticket at
http://tracker.ceph.com/projects/rbd/issu
Not all server are real centOS servers.
Some of them are dedicated distribution locked at 7.2 with locked kernel
fixed at 3.10.
Which as far as I can understand need CRUSH_TUNABLES2 and not even 3!
http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client
So
Hi Everybody,
i also see that VM on top of this drive see an even lower speed:
hdparm -Tt --direct /dev/xvdb
/dev/xvdb:
Timing O_DIRECT cached reads: 2596 MB in 2.00 seconds = 1297.42 MB/sec
Timing O_DIRECT disk reads: 910 MB in 3.00 seconds = 303.17 MB/sec
It's seem there is huge diff
Did you set "setuser match path" in your config? If you look at the
release notes for Infernalis, it outlines how to still use the ceph user.
Also to note below from Infernalis,
"Ceph daemons now run as user and group ceph by default. The ceph user has
a static UID assigned by Fedora and Debian (
What is the output of the following command? If a directory has no quota,
it should respond "0" as the quota.
# getfattr -n ceph.quota.max_bytes /mnt/cephfs/foo
I tested this in my home cluster that uses ceph-fuse to mount cephfs under
the david user (hence no need for sudo). I'm using Ubuntu 16
On Fri, Jun 23, 2017 at 4:59 PM, David Turner wrote:
> What is the output of the following command? If a directory has no quota,
> it should respond "0" as the quota.
> # getfattr -n ceph.quota.max_bytes /mnt/cephfs/foo
>
> I tested this in my home cluster that uses ceph-fuse to mount cephfs unde
I don't really have anything to add to this conversation, but I see emails
like this in the ML all the time. Have you looked through the archives?
Everything that's been told to you and everything you're continuing to ask
have been covered many many times.
http://lists.ceph.com/pipermail/ceph-use
If you have no control over what kernel the clients are going to use, then
I wouldn't even consider using the kernel driver for the clients. For me,
I would do anything to maintain the ability to use the object map which
would require the 4.9 kernel to use with the kernel driver. Because of
this
2017-06-23 18:06 GMT+02:00 John Spray :
> I can't immediately remember which version we enabled quota by default
> in -- you might also need to set "client quota = true" in the client's
> ceph.conf.
>
>
I need to set this option only on host where I want to mount volume? or on
all mds hosts?
What
2017-06-23 17:59 GMT+02:00 David Turner :
> It might be possible that it doesn't want an absolute path and wants a
> relative path for setfattr, although my version doesn't seem to care. I
> mention that based on the getfattr response.
>
>
I did the test with relative path and I have the same err
Two of our OSD systems hit 75% disk utilization, so I added another
system to try and bring that back down. The system was usable for a day
while the data was being migrated, but now the system is not responding
when I try to mount it:
mount -t ceph ceph-0,ceph-1,ceph-2,ceph-3:6789:/ /home -
I doubt the ceph version from 10.2.5 to 10.2.7 makes that big of a
difference. Read through the release notes since 10.2.5 to see if it
mentions anything about cephfs quotas.
On Fri, Jun 23, 2017 at 12:30 PM Stéphane Klein
wrote:
> 2017-06-23 17:59 GMT+02:00 David Turner :
>
>> It might be poss
# ceph health detail | grep 'ops are blocked'
# ceph osd blocked-by
My guess is that you have an OSD that is in a funky state blocking the
requests and the peering. Let me know what the output of those commands
are.
Also what are the replica sizes of your 2 pools? It shows that only 1 OSD
was l
Thanks for the response:
[root@ceph-control ~]# ceph health detail | grep 'ops are blocked'
100 ops are blocked > 134218 sec on osd.13
[root@ceph-control ~]# ceph osd blocked-by
osd num_blocked
A problem with osd.13?
Dan
On 06/23/2017 02:03 PM, David Turner wrote:
# ceph health detail | grep
Hi everybody,
I just realize that all my Images are completly without features:
rbd info VHD-4c7ebb38-b081-48da-9b57-aac14bdf88c4
rbd image 'VHD-4c7ebb38-b081-48da-9b57-aac14bdf88c4':
size 102400 MB in 51200 objects
order 21 (2048 kB objects)
block_name_
Ok,
I get the point.
Il 23/06/2017 17:42, Ashley Merrick ha scritto:
But your then have a very mismatch of performance across your OSD’s
which is never recommend by CEPH.
It’s all about what you can do with your current boxes capacity to
increase performance across the whole OSD set.
,Ash
Ok,
so if I understand correctly your opinion: if you cannot choiche the
kernel then you'll sacrifice immediatly the kernel-rbd.
I was at the same opinion but i'm still harvesting opinion.
Can you tell me if by using nbd-rbd I'm not losing any features?
I just cannot understand if nbd is a sor
All of the features you are talking about likely require the exclusive-lock
which requires the 4.9 linux kernel. You cannot map any RBDs that have
these features enabled with any kernel older than that.
The features you can enable are layering, exclusive-lock, object-map, and
fast-diff. You cann
Something about it is blocking the cluster. I would first try running this
command. If that doesn't work, then I would restart the daemon.
# ceph osd down 13
Marking it down should force it to reassert itself to the cluster without
restarting the daemon and stopping any operations it's working
I've never used nbd-rbd, I would use rbd-fuse. It's version should match
your cluster's running version as it's a package compiled with each ceph
release.
On Fri, Jun 23, 2017 at 3:58 PM Massimiliano Cuttini
wrote:
> Ok,
>
> so if I understand correctly your opinion: if you cannot choiche the
>
Ok,
At moment my client use only nbd-rbd, can I use all these feature or
this is something unavoidable?
I guess it's ok.
Reading around seems that a lost feature cannot be re-enabled due to
back-compatibility with old clients.
... I guess I'll need to export and import in a new image fully f
What is your use case? That matters the most.
On Fri, Jun 23, 2017 at 4:31 PM David Turner wrote:
> I've never used nbd-rbd, I would use rbd-fuse. It's version should match
> your cluster's running version as it's a package compiled with each ceph
> release.
>
> On Fri, Jun 23, 2017 at 3:58 PM
I upgraded to Jewel from Hammer and was able to enable those features on
all of my rbds that were format 2, which yours is. Just test it on some
non customer data and see how it goes.
On Fri, Jun 23, 2017, 4:33 PM Massimiliano Cuttini
wrote:
> Ok,
>
> At moment my client use only nbd-rbd, can I
On Fri, 23 Jun 2017, Abhishek L wrote:
> This is the first release candidate for Luminous, the next long term
> stable release.
I just want to reiterate that this is a release candidate, not the final
luminous release. We're still squashing bugs and merging a few last
items. Testing is welcome,
We are using replica 2 and min size is 2. A small amount of data is
sitting around from when we were running the default 3.
Looks like the problem started around here:
2017-06-22 14:54:29.173982 7f3c39f6f700 0 log_channel(cluster) log
[INF] : 1.2c9 deep-scrub ok
2017-06-22 14:54:29.690401 7f
I guess you updated those feature before the commit that fix this:
https://github.com/ceph/ceph/blob/master/src/include/rbd/features.h
As stated:
// features that make an image inaccessible for read or write by
/// clients that don't understand them
#define RBD_FEATURES_INCOMPATIBLE
What seems to be strange is that feature are *all disabled* when I
create some images.
While ceph should use default settings of jewel at least.
Do I need to place in ceph.conf something in order to use default settings?
Il 23/06/2017 23:43, Massimiliano Cuttini ha scritto:
I guess you upd
2017-06-23 20:44 GMT+02:00 David Turner :
> I doubt the ceph version from 10.2.5 to 10.2.7 makes that big of a
> difference. Read through the release notes since 10.2.5 to see if it
> mentions anything about cephfs quotas.
>
Yes, same error with 10.2.7 :(
The general advice floating around is that your want CPUs with high clock
speeds rather than more cores to reduce latency and increase IOPS for SSD
setups (see also
http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/) So
something like a E5-2667V4 might bring better results in that sit
Your min_size=2 is why the cluster is blocking and you can't mount cephfs.
Those 2 PGs, while the cluster is performing the backfilling, are currently
only on 1 OSD (osd.13). That is not enough OSDs to satisfy the min_size,
so any requests for data on those PGs will block and wait until a second
O
It all depends on how you are creating your RBDs. Whatever your using is
likely overriding the defaults and using a custom line in it's code.
What you linked did not say that you cannot turn on the features I
mentioned. There are indeed some features that cannot be enabled if they
have ever been
I do not have a single mention of quotas, or even MDS, in my config files
anywhere in my cluster... which is to say that everything is running on
default settings. What settings do you have explicitly stated in your
config file related to MDS and/or quotas on both your client server and
your MDS s
55 matches
Mail list logo