Interface and switch should have the same MTU and that should not cause any
issues (setting switch MTU higher is always safe, though).
Aren’t you encapsulating the mon communication in some SDN like openwswitch? Is
that a straight L2 connection?
I think this is worth investigating. For example a
Hi,
I'm not having much luck with my proof-of-concept Ceph deployment, so I'll just
ask the question here.
Does Ceph provide multiprotocol access to the same file system like Isilon's
OneFS (CIFS, NFS, Swift, HDFS) and, to a lesser extent, NetApp's Data ONTAP
(CIFS, NFS)?
Thanks,
Alex Dacre
On 03/06/15 10:12, Alexander Dacre wrote:
Hi,
I’m not having much luck with my proof-of-concept Ceph deployment, so
I’ll just ask the question here.
Does Ceph provide multiprotocol access to the same file system like
Isilon’s OneFS (CIFS, NFS, Swift, HDFS) and, to a lesser extent,
NetApp
On 06/02/2015 07:21 PM, Paul Evans wrote:
Kenneth,
My guess is that you’re hitting the cache_target_full_ratio on an
individual OSD, which is easy to do since most of us tend to think of
the cache_target_full_ratio as an aggregate of the OSDs (which it is not
according to Greg Farnum). Thi
On 06/02/2015 07:08 PM, Nick Fisk wrote:
Hi Kenneth,
I suggested an idea which may help with this, it is being currently being
developed .
https://github.com/ceph/ceph/pull/4792
In short there is a high and low threshold with different flushing
priorities. Hopefully this will help with burst
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Kenneth Waegeman
> Sent: 03 June 2015 10:51
> To: Nick Fisk; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] bursty IO, ceph cache pool can not follow
evictions
>
>
>
> On 06/02/2015 0
Thanks for the response, John.
Sorry, I should have been a bit clearer. Let us assume that a user puts a file
into the object store via rados put test-object-1 testfile.txt --pool=data,
will that same file be accessible via CephFS?
Alex Dacre
Systems Engineer
+44 131 560 1466
From: John Spray
Thanks for a very helpful answer.
So if I understand it correctly then what I want (crash consistency with RPO>0)
isn’t possible now in any way.
If there is no ordering in RBD cache then ignoring barriers sounds like a very
bad idea also.
Any thoughts on ext4 with journal_async_commit? That shou
Hi folks,
I've always been confused about the apply/commit latency numbers in "ceph
osd perf" output. I only know for sure that when they get too high,
performance is bad.
My deployments have seen many different versions of ceph. Pre 0.80.7, I've
seen those numbers being pretty high. After upgrad
On 03/06/15 11:59, Alexander Dacre wrote:
Thanks for the response, John.
Sorry, I should have been a bit clearer. Let us assume that a user
puts a file into the object store via rados put test-object-1
testfile.txt --pool=data, will that same file be accessible via CephFS?
No, CephFS wo
Hi Alexander,
The archipelago subproject of the synnefo toolkit does something like this
on top of ceph mainly for virtual server images.
Regards
Karsten
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users
You say “even when the cluster is doing nothing” - Are you seeing those numbers
on a completely idle cluster?
Even SSDs can go to sleep, as can CPUs (throttle/sleep states), memory gets
swapped/paged out, tcp connections die, cache is empty... measuring a
completely idle cluster is not always re
On Wed, Jun 3, 2015 at 5:19 AM, Xu (Simon) Chen wrote:
> Hi folks,
>
> I've always been confused about the apply/commit latency numbers in "ceph
> osd perf" output. I only know for sure that when they get too high,
> performance is bad.
>
> My deployments have seen many different versions of ceph.
The TCP_NODELAY issue was with kernel rbd *not* with OSD. Ceph messenger code
base is setting it by default.
BTW, I doubt TCP_NODELAY has anything to do with it.
Thanks & Regards
Somnath
From: Jan Schermer [mailto:j...@schermer.cz]
Sent: Wednesday, June 03, 2015 1:37 AM
To: cameron.scr...@solnet
And for reference, we see 0-1ms on Intel NVMe SSDs for commit latency.
On Wed, Jun 3, 2015 at 10:11 AM, Gregory Farnum wrote:
> On Wed, Jun 3, 2015 at 5:19 AM, Xu (Simon) Chen wrote:
> > Hi folks,
> >
> > I've always been confused about the apply/commit latency numbers in "ceph
> > osd perf" ou
First, the cluster is newly built, so it's literally sitting idle and doing
nothing.
Second, this new cluster has exactly the same hardware as my other
clusters, same kernel, and same journal device setup, OSD layout, etc. The
only difference is ceph version, 0.80.7 vs. hammer.
-Simon
On Wednesd
Hi All,
Am I correct in thinking that in latest kernels, now that krbd is supported
via blk-mq, the maximum queue depth is now 128 and cannot be adjusted
https://github.com/torvalds/linux/blob/master/drivers/block/rbd.c
3753: rbd_dev->tag_set.queue_depth = BLKDEV_MAX_RQ;
blkdev.h
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
We are experiencing a problem where nova is opening up all kinds of
sockets like:
nova-comp 20740 nova 1996u unix 0x8811b3116b40 0t0 41081179
/var/run/ceph/ceph-client.volumes.20740.81999792.asok
hitting the open file limits rather quickl
On 06/03/2015 02:31 PM, Robert LeBlanc wrote:
We are experiencing a problem where nova is opening up all kinds of
sockets like:
nova-comp 20740 nova 1996u unix 0x8811b3116b40 0t0 41081179
/var/run/ceph/ceph-client.volumes.20740.81999792.asok
hitting the open file limits rather quickly
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Thank you for pointing to the information. I'm glad a fix is already
ready. I can't tell from https://github.com/ceph/ceph/pull/4657, will
this be included in the next point release of hammer?
Thanks,
-
Robert LeBlanc
GPG Fingerprin
On Mon, 1 Jun 2015, Gregory Farnum wrote:
> On Mon, Jun 1, 2015 at 6:39 PM, Paul Von-Stamwitz
> wrote:
> > On Fri, May 29, 2015 at 4:18 PM, Gregory Farnum wrote:
> >> On Fri, May 29, 2015 at 2:47 PM, Samuel Just wrote:
> >> > Many people have reported that they need to lower the osd recovery
>
On Wed, Jun 3, 2015 at 3:44 PM, Sage Weil wrote:
> On Mon, 1 Jun 2015, Gregory Farnum wrote:
>> On Mon, Jun 1, 2015 at 6:39 PM, Paul Von-Stamwitz
>> wrote:
>> > On Fri, May 29, 2015 at 4:18 PM, Gregory Farnum wrote:
>> >> On Fri, May 29, 2015 at 2:47 PM, Samuel Just wrote:
>> >> > Many people h
The interface MTU has to be 18 or more bytes lower than the switch MTU or
it just stops working. As far as I know the monitor communication is not
being encapsulated by any SDN.
Cameron Scrace
Infrastructure Engineer
Mobile +64 22 610 4629
Phone +64 4 462 5085
Email cameron.scr...@solnet.co.
Hmm…Thanks for sharing this..
Any chance it depends on switch ?
Could you please share what NIC card and switch you are using ?
Thanks & Regards
Somnath
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz]
Sent: Wednesday, June 03, 2015 4:07 PM
To: Somnath Roy; Jan Schermer
Cc:
It mostly like is the model of switch. In its settings the minimum frame
size you can set is 1518, default MTU is 1500, seems the switch wants the
18 byte difference.
We are using a pair of Netgear XS712T and bonded pairs of Intel 10-Gigabit
X540-AT2 (rev 01) with 3 VLans.
Cameron Scrace
Infr
Surely this is to be expected... 1500 is the IP MTU, and 1518 is the
Ethernet MTU including 4 Bytes for optional 802.1q VLAN tag. Interface
MTU typically means the IP MTU, whereas a layer 2 switch cares more
about layer 2 Ethernet frames, and so MTU in that context means the
Ethernet MTU.
On
On 06/03/2015 03:15 PM, Robert LeBlanc wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Thank you for pointing to the information. I'm glad a fix is already
ready. I can't tell from https://github.com/ceph/ceph/pull/4657, will
this be included in the next point release of hammer?
It'll b
Hi list,
ceph version 0.87
On a small cluster, I’m experiencing a behaviour that I have to understand (but
I have reproduce that on bigger)
I pushed 300MB of data with EC 2+1 so I see 300MB data and 476MB used, right
I deleted data on buckets and after gc collector I see this report :
904 pgs:
I'm trying to set up some OSDs and if I try to use a raid device for the
journal disk it fails: http://pastebin.com/mTw6xzNV
The main issue I see is that the symlink in /dev/disk/by-partuuid is not
being made correctly. When I make it manually and try to activate I still
get errors, It seems to
Hello,
Actually after going through the changelogs with a fine comb and the ole
Mark I eyeball I think I might be seeing this:
---
osd: fix journal direct-io shutdown (#9073 Mark Kirkwood, Ma Jianpeng, Somnath
Roy)
---
The details in the various related bug reports certainly make it look
relate
30 matches
Mail list logo