date:20180912

[ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

2018-09-12 Thread Daniel Goldbach

Hi all, We're reading from a Ceph Luminous pool using the librados asychronous I/O API. We're seeing some concerning memory usage patterns when we read many objects in sequence. The expected behaviour is that our memory usage stabilises at a small amount, since we're just fetching objects and ign

Re: [ceph-users] ceph-fuse slow cache?

2018-09-12 Thread Stefan Kooman

Quoting Yan, Zheng (uker...@gmail.com): > > > please add '-f' option (trace child processes' syscall) to strace, Good suggestion. We now see all apache child processes doing it's thing. We have been, on and off, been stracing / debugging this issue. Nothing obvious. We are still trying to get o

[ceph-users] Ceph MDS WRN replayed op client.$id

2018-09-12 Thread Stefan Kooman

Hi, Once in a while, today a bit more often, the MDS is logging the following: mds.mds1 [WRN] replayed op client.15327973:15585315,15585103 used ino 0x19918de but session next is 0x1873b8b Nothing of importance is logged in the mds (debug_mds_log": "1/5"). What does this warning messag

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

2018-09-12 Thread Casey Bodley

On 09/12/2018 05:29 AM, Daniel Goldbach wrote: Hi all, We're reading from a Ceph Luminous pool using the librados asychronous I/O API. We're seeing some concerning memory usage patterns when we read many objects in sequence. The expected behaviour is that our memory usage stabilises at a s

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

2018-09-12 Thread Gregory Farnum

Yep, those completions are maintaining bufferlist references IIRC, so they’re definitely holding the memory buffers in place! On Wed, Sep 12, 2018 at 7:04 AM Casey Bodley wrote: > > > On 09/12/2018 05:29 AM, Daniel Goldbach wrote: > > Hi all, > > > > We're reading from a Ceph Luminous pool using

[ceph-users] osx support and performance testing

2018-09-12 Thread Marc Roos

Is this osxfuse, the only and best performing way to mount a ceph filesystem on an osx client? http://docs.ceph.com/docs/mimic/dev/macos/ I am now testing cephfs performance on a client with the fio libaio engine. This engine does not exist on osx, but there is a posixaio. Does anyone have ex

[ceph-users] Benchmark does not show gains with DB on SSD

2018-09-12 Thread Ján Senko

We are benchmarking a test machine which has: 8 cores, 64GB RAM 12 * 12 TB HDD (SATA) 2 * 480 GB SSD (SATA) 1 * 240 GB SSD (NVME) Ceph Mimic Baseline benchmark for HDD only (Erasure Code 4+2) Write 420 MB/s, 100 IOPS, 150ms latency Read 1040 MB/s, 260 IOPS, 60ms latency Now we moved WAL to the SS

Re: [ceph-users] Benchmark does not show gains with DB on SSD

2018-09-12 Thread Eugen Block

Hi Jan, how did you move the WAL and DB to the SSD/NVMe? By recreating the OSDs or a different approach? Did you check afterwards that the devices were really used for that purpose? We had to deal with that a couple of months ago [1] and it's not really obvious if the new devices are real

Re: [ceph-users] Benchmark does not show gains with DB on SSD

2018-09-12 Thread David Turner

If you're writes are small enough (64k or smaller) they're being placed on the WAL device regardless of where your DB is. If you change your testing to use larger writes you should see a difference by adding the DB. Please note that the community has never recommended using less than 120GB DB for

[ceph-users] help me turn off "many more objects that average"

2018-09-12 Thread Chad William Seys

Hi all, I'm having trouble turning off the warning "1 pools have many more objects per pg than average". I've tried a lot of variations on the below, my current ceph.conf: #... [mon] #... mon_pg_warn_max_object_skew = 0 All of my monitors have been restarted. Seems like I'm missing someth

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

2018-09-12 Thread Daniel Goldbach

The issue continues even when I do rados_aio_release(completion) at the end of the readobj(..) definition in the example. Also, in our production code we call rados_aio_release for each completion and we still see the issue there. The release command doesn't guarantee instant release, so could it b

Re: [ceph-users] [Ceph-community] Multisite replication jewel and luminous

2018-09-12 Thread Sage Weil

[Moving this to ceph-users where it will get more eyeballs.] On Wed, 12 Sep 2018, Andrew Cassera wrote: > Hello, > > Any help would be appreciated. I just created two clusters in the lab. One > cluster is running jewel 10.2.10 and the other cluster is running luminous > 12.2.8. After creating

[ceph-users] Performance predictions moving bluestore wall, db to ssd

2018-09-12 Thread Marc Roos

When having a hdd bluestore osd with collocated wal and db. - What performance increase can be expected if one would move the wal to an ssd? - What performance increase can be expected if one would move the db to an ssd? - Would the performance be a lot if you have a very slow hdd (and thu

Re: [ceph-users] Performance predictions moving bluestore wall, db to ssd

2018-09-12 Thread David Turner

You already have a thread talking about benchmarking the addition of WAL and DB partitions to an OSD. Why are you creating a new one about the exact same thing? As with everything, the performance increase isn't even solely answerable by which drives you have, there are a lot of factors that coul

Re: [ceph-users] Performance predictions moving bluestore wall, db to ssd

2018-09-12 Thread David Turner

Sorry, I was wrong that it was you. I just double checked. But there is a new thread as of this morning about this topic where someone is running benchmark tests with numbers titled "Benchmark does not show gains with DB on SSD". On Wed, Sep 12, 2018 at 12:20 PM David Turner wrote: > You alread

Re: [ceph-users] Performance predictions moving bluestore wall, db to ssd

2018-09-12 Thread Marc Roos

What thread? I have put this, with this specific subject so it is easier to find in the future and this is not a 'sub question' of someone's problem. Hoping for others to post their experience/results. I thought if cern can give estimates, people here can to. -Original Message- From: D

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

2018-09-12 Thread Gregory Farnum

That code bit is just "we have an incoming message with data", which is what we'd expect, but means it's not very helpful for tracking down the source of any leaks. My guess is still very much that somehow there are deallocations missing here. Internally, the synchronous API is wrapping the async

Re: [ceph-users] Benchmark does not show gains with DB on SSD

2018-09-12 Thread Ján Senko

Eugene: Between tests we destroyed the OSDs and created them from scratch. We used Docker image to deploy Ceph on one machine. I've seen that there are WAL/DB partitions created on the disks. Should I also check somewhere in ceph config that it actually uses those? David: We used 4MB writes. I kn

Re: [ceph-users] Mimic upgrade failure

2018-09-12 Thread Kevin Hrpcek

I couldn't find any sign of a networking issue at the OS or switches. No changes have been made in those to get the cluster stable again. I looked through a couple OSD logs and here is a selection of some of most frequent errors they were getting. Maybe something below is more obvious to you.

Re: [ceph-users] Benchmark does not show gains with DB on SSD

2018-09-12 Thread Maged Mokhtar

On 12/09/18 17:06, Ján Senko wrote: We are benchmarking a test machine which has: 8 cores, 64GB RAM 12 * 12 TB HDD (SATA) 2 * 480 GB SSD (SATA) 1 * 240 GB SSD (NVME) Ceph Mimic Baseline benchmark for HDD only (Erasure Code 4+2) Write 420 MB/s, 100 IOPS, 150ms latency Read 1040 MB/s, 260 IOPS,

Re: [ceph-users] help me turn off "many more objects that average"

2018-09-12 Thread Paul Emmerich

Did you restart the mons or inject the option? Paul 2018-09-12 17:40 GMT+02:00 Chad William Seys : > Hi all, > I'm having trouble turning off the warning "1 pools have many more objects > per pg than average". > > I've tried a lot of variations on the below, my current ceph.conf: > > #... > [mo

Re: [ceph-users] omap vs. xattr in librados

2018-09-12 Thread Gregory Farnum

On Tue, Sep 11, 2018 at 5:32 PM Benjamin Cherian wrote: > Ok, that’s good to know. I was planning on using an EC pool. Maybe I'll > store some of the larger kv pairs in their own objects or move the metadata > into it's own replicated pool entirely. If the storage mechanism is the > same, is ther

Re: [ceph-users] help me turn off "many more objects that average"

2018-09-12 Thread Chad William Seys

Hi Paul, Yes, all monitors have been restarted. Chad. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs speed

2018-09-12 Thread Joe Comeau

Hi replying to list - instead of directly by accident --- Sorry I was camping for a week and was disconnected without data for the most part Yes it is over iSCSI - 2 iscsi nodes We've set both iscsi and vmware hosts to SUSE recommended settings in addition we've set round robi

Re: [ceph-users] data corruption issue with "rbd export-diff/import-diff"

2018-09-12 Thread Jason Dillaman

Any chance you know the LBA or byte offset of the corruption so I can compare it against the log? On Wed, Sep 12, 2018 at 8:32 PM wrote: > > Hi Jason, > > On 2018-09-10 11:15:45-07:00 ceph-users wrote: > > On 2018-09-10 11:04:20-07:00 Jason Dillaman wrote: > > > > In addition to this, we are se

Re: [ceph-users] data corruption issue with "rbd export-diff/import-diff"

2018-09-12 Thread Patrick.Mclean

On 2018-09-12 17:35:16-07:00 Jason Dillaman wrote: Any chance you know the LBA or byte offset of the corruption so I can compare it against the log? The LBAs of the corruption are 0xA74F000 through 175435776 On Wed, Sep 12, 2018 at 8:32 PM wrote: > > Hi Jason,

Re: [ceph-users] data corruption issue with "rbd export-diff/import-diff"

2018-09-12 Thread Jason Dillaman

On Wed, Sep 12, 2018 at 10:15 PM wrote: > > On 2018-09-12 17:35:16-07:00 Jason Dillaman wrote: > > > Any chance you know the LBA or byte offset of the corruption so I can > compare it against the log? > > The LBAs of the corruption are 0xA74F000 through 175435776 Are you saying the corruption sta

Re: [ceph-users] omap vs. xattr in librados

2018-09-12 Thread Benjamin Cherian

Greg, Paul, Thank you for the feedback. This has been very enlightening. One last question (for now at least). Are there any expected performance impacts from having I/O to multiple pools from the same client? (Given how RGW and CephFS store metadata, I would hope not, but I thought I'd ask.) Base

Re: [ceph-users] omap vs. xattr in librados

2018-09-12 Thread Gregory Farnum

Nope there shouldn’t be any impact apart from the potential issues that arise from breaking up the I/O stream. Which in the case of either a saturated or mostly-idle RADOS cluster should not be an issue. -Greg On Wed, Sep 12, 2018 at 9:24 PM Benjamin Cherian wrote: > Greg, Paul, > > Thank you for

[ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

Re: [ceph-users] ceph-fuse slow cache?

[ceph-users] Ceph MDS WRN replayed op client.$id

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

[ceph-users] osx support and performance testing

[ceph-users] Benchmark does not show gains with DB on SSD

Re: [ceph-users] Benchmark does not show gains with DB on SSD

Re: [ceph-users] Benchmark does not show gains with DB on SSD

[ceph-users] help me turn off "many more objects that average"

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

Re: [ceph-users] [Ceph-community] Multisite replication jewel and luminous

[ceph-users] Performance predictions moving bluestore wall, db to ssd

Re: [ceph-users] Performance predictions moving bluestore wall, db to ssd

Re: [ceph-users] Performance predictions moving bluestore wall, db to ssd

Re: [ceph-users] Performance predictions moving bluestore wall, db to ssd

Re: [ceph-users] RADOS async client memory usage explodes when reading several objects in sequence

Re: [ceph-users] Benchmark does not show gains with DB on SSD

Re: [ceph-users] Mimic upgrade failure

Re: [ceph-users] Benchmark does not show gains with DB on SSD

Re: [ceph-users] help me turn off "many more objects that average"

Re: [ceph-users] omap vs. xattr in librados

Re: [ceph-users] help me turn off "many more objects that average"

Re: [ceph-users] cephfs speed

Re: [ceph-users] data corruption issue with "rbd export-diff/import-diff"

Re: [ceph-users] data corruption issue with "rbd export-diff/import-diff"

Re: [ceph-users] data corruption issue with "rbd export-diff/import-diff"

Re: [ceph-users] omap vs. xattr in librados

Re: [ceph-users] omap vs. xattr in librados

29 matches

Site Navigation

Mail list logo

Footer information