[ceph-users] Scrubbing not using Idle thread?

2016-11-08 Thread Nick Fisk
Hi, I have all the normal options set in ceph.conf (priority and class for disk threads) however scrubs look like they are running as the standard BE/4 class in iotop. Running 10.2.3. Eg PG Dump (Shows that OSD 1 will be scrubbing) pg_stat objects mip degrmispunf bytes log

Re: [ceph-users] Scrubbing not using Idle thread?

2016-11-08 Thread Dan van der Ster
Hi Nick, That's expected since jewel, which moved the scrub IOs out of the disk thread and into the ?op? thread. They can now be prioritized using osd_scrub_priority, and you can experiment with osd_op_queue = prio/wpq to see if scrubs can be made more transparent with the latter, newer, queuing i

Re: [ceph-users] Replication strategy, write throughput

2016-11-08 Thread Christian Balzer
On Tue, 8 Nov 2016 08:55:47 +0100 Andreas Gerstmayr wrote: > 2016-11-07 3:05 GMT+01:00 Christian Balzer : > > > > Hello, > > > > On Fri, 4 Nov 2016 17:10:31 +0100 Andreas Gerstmayr wrote: > > > >> Hello, > >> > >> I'd like to understand how replication works. > >> In the paper [1] several replicat

Re: [ceph-users] Scrubbing not using Idle thread?

2016-11-08 Thread Nick Fisk
> -Original Message- > From: Dan van der Ster [mailto:d...@vanderster.com] > Sent: 08 November 2016 08:38 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Scrubbing not using Idle thread? > > Hi Nick, > > That's expected since jewel, which moved the scrub IOs out of the disk

Re: [ceph-users] Deep scrubbing causes severe I/O stalling

2016-11-08 Thread Kees Meijs
Hi, As promised, our findings so far: * For the time being, the new scrubbing parameters work well. * Using CFQ for spinners and NOOP voor SSD seems to spread load over the storage cluster a little better than deadline does. However, overall latency seems (just a feeling, no numbers t

Re: [ceph-users] Deep scrubbing causes severe I/O stalling

2016-11-08 Thread Stefan Priebe - Profihost AG
Am 08.11.2016 um 10:17 schrieb Kees Meijs: > Hi, > > As promised, our findings so far: > > * For the time being, the new scrubbing parameters work well. Which parameters do you refer to? Currently we're on hammer. > * Using CFQ for spinners and NOOP voor SSD seems to spread load over >

[ceph-users] ceph 10.2.3 release

2016-11-08 Thread M Ranga Swami Reddy
Hello, Can you please confirm, if the ceph 10.2.3 is ready for production use. Thanks Swami ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph 10.2.3 release

2016-11-08 Thread Sean Redmond
Hi, Yes this is pretty stable, I am running it in production. Thanks On Tue, Nov 8, 2016 at 10:38 AM, M Ranga Swami Reddy wrote: > Hello, > Can you please confirm, if the ceph 10.2.3 is ready for production use. > > Thanks > Swami > > ___ > ceph-user

[ceph-users] aaa

2016-11-08 Thread 张鹏
aaa ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-11-08 Thread 张鹏
i think you may increase mds_bal_fragment_size_max, default is 10 > > On Oct 4, 2016, at 10:30 AM, John Spray wrote: > > > >> On Tue, Oct 4, 2016 at 5:09 PM, Stephen Horton > wrote: > >> Thank you John. Both my Openstack hosts and the VMs are all running > 4.4.0-38-generic #57-Ubuntu SMP x

Re: [ceph-users] Fwd: Hammer OSD memory increase when add new machine

2016-11-08 Thread Sage Weil
> -- Forwarded message -- > From: Dong Wu > Date: 2016-10-27 18:50 GMT+08:00 > Subject: Re: [ceph-users] Hammer OSD memory increase when add new machine > To: huang jun > 抄送: ceph-users > > > 2016-10-27 17:50 GMT+08:00 huang jun : > > how do you add the new machine ? > > does i

Re: [ceph-users] Fwd: Hammer OSD memory increase when add new machine

2016-11-08 Thread zphj1987
I remember CERN had a test ceph cluster 30PB and the osd use more memery than usual ,and thay tune osdmap_epochs ,if it is the osdmap make it use more memery,ithink you may have a test use less osdmap_epochs to see if have some change default mon_min_osdmap_epochs is 500 zphj1987 2016-11-

Re: [ceph-users] Bluestore + erasure coding memory usage

2016-11-08 Thread Mark Nelson
Heya, Sorry got distracted with other stuff yesterday. Any chance you could run this for longer? It's tough to tell what's going on from this run unfortunately. Maybe overnight if possible. Thanks! Mark On 11/08/2016 01:10 AM, bobobo1...@gmail.com wrote: Just bumping this and CCing dir

Re: [ceph-users] Bluestore + erasure coding memory usage

2016-11-08 Thread bobobo1...@gmail.com
Unfortunately I don't think overnight is possible. The OOM will kill it in hours, if not minutes. Will the output be preserved/usable if the process is uncleanly terminated? On 8 Nov 2016 08:33, "Mark Nelson" wrote: > Heya, > > Sorry got distracted with other stuff yesterday. Any chance you cou

Re: [ceph-users] Bluestore + erasure coding memory usage

2016-11-08 Thread Mark Nelson
It should be running much slower through valgrind so probably won't accumulate very quickly. That was the problem with the earlier trace, there wasn't enough memory used yet to really get us out of the weeds. If it's still accumulating quickly, try to wait until the OSD is up to 4+GB RSS if yo

[ceph-users] How to pick the number of PGs for a CephFS metadata pool?

2016-11-08 Thread Dan Jakubiec
Hello, Picking the number of PGs for the CephFS data pool seems straightforward, but how does one do this for the metadata pool? Any rules of thumb or recommendations? Thanks, -- Dan Jakubiec ___ ceph-users mailing list ceph-users@lists.ceph.com http

[ceph-users] Ceph Days 2017??

2016-11-08 Thread McFarland, Bruce
Is there a schedule available for 2017 Ceph Days? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to pick the number of PGs for a CephFS metadata pool?

2016-11-08 Thread Gregory Farnum
On Tue, Nov 8, 2016 at 9:37 AM, Dan Jakubiec wrote: > Hello, > > Picking the number of PGs for the CephFS data pool seems straightforward, but > how does one do this for the metadata pool? > > Any rules of thumb or recommendations? I don't think we have any good ones yet. You've got to worry abo

Re: [ceph-users] forward cache mode support?

2016-11-08 Thread Gregory Farnum
On Mon, Nov 7, 2016 at 2:39 AM, Henrik Korkuc wrote: > Hey, > > trying to activate forward mode for cache pool results in "Error EPERM: > 'forward' is not a well-supported cache mode and may corrupt your data. > pass --yes-i-really-mean-it to force." > > Change for this message was introduced few

Re: [ceph-users] MDS Problems - Solved but reporting for benefit of others

2016-11-08 Thread Gregory Farnum
On Wed, Nov 2, 2016 at 2:49 PM, Nick Fisk wrote: > A bit more digging, the original crash appears to be similar (but not exactly > the same) as this tracker report > > http://tracker.ceph.com/issues/16983 > > I can see that this was fixed in 10.2.3, so I will probably look to upgrade. > > If the

[ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Andrei Mikhailovsky
Hello I am having issues with creating buckets in radosgw. It started with an upgrade to version 10.2.x When I am creating a bucket I get the following error on the client side: boto.exception.S3ResponseError: S3ResponseError: 400 Bad Request InvalidArgumentmy-new-bucket-31337tx000

Re: [ceph-users] How to pick the number of PGs for a CephFS metadata pool?

2016-11-08 Thread Dan Jakubiec
Thanks Greg, makes sense. Our ceph cluster currently has 16 OSDs, each with an 8TB disk. Sounds like 32 PGs at 3x replication might be a reasonable starting point? Thanks, -- Dan > On Nov 8, 2016, at 14:02, Gregory Farnum wrote: > > On Tue, Nov 8, 2016 at 9:37 AM, Dan Jakubiec wrote: >> Hel

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Yehuda Sadeh-Weinraub
On Tue, Nov 8, 2016 at 3:36 PM, Andrei Mikhailovsky wrote: > Hello > > I am having issues with creating buckets in radosgw. It started with an > upgrade to version 10.2.x > > When I am creating a bucket I get the following error on the client side: > > > boto.exception.S3ResponseError: S3ResponseE

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Andrei Mikhailovsky
Hi Yehuda, I don't have a multizone setup. The radosgw service was configured about two years ago according to the documentation on ceph.com and haven't changed with numerous version updates. All was working okay until i've upgraded to version 10.2.x. Could you please point me in the right dir

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Yehuda Sadeh-Weinraub
On Tue, Nov 8, 2016 at 5:05 PM, Andrei Mikhailovsky wrote: > Hi Yehuda, > > I don't have a multizone setup. The radosgw service was configured about two > years ago according to the documentation on ceph.com and haven't changed with > numerous version updates. All was working okay until i've upg

Re: [ceph-users] Fwd: Hammer OSD memory increase when add new machine

2016-11-08 Thread Dong Wu
Thanks, though CERN 30PB cluster test, the osdmap caches causes memory increase, I'll test how these configs( osd_map_cache_size, osd_map_max_advance, etc.) influence the memory usage. 2016-11-08 22:48 GMT+08:00 zphj1987 : > I remember CERN had a test ceph cluster 30PB and the osd use more memer

Re: [ceph-users] Bluestore + erasure coding memory usage

2016-11-08 Thread bobobo1...@gmail.com
Okay, I left it for 3h and it seemed to actually stabilise at around 2.3G: http://ix.io/1DEK This was only after disabling other services on the system however. Generally this much RAM isn't available to Ceph (hence the OOM previously). On Tue, Nov 8, 2016 at 9:00 AM, Mark Nelson wrote: > It sho

Re: [ceph-users] Bluestore + erasure coding memory usage

2016-11-08 Thread bobobo1...@gmail.com
Ah, I was actually mistaken. After running without Valgrind, it seems I just estimated how slowed down it was. I'll leave it to run overnight as suggested. On Tue, Nov 8, 2016 at 10:44 PM, bobobo1...@gmail.com wrote: > Okay, I left it for 3h and it seemed to actually stabilise at around > 2.3G: h