Re: [ceph-users] "too many PGs per OSD" in Hammer

Chris Armstrong Fri, 08 May 2015 11:29:40 -0700

We actually have 3 OSDs by default, but some users run 5. Typically we're
not looking at more than that. Should we try 64? I suppose I still don't
understand the tradeoffs here - using fewer PGs definitely makes platform
start faster (and replication when adding new hosts), but not sure what
having more PGs buys us.


We use the default pools plus radosgw, so we have 12 pools in total.

12 pools, 3 OSDs, with a size set to 3 (so each OSD has all PGs).

On Thu, May 7, 2015 at 11:49 PM, Somnath Roy <somnath....@sandisk.com>
wrote:

>  Sorry, I didn’t read through all..It seems you have 6 OSDs, so, I would
> say 128 PGs per pool is not bad !
>
> But, if you keep on adding pools, you need to lower this number, generally
> ~64 PGs per pool should achieve good parallelism with lower number of
> OSDs..If you grow your cluster , create pools with more PGs..
>
> Again, the warning number is a ballpark number, if you have more powerful
> compute and fast disk , you can safely ignore this warning.
>
>
>
> Thanks & Regards
>
> Somnath
>
>
>
> *From:* Somnath Roy
> *Sent:* Thursday, May 07, 2015 11:44 PM
> *To:* 'Chris Armstrong'
> *Cc:* Stuart Longland; ceph-users@lists.ceph.com
> *Subject:* RE: [ceph-users] "too many PGs per OSD" in Hammer
>
>
>
> Nope, 16 seems way too less for performance.
>
> How many OSDs you have ? And how many pools are you planning to create ?
>
>
>
> Thanks & Regards
>
> Somnath
>
>
>
> *From:* Chris Armstrong [mailto:carmstr...@engineyard.com
> <carmstr...@engineyard.com>]
> *Sent:* Thursday, May 07, 2015 11:34 PM
> *To:* Somnath Roy
> *Cc:* Stuart Longland; ceph-users@lists.ceph.com
>
> *Subject:* Re: [ceph-users] "too many PGs per OSD" in Hammer
>
>
>
> Thanks for the details, Somnath.
>
>
>
> So it definitely sounds like 128 pgs per pool is way too many? I lowered
> ours to 16 on a new deploy and the warning is gone. I'm not sure if this
> number is sufficient, though...
>
>
>
> On Wed, May 6, 2015 at 4:10 PM, Somnath Roy <somnath....@sandisk.com>
> wrote:
>
> Just checking, are you aware of this ?
>
> http://ceph.com/pgcalc/
>
> FYI, the warning is given based on the following logic.
>
>     int per = sum_pg_up / num_in;
>     if (per > g_conf->mon_pg_warn_max_per_osd) {
>         //raise warning..
>    }
>
> This is not considering any resources..It is solely depends on number of
> in OSDs and total number of PGs in the cluster. Default
> mon_pg_warn_max_per_osd = 300, so, in your cluster per OSD is serving > 300
> PGs it seems.
> It will be good if you assign PGs in your pool keeping the above
> calculation in mind i.e no more than 300 PGs/ OSD..
> But, if you feel you OSD is in fast disk and box has lot of compute power,
> you may want to try out with more number of PGs/OSD. In this case, raise
> the mon_pg_warn_max_per_osd to something big and warning should go away.
>
> Hope this helps,
>
> Thanks & Regards
> Somnath
>
>
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Stuart Longland
> Sent: Wednesday, May 06, 2015 3:48 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] "too many PGs per OSD" in Hammer
>
> On 07/05/15 07:53, Chris Armstrong wrote:
> > Thanks for the feedback. That language is confusing to me, then, since
> > the first paragraph seems to suggest using a pg_num of 128 in cases
> > where we have less than 5 OSDs, as we do here.
> >
> > The warning below that is: "As the number of OSDs increases, chosing
> > the right value for pg_num becomes more important because it has a
> > significant influence on the behavior of the cluster as well as the
> > durability of the data when something goes wrong (i.e. the probability
> > that a catastrophic event leads to data loss).", which suggests that
> > this could be an issue with more OSDs, which doesn't apply here.
> >
> > Do we know if this warning is calculated based on the resources of the
> > host? If I try with larger machines, will this warning change?
>
> I'd be interested in an answer here too.  I just did an update from Giant
> to Hammer and struck the same dreaded error message.
>
> When I initially deployed Ceph (with Emperor), I worked out according to
> the formula given on the site:
>
> >     # We have: 3 OSD nodes with 2 OSDs each
> >     # giving us 6 OSDs total.
> >     # There are 3 replicas, so the recommended number of
> >     # placement groups is:
> >     #      6 * 100 / 3
> >     # which gives: 200 placement groups.
> >     # Rounding this up to the nearest power of two gives:
> >     osd pool default pg num = 256
> >     osd pool default pgp num = 256
>
> It seems this was a bad value to use.  I now have a problem of a biggish
> lump of data sitting in a pool with an inappropriate number of placement
> groups.  It seems I needed to divide this number by the number of pools.
>
> For now I've shut it up with the following:
>
> > [mon]
> >     mon warn on legacy crush tunables = false
> >     # New warning on move to Hammer
> >     mon pg warn max per osd = 2048
>
> Question is, how does one go about fixing this?  I'd rather not blow away
> production pools just at this point although right now we only have one
> major production load, so if we're going to do it at any time, now is the
> time to do it.
>
> Worst bit is this will probably change: so I can see me hitting this
> problem time and time again as a new pool is added some time later.
>
> Is there a way of tuning the number of placement groups without destroying
> data?
>
> Regards,
> --
>      _ ___             Stuart Longland - Systems Engineer
> \  /|_) |                           T: +61 7 3535 9619
>  \/ | \ |     38b Douglas Street    F: +61 7 3535 9699
>    SYSTEMS    Milton QLD 4064       http://www.vrt.com.au
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is
> intended only for the use of the designated recipient(s) named above. If
> the reader of this message is not the intended recipient, you are hereby
> notified that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If you have received this communication in error, please notify
> the sender by telephone or e-mail (as shown above) immediately and destroy
> any and all copies of this message in your possession (whether hard copies
> or electronically stored copies).
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>
> --
>
> *Chris Armstrong* | Deis Team Lead | *Engine Yard* | t: @carmstrong_afk
> <https://twitter.com/carmstrong_afk> | gh: carmstrong
> <https://github.com/carmstrong>
>
>
>
> Deis: github.com/deis/deis | docs.deis.io | #deis
> <https://botbot.me/freenode/deis/>
>
>
>
> Deis is now part of Engine Yard! http://deis.io/deis-meet-engine-yard/
>



-- 
*Chris Armstrong* | Deis Team Lead | *Engine Yard* | t: @carmstrong_afk
<https://twitter.com/carmstrong_afk> | gh: carmstrong
<https://github.com/carmstrong>

Deis: github.com/deis/deis | docs.deis.io | #deis
<https://botbot.me/freenode/deis/>

Deis is now part of Engine Yard! http://deis.io/deis-meet-engine-yard/

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] "too many PGs per OSD" in Hammer

Reply via email to