That’s great to know, Bryan. I’ve seen multiple locations for the code out there, which one is canonical? (Lowercase c)
> On Jan 17, 2025, at 3:46 PM, Stillwell, Bryan <bstil...@akamai.com> wrote: > > The latest version (since September) switched to using the python rados > bindings which not only fixes this problem, but also makes it much faster. > It also has a fix I made that orders the upmaps so that data is moved off of > OSDs before trying to move data on to them. This helps a lot on clusters > with EC pools. > > Bryan > > From: Alexander Patrakov <patra...@gmail.com <mailto:patra...@gmail.com>> > Date: Friday, January 17, 2025 at 09:53 > To: Anthony D'Atri <anthony.da...@gmail.com <mailto:anthony.da...@gmail.com>> > Cc: Kasper Rasmussen <kasper_steenga...@hotmail.com > <mailto:kasper_steenga...@hotmail.com>>, ceph-users@ceph.io > <mailto:ceph-users@ceph.io> <ceph-users@ceph.io <mailto:ceph-users@ceph.io>> > Subject: [ceph-users] Re: Adding Rack to crushmap - Rebalancing multiple PB > of data - advice/experience > > !-------------------------------------------------------------------| > This Message Is From an Untrusted Sender > You have not previously corresponded with this sender. > |-------------------------------------------------------------------! > > Hello Kasper, > > Please be aware that the current "upmap-remapped" script is flaky. It > might just refuse to work, with this message: > > Error loading remapped pgs > > This has been traced to the fact that "ceph pg ls remapped -f json" > sets its stderr to non-blocking mode, and that is the same file > descriptor to which jq (which follows in the pipeline) writes. Thus, > jq can get -EAGAIN and terminate prematurely. > > The problem is tracked as > https://urldefense.com/v3/__https://tracker.ceph.com/issues/67505__;!!GjvTz_vk!UldZKAbJ2Z9kMh9IMdHxZdGbAmWC6sE3ekqhHQMHb-HchhMen_khX4bU3IQcH2foYQtx9R_4h3jtdOyn$ > > <https://urldefense.com/v3/__https:/tracker.ceph.com/issues/67505__;!!GjvTz_vk!UldZKAbJ2Z9kMh9IMdHxZdGbAmWC6sE3ekqhHQMHb-HchhMen_khX4bU3IQcH2foYQtx9R_4h3jtdOyn$> > > Retrying the script might help. > > What's worse is that the whole reason for adding jq to the > upmap-remapped script is another Ceph bug: it sometimes outputs > invalid JSON (containing a literal inf or nan instead of a number), > and this became much more common with Reef, as new fields were added > that are commonly equal to inf or nan. This is tracked as > https://urldefense.com/v3/__https://tracker.ceph.com/issues/66215__;!!GjvTz_vk!UldZKAbJ2Z9kMh9IMdHxZdGbAmWC6sE3ekqhHQMHb-HchhMen_khX4bU3IQcH2foYQtx9R_4h5M5tXer$ > > <https://urldefense.com/v3/__https:/tracker.ceph.com/issues/66215__;!!GjvTz_vk!UldZKAbJ2Z9kMh9IMdHxZdGbAmWC6sE3ekqhHQMHb-HchhMen_khX4bU3IQcH2foYQtx9R_4h5M5tXer$> > and has a fix merged in a > not-yet-released version. > > Maybe you should look into alternative tools, like > https://urldefense.com/v3/__https://github.com/digitalocean/pgremapper__;!!GjvTz_vk!UldZKAbJ2Z9kMh9IMdHxZdGbAmWC6sE3ekqhHQMHb-HchhMen_khX4bU3IQcH2foYQtx9R_4h8BgK2LL$ > > <https://urldefense.com/v3/__https:/github.com/digitalocean/pgremapper__;!!GjvTz_vk!UldZKAbJ2Z9kMh9IMdHxZdGbAmWC6sE3ekqhHQMHb-HchhMen_khX4bU3IQcH2foYQtx9R_4h8BgK2LL$> > > > On Fri, Jan 17, 2025 at 11:43 PM Anthony D'Atri <anthony.da...@gmail.com > <mailto:anthony.da...@gmail.com>> wrote: > > > > > > > > > On Jan 17, 2025, at 6:02 AM, Kasper Rasmussen > > > <kasper_steenga...@hotmail.com <mailto:kasper_steenga...@hotmail.com>> > > > wrote: > > > > > > However I'm concerned with the amount of data that needs to be > > > rebalanced, since the cluster holds multiple PB, and I'm looking for > > > review of/input for my plan, as well as words of advice/experience from > > > someone who has been in a similar situation. > > > > Yep, that’s why you want to use upmap-remapped. Otherwise the thundering > > herd of data shuffling will DoS your client traffic, esp. since you’re > > using spinners. Count on pretty much all data moving in the process, and > > the convergence taking …. maybe a week? > > > > > On Pacific: Data is marked as "degraded", and not misplaced as expected. > > > I also see above 2000% degraded data (but that might be another issue) > > > > > > On Quincy: Data is marked as misplaced - which seems correct. > > > > > > I’m not specifically familiar with such a change, but that could be mainly > > cosmetic, a function of how the percentage is calculated for objects / PGs > > that are multiply remapped. > > > > In the depths of time I had clusters that would sometimes show a negative > > number of RADOS objects to recover, it would bounce above and below zero a > > few times as it converged to 0. > > > > > > > Instead balancing has been done by a cron job executing - ceph osd > > > reweight-by-utilization 112 0.05 30 > > > > I used a similar strategy with older releases. Note that this will > > complicate your transition, as those relative weights are a function of the > > CRUSH topology, so when the topology changes, likely some reweighted OSDs > > will get much less than their fair share, and some will get much more. How > > full is your cluster (ceph df)? It might not be a bad idea to > > incrementally revert those all to 1.00000 if you have the capacity, and > > disable the cron job. > > You’ll also likely want to switch to the balancer module for the > > upmap-remapped strategy to incrementally move your data around. Did you > > have it disabled for a specific reason? > > > > Updating to Reef before migrating might be to your advantage so that you > > can benefit from performance and efficiency improvements since Pacific. > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io> > > To unsubscribe send an email to ceph-users-le...@ceph.io > > <mailto:ceph-users-le...@ceph.io> > > > > -- > Alexander Patrakov > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io> > To unsubscribe send an email to ceph-users-le...@ceph.io > <mailto:ceph-users-le...@ceph.io> _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io