For me, it was the .rgw.meta pool that had very dense placement groups. The
OSDs would fail to start and would then commit suicide while trying to scan
the PGs. We had to remove all references of those placement groups just to
get the OSDs to start. It wasn't pretty.
On Mon, Aug 19, 2019, 2:09 AM
Yes, it's possible that they do, but since all of the affected OSDs are
still down and the monitors have been restarted since, all of those
pools have pgs that are in unknown state and don't return anything in
ceph pg ls.
There weren't that many placement groups for the SSDs, but also I don't
This sounds familiar. Do any of these pools on the SSD have fairly dense
placement group to object ratios? Like more than 500k objects per pg? (ceph
pg ls)
On Sun, Aug 18, 2019, 10:12 PM Brad Hubbard wrote:
> On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan wrote:
> >
> > Paul,
> >
> > Thanks for the
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan wrote:
>
> Paul,
>
> Thanks for the reply. All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate
On 8/18/19 6:43 PM, Brad Hubbard wrote:
That's this code.
3114 switch (alg) {
3115 case CRUSH_BUCKET_UNIFORM:
3116 size = sizeof(crush_bucket_uniform);
3117 break;
3118 case CRUSH_BUCKET_LIST:
3119 size = sizeof(crush_bucket_list);
3120 break;
3121 case CRUSH_BUCKET_TRE
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan wrote:
>
> Paul,
>
> Thanks for the reply. All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate