[ceph-users] Re: Issue with QEMU clones and excessive PG counts on Ceph Squid (19.2.3)

Anthony D'Atri via ceph-users Tue, 17 Feb 2026 17:33:51 -0800

> 
>>   *Hardware:* Intel NVMe drives,

Intel, or Solidigm? I highly recommend using SST (or if you got them from Dell, 
DSU) to update firmware.


>>   *Fails:* Cloning via QEMU (librbd).

Check that your qemu / ProxMox defaults to enabling the layering, object-map, 
and fast-diff feature flags on RBD volumes.  

>> , the autoscaler increased the total PG
>> count to 16k. We suspect this excessive PG count might be contributing to
>> the cloning failures, though we aren't certain.

I’m skeptical.  

>> We would like to safely decrease the number of PGs back to a standard range

What do you consider a standard range? If you have one OSD deployed on each 
SSD, the PGS column from `ceph osd df ` should be in the 200-300 range.  The 
autoscaler pgs_per_osd_target defaults to 100 which is rather low for most 
clusters, plus it’s actually a ceiling not a target.  I suggest setting to 300 
and mon_max_pgs_per_osd to 600.  

16k PGs, assuming no other large pools and replica size=3, would thus be just 
smurfy across ….. ~~ 240 OSDs, plus or minus.  How many do you have?
What PGS values do you have?

>>   Decrease the PG count in small increments.

The Manager does this for you since Nautilus I think it was.  

>>   Monitor for I/O latency on the Intel NVMes.

Remember that iostat %util isn’t great for that.  And update your firmware!

>> 
>> *Questions for the list:*
>> 
>>   1.
>> 
>>   Are there known risks to "stepping down" PGs in increments of, say,
>>   10-20% at a time while the cluster is under load?

No, the Manager steps pg_num and pgp_num for you.  Remember to only set the 
pg_num target to a power of two.  


>>   2.
>> 
>>   Given the 16k PG count, is there a specific "recovery/backfill" throttle
>>   we should prioritize to ensure VM performance isn't impacted during the
>>   merge process?

osd_max_backfills <= 4

>> 
>> We are trying to decide if this approach is feasible or if we should create
>> a new cluster and migrate VMs

I doubt that would be necessary.  


>> . Any insights would be greatly appreciated.
>> 
>> Thank you,
>> 
>> Robert
>> _______________________________________________
>> ceph-users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
> 
> 
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Issue with QEMU clones and excessive PG counts on Ceph Squid (19.2.3)

Reply via email to