On 19.03.25 14:33:54, Ira Weiny wrote:
> Robert Richter wrote:
> > If a CXL memory device returns a broken zero LSA size in its memory
> > device information (Identify Memory Device (Opcode 4000h), CXL
> > spec. 3.1, 8.2.9.9.1.1), a divide error occurs in the libnvdimm
> > driver:
> > 
> >  Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI
> >  RIP: 0010:nd_label_data_init+0x10e/0x800 [libnvdimm]
> > 
> > Code and flow:
> > 
> > 1) CXL Command 4000h returns LSA size = 0,
> > 2) config_size is assigned to zero LSA size (CXL pmem driver):
> > 
> > drivers/cxl/pmem.c:             .config_size = mds->lsa_size,
> > 
> > 3) max_xfer is set to zero (nvdimm driver):
> > 
> > drivers/nvdimm/label.c: max_xfer = min_t(size_t, ndd->nsarea.max_xfer, 
> > config_size);
> > drivers/nvdimm/label.c: if (read_size < max_xfer) {
> > drivers/nvdimm/label.c-         /* trim waste */
> > 
> > 4) DIV_ROUND_UP() causes division by zero:
> > 
> > drivers/nvdimm/label.c:         max_xfer -= ((max_xfer - 1) - (config_size 
> > - 1) % max_xfer) /
> > drivers/nvdimm/label.c:                     DIV_ROUND_UP(config_size, 
> > max_xfer);
> 
> I think this is the wrong DIV_ROUND_UP which is failing because read_size is
> never less than max_xfer is it?
> 
> I believe the failing DIV_ROUND_UP is after if statement here:
> 
>  489         /* Make our initial read size a multiple of max_xfer size */
>  490         read_size = min(DIV_ROUND_UP(read_size, max_xfer) * max_xfer,
>  491                         config_size);

Yes, it is this one.

> 
> Apparently nvdimm_get_config_data() was intended to check for this implicitly
> but it is too late.
> 
> Anyway all this side tracked me a bit.
> 
> I assume this is a broken device which is in the real world?  The fix looks
> fine.  But could you re-spin with a clean up of the commit message and I'll
> queue it up.

Yes, it was caused by a faulty device.

Sure, will update description and resend.

> 
> Reviewed-by: Ira Weiny <ira.we...@intel.com>

Thanks for review,

-Robert

Reply via email to