cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)

Gregory Price Tue, 03 Jan 2023 08:03:47 -0800

The fine grained control would be a precursor to an emulated pooling
device.  If you can demonstrate it with a singleton attached device, you
could just implement an exclusivity table in a shared file, and set the
shared memory to a file backend as well.  Boom, shared memory pool across
qemu instances.


On Tue, Jan 3, 2023, 10:56 AM Jonathan Cameron <jonathan.came...@huawei.com>
wrote:

> On Tue, 20 Dec 2022 14:27:31 -0500
> Gregory Price <gregory.pr...@memverge.com> wrote:
>
> > On Tue, Dec 20, 2022 at 03:34:53PM +0000, Jonathan Cameron wrote:
> > > > However I don't think this is successful in creating the dax devices,
> > > > and therefore the reconfiguring into ram.
> > >
> > > Sure. I only bothered testing the it in some dax modes rather than via
> kmem.
> > > It 'should' work but more testing needed there.
> > >
> > > However as you've noted, that only applies to the pmem regions at the
> moment.
> > > I wondered if you'd scripted the HDM decoder setup etc for test
> purposes
> > > (so what the driver will do). Alternative to that would be enabling
> the driver
> > > support. Not sure if anyone is looking at that yet. Final alternative
> would
> > > be to port the existing EDK2 based support to work on QEMU.  All non
> trivial
> > > jobs so may take a while,
> > >
> > > Jonathan
> >
> > Also, I'm relatively new to this corner of the kernel (mm, regions, dax,
> > etc), so i need to spend a week or two with uninterrupted tinkering with
> > how adding new memory regions from these devices is actually "supposed
> > to work" in a dynamic-capacity world.
> >
> > At least in theory, the partitioning of persistent and volatile memory
> > regions on one of these type-3 devices should end up looking a bit like
> > dynamic capacity when doing runtime reconfiguring.
> >
> > For example, considering
> >
> > Device(512mb PMEM, 512 VMEM), I'd want, at least i think
> >
> > CMFW-Volatile:    max window size(1024mb) - Numa 2
> > CMFW-Persistent:  max window size(512mb)  - Numa 3
> >
> > Then we'd need the kernel support for
> >
> > 1) Online 2x256mb volatile regions in Numa 2
> > 2) Online 2x256mb persistent regions in Numa 3
> > 3) Offline persistent region (256mb:512mb)
> > 4) Reconfigure device to 256Pmem/768Volatile
> >    a) change decoders in device accordingly
> > 5) Online 1x256mb volatile region in Numa 2
> >
> > The question is whether you can do this without offlining the other
> > adjacent regions.  I just don't know enough about the region subsystem
> > to say what is "correct" behavior here.
>
> Whilst you probably 'can' do fine grained offline / online (to some
> degree anyway) I'm not sure if people consider it an important
> usecase. If decoder reprogramming is involved things will get very fiddly
> so at least in first instance I'd advocate just ripping it all down and
> building up again.  Or in the simple case, just block attempts to
> reconfigure
> at the partitioning if either side is in use.
>
> >
> > On the device side, I need to go look at the mailbox commands to go
> > about implementing the reconfiguration / decoder reprogramming.
> >
> > I guess the "decoder" reprogramming is essentially changing the
> > read/write commands to adjust based on v/pmem_active vs v/pmem_size?
>
> Yup.  We also need multiple decoder support in general in QEMU.
> It's not that high on my list as my main focus this cycle is going
> to be on reducing the out of tree patch set by upstreaming stuff.
>
> >
> > I suppose I can look at this chunk next.
>
> Great.
>
> Jonathan
>
>
>

Re: [RFC v4 3/3] hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)

Reply via email to