Re: [Qemu-devel] [PATCH for-3.0 v2] pc: acpi: fix memory hotplug regression by reducing stub SRAT entry size

Igor Mammedov Fri, 10 Aug 2018 08:53:47 -0700

On Thu, 9 Aug 2018 13:10:44 -0300
Eduardo Habkost <ehabk...@redhat.com> wrote:


> On Thu, Aug 09, 2018 at 11:10:54AM +0200, Igor Mammedov wrote:
> > On Thu, 2 Aug 2018 10:39:22 -0300
> > Eduardo Habkost <ehabk...@redhat.com> wrote:
> >   
> > > On Thu, Aug 02, 2018 at 12:09:37PM +0200, Igor Mammedov wrote:  
> > > > On Tue, 31 Jul 2018 12:03:22 -0300
> > > > Eduardo Habkost <ehabk...@redhat.com> wrote:
> > > >     
> > > > > On Tue, Jul 31, 2018 at 11:53:40AM +0200, Igor Mammedov wrote:    
> > > > > > On Mon, 30 Jul 2018 17:26:24 -0300
> > > > > > Eduardo Habkost <ehabk...@redhat.com> wrote:
> > > > > >       
> > > > > > > On Mon, Jul 30, 2018 at 11:41:41AM +0200, Igor Mammedov wrote:    
> > > > > > >   
> > > > > > > > Commit 848a1cc1e (hw/acpi-build: build SRAT memory affinity 
> > > > > > > > structures for DIMM devices)
> > > > > > > > broke the first dimm hotplug in following cases:
> > > > > > > > 
> > > > > > > >  1: there is no coldplugged dimm in the last numa node
> > > > > > > >     but there is a coldplugged dimm in another node
> > > > > > > > 
> > > > > > > >   -m 4096,slots=4,maxmem=32G               \
> > > > > > > >   -object memory-backend-ram,id=m0,size=2G \
> > > > > > > >   -device pc-dimm,memdev=m0,node=0         \
> > > > > > > >   -numa node,nodeid=0                      \
> > > > > > > >   -numa node,nodeid=1
> > > > > > > > 
> > > > > > > >  2: if order of dimms on CLI is:
> > > > > > > >        1st plugged dimm in node1
> > > > > > > >        2nd plugged dimm in node0
> > > > > > > > 
> > > > > > > >   -m 4096,slots=4,maxmem=32G               \
> > > > > > > >   -object memory-backend-ram,size=2G,id=m0 \
> > > > > > > >   -device pc-dimm,memdev=m0,node=1         \
> > > > > > > >   -object memory-backend-ram,id=m1,size=2G \
> > > > > > > >   -device pc-dimm,memdev=m1,node=0         \
> > > > > > > >   -numa node,nodeid=0                      \
> > > > > > > >   -numa node,nodeid=1
> > > > > > > > 
> > > > > > > > (qemu) object_add memory-backend-ram,id=m2,size=1G
> > > > > > > > (qemu) device_add pc-dimm,memdev=m2,node=0
> > > > > > > > 
> > > > > > > > the first DIMM hotplug to any node except the last one
> > > > > > > > fails (Windows is unable to online it).
> > > > > > > > 
> > > > > > > > Length reduction of stub hotplug memory SRAT entry,
> > > > > > > > fixes issue for some reason.
> > > > > > > >         
> > > > > > > 
> > > > > > > I'm really bothered by the lack of automated testing for all
> > > > > > > these NUMA/ACPI corner cases.
> > > > > > > 
> > > > > > > This looks like a good candidate for an avocado_qemu test case.
> > > > > > > Can you show pseudo-code of how exactly the bug fix could be
> > > > > > > verified, so we can try to write a test case?      
> > > > > > Sadly I do it manually every time I'm suspect a patch would
> > > > > > affect the feature. On just has to check if a new memory device
> > > > > > appeared in device manager and it is in working state (started 
> > > > > > successfully).
> > > > > > One also need to run it against to test it against windows version
> > > > > > that supports memory hot-add (DC ed.).
> > > > > > 
> > > > > > It's typically what RHEL QE does, and they just found
> > > > > > a new case which wasn't on test list so proactive measures
> > > > > > wouldn't work here in any case as we didn't know about
> > > > > > this particular combination.
> > > > > > 
> > > > > > I'm not sure how it will work with upstream avocado though,
> > > > > > windows testing implies tester would need access to MSDN
> > > > > > subscription or multiple retail versions to test against.
> > > > > > So with windows it becomes expensive and complicated
> > > > > > hence I'd leave this job to QE which has resources and
> > > > > > upstream would benefit from downstream when a bug is found
> > > > > > (albeit it's a catch up game).      
> > > > > 
> > > > > I don't mean functional testing of Windows guests.  I'm just
> > > > > looking for a way we can ensure we won't reintroduce this
> > > > > particular bug later.  We should be able to encode known
> > > > > requirements of existing guest OSes in test code (especially the
> > > > > undocumented requirements).
> > > > >
> > > > > In other words, we need test code that will check if the entry
> > > > > you are adding below is still present and contains the right
> > > > > flags, so people won't remove it by mistake.    
> > > > known requirements are described in acpi code comment and commit
> > > > messages and maintainer are supposed to check if a change showed
> > > > by bios test is valid and doesn't regress existing state.    
> > > 
> > > I really believe computers are better at that than humans.  If we
> > > can't even describe to a computer how to look for mistakes, I
> > > don't think we can expect humans to spot them.
> > > 
> > >   
> > > > Parsing SRAT in test and ensuring that the last entry hasn't changed
> > > > won't help, we already have this by doing comparison with reference
> > > > SRAT.    
> > > 
> > > Comparison against reference SRAT is useless when a patch is
> > > expected to change the ACPI tables, which happens all the time.
> > > I think we can do better than that.
> > >   
> > > > 
> > > > And if there is a change, the only thing that can somewhat verify
> > > > it is a functional test with windows (known combinations at
> > > > least). Some new sequence/combination might regress it again
> > > > (like one described in commit). An Avocado functional test running
> > > > windows(es) might help if it will test random startup/hotplug 
> > > > combinations,
> > > > run by someone with rights to use windows.
> > > > 
> > > > I think that once I've contributed cpu hotplug testcases to autotest
> > > > but then there appears a new test suite and then another.
> > > > I don't really feel nor have capacity to deal with it, if someone
> > > > contributes testcase to Avocado and tells me how to easily use it,
> > > > I'd gladly run it with windows guests I have access to
> > > > whenever I review/test a patch that might affect windows.    
> > > 
> > > Running tests with real guests is useful, but costly.  I would
> > > like to be able to detect when we break known requirements
> > > without running a guest.
> > > 
> > > I'm not asking you to write that test code right now, but I'm
> > > trying to see if it's possible to do that and use it as reference
> > > for testing future ACPI patches.  Let's see if I can enumerate
> > > the known requirements for this stub entry that are not
> > > documented in the ACPI spec:
> > > 
> > > 1) Windows expect a stub entry with
> > >    MEM_AFFINITY_HOTPLUGGABLE|MEM_AFFINITY_ENABLED at the end of
> > >    the hotpluggable address space.
> > > 2) Linux won't enable SWIOTLB when booted with less 4GB unless
> > >    we add a stub entry to cover the whole hotpluggable address
> > >    space.
> > > 3) Windows expect that entry to be bound to the last NUMA node.
> > > 4) Windows won't behave well if the stub entry covers the whole
> > >    hotpluggable address space.  We just need to be at the end of
> > >    the address space, leaving a hole between the last DIMM and
> > >    this stub entry.  We add a 1-byte stub entry for that.
> > > 
> > > The last one is a new requirement we didn't know about, right?
> > > Is my description accurate?  What else we know about Windows
> > > expectations?
this requirements might be not complete, though.
It seems that 2.6 kernel isn't happy about 1 byte entry at the end
and discards SRAT, but restriction was removed since 3.0

So I'll try to find a middle ground that will work for old/new
kernels and windows.


> > that's all we know now. But to make this work from unit test
> > one would need to write SRAT table parser to locate and check
> > this entry.  
> 
> Thanks.  It's true that the test would have to parse the SRAT
> table, so it's not as trivial as I would like to.
> 
> > 
> > Anyways for me this would be test won't do much good beside of
> > telling that entry has changed somehow (which existing SRAT diff
> > already does). So it's the same warning from maintainer pov,
> > as I'd need to run real guest tests to ack or veto the change.
> > Even doing the later doesn't guarantee that it's a safe change
> > as 848a1cc1e readily demonstrates.  
> 
> Automated testing for specific non-documented requirements would
> help a reviewer to decide if the change requires more extensive
> testing.  The test output would carry more useful information
> than just "SRAT has changed".
It shows diff of how it's changed so reviewer should ask if change is
valid and to answer it one should do a lot of testing.
But if it would make a reviewer happier to see a better message that
SRAT has changed it is fine as well. Only requirements appear to be
a moving target.

Re: [Qemu-devel] [PATCH for-3.0 v2] pc: acpi: fix memory hotplug regression by reducing stub SRAT entry size

Reply via email to