Hi Babu,

On Wed, Aug 23, 2023 at 12:18:30PM -0500, Moger, Babu wrote:
> Date: Wed, 23 Aug 2023 12:18:30 -0500
> From: "Moger, Babu" <babu.mo...@amd.com>
> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[4]
> 
> Hi Zhao,
> 
> On 8/18/23 02:37, Zhao Liu wrote:
> > Hi Babu,
> > 
> > On Mon, Aug 14, 2023 at 11:03:53AM -0500, Moger, Babu wrote:
> >> Date: Mon, 14 Aug 2023 11:03:53 -0500
> >> From: "Moger, Babu" <babu.mo...@amd.com>
> >> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
> >>  CPUID[4]
> >>
> >> Hi Zhao,
> >>
> >>
> >> On 8/14/23 03:22, Zhao Liu wrote:
> >>> Hi Babu,
> >>>
> >>> On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
> >>>> Date: Fri, 4 Aug 2023 10:48:29 -0500
> >>>> From: "Moger, Babu" <babu.mo...@amd.com>
> >>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to 
> >>>> encode
> >>>>  CPUID[4]
> >>>>
> >>>> Hi Zhao,
> >>>>
> >>>> On 8/4/23 04:48, Zhao Liu wrote:
> >>>>> Hi Babu,
> >>>>>
> >>>>> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
> >>>>>> Date: Thu, 3 Aug 2023 11:41:40 -0500
> >>>>>> From: "Moger, Babu" <babu.mo...@amd.com>
> >>>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to 
> >>>>>> encode
> >>>>>>  CPUID[4]
> >>>>>>
> >>>>>> Hi Zhao,
> >>>>>>
> >>>>>> On 8/2/23 18:49, Moger, Babu wrote:
> >>>>>>> Hi Zhao,
> >>>>>>>
> >>>>>>> Hitting this error after this patch.
> >>>>>>>
> >>>>>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code 
> >>>>>>> should
> >>>>>>> not be reached
> >>>>>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: 
> >>>>>>> code
> >>>>>>> should not be reached
> >>>>>>> Aborted (core dumped)
> >>>>>>>
> >>>>>>> Looks like share_level for all the caches for AMD is not initialized.
> >>>>>
> >>>>> I missed these change when I rebase. Sorry for that.
> >>>>>
> >>>>> BTW, could I ask a question? From a previous discussion[1], I understand
> >>>>> that the cache info is used to show the correct cache information in
> >>>>> new machine. And from [2], the wrong cache info may cause "compatibility
> >>>>> issues".
> >>>>>
> >>>>> Is this "compatibility issues" AMD specific? I'm not sure if Intel 
> >>>>> should
> >>>>> update the cache info like that. thanks!
> >>>>
> >>>> I was going to comment about that. Good that you asked me.
> >>>>
> >>>> Compatibility is qemu requirement.  Otherwise the migrations will fail.
> >>>>
> >>>> Any changes in the topology is going to cause migration problems.
> >>>
> >>> Could you please educate me more about the details of the "migration
> >>> problem"?
> >>>
> >>> I didn't understand why it was causing the problem and wasn't sure if I
> >>> was missing any cases.
> >>>
> >>
> >> I am not an expert on migration but I test VM migration sometimes.
> >> Here are some guidelines.
> >> https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines
> > 
> > Thanks for the material!
> > 
> >>
> >> When you migrate a VM to newer qemu using the same CPU type, migration
> >> should work seamless. That means list of CPU features should be compatible
> >> when you move to newer qemu version with CPU type.
> > 
> > I see. This patches set adds the "-smp cluster" command and the
> > "x-l2-cache-topo" command. Migration requires that the target and
> 
> Shouldn't the command x-l2-cache-topo disabled by default? (For example
> look at hw/i386/pc.c the property x-migrate-smi-count).

Thanks!

Since we add the default topology level in cache models, so the default
l2 topology is the level hardcoded in cache model.

>From this point, this option won't affect the migration between
different QEMU versions. If user doesn't change l2 to cluster, the
default l2 topology levels are the same (core level).

> 
> It will be enabled when user passes "-cpu x-l2-cache-topo=[core|cluster]".
> Current code enables it by default as far I can see.

I think the compatibility issue for x-migrate-smi-count is because it
has differnt default settings for different QEMU versions.

And for x-l2-cache-topo, it defaults to use the level hardcoded in cache
model, this is no difference between new and old QEMUs.

> 
> > source VM command lines are the same, so the new commands ensure that
> > the migration is consistent.
> > 
> > But this patch set also includes some topology fixes (nr_cores fix and
> > l1 cache topology fix) and encoding change (use APIC ID offset to encode
> > addressable ids), these changes would affect migration and may cause
> > CPUID change for VM view. Thus if this patch set is accepted, these
> > changes also need to be pushed into stable versions. Do you agree?
> 
> Yes. That sounds right.
> 
> > 
> > And about cache info for different CPU generations, migration usually
> > happens on the same CPU type, and Intel uses the same default cache
> > info for all CPU types. With the consistent cache info, migration is
> > also Ok. So if we don't care about the specific cache info in the VM,
> > it's okay to use the same default cache info for all CPU types. Right?
> 
> I am not sure about this. Please run migration tests to be sure.

We tested for these cases:

1. v3 <-> v3: same cli (same setting in x-l2-cache-topo) cases succeed.

2. v3 <-> master base (no v3 patches): same cli or v3 with default level
   (as hardcoded in cache models) cases succeed.

Thanks,
Zhao

Reply via email to