On Thu, 21 Sep 2017 08:04:55 +0200
Cédric Le Goater <c...@kaod.org> wrote:

> On 09/21/2017 05:54 AM, Nikunj A Dadhania wrote:
> > David Gibson <da...@gibson.dropbear.id.au> writes:
> >   
> >> On Wed, Sep 20, 2017 at 12:48:55PM +0530, Nikunj A Dadhania wrote:  
> >>> David Gibson <da...@gibson.dropbear.id.au> writes:
> >>>  
> >>>> On Wed, Sep 20, 2017 at 12:10:48PM +0530, Nikunj A Dadhania wrote:  
> >>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
> >>>>>  
> >>>>>> On Wed, Sep 20, 2017 at 10:43:19AM +0530, Nikunj A Dadhania wrote:  
> >>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
> >>>>>>>  
> >>>>>>>> On Wed, Sep 20, 2017 at 09:50:24AM +0530, Nikunj A Dadhania wrote:  
> >>>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
> >>>>>>>>>  
> >>>>>>>>>> On Fri, Sep 15, 2017 at 02:39:16PM +0530, Nikunj A Dadhania wrote: 
> >>>>>>>>>>  
> >>>>>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
> >>>>>>>>>>>  
> >>>>>>>>>>>> On Fri, Sep 15, 2017 at 01:53:15PM +0530, Nikunj A Dadhania 
> >>>>>>>>>>>> wrote:  
> >>>>>>>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
> >>>>>>>>>>>>>  
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I thought, I am doing the same here for PowerNV, number of 
> >>>>>>>>>>>>>>> online cores
> >>>>>>>>>>>>>>> is equal to initial online vcpus / threads per core
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>    int boot_cores_nr = smp_cpus / smp_threads;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Only difference that I see in PowerNV is that we have 
> >>>>>>>>>>>>>>> multiple chips
> >>>>>>>>>>>>>>> (max 2, at the moment)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>         cores_per_chip = smp_cpus / (smp_threads * 
> >>>>>>>>>>>>>>> pnv->num_chips);  
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This doesn't make sense to me.  Cores per chip should *always* 
> >>>>>>>>>>>>>> equal
> >>>>>>>>>>>>>> smp_cores, you shouldn't need another calculation for it.
> >>>>>>>>>>>>>>  
> >>>>>>>>>>>>>>> And in case user has provided sane smp_cores, we use it.  
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If smp_cores isn't sane, you should simply reject it, not try 
> >>>>>>>>>>>>>> to fix
> >>>>>>>>>>>>>> it.  That's just asking for confusion.  
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This is the case where the user does not provide a 
> >>>>>>>>>>>>> topology(which is a
> >>>>>>>>>>>>> valid scenario), not sure we should reject it. So qemu defaults
> >>>>>>>>>>>>> smp_cores/smt_threads to 1. I think it makes sense to 
> >>>>>>>>>>>>> over-ride.  
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you can find a way to override it by altering smp_cores when 
> >>>>>>>>>>>> it's
> >>>>>>>>>>>> not explicitly specified, then ok.  
> >>>>>>>>>>>
> >>>>>>>>>>> Should I change the global smp_cores here as well ?  
> >>>>>>>>>>
> >>>>>>>>>> I'm pretty uneasy with that option.  
> >>>>>>>>>
> >>>>>>>>> Me too.
> >>>>>>>>>  
> >>>>>>>>>> It would take a fair bit of checking to ensure that changing 
> >>>>>>>>>> smp_cores
> >>>>>>>>>> is safe here. An easier to verify option would be to make the 
> >>>>>>>>>> generic
> >>>>>>>>>> logic which splits up an unspecified -smp N into cores and sockets
> >>>>>>>>>> more flexible, possibly based on machine options for max values.
> >>>>>>>>>>
> >>>>>>>>>> That might still be more trouble than its worth.  
> >>>>>>>>>
> >>>>>>>>> I think the current approach is the simplest and less intrusive, as 
> >>>>>>>>> we
> >>>>>>>>> are handling a case where user has not bothered to provide a 
> >>>>>>>>> detailed
> >>>>>>>>> topology, the best we can do is create single threaded cores equal 
> >>>>>>>>> to
> >>>>>>>>> number of cores.  
> >>>>>>>>
> >>>>>>>> No, sorry.  Having smp_cores not correspond to the number of cores 
> >>>>>>>> per
> >>>>>>>> chip in all cases is just not ok.  Add an error message if the
> >>>>>>>> topology isn't workable for powernv by all means.  But users having 
> >>>>>>>> to
> >>>>>>>> use a longer command line is better than breaking basic assumptions
> >>>>>>>> about what numbers reflect what topology.  
> >>>>>>>
> >>>>>>> Sorry to ask again, as I am still not convinced, we do similar
> >>>>>>> adjustment in spapr where the user did not provide the number of 
> >>>>>>> cores,
> >>>>>>> but qemu assumes them as single threaded cores and created
> >>>>>>> cores(boot_cores_nr) that were not same as smp_cores ?  
> >>>>>>
> >>>>>> What?  boot_cores_nr has absolutely nothing to do with adjusting the
> >>>>>> topology, and it certainly doesn't assume they're single threaded.  
> >>>>>
> >>>>> When we start a TCG guest and user provides following commandline, e.g.
> >>>>> "-smp 4", smt_threads is set to 1 by default in vl.c. So the guest boots
> >>>>> with 4 cores, each having 1 thread.  
> >>>>
> >>>> Ok.. and what's the problem with that behaviour on powernv?  
> >>>
> >>> As smp_thread defaults to 1 in vl.c, similarly smp_cores also has the
> >>> default value of 1 in vl.c. In powernv, we were setting nr-cores like
> >>> this:
> >>>
> >>>         object_property_set_int(chip, smp_cores, "nr-cores", 
> >>> &error_fatal);
> >>>
> >>> Even when there were multiple cpus (-smp 4), when the guest boots up, we
> >>> just get one core (i.e. smp_cores was 1) with single thread(smp_threads
> >>> was 1), which is wrong as per the command-line that was provided.  
> >>
> >> Right, so, -smp 4 defaults to 4 sockets, each with 1 core of 1
> >> thread.  If you can't supply 4 sockets you should error, but you
> >> shouldn't go and change the number of cores per socket.  
> > 
> > OK, that makes sense now. And I do see that smp_cpus is 4 in the above
> > case. Now looking more into it, i see that powernv has something called
> > "num_chips", isnt this same as sockets ? Do we need num_chips separately?  
> 
> yes that would do for cpus, but how do we retrieve the number of 
> sockets ? I don't see a smp_sockets. 
I'd suggest to rewrite QEMU again :)

more exactly, -smp parsing is global and sometimes doesn't suite
target device model/machine.
Idea was to make it's options machine properties to get rid of globals
and then let leaf machine redefine parsing behaviour.
here is Drew's take on it:

[Qemu-devel] [PATCH RFC 00/16] Rework SMP parameters
https://www.mail-archive.com/qemu-devel@nongnu.org/msg376961.html

considering there weren't pressing need, the series has been pushed
to the end of TODO list. Maybe you can revive it and make work for
pnv and other machines.


> If we start looking at such issues, we should also take into account 
> memory distribution :
> 
>       -numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]
it's interface based on cpu_index, which internal qemu number for
an 1 cpu execution context.

It would be better if one would use new interface with new machines
 -numa cpu,node-id=0,socket-id=0 ...

> 
> would allow us to define a set of cpus per node, cpus should be evenly 
> distributed on the nodes though, and also define memory per node, but 
> some nodes could be without memory.   
> 
> C. 
> 


Reply via email to