Responses to two messages plus additionals thought below.

On Fri, Dec 2, 2016 at 11:32 AM, Len Weincier <[email protected]> wrote:
>
> Thats very useful. Inside the joyent public cloud how many cores do you
> see inside an LX zone ? When I spin up an LZ zone I see all the cores on
> the machine, which makes sense. So some customers see 40 Cores ("Intel(R)
> Xeon(R) CPU E5-2670 v2 @ 2.50GHz") and some customers see 128 cores
> ("Intel(R) Xeon(R) CPU E7-4850 v4 @ 2.10GHz") regardless of the cpu_cap
> setting.
>

Inside the Joyent cloud, LX zones see all the cores on the machine in
question. Same as on your hardware.


> The problem is as Ian said above that when cpu_cap is less than the core
> count its tricky for our customers.
>

Yes, software in a zone making inferences based on how many cores it can
see is problematic.


On Fri, Dec 2, 2016 at 5:19 PM, Ian Collins <[email protected]> wrote:
>
> The problem with the current scheme is when you hit a capped zone with
> something like a parallel compile.  The build tool will see all of the
> cores and happily spin up a compile or two on each.  So on a decent machine
> like the one Len describes, there will be 64 CPU intensive jobs running on
> a machine with a cap one CPU...  The result is the load average goes
> through the roof and the the box falls over.  If you do this in a KVM, the
> KVM will curl up and die.
>

Yes


> The only way I know to properly confine a guest is to use a KVM with
> matching CPU cap and count.
>

Currently, yes, though depending on the software in question sometimes you
can override the degree of parallelism that it will use by setting an
explicit value.

Additional thoughts:
I wonder how bad it would be to allow the LX brand to alter how many CPUs
the Linux system calls see based on a zone property (probably a new one...)
My first instinct is "not that bad", though my second instinct is "the
consequences of doing so are subtle and it's probably much worse than I
would think".

I would look into how to make sure those parallel compiles are using an
appropriate level of parallelism. I imagine the set of applications that
will be misbehaving based on core count while theoretically very large will
in practice be relatively small and helping customers work around those
issues might be the easiest solution of them all.

-Nahum



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to