Possibly, but I think they'd be using the V100s rather than the CPUs. For reference:
rr42-03:~$ sudo cpupower monitor -l Monitor "Nehalem" (4 states) - Might overflow after 922000000 s C3 [C] -> Processor Core C3 C6 [C] -> Processor Core C6 PC3 [P] -> Processor Package C3 PC6 [P] -> Processor Package C6 Monitor "Mperf" (3 states) - Might overflow after 922000000 s C0 [T] -> Processor Core not idle Cx [T] -> Processor Core in an idle state Freq [T] -> Average Frequency (including boost) in MHz So that node is doing nothing much right now. On 16 May 2018 at 04:29, John Hearns <hear...@googlemail.com> wrote: > Blair, > methinks someone is doing bitcoin mining on your systems when they are > idle :-) > > I WAS going to say that maybe the cpupower utility needs an update to cope > with that generation of CPUs. > But 7proc/cpuinfo never lies (does it ?) > > > > > On 16 May 2018 at 13:22, Blair Bethwaite <blair.bethwa...@gmail.com> > wrote: > >> On 15 May 2018 at 08:45, Wido den Hollander <w...@42on.com> wrote: >>> >>> > We've got some Skylake Ubuntu based hypervisors that we can look at to >>> > compare tomorrow... >>> > >>> >>> Awesome! >> >> >> Ok, so results still inconclusive I'm afraid... >> >> The Ubuntu machines we're looking at (Dell R740s and C6420s running with >> Performance BIOS power profile, which amongst other things disables cstates >> and enables turbo) are currently running either a 4.13 or a 4.15 HWE kernel >> - we needed 4.13 to support PERC10 and even get them booting from local >> storage, then 4.15 to get around a prlimit bug that was breaking Nova >> snapshots, so here we are. Where are you getting 4.16, >> http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16/ ? >> >> So interestingly in our case we seem to have no cpufreq driver loaded. >> After installing linux-generic-tools (cause cpupower is supposed to >> supersede cpufrequtils I think?): >> >> rr42-03:~$ uname -a >> Linux rcgpudc1rr42-03 4.15.0-13-generic #14~16.04.1-Ubuntu SMP Sat Mar 17 >> 03:04:59 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux >> >> rr42-03:~$ cat /proc/cmdline >> BOOT_IMAGE=/vmlinuz-4.15.0-13-generic root=/dev/mapper/vg00-root ro >> intel_iommu=on iommu=pt intel_idle.max_cstate=0 processor.max_cstate=1 >> >> rr42-03:~$ lscpu >> Architecture: x86_64 >> CPU op-mode(s): 32-bit, 64-bit >> Byte Order: Little Endian >> CPU(s): 36 >> On-line CPU(s) list: 0-35 >> Thread(s) per core: 1 >> Core(s) per socket: 18 >> Socket(s): 2 >> NUMA node(s): 2 >> Vendor ID: GenuineIntel >> CPU family: 6 >> Model: 85 >> Model name: Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz >> Stepping: 4 >> CPU MHz: 3400.956 >> BogoMIPS: 5401.45 >> Virtualization: VT-x >> L1d cache: 32K >> L1i cache: 32K >> L2 cache: 1024K >> L3 cache: 25344K >> NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34 >> NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35 >> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe >> syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts >> rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 >> monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca >> sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c >> rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 >> invpcid_single pti intel_ppin mba tpr_shadow vnmi flexpriority ept vpid >> fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a >> avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw >> avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total >> cqm_mbm_local ibpb ibrs stibp dtherm ida arat pln pts pku ospke >> >> rr42-03:~$ sudo cpupower frequency-info >> analyzing CPU 0: >> no or unknown cpufreq driver is active on this CPU >> CPUs which run at the same hardware frequency: Not Available >> CPUs which need to have their frequency coordinated by software: Not >> Available >> maximum transition latency: Cannot determine or is not supported. >> Not Available >> available cpufreq governors: Not Available >> Unable to determine current policy >> current CPU frequency: Unable to call hardware >> current CPU frequency: Unable to call to kernel >> boost state support: >> Supported: yes >> Active: yes >> >> >> And of course there is nothing under sysfs (/sys/devices/system/cpu*). >> But /proc/cpuinfo and cpupower-monitor show that we seem to be hitting >> turbo freqs: >> >> rr42-03:~$ sudo cpupower monitor >> |Nehalem || Mperf >> PKG |CORE|CPU | C3 | C6 | PC3 | PC6 || C0 | Cx | Freq >> 0| 0| 0| 0.00| 0.00| 0.00| 0.00|| 0.05| 99.95| 3391 >> 0| 1| 4| 0.00| 0.00| 0.00| 0.00|| 0.02| 99.98| 3389 >> 0| 2| 8| 0.00| 0.00| 0.00| 0.00|| 0.14| 99.86| 3067 >> 0| 3| 6| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3385 >> 0| 4| 2| 0.00| 0.00| 0.00| 0.00|| 0.09| 99.91| 3119 >> 0| 8| 12| 0.00| 0.00| 0.00| 0.00|| 0.03| 99.97| 3312 >> 0| 9| 16| 0.00| 0.00| 0.00| 0.00|| 0.11| 99.89| 3157 >> 0| 10| 14| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3352 >> 0| 11| 10| 0.00| 0.00| 0.00| 0.00|| 0.05| 99.95| 3390 >> 0| 16| 20| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3387 >> 0| 17| 24| 0.00| 0.00| 0.00| 0.00|| 0.22| 99.78| 3115 >> 0| 18| 26| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3389 >> 0| 19| 22| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3366 >> 0| 20| 18| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3392 >> 0| 24| 28| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3376 >> 0| 25| 32| 0.00| 0.00| 0.00| 0.00|| 0.05| 99.95| 3390 >> 0| 26| 34| 0.00| 0.00| 0.00| 0.00|| 0.03| 99.97| 3391 >> 0| 27| 30| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3392 >> 1| 0| 1| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3394 >> 1| 1| 5| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3378 >> 1| 2| 9| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3393 >> 1| 3| 7| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3384 >> 1| 4| 3| 0.00| 0.00| 0.00| 0.00|| 0.02| 99.98| 3391 >> 1| 8| 13| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3390 >> 1| 9| 17| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3391 >> 1| 10| 15| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3360 >> 1| 11| 11| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3393 >> 1| 16| 21| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3397 >> 1| 17| 25| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3391 >> 1| 18| 27| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3376 >> 1| 19| 23| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3334 >> 1| 20| 19| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3387 >> 1| 24| 29| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3377 >> 1| 25| 33| 0.00| 0.00| 0.00| 0.00|| 0.01| 99.99| 3387 >> 1| 26| 35| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3392 >> 1| 27| 31| 0.00| 0.00| 0.00| 0.00|| 0.00|100.00| 3392 >> >> >> On a similar node with the 4.13 kernel we get similar reports from >> cpupower-monitor, but oddly on 4.13 /proc/cpuinfo shows all cores at base >> 2700.000 (on 4.15 it updates). >> >> We can try 4.16 tomorrow. But I wonder why we are already seeing turbo >> even at idle and you aren't... only thing I can think of is that it must be >> because our cstates are disabled in BIOS, indeed when looking in dmesg I >> see: >> >> [ 1.274325] intel_idle: disabled >> >> So it stands to reason that intel_idle.max_cstate=0 is doing nothing for >> either of us. What do you see from intel_idle on 4.16? >> >> -- >> Cheers, >> ~Blairo >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Cheers, ~Blairo
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com