Hello dear Debian folks We run a Debian 9.2 build server on top of a vmware ESXi install on a quite powerful server (Dell Poweredge R730 with 2x Xeon E5-2683v4 (16 cores per CPU makes 64 vCPUs with HT enabled). Bot installations are fully updated. Also Dell firmwares are up to date.
Now I do see stack traces in the Debian /var/log/messages (attached) file, but no time - corresponding entries in the underlying ESXi logs, so I tend to say it's a Debian (or a kernel) problem. The traces occur under heavy load and the server stops to respond. Unfortunately we're evaluating vmware for this use-case so I cannot open a ticket there, as I'm running in eval mode. Its only one vm on this physical server. And no, it was not my idea to run it on vmware, I was told to do so. I did run the open-vm-tools and tried with the vmware proprietary ones, no difference. Linux hostname 4.9.0-4-amd64 #1 SMP Debian 4.9.51-1 (2017-09-28) x86_64 GNU/Linux Any ideas what I can do? Any help would be greately appreciated Best regards Tom Stocker More infos: root@hostname:# cat /proc/meminfo MemTotal: 121711504 kB MemFree: 8760844 kB MemAvailable: 119331608 kB Buffers: 11026468 kB Cached: 93449972 kB SwapCached: 7924 kB Active: 16845672 kB Inactive: 87776128 kB Active(anon): 73852 kB Inactive(anon): 140096 kB Active(file): 16771820 kB Inactive(file): 87636032 kB Unevictable: 91916 kB Mlocked: 91916 kB SwapTotal: 524287996 kB SwapFree: 524234744 kB Dirty: 48 kB Writeback: 0 kB AnonPages: 232092 kB Mapped: 158928 kB Shmem: 64996 kB Slab: 7419836 kB SReclaimable: 7161340 kB SUnreclaim: 258496 kB KernelStack: 14608 kB PageTables: 30352 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 585143748 kB Committed_AS: 1587700 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 2940800 kB DirectMap2M: 104013824 kB DirectMap1G: 18874368 kB root@hostname cat /proc/cpuinfo | less core id : 1 cpu cores : 32 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 invpcid rtm rdseed adx smap xsaveopt arat bugs : bogomips : 4199.99 clflush size : 64 cache_alignment : 64 address sizes : 43 bits physical, 48 bits virtual power management: [...] processor : 63 vendor_id : GenuineIntel cpu family : 6 model : 79 model name : Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz stepping : 1 microcode : 0xb000021 cpu MHz : 2099.078 cache size : 40960 KB physical id : 1 siblings : 32 core id : 31 cpu cores : 32 apicid : 63 initial apicid : 63 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 invpcid rtm rdseed adx smap xsaveopt arat bugs : bogomips : 4199.99 clflush size : 64 cache_alignment : 64 address sizes : 43 bits physical, 48 bits virtual power management:
messages
Description: messages