On 02.03.2015 18:15, Gerhard Wiesinger wrote:
On 02.03.2015 16:52, Gerhard Wiesinger wrote:
On 02.03.2015 10:26, Paolo Bonzini wrote:
On 01/03/2015 11:36, Gerhard Wiesinger wrote:
So far it happened only the PostgreSQL database VM. Kernel is alive
(ping works well). ssh is not working.
console window: after entering one character at login prompt, then
crashed:
[1438.384864] Out of memory: Kill process 10115 (pg_dump) score 112 or
sacrifice child
[1438.384990] Killed process 10115 (pg_dump) total-vm: 340548kB,
anon-rss: 162712kB, file-rss: 220kB
Can you get a vmcore or at least sysrq-t output?
Yes, next time it happens I can analyze it.
I think there are 2 problems:
1.) OOM (Out of Memory) problem with the low memory settings and
kernel settings (see below)
2.) Instability problem which might have a dependency to 1.)
What I've done so far (thanks to Andrey Korolyov for ideas and help):
a.) Updated maschine type from pc-0.15 to pc-i440fx-2.2
virsh dumpxml database | grep "<type"
<type arch='x86_64' machine='pc-0.15'>hvm</type>
virsh edit database
virsh dumpxml database | grep "<type"
<type arch='x86_64' machine='pc-i440fx-2.2'>hvm</type>
SMBIOS is updated therefore from 2.4 to 2.8:
dmesg|grep -i SMBIOS
[ 0.000000] SMBIOS 2.8 present.
b.) Switched to tsc clock, kernel parameters: clocksource=tsc
nohz=off highres=off
c.) Changed overcommit to 1
echo "vm.overcommit_memory = 1" > /etc/sysctl.d/overcommit.conf
d.) Tried 1 VCPU instead of 2
e.) Installed 512MB vRAM instead of 384MB
f.) Prepared for sysrq and vmcore
echo "kernel.sysrq = 1" > /etc/sysctl.d/sysrq.conf
sysctl -w kernel.sysrq=1
virsh send-key database KEY_LEFTALT KEY_SYSRQ KEY_T
virsh dump domain-name /tmp/dumpfile
g.) Further ideas, not yet done: disable memory balooning by
blacklisting baloon driver or remove from virsh xml config
Summary:
1.) 512MB, tsc timer, 1VCPU, vm.overcommit_memory = 1: no OOM
problem, no crash
2.) 512MB, kvm_clock, 2VCPU, vm.overcommit_memory = 1: no OOM
problem, no crash
3.) 384MB, kvm_clock, 2VCPU, vm.overcommit_memory = 1: no OOM problem,
no crash
3b.) Still happened again at the nightly backup with same configuration
as in 3.) configuration 384MB, kvm_clock, 2VCPU, vm.overcommit_memory =
1, pc-i440fx-2.2: no OOM problem, ping ok, no reaction, BUT CRASHED again
SYSRQ: no reaction of the VM
virsh send-key vm KEY_LEFTALT KEY_SYSRQ KEY_T
virsh dump vm file.core
error: Failed to core dump domain vm to file.core
error: internal error: unable to execute QEMU command 'migrate': State
blocked by non-migratable device '0000:00:09.0/ich9_ahci'
Removed the SATA controller, dump should work for the future.
Any futher ideas?
Ciao,
Gerhard