Hi paolo, thanks for your work. Should i still apply your "old" patch to scsi-disk or should i remove it?
Stefan Am 13.02.2013 14:39, schrieb Paolo Bonzini: > Il 13/02/2013 13:55, Stefan Priebe - Profihost AG ha scritto: >> Hi, >> Am 13.02.2013 12:36, schrieb Paolo Bonzini: >>> Il 13/02/2013 10:07, Stefan Priebe - Profihost AG ha scritto: >>>>>>>> >>>>>>>> commit 47a150a4bbb06e45ef439a8222e9f46a7c4cca3f >>>> ... >>>>>> You can certainly try reverting it, but this patch is fixing a real bug. >>>> Will try that. Yes but even if it fixes a bug and raises another one >>>> (kvm segfault) which is the worst one. It should be fixed. >>> >>> The KVM segfault is exposing a potential consistency problem. What is >>> worse is not obvious. Also, it is happening at reset time if this is >>> the culprit. Reset usually happens at places where no data loss is caused. >>> >>> Can you find out what the VM was doing when it segfaulted? (Or even, >>> can you place the corefile and kvm executable somewhere where I can >>> download it?) >> >> Yes it was doing an fstrim -v / which resulted in: >> >> [45648.453698] end_request: I/O error, dev sda, sector 9066952 > > Ok, very helpful. One thing is to find why this failed. This can > come later though. > > First of all, please run "cat > /sys/block/*/device/scsi_disk/*/provisioning_mode" > in a VM with a similar configuration as the one that crashed last. > > Second, I attach another patch. > > Third, if possible please compile QEMU with --enable-trace-backend=simple, > and run it with > > -trace events='bdrv_aio_discard > scsi_req_cancel > ',file=qemu.$$.trace > > This can give some clues. The files should remain quite small, > so you can enable it on all VMs safely. > >> Sadly not as i don't have acore dump. The kvm processes are started >> through variuos Daemons and there seems no way to activate core dumps >> for an already running process and i don't know which VM will crash next. > > Probably the next that invokes fstrim. :) > >>> If not, do your VMs reset themselves >>> often for example? >> No > > Ok, good to know. > >>> Can you reproduce it on non-rbd storage? >> I don't have another storage type. ;-( > > No problem. > > Paolo >