Update:
First of all: Forget my observation about the 'system boot time'. I
mixed up something, the dom0 boot time was increased, but this happened
probably due to the not (well/propper) handled lvm thin activation
during system boot.
One last thing I pulled from domu with the original kernel (4.9.51-1)
was this top output:
top - 20:41:03 up 6:18, 2 users, load average: 17.03, 6.98, 2.62
Tasks: 231 total, 1 running, 230 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 0.0 id,100.0 wa, 0.0 hi, 0.0 si,
0.0 st
%Cpu1 : 0.0 us, 0.3 sy, 0.0 ni, 0.0 id, 99.7 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem : 8212616 total, 1907568 free, 1485276 used, 4819772 buff/cache
KiB Swap: 2097148 total, 2097148 free, 0 used. 6558984 avail Mem
at this point, the system is more or less unusable, everything depending
on IO is dead.
Currently my production system domu is running for over a week with the
last backports kernel (linux-image-4.13.0-0.bpo.1-amd64) dom0 is still
on the current stretch kernel (4.9.51-1) and it seems stable for now.
My guess would be some issue with the xen blkfront driver.
About end of last year I experiences something similar with jessie.
After some kernel updates those issues got better. They are not
completely gone, some jessie domu's need a reboot from time to time due
to raising wa, but the system is still responsive then, it's just
getting slower and slower by the minute.