Package: linux-source-4.19
Version: 4.19.160-2
Followup-For: Bug #927184

We got hit by this bug after updating to most recent kernel version. In our case it affected all disk drives, regardless of their type (rotating or SSD.) After some research, we found out that the issue was triggered by change of default I/O scheduler done by us before the update (and taking effect after restart.) See the following console output:

root@server:~# cat /sys/block/sde/queue/scheduler
[mq-deadline] none
root@server:~# for a in $( seq 1 5 ) ; do sleep 1 ; cat /sys/block/sde/stat ; done 11452700 133667 706138701 5139372 168800292 66117118 3870603434 78079852 0 1156352620 1170874630 1819485 0 349746784 1614457 11452700 133667 706138701 5139372 168800309 66117137 3870603746 78079862 0 1156352630 1170874640 1819485 0 349746784 1614457 11452700 133667 706138701 5139372 168800417 66117189 3870604914 78079911 0 1156352640 1170874650 1819485 0 349746784 1614457 11452700 133667 706138701 5139372 168800431 66117205 3870605154 78080014 0 1156352740 1170874750 1819485 0 349746784 1614457 11452700 133667 706138701 5139372 168800491 66117250 3870606090 78080037 0 1156352750 1170874760 1819486 0 349746792 1614457
root@server:~# echo "none"> /sys/block/sde/queue/scheduler
root@server:~# for a in $( seq 1 5 ) ; do sleep 1 ; cat /sys/block/sde/stat ; done 11452709 133667 706138965 5139376 168803680 66118376 3870643746 78081646 0 1156354860 1170878030 1819488 0 349746808 1614459 11452710 133667 706138997 5139377 168804363 66118550 3870650634 78081962 0 1156355800 1170879210 1819489 0 349746816 1614459 11452710 133667 706138997 5139377 168804439 66118594 3870651642 78081993 0 1156356740 1170880150 1819489 0 349746816 1614459 11452710 133667 706138997 5139377 168804498 66118653 3870652762 78082022 0 1156357680 1170881090 1819489 0 349746816 1614459 11452710 133667 706138997 5139377 168804569 66118700 3870653834 78082053 0 1156358630 1170882040 1819489 0 349746816 1614459
root@server:~# echo "mq-deadline"> /sys/block/sde/queue/scheduler
root@server:~# for a in $( seq 1 5 ) ; do sleep 1 ; cat /sys/block/sde/stat ; done 11452710 133667 706138997 5139377 168804885 66118930 3870658594 78082208 0 1156361540 1170884950 1819491 0 349779592 1614461 11452710 133667 706138997 5139377 168804942 66118957 3870659346 78082236 0 1156361570 1170884980 1819491 0 349779592 1614461 11452711 133667 706139005 5139377 168805194 66119062 3870673810 78082346 0 1156361630 1170885050 1819491 0 349779592 1614461 11452711 133667 706139005 5139377 168805240 66119088 3870674394 78082368 0 1156361630 1170885050 1819491 0 349779592 1614461 11452711 133667 706139005 5139377 168805464 66119186 3870688410 78082457 0 1156361670 1170885130 1819491 0 349779592 1614461

As long as mq-deadline scheduler is active, all works well, value in 10th column is increasing by realistic steps. After switching to none scheduler, the steps increase to almost a second whenever there is at least some activity on the drive. That in turn shows as almost 100% disk utilization in tools that use these values. Note that the value in 10th column is supposed to be increasing when value in 9th column is non-zero (according to https://www.kernel.org/doc/Documentation/iostats.txt ), yet I saw no occurence of non-zero number in that column in my testing, which does not correspond to device under 100% load.

Using mq-deadline in virtual machines and for SSDs is most likely not optimal, but as a workaround, switching to mq-deadline works.

Reply via email to