Hi, we are facing sporadic latency issue in our guests due to the synchronous nature of bdrv_check_byte_request on raw images:
#0 0x00007ff8962e2070 in lseek64 () from /lib64/libpthread.so.0 #1 0x00000000004d7f97 in raw_getlength (bs=0xcab010) at block/raw-posix.c:704 #2 0x00000000004b9d22 in bdrv_getlength (bs=0xcab010) at block.c:816 #3 0x00000000004b95f0 in bdrv_check_byte_request (bs=0xcab010, offset=233472, size=4096) at block.c:618 #4 0x00000000004b9664 in bdrv_check_request (bs=0xcab010, sector_num=456, nb_sectors=8) at block.c:632 #5 0x00000000004bb56a in bdrv_aio_readv (bs=0xcab010, sector_num=456, qiov=0xcf5d48, nb_sectors=8, cb=0x4eaddc <scsi_read_complete>, opaque=0xcf5cc0) at block.c:1554 The host tends to hang in lseek on some kernel mutex. Instead of addressing this with the big hammer (more preemptible kernel), I wondered why we need that many raw_getlength requests with all their file open/ioctl/close/whatever syscalls - and that in the asynchronous I/O path. Looking at bdrv_check_byte_request, I find if (bs->growable) return 0; i.e. out-of-bound requests on growable devices are always handled by the respective IO handler. Just fixed-size devices (like the classic raw disk image file...) go through len = bdrv_getlength(bs); if (offset < 0) return -EIO; if ((offset > len) || (len - offset < size)) return -EIO; for _each_ request!? Why? Looks like there is room for improvement, not only /wrt latency, isn't it? Jan PS: This should also explain at least some of the latencies I once measured with prio-boosted VCPU threads on PREEMPT-RT kernels. -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux