On Tue, Jan 9, 2018 at 7:19 PM, Fam Zheng <f...@redhat.com> wrote: > On Tue, 01/09 23:18, Ala Hino wrote: > > qeum version: > > qemu-kvm-rhev-2.9.0-16.el7_4.5.x86_64 > > > > On Tue, Jan 9, 2018 at 11:10 PM, Ala Hino <ah...@redhat.com> wrote: > > > > > Hello, > > > > > > A user is hitting the following error when performing `qemu-img > commit`: > > > > > > 2018-01-03 09:48:54,168+0000 ERROR (tasks/5) [root] Job > > > u'e4d77997-db47-4c60-8f2c-019edebe9917' failed (jobs:217) > > > Traceback (most recent call last): > > > File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 154, in > run > > > self._run() > > > File "/usr/share/vdsm/storage/sdm/api/merge.py", line 65, in _run > > > self.operation.wait_for_completion() > > > File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 329, > in > > > wait_for_completion > > > self.poll(timeout) > > > File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 324, > in > > > poll > > > self.error) > > > QImgError: cmd=['/usr/bin/taskset', '--cpu-list', '0-31', > '/usr/bin/nice', > > > '-n', '19', '/usr/bin/ionice', '-c', '3', '/usr/bin/qemu-img', > 'commit', > > > '-p', '-t', 'none', '-b', u'/rhev/data-center/mnt/gluste > > > rSD/bms-gluster01-am5:_rhevm/ab27a13a-7a82-4cf4-86fd-3a6c69 > > > 7ccef9/images/3122bbad-e3d3-446a-90b9-eb6da08dad2a/7fa44a9 > > > 5-2988-4866-924c-329713989ad0', '-f', 'qcow2', > u'/rhev/data-center/mnt/ > > > glusterSD/bms-gluster01-am5:_rhevm/ab27a13a-7a82-4cf4-86fd-3a6c69 > > > 7ccef9/images/3122bbad-e3d3-446a-90b9-eb6da08dad2a/8ba82ba > > > 8-32ea-4501-95e8-55ef42812c89'], ecode=-6, stdout=, stderr=qemu-img: > > > block/file-posix.c:1774: find_allocation: Assertion `offs >= start' > failed. > > > , message=None > > > > > > Would appreciate any explanation of the error and what potentially > could > > > cause it. > > This is odd. The code around it is simply: > > off_t offs; > > /* > * SEEK_DATA cases: > * D1. offs == start: start is in data > * D2. offs > start: start is in a hole, next data at offs > * D3. offs < 0, errno = ENXIO: either start is in a trailing hole > * or start is beyond EOF > * If the latter happens, the file has been truncated behind > * our back since we opened it. All bets are off then. > * Treating like a trailing hole is simplest. > * D4. offs < 0, errno != ENXIO: we learned nothing > */ > offs = lseek(s->fd, start, SEEK_DATA); > if (offs < 0) { > return -errno; /* D3 or D4 */ > } > assert(offs >= start); > > We already checked "offs < 0" so it is non-negative. But according to the > manpage: > > > SEEK_DATA > > Adjust the file offset to the next location in the file > greater > > than or equal to offset containing data. If offset points to > > data, then the file offset is set to offset. > > What is the filesystem the image is on? Maybe the implementation has > violated > this and returned an offset before @start. >
This is on glusterfs. > Can you run the qemu-img command manually and reproduce it? If you can > please > collect the output after prepending to the command line with "strace -f -e > lseek", so we can see what the kernel has returned for the lseek call. > > This assertion failure seems similar to BZ 1451191, except that was relative to a gluster library function, glfs_lseek(), whereas in our case it's in lseek() I am unable to reproduce this on local inhouse RHV (ovirt) lab. As this is on RHV (ovirt), running the qemu-img command manually will require additional steps to finalize the images and the end user not willing as it is production VM. These are the specifics: ~~~ ovirt-engine-4.1.6.2-0.1.el7.noarch Red Hat Virtualization Host 4.1 (el7.4) - 3.10.0-693.2.2.el7.x86_64 vdsm-4.19.31-1.el7ev.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.5.x86_64 libvirt-daemon-3.2.0-14.el7_4.3.x86_64 glusterfs-3.8.4-18.6.el7rhgs.x86_64 ~~~ > Fam >