Sorry guys I had been swamped with other stuff and then simply forgot to test
@alexng's patch
I ran it overnight on my original test setup and it also worked for me.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https:/
@Alex
I will try as soon as I have some spare time
@jsalisbury
Unfortunately I didnt have time to test the kernel from #341.
Including the #343 upstream commit should take care of our issue, and using the
patch from #345 (replacing my initial patch) should prevent any hv_vss_daemon
crashes in ex
@Alex
The modified timeout should take care of the issue, but I think its a good idea
for the VSS daemon to issue a THAW before either exiting or trying to recover
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://b
@Alex
1) Yes, but I might need some help with that. Which list/maintainer should I
submit it to?
2) On our test machine, with both the hyper-v host as well as the guest under
heavy i/o load, it was a few hundred ms but with high variance and spiking
(quite often) in to the 2-4s range, with some
We moved some machines back to their regular backup schedule with the
new kernel, no problems so far
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.
@Frederik: yes, as far as I have seen. The file systems are still frozen
between FREEZE and THAW, which in the case of timeouts, is >10s, I have seen
about 30s in some of our error cases. But they do recover.
I only tried the patched 4.4.0-34 version for now, though
Some testing would be apprecia
I spent some time investigating our issue further.
As far as I can tell, the main issue is that ioctl(FIFREEZE) can take a long
time when running VSS backups, and the default timeout is 10s.
This is very noticeable under load, with rare peaks of >5s seen, so 10s seem
plausible
If the timeout is
v4.8-rc2 (with patch 1/2 from #308) failed after 18h :(
With the first patch applied, the VSS daemon decides to quit, but a THAW
is missing after the FREEZE there are the usual syscall timeouts
afterwards
kernel: sd 2:0:0:0: [storvsc] Add. Sense: Changed operating definition
kernel: sd 2:0:0:0: W
unfortunately still the same problem using the v4.8 tools
(on 4.4 and up it doesnt remount the filesystem read-only, it just hangs on any
write operation)
dmesg output:
[30626.788513] hv_utils: VSS: timeout waiting for daemon to reply
[30627.100164] hv_utils: VSS: Transaction not active
[30813.15
488347f seems stable after 25h
4.8-rc1 crashed/hung after ~7h, but I didnt have the 4.8 cloud tools (using the
4.4 ones),
@jsalisbury: maybe you could build those?
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://
@jsalisbury: Yes, as far as I can trust my results
Currently running 488347f (which is reasonably close to 71425a9)
If the current one doesnt fail I would test d215f91 next, then 71425a9 and the
4.8rc1 mainline
--
You received this bug notification because you are a member of Kernel
Packages, w
Currently re-running a few test kernels.
Current results:
3.13.0-35-generic #62-Commitd215f91Reverted: BAD
3.13.0-34-generic #61 @4c48c359b: GOOD
3.13.0-34-generic #61 @95d1181: GOOD
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in
Update (no real results yet)
I tried to improve on the test cycle by stopping the hyper-v backup immediately
after it has begun, then waiting until the delta disks have been merged back
(rinse and repeat)
It took some time to get stable but it seems to have a 6-7x speedup compared to
the origina
Unfortunately, yes. I did not find any evidence of other Hyper-V/Host related
problems.
I will try to repro the crash at least once to be sure.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bu
Unfortunately it crashed after about 80 hours.
I am currently running 4c48c35 from the original bisect (95h+)
But things seem increasingly random at this point.
I tweaked the IO load on the host and guest machine and it seems the crashes
are now reproducible a bit faster, but I think we might need
i meant crashed after 12 hours *g*
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in l
3.14rc crashed after 3.14-rc1
However isn't the crash to be expected in all the mainline kernels >=3.14 since
d215f91 has not been reverted in them?
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.n
Started run on 3.14-rc1
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in linux packag
Both crashed, 3.15rc1 after about 20h, 3.14 only after 66h
I am starting to wonder if it might be a good idea to run the good(?)
3.13.0-86(+revert) kernel for a week or two to make sure its actually good.
--
You received this bug notification because you are a member of Kernel
Packages, which i
Crashed after 30 and 12 hours.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in linux
Started run on 3.15, but there were no cloud tools in your build, so I
used the linux-cloud-tools from 3.16.0-76 (jus copied the hv_* daemons
over)
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net
And it crashed again after 10 hours..
Another thing: I looked at the timing of the (most recent) crashes a bit
and it seems like they always happen after the backup has completed,
when its merging back the backup checkpoint disk (which can be quite
large under heavy I/O)
--
You received this bug
Unfortunately, it failed after a few hours. Trying to repro the crash a
second time.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation
.. and it failed :(
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in linux package in
@jsalisbury:
Your newest build is now at ~35hours without any issues; will keep it running
over the weekend
Could you maybe post the commits you reverted?
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launc
Unfortunately, it failed after 21 hours
BTW we are at 240TB written on our test server :P
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Gener
@jsalisbury: started testing 3.16 "double revert"
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backu
The second run of the 3.19 vivid kernel crashed after 16 hours
The Utopic kernel (3.16) crashed after 23 hours (first run), restarting
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/147025
The 3.19.0-61-generic you posted failed after 2 hours (currently re-running)
Also, isn't this 3.19 a vivid kernel?
sd 2:0:0:0: Device offlined - not ready after error recovery
sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
sd 2:0:0:0: [sda] CDB:
Write(10): 2a 00 00 a1 cc 00
@jsalisbury
Yes I can confirm that. Both 4.X kernels were run twice to make sure the crash
is reproducible, and the 3.13 which seems stable ran for a long time.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs
4.2.0-36-89fb4cdReverted crashed after 10 hours, same errors
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Ba
No luck with the Xenial kernel (4.4.0-22), I could repro the crash 2
times (after a couple of hours each). Testing the Wily kernel next.
Here is the relevant part of the logs (both crashes produced near
identical logs):
Jun 07 20:19:44 muchcrash02 kernel: sd 2:0:0:0: Device offlined - not ready
d215f91-reverted stable for 90 hours
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in
a1dd8c87 failed after 70(!) hours (VSS backups started to fail after 17
hours or so, but the file system was remounted read-only after 70 hours
total)
Another observation, something which h I also noticed in previous "bad" runs:
Also almost instantly after starting backups, there were I/O errors
@Dino
Yes we could also reproduce the issue on Jessie (3.16) and I've also seen it in
testing/unstable, too
@jsalisbury
Assuming the bisect is correct this time, d215f91 seems the only likely suspect
Which kernel version did you test with this commit reverted?
Maybe some of the later merges rein
488347f3f is good after 48 hours
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in lin
bb3becb good after 86 hours
I stopped the test run for now
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Base
bb3becb good after 38 hours
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in linux pa
95d1181 good after 38 hours
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in linux pa
95d1181 is still good after 25 hours, will keep it running for another
10 or so
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SC
4c48c359b is still good after 27 hours, starting on 95d1181
If I hit the error I will re-run it immediately to make sure its bad
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Titl
@jsalisbury:
As an update to my latest post:
I re-ran a few even older builds on the weekend:
adbb4e646 - good after 40hours (bad before, might have been another victim of
the disk filling up, I think I might have screwed up this run too)
da1674843 - bad after 8 hours (consistent with previous tes
586fbce still running after 26 hours, I will keep it running over the
weekend
As far as is can tell, the previous 586fbce run might have been affected
by the same issue as the other two versions (5e6cf71/8321521). I do have
more confidence in the newer runs, but it would be good if multiple
people
8321521 is good after 40 hours, moving to 586fbce
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backu
5e6cf71 still good after 26 hours, switching to 8321521 next, if that
turns out good it might be good idea to re-test 586fbce as well or we
can continue the bisect
@jsalisbury
were you able to reproduce the crash on the kernels with 1c8349a17 reverted?
@benjamin-ihrig
on one of our production ser
@jsalisbury, I moved back to testing only a single machine at a given time,
currently 5e6cf71 is running for ~6 hours, 83215219 is up next
we had 5e6cf71 and 83215219 running at the same time without any issues
for 24 hours *but* the problem seems to be easier to reproduce with only
one machine r
>From 4.2.0-35-generic (lp1470250Commit1c8349a17Reverted), crashed after less
>than 2 hours:
[ 7016.076017] sd 2:0:0:0: [storvsc] Sense Key : Unit Attention [current]
[ 7016.076062] sd 2:0:0:0: [storvsc] Add. Sense: Changed operating definition
[ 7016.076262] sd 2:0:0:0: Warning! Received an indic
I stopped running 3.13.0-34.60 and dfbdac2e after nearly 90 hours and 500
backups with no issues
Started re-running 5e6cf71 and 83215219
If 83215219 is bad, I will run the kernel with cd4842f4 reverted
--
You received this bug notification because you are a member of Kernel
Packages, which is s
both 3.13.0-34.60 and dfbdac2e running for 48 hours, no issues, both now have
gone through 260 backup cycles
i will keep them running for now
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bug
@jsalisbury
We have 3.13.0-34.60 already running for about 22hours straight, no problems
yet, as well as dfbdac2e, which also runs fine for now.
I'll just keep it running for a few days
Also, unfortunately, our result for 5e6cf71 might be invalid because the test
machine ran out of disk space on
I did not explicitly test for that, but on our production server the issue went
away completely
But we can definitely try that, 48 hours of I/O torture should rule out any
non-VSS related issues
--
You received this bug notification because you are a member of Kernel
Packages, which is subscrib
5e6cf71 crashed after 12 hours
until the next build is available I will let it churn on 3.13.0-34.60 just to
make sure its stable
there are only 26 commits in 3402ec8..5e6cf71
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubunt
8321521963a dead after 19 hours
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based
Backups
Status in linu
We observed that as well.
The issue can occur just by creating a volume shadow copy of the volume the
Hyper-V disk is stored on (with the Hyper-V VSS writer)
Started running build 83215219 in the meantime.
I also thought about experimenting with creating shadow copies (volatile, with
writers) d
586fbce failed for me after 28 hours
It would be nice if we could have packages for maybe 2 further versions in the
bisect (the current one + good/fail one), so we can run new builds back to back.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscri
Oops I actually meant adbb4e646 .. The one provided in post #181 as a
download
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1470250
Title:
[Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCS
We have been running the da1674843 test kernel on another hyper-v server (an
older test machine), as well as 4.4.0_21 for comparison; the test kernel
failed after 14h the 4.4.0_21 after 9h.
This takes much longer than on the original server (which was a 2-CPU 20-Core
256GB RAM machine), but we
For the backup stress test I really just used:
:start
wbadmin start backup -quiet -backupTarget:\\myserver\dummyshare
-hyperv:"MYTESTVM"
goto start
In our case the VM server did not run anything else and the Ubuntu guest
was a minmal install, so the loop took only a couple of minutes, an
I managed to break a test VM in 10-15min with minimal Ubuntu installs
(wily/xenial) by spamming wbadmin calls ( backing up only this single VM) in a
loop.. using PowerShell to create and delete snapshots in a loop seems to have
the same effect
However, after some time Hyper-V (and VSS) complaine
59 matches
Mail list logo