That's my understanding too, except in one of the scenarios I observed
100% SYS CPU for long stretches even when there was a significant amount
(~50GB) of the device unused.
However, if it was a soft lockup it was for >8 hours, during which the
machine was totally unresponsive to HTTP requests, wh
Repost of what I sent to the mailing list just now:
My current interpretation of this problem is that it is some
pathological condition caused by not rebalancing and being nearly out
of space for allocating more metadata and hence it is rarely being
seen by anyone else (because most users are regu
The production machine hasn't had a lockup since moving to
3.15.7-031507-generic (it's been up for 4 days) even though we could
reproduce the lockup on a new machine with that kernel using a snapshot
of the old volume.
Another twist is that on the productino machine I'm now reliably seeing
"No spa
smb: Yeah, the system the filesystem was created on was PV, the device
name was xvd*. Now it's on HVM with xvd*.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1349711
Title:
Machine loc
The filesystem may have been originally created on an older version of
BTRFS from Ubuntu Saucy, which I suppose may not have detected the SSD?
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs
btrfs was created with `mkfs.btrfs /dev/mapper/vg-lv`.
It isn't a hard requirement except that it's a pain to migrate since
that requires downtime to move the files. Something I'd rather not do
unless absolutely necessary. The machine freezes are inconvenient but
represent a few minutes downtime i
(otherwise unloaded test machines)
On a dual core machine, 100% system CPU usage with zero writes is seen
on one core for 5-10 minutes, spending time in BTRFS threads.
On a single thread machine 100% system CPU is used and I haven't yet
been able to cause it to hang entirely. I do observe almost
Hm, I'm not sure I can give a thorough description since I don't
understand enough about the exact workload myself. It is a fairly
arbitrary workload generated by our users.
In the end, it boils down to creating, reading and writing many
(~20,000) sqlite files of size 16kb - 12GB across many fold
** Tags added: kernel-bug-exists-upstream
** Changed in: linux (Ubuntu)
Status: Incomplete => Confirmed
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1349711
Title:
Machine lock
This gist contains a stack trace every 10 seconds taken with `echo l >
/proc/sysrq-trigger` whilst the machine was spinning in the kernel but
still responsive.
https://gist.github.com/pwaller/c7dd0f4807459acedcdf
The machine remained responsive for 5-10 minutes before becoming totally
unresponsiv
Now reproduced on 3.16. I'm out of things to try for now.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1349711
Title:
Machine lockup in btrfs-transaction
Status in “linux” package in
I've got a way to rapidly reproduce the error now. I can do it reliably
with a turnaround time of 5-10 minutes.
I've reproduced the crash on the new Kernel, so it has now been observed
on both 3.13.0-32-generic and 3.15.7-031507-generic. I'll try 3.16
next.
I've also discovered this new stack tr
I found an additional stack trace from a previous machine lockup.
[1093202.136107] INFO: task kworker/u30:1:31455 blocked for more than 120
seconds.
[1093202.141596] Tainted: GF3.13.0-30-generic #54-Ubuntu
[1093202.146201] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disab
One thing I am unsure of is that the bug did not manifest for at least
12 days running originally. So I'm not sure it is going to be possible
to reliably decide that it is fixed by moving to a particular kernel.
What is the standard here?
--
You received this bug notification because you are a me
The crashes became more frequent. The approximate time was >12 days
running, then ~2 days running, then 6 hours, then 1 hour.
I since moved to 3.15.7-031507-generic.
One thing I have observed is that (EXT4 filesystem)
/var/log/nginx/access.log contained ~2KB of NULL characters in place of
any ent
@brad-figg, apologies I missed your response. Is there a way to generate
the output without automatically uploading it? I would like to review it
first. I tried `apport-cli --save` but that doesn't do anything unless
there are any crash files that I can tell.
--
You received this bug notification
I've also started a thread on linux-btrfs:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1349711
Title:
Machine lockup in bt
Public bug reported:
This has happened twice now.
I'm on an AWS EC2 m3.large instance with the official Ubuntu AMI ami-
776d9700.
# cat /proc/version_signature
Ubuntu 3.13.0-32.57-generic 3.13.11.4
After running for many days, the machine locked up with the below
messages appearing on the conso
18 matches
Mail list logo