[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-29 Thread Stefan Bader
Yes, this value is used right now. The question was whether this could be moved by now (depending on the AWS rollout status). But anyway, I changed the patch to activate ticket spinlocks even when compiled for 3.0.2 or higher. Which is what we would be the same situation we have right now, just

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-24 Thread Matt Wilson
The required CONFIG_XEN_COMPAT value for ec2 is documented here: http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/AdvancedUsers.html -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/9299

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-23 Thread Stefan Bader
Now added v2 builds which include the newer spinlock code (also pulling in some other changes to allow it to compile) and change XEN_COMPAT to 3.2 and later. Question would be whether it is a valid assumption that there won't be a Xen version older than 3.2 on EC2. -- You received this bug notifi

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-21 Thread Stefan Bader
Started to look into backporting the spinlock changes from the newer patchset. Without changing the XEN_COMPAT this would result in a non- ticket lock implementation (as mentioned before). Not sure how this behaves, but maybe you want to try. I uploaded kernel packages in that state to http://peopl

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-17 Thread Stefan Bader
Oops, sorry about that. The push there did not really indicate that the repo went into such an utter state of disaster. :( It is fixed up now. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/9299

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-16 Thread Matt Wilson
$ git clone git://kernel.ubuntu.com/smb/ubuntu-lucid.git Cloning into ubuntu-lucid... remote: error: Could not read b43f7c4d8d293aa9f47a7094852ebd5355e4f38f remote: fatal: Failed to traverse parents of commit 3becab1d2df01d54a4e889cf2d69ccb902cd43c3 remote: aborting due to possible repository corr

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-16 Thread Stefan Bader
Oh, completely forgot to say: the comment I was talking of shows up in ec2-next in arch/x86/include/mach-xen/asm/spinlock_types.h. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/929941 Title:

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-16 Thread Stefan Bader
Matt, which commit is a bit complicated to say. Basically yes, the code is a merge between the 2.6.32 kernel code we have for 10.04 and the Xen patches SUSE had at that point in time. The "new" tree I am talking was an effort to pick the patches from a newer release and try to work out what is mis

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-15 Thread Matt Wilson
Stefan, Which commit has the race condition comment? I'm aware of a problem with SUSE's kernel with regard to PV ticketlocks and HYPERVISOR_poll(), but I don't see any mention in upstream 3.2.x or XenLinux 2.6.18. Your 10.04 2.6.32-era kernel doesn't have ticketlocks, so the underlying hypervisor

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-15 Thread Stefan Bader
This gives me some headaches. So, I tried to figure out what would make sense to pick from the newer code related to spinlocks. The current code (our ec2 topic branch) seems at least to have a potentially dangerous place in xen_spin_kick. There it only checks whether any other cpu spins on the same

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-14 Thread Stefan Bader
Looking at the xen code used for the ec2 guest kernels, this is not overloading the generic spinlock struct with xen data. So at least that cannot overflow. That said, the whole xen spinlock code there is a snapshot from quite a while ago. And I had been working on importing a number of changes

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-14 Thread Stefan Bader
The most interesting part of the dmesg for me is to give a rough idea about what Xen version it has. And usually it is helping to make sure whether this correlates on all the cases where the hang happens. It looks like some interaction problem but the only code I can look at is the guest. Recent

[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-13 Thread Matt Wilson
** Attachment added: "/proc/interrupts as an attachment" https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2736482/+files/proc-interrupts.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to the bug report. https:/