Hi, I am really sorry it took so long but I was constantly preempted by other stuff. I hope I have a good news for you, though. Johannes has found a nice way how to overcome deadlock issues from memcg OOM which might help you. Would you be willing to test with his patch (http://permalink.gmane.org/gmane.linux.kernel.mm/101437). Unlike my patch which handles just the i_mutex case his patch solved all possible locks.
I can backport the patch for your kernel (are you still using 3.2 kernel or you have moved to a newer one?). On Fri 22-02-13 09:23:32, azurIt wrote: > >Unfortunately I am not able to reproduce this behavior even if I try > >to hammer OOM like mad so I am afraid I cannot help you much without > >further debugging patches. > >I do realize that experimenting in your environment is a problem but I > >do not many options left. Please do not use strace and rather collect > >/proc/pid/stack instead. It would be also helpful to get group/tasks > >file to have a full list of tasks in the group > > > > Hi Michal, > > > sorry that i didn't response for a while. Today i installed kernel with your > two patches and i'm running it now. I'm still having problems with OOM which > is not able to handle low memory and is not killing processes. Here is some > info: > > - data from cgroup 1258 while it was under OOM and no processes were killed > (so OOM don't stop and cgroup was freezed) > http://watchdog.sk/lkml/memcg-bug-6.tar.gz > > I noticed problem about on 8:39 and waited until 8:57 (nothing happend). Then > i killed process 19864 which seems to help and other processes probably ends > and cgroup started to work. But problem accoured again about 20 seconds > later, so i killed all processes at 8:58. The problem is occuring all the > time since then. All processes (in that cgroup) are always in state 'D' when > it occurs. > > > - kernel log from boot until now > http://watchdog.sk/lkml/kern3.gz > > > Btw, something probably happened also at about 3:09 but i wasn't able to > gather any data because my 'load check script' killed all apache processes > (load was more than 100). > > > > azur > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/