On Thu, May 17, 2007 at 07:49:31PM +0200, Tejun Heo wrote: > Maneesh Soni wrote: > > On Thu, May 17, 2007 at 05:04:23AM -0700, Greg KH wrote: > >> On Wed, May 16, 2007 at 08:31:00PM +0200, Tejun Heo wrote: > >>> sd->s_dentry updates made by dentry/inode reclamation are racy and can > >>> lead to BUG() or oops. This is already fixed in -mm and the fix is > >>> scheduled to be merged into upstream for 2.6.23 but the fix > >>> reimplements sysfs dentry dropping and is too risky for -stable > >>> kernels. > >>> > > > > But was the synchronization fix tested by people facing the race? I still > > don't understand the racy code path. The last google problem I saw had > > s_dentry field as NULL. > > Please take a look at the following message. > > http://article.gmane.org/gmane.linux.kernel/521729 > > I could reproduce both races on my test machine fairly reliable with > parallel find, cat, mount/mount while repeatedly ins/rmmoding a libata > driver. > Thanks for the pointer.. earlier it got buried in the fat rework..
> >>> This is an interim solution for -stable kernels. sysfs reclamation is > >>> disabled by default and can be enabled by using sysfs.enable_reclaim > >>> kernel parameter. Note that dentries are still created on demand, so > >>> attribute and symlinks nodes aren't allocated on creation. They're > >>> allocated on first lookup and deallocated when the sysfs node is > >>> removed. > >> Ick, this is going to kill memory on big boxes (s390 and others) and I > >> don't really want to apply this it if at all possible. > >> > > At least not make it default. This might create boot issues with these > > boxes. > > Which makes oopsing the default. Fun! :-) > but.. avoid oops by not booting at all is more fun !! ;-) > >> Maneesh, any other thoughts? > >> > > I actually wanted to investigate this oops but left it considering the > > rework being done by Tejun. If this still make sense we can have some > > more debug code stuffed there or get a crashdump (kdump) to get better > > understanding of the race. > > The above message contains analysis of both races. I just ported the > fixes. I have a different test machine now and can't reproduce the > races with this one yet so I couldn't verify whether the patches > actually fix the problem. I'll post the patches anyway. If anyone can > reproduce these races, please verify the posted patches fix the problem. > I would prefer fixing the race instead of making attributes non-reclaimable. Thanks Maneesh -- Maneesh Soni Linux Technology Center, IBM India Systems and Technology Lab, Bangalore, India - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/