On Fri, 26 Oct 2012, Thomas Hellstrom wrote:
> Hi, > > On 10/25/2012 11:27 PM, Ilija Hadzic wrote: >> >> Can you give the attached patch a whirl and let me know if it fixes the >> problem? >> >> As I indicated in my previous note, vmwgfx should be the only affected >> driver because it looks at dev_mapping in the open hook (others do it when >> they create an object, so they are not affected). >> >> I'll probably revise it (and I'll have some general questions about >> drm_open syscall) before officially send the patch, but I wanted to get >> something quickly to you to check if it fixes your problem. I hope that >> your vmwgfx test environment is such that you can reproduce the original >> problem. >> >> thanks, >> >> -- Ilija > > Yes, it appears like this patch fixes the problem. It'd be good to have it in > 3.7 (drm-fixes) with a cc to stable. > OK great. Thanks for testing. Before I send out an "official" patch, I have a few questions for those who have been around longer and can possibly reflect better than me on the history of drm_open syscall. Currently, before touching dev->dev_mapping field we grab dev->struct mutex. This has been introduced by Dave Airlie a long time ago in a2c0a97b784f837300f7b0869c82ab712c600952. I tried to preserve that in all patches where I touched dev_open, but looking at the code I don't think the mutex is necessary. Namely, drm_open is only set in drm_open, and concurrent openers are protected with drm_global_mutex. Other places (drivers) where dev->dev_mapping is accessed is read-only and dev_mapping is written at first open when there are no file descriptors around to issue any other call. Also, it doesn't look to me that any driver locks dev->struct_mutex before accessing dev->dev_mapping anyway. So I am thinking of dropping the mutex completely, but I would like to hear a second thought. The other issue, I noticed is that of the drm_setup() call fails, the open_count counter would remain incremented and I think we need to restore it back (or if I am missing something, would someone please enlighten me). This was also in the kernel all this time (and I have not noticed until now), so I "smuggled" that fix in the patch that I sent you. However, wonder if I should cut the separate patch for open_count fix. Actually, I think that I should cut three patches: one to drop the mutex, one to fix the open_count and one to fix your problem with dev_mapping and that probably all three should CC stable. Before I do that, I'd like to hear opinions of others. thanks, Ilija