Am 16.04.19 um 14:18 schrieb Daniel Vetter: > On Tue, Apr 16, 2019 at 11:06:54AM +0000, Koenig, Christian wrote: >> Am 16.04.19 um 12:54 schrieb Karol Herbst: >>> On Tue, Apr 16, 2019 at 11:12 AM Koenig, Christian >>> <christian.koe...@amd.com> wrote: >>>> Am 16.04.19 um 11:10 schrieb Karol Herbst: >>>>> On Tue, Apr 16, 2019 at 8:38 AM Christian König >>>>> <ckoenig.leichtzumer...@gmail.com> wrote: >>>>>> Am 16.04.19 um 02:35 schrieb Karol Herbst: >>>>>>> Kobjects are supposed to be dynamically allocated, but with recent >>>>>>> changes >>>>>>> this rule was violated. Reverting those commits fixes crashes when a drm >>>>>>> driver using TTM gets loaded again. >>>>>>> >>>>>>> The object in question is "ttm_mem_glob" declared inside >>>>>>> "include/drm/ttm/ttm_memory.h" and instatiated inside >>>>>>> "drivers/gpu/drm/ttm/ttm_memory.c". >>>>>>> >>>>>>> from "Documentation/kobject.txt": >>>>>>> "Because kobjects are dynamic, they must not be declared statically or >>>>>>> on >>>>>>> the stack, but instead, always allocated dynamically. Future versions >>>>>>> of >>>>>>> the kernel will contain a run-time check for kobjects that are created >>>>>>> statically and will warn the developer of this improper usage." >>>>>>> >>>>>>> Unloading ttm before reloading the driver workarounds that crash, >>>>>>> because >>>>>>> the memory backing the kobject member "kobj" is cleaned up. The >>>>>>> kobject_del >>>>>>> and kobject_put function never free or clean up the kobject object >>>>>>> leaving >>>>>>> it in an undefined state. >>>>>>> >>>>>>> I reverted a few more commits to make it less painful for me to rever >>>>>>> this >>>>>>> rather big change. >>>>>> Well, NAK. By reverting those change you also re-introduced the problems >>>>>> we originally fixed with those patches. >>>>>> >>>>>> Please work on a proper fix instead, >>>>>> Christian. >>>>> And which problem was that besides duplicated code (or maybe even a >>>>> bit of memory consumption if multiple ttm driver were used)? If I had >>>>> to choose between duplicated code and a crash, I choose the former. >>>>> >>>>> Maybe I missed the real reason why those changes are made, but the >>>>> commit messages don't really seem to tell me. >>>> The old implementation crashed because different drivers tried to >>>> allocate the same kobj. >>>> >>>> Crashing in one way is not better than crashing in another way. >>>> >>>> Christian. >>>> >>> how can that old crash be triggered? By loading two ttm based drivers >>> at the same time or by other means? >> Yes, exactly. Even worse it could be triggered by one driver >> instantiating multiple times at the same time, e.g two AMD GPUs in the >> same system. >> >> On the other hand I completely agree that using kobj static is >> completely nuts. The problem is that using a kobj was the wrong approach >> in the first place. >> >> In other words when you have something which controls a global behavior >> of a module, what do you use? Right, a module parameter. >> >> Point is that we can't get away from those kobj without breaking the >> uapi, so that is something which can't be done easily. > Randome idea: Push the kobj setup into drm.ko (and shrugg it off as > another lesson in how maybe uapi shouldn't have been designed, but hey not > our worst mistake by far). I think that'd be totally ok.
Yeah, thought about that as well. But I would rather re-design this from the scratch instead of just moving it over. And yes I agree with a bit of luck that UAPI is actually not used at all, so we could remove it sooner or later. Regards, Christian. > -Daniel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel