On 12/21/2014 03:06 PM, Oded Gabbay wrote: > > > On 12/21/2014 02:19 PM, Christian König wrote: >> Am 21.12.2014 um 12:34 schrieb Oded Gabbay: >>> >>> >>> On 12/21/2014 01:27 PM, Christian König wrote: >>>> Am 20.12.2014 um 21:46 schrieb Oded Gabbay: >>>>> When amdkfd and radeon are compiled inside the kernel image (not as >>>>> modules), >>>>> radeon will load before amdkfd and will set *kfd2kgd to its interface >>>>> structure. Therefore, we must not set *kfd2kgd to NULL when amdkfd is >>>>> loaded >>>>> because it will override radeon's initialization and cause kernel BUG. >>>>> >>>>> Signed-off-by: Oded Gabbay <oded.gabbay at amd.com> >>>> >>>> You should probably rather fix the dependency between the two modules to >>>> get an >>>> determined load order instead of doing such nasty workarounds. >>>> >>>> Christian. >>> >>> The problem is that when modules are compiled inside the kernel, there is NO >>> determined load order and there is no mechanism to enforce that. If there >>> is/was such a mechanism, I would of course prefer to use it. >> >> There should be an determined order based on the symbol use, otherwise >> initializing most of the kernel modules wouldn't work as expected. For >> example >> radeon depends on the drm module must be loaded before radeon is loaded. > There should be, but when the modules are compiled in, they are loaded based > on > link order only, if they are in the same group, and the groups are loaded by a > pre-defined order. > The groups are: pure, core, postcore, arch, subsys, fs, device (which > represents > all the modules) and late. See init.h > > So radeon, amdkfd and amd_iommu_v2 are all in device group, and in the group > they are ordered by their link order. > > Yes, radeon loads after drm, because drm*.o are before radeon*.o in the > Makefile. See > http://stackoverflow.com/questions/5669647/linux-order-of-statically-linked-module-loading >
So I tried moving amdkfd inside the Makefile before radeon, and that made amdkfd load before radeon did. This solves my first problem - order between amdkfd and radeon. However, amd_iommu_v2 doesn't belong to the drm Makefile, and I don't want to move iommu before gpu, so I don't have a solution for the order between amdkfd and amd_iommu_v2. Oded > > >> >>> >>> Actually, I don't understand why the kernel doesn't enforce the order >>> according to the use of exported symbols, like it does with modules. >> >> Yeah, that's indeed rather strange. There must be something in the amdkfd >> code >> which broke that somehow. > IMO, that's a far-fetched guess. Could you point to something more specific ? > >> >> As far as I understand you the desired init order is radeon and amd_iommu_v2 >> first and then amdkfd, right? > Actually no. The preferred order is amd_iommu_v2, amdkfd and radeon last. This > is the order that happens when all three are built as modules. More > accurately, > radeon inits, but its init triggers amdkfd init, which triggers amd_iommu_v2 > init. So before radeon reaches its probe stage, all the modules were > initialized. > > So what happens when you boot with radeon, >> amd_iommu_v2 and amdkfd blacklisted for automatically load and only load >> amdkfd >> manually? > As said above, that's ok. >> >>> There will always be dependencies between kgd (radeon) and amdkfd and >>> between >>> amdkfd and amd_iommu_v2. I don't think I can eliminate those dependencies, >>> not >>> without a very complex solution. And the fact that this complex solution >>> occurs only in a very specific use case (all modules compiled in), makes me >>> less inclined to do that. >>> >>> So I don't see it as a "nasty workaround". I would call it just a >>> "workaround" >>> for a specific use case, which should be solved by a generic solution to the >>> kernel enforcing load orders. >> >> The normal kernel module handling already should provide the correct init >> order, >> so I would still call this a rather nasty workaround because we couldn't find >> the underlying problem. > Well, the normal kernel module handling doesn't work when all modules are > compiled in. I'm not a huge expert on this issue so I had Joerg Roedel help me > analyze this (thanks Joerg) and he agreed that there is no enforcement of > order > in this case. > >> >> Christian. >> >>> >>> Oded >>>> >>>>> --- >>>>> drivers/gpu/drm/amd/amdkfd/kfd_module.c | 5 ++--- >>>>> 1 file changed, 2 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c >>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_module.c >>>>> index 95d5af1..236562f 100644 >>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c >>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c >>>>> @@ -34,7 +34,7 @@ >>>>> #define KFD_DRIVER_MINOR 7 >>>>> #define KFD_DRIVER_PATCHLEVEL 0 >>>>> -const struct kfd2kgd_calls *kfd2kgd; >>>>> +const struct kfd2kgd_calls *kfd2kgd = NULL; >>>>> static const struct kgd2kfd_calls kgd2kfd = { >>>>> .exit = kgd2kfd_exit, >>>>> .probe = kgd2kfd_probe, >>>>> @@ -84,14 +84,13 @@ EXPORT_SYMBOL(kgd2kfd_init); >>>>> void kgd2kfd_exit(void) >>>>> { >>>>> + kfd2kgd = NULL; >>>>> } >>>>> static int __init kfd_module_init(void) >>>>> { >>>>> int err; >>>>> - kfd2kgd = NULL; >>>>> - >>>>> /* Verify module parameters */ >>>>> if ((sched_policy < KFD_SCHED_POLICY_HWS) || >>>>> (sched_policy > KFD_SCHED_POLICY_NO_HWS)) { >>>> >> > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel