On 6/9/19 2:56 AM, Mick wrote:
This sounds as if it may be related to a move from an older gcc to a newer version.

I'm not sure it's related to a gcc version:

# gcc-config -l
 [1] x86_64-pc-linux-gnu-6.4.0 *
 [2] x86_64-pc-linux-gnu-8.3.0

I think that gcc 8.3 might have been selected and I reverted to 6.4 thinking that it might have been part of the problem. I have since done an emerge -DuNeq @world with gcc 6.4 and the problem persists.

Checking my understanding:

1. The old modules, compiled with the old gcc and toolchain worked fine.

Correct.

2. The new modules, compiled with the new gcc but old libtool, binutils and glibc worked (usually you update these or @system, before you update the whole world).

Correct.

3. The new modules, compiled with the new gcc and toolchain rebuilt the second time do not work (this would use libtools, binutils, glibc, now compiled with the new gcc).

Correct.

4. All of the above happens with the old kernel, which was not rebuilt with the new toolchain.

Correct.

5. New kernel(s) compiled thereafter will not boot.

Correct.

You have not mentioned if you upgraded gcc.

I think that the first emerge -DuNeq @world did pull in a new gcc. But I have since selected gcc 6.4 as part of diagnostics. (See above.)

The error you get about modules failing to load sounds like a path/symlink error, or a linux headers error, or a change of arch.

I don't think it's a symlink error. (I've configured things to not automatically update the sym-link.)

# ls -la /usr/src/linux
lrwxrwxrwx 1 root root 22 Sep 8 2018 /usr/src/linux -> linux-4.9.76-gentoo-r1/
# uname -a
Linux REDACTED 4.9.76-gentoo-r1 #1 SMP Thu Nov 15 22:23:44 MST 2018 x86_64 Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz GenuineIntel GNU/Linux

As you can see, the machine has enough CPU that I can let it do the following to make sure that things are consistent. (At least I think that's what's happening.)

emerge -DuNeq @world && emerge --depclean && revdep-rebuild

That's my SOP. If that fails I usually try a --resume to see if the problem repeats, and if it's at the same place. If that fails for some reason, I'll fall back to a @system. Usually the failure is caused by something that I've done, disk space, ZFS version issues, etc.

Since both vbox and zfs modules fail to boot I would not think this is a zfs isolated problem.

Agreed.

Have you tried forcing the loading of these modules?

modprobe --force --verbose <module_name>

No, not yet.  I've never had any success forcing the kernel to load modules.

What errors do you get with the new non-booting kernels?

# modprobe --force --verbose vboxdrv
insmod /lib/modules/4.9.76-gentoo-r1/misc/vboxdrv.ko
modprobe: ERROR: could not insert 'vboxdrv': Exec format error

dmesg reports the following for each attempt to (force) load the module.

module: vboxdrv: Unknown rela relocation: 4

Mick I get the impression that you've got the correct understanding of my current situation. I'm interested learn what you think should be done.

Reply via email to